SolR: How to make spellcheck case insensitive, but return the original word with uppercase letters?

I am working on a SolR project to create a spell

Why, if I type "britne", it automatically fills in "britney", but when I type "Britne", it does not find any result? Here is my spell check box:

<fieldType name="suggestText" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1" ignoreCase="true"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt" ignoreCase="true"/> <filter class="solr.RemoveDuplicatesTokenFilterFactory" ignoreCase="true"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1" ignoreCase="true"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt" ignoreCase="true"/> <filter class="solr.RemoveDuplicatesTokenFilterFactory" ignoreCase="true"/> </analyzer> </fieldType> 

It has the LowerCaseFilterFactory value in the AND query part in the index part, so I guessed that it will convert my query to lowerCase and compare the words stored in lower case, but not explicitly.

Also, I would like to have when I type “Britne”, “britne” or “BriTnE” the result is “Britney” (and not “Britney”). How can I make my spellchecker case insensitive , but return case-sensitive words?

+4
source share
2 answers

I'm not sure if it works, but maybe you can use the fields for copying for this:

We do not use LowerCaseFilterFactory in the suggestText field, but use LowerCaseFilterFactory in the second field (let it call) suggestText_lower. Then copy the field into the suggestText field.

Thus, “BriTnE” will be matched by typing “britne” without the bottom of the “suggestText” field.

0
source

You confuse a few things about indexes and storage here.

About the store, when you set the value with the saved value = true, the value is stored "as is" and does not reflect what the exple index has: <field name="FIELDNAME" type="text" indexed="false" **stored="true"** multiValued="false" required="true" /> To check what was saved, just make it simple: a query that displays all the fields.

Next, the indices. Here you process (analyze and filter) your values ​​to make them searchable. For the same value, you may have to make several indexes in order to be able to perform different types of searches. Take this seriously, which is often the best option. For indexes, use the "Schema Browser" to check your indexed values ​​(open the admin console, select your instance and select the schema browser, then select the field you want to check and open the "Download Terms Information"). To do this, "copyField" is executed, and you need to save the value only once. There you will see how he figured out, and if he really went down, what do you think: I already had some kind of surprise. If you indicated no, you can try this <tokenizer class="solr.StandardTokenizerFactory"/> toner in combination with LowerCaseFilterFactory, this worked for me.

Finally, your request is important too, and probably the solution to your problem . When you search for Britne, you must create a search with the similarity function (fuzzy search) or indicate that you want it from the default search. You can try exploring Britne ~ (the same as Britne ~ 0.5), or Britne ~ or Britne ~ 0.8, or something else. You will need to fine-tune it to your needs and context.

0
source

All Articles