Faceted search problem

Question

Faceted search problem

I am doing some grandiose searches, but have a few problems. I do not get the desired results if there are several words in the facet search field.

Example: field "animal" with the following data:

A horse Black horse Black horse

The Lagrange search sends back “horse (3)” as the best result, whereas I would like to return “Black horse (2)”.

And this is schema.xml. The search field is BUSQUEDA and the grunge field is SUPERFICIE. I think I tried most of the possible combinations of certain types for these two fields, but it still doesn't work.

 <?xml version="1.0" encoding="UTF-8" ?> <schema name="example" version="1.2"> <types> <fieldType name="string" class="solr.StrField"/> <fieldType name="facet_texPersonal" class="solr.StrField" sortMissingLast="true" omitNorms="true"> <analyzer> <tokenizer class="solr.KeywordTokenizerFactory"/> </analyzer> </fieldType> <fieldType name="facet_tex" class="solr.TextField" sortMissingLast="true" omitNorms="true"> <analyzer> <tokenizer class="solr.KeywordTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory" /> <filter class="solr.TrimFilterFactory" /> </analyzer> </fieldType> <fieldType name="text" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true"/> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true"/> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/> </analyzer> </fieldType> <fieldType name="textTight" class="solr.TextField" positionIncrementGap="100" > <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="false"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="0" generateNumberParts="0" catenateWords="1" catenateNumbers="1" catenateAll="0"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> </analyzer> </fieldType> <fieldType name="textMultidioma" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" /> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="0"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true"/> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> </fieldType> </types> <fields> <field name="BUSQUEDA" type="facet_tex" indexed="true" stored="true"/> <field name="SUPERFICIE" type="facet_tex" indexed="true" stored="true"/> <field name="NOMBRE" type="string" indexed="true" stored="true"/> </fields> <uniqueKey>NOMBRE</uniqueKey> <defaultSearchField>BUSQUEDA</defaultSearchField></schema>

Any suggestions?

Thanks for the bunch in advance!

+6

solr

Carlos Feb 08 '10 at 16:35

source share

2 answers

Mauricio Scheffer · Answer 1 · 2010-02-08T20:25:45+0000

You need to execute the face in a non-tokenized field (class field solr.StrField or using solr.KeywordTokenizerFactory). This thread explains this in detail.

Jonathan williams · Answer 2 · 2011-01-05T12:56:38+0000

We had verbose faceted fields working on a project that I worked on earlier. Here is the (part) schema.xml related to this:

 <schema name="example" version="1.2"> <types> <fieldType name="string" class="solr.StrField" sortMissingLast="true" omitNorms="true" /> ... </types> <fields> <field name="grant_type" type="string" indexed="true" stored="true" /> ... </fields> </schema>

As Mauricio noted, the facet field must be non-tokenized (not divided into separate words). In the above configuration, we use the field type "solr.StrField" (non-tokenized).

Further hints for facet field types (not conversion to lowercase letters, not punctuation separation, etc.) can be found on the Solr mirror overview page .

Faceted search problem

More articles: