When rewriting a multi-farm request, add constant_score for each term, not the entire request

Question

When rewriting a multi-farm request, add constant_score for each term, not the entire request

I am looking for cities from geonames db. A typical search string would be "San Francisco CA". I have documents in which there is a city and state field. I make a match query matching the search string with the city and state, then combine these matches with bool :

 "query" : { "bool" : { "must" : { "match" : { "country" : { "query" : "San Francisco CA" } } }, "should" : { "match" : { "city" : { "query" : "San Francisco CA" } } } } }

I have these two documents in my db:

 {"city" : "San Francisco", "state" : "CA"} {"city" : "San Marino", "state" : "San Marino"}

The problem is that the alignment of “san” with San Marino state ratings is much higher than the comparison of CA with the state of San Francisco, because there are many cities with a state “CA” and very small cities with a state “San Marino”.

I'm trying to disable IDF using constant_score , but this leads to another problem: matching San Francisco to San Francisco with San Francisco, where the two matches are the same, matches San Francisco California to San Marino "" where only one term matches. When a multi-thermal match request is rewritten into separate terms, is it possible for constant_score to execute each of the rewritten requests to get a score of 2 for San Francisco and score 1 for only San ,

+2

elasticsearch

Beowulfenator Sep 22 '15 at 19:27

source share

1 answer

Beowulfenator · Accepted Answer · 2015-09-24T07:44:33+0000

With good help from ElasticSearch Discussion Forum I have a solution.

The easiest way to make an IDF constant is to create your own class to calculate the similarity. Here is my updated example for ElasticSearch 1.7.0 .

The class forces the IDF to always be 1, which solves my problem.

When rewriting a multi-farm request, add constant_score for each term, not the entire request

More articles: