ElasticSearch - Increasing relevance based on field value

You need to find a way in ElasticSearch to increase the relevance of the document based on a specific field value. In particular, in all my documents there is a special field, where the higher the value of the field, the more relevant the document containing it should be, regardless of the search.

Consider the following document structure:

{ "_all" : {"enabled" : "true"}, "properties" : { "_id": {"type" : "string", "store" : "yes", "index" : "not_analyzed"}, "first_name": {"type" : "string", "store" : "yes", "index" : "yes"}, "last_name": {"type" : "string", "store" : "yes", "index" : "yes"}, "boosting_field": {"type" : "integer", "store" : "yes", "index" : "yes"} } } 

I would like for documents with a higher boosting_field to be inherently more relevant than those with a lower boosting_field. This is just a starting point - the comparison between the query and other fields will also be taken into account when determining the final assessment of the relevance of each document in the search. But, ceteris paribus, the higher the force field, the more relevant the document .

Does anyone have an idea how to do this?

Thank you so much!

+63
search elasticsearch
Sep 14 '12 at 15:20
source share
3 answers

You can either increase the index time or the query time. I usually prefer to increase the query time, even if it makes the queries a little slower, otherwise I will need to reindex every time I want to change my boost factors, which are usually necessary for fine tuning and should be quite flexible.

There are various ways to apply query time increase using elasticsearch DSL query:

The first three queries are useful if you want to give a specific impulse to documents that match specific queries or filters. For example, if you want to increase only documents published in the last month. You can use this approach with your boosting_field, but you will need to manually determine some boosting_field intervals and give them a different impulse, which is not so great.

The best solution would be to use a Custom Score Query , which will allow you to make a query and configure its score using a script. It is quite powerful, with the help of a script you can directly change the score itself. First of all, I would scale the boosting_field values ​​to a value from 0 to 1, for example, so that your final result does not become a large number. To do this, you need to predict what more or less minimum and maximum values ​​that a field can contain. Suppose, for example, a minimum of 0 and a maximum of 100000. If you scale the boosting_field value to a number from 0 to 1, you can add the result to the actual score as follows:

 { "query" : { "custom_score" : { "query" : { "match_all" : {} }, "script" : "_score + (1 * doc.boosting_field.doubleValue / 100000)" } } } 

You can also use boosting_field as a boost factor ( _score * , not _score + ), but then you will need to scale it to an interval with a minimum value of 1 (just add +1).

You can even tweak the result to change its value by adding weight to the value you use to influence the score. You will need this even more if you need to combine several enhancement factors together to give them a different weight.

+70
Sep 14 '12 at 19:17
source share
β€” -

With the latest version of Elasticsearch (version 1.3+) you will want to use "function calls":

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html

A clogged query_string query looks like this:

 { 'query': { 'function_score': { 'query': { 'query_string': { 'query': 'my search terms' } }, 'functions': [{ 'field_value_factor': { 'field': 'my_boost' } }] } } } 

"my_boost" is a number field in your search index containing a promotion factor for individual documents. It might look like this:

 { "my_boost": { "type": "float", "index": "not_analyzed" } } 
+13
Dec 14 '14 at 19:36
source share

if you want to avoid increasing every time inside the query, you might consider adding it to your mapping by adding "boost: factor" to it.

So your mapping might look like this:

 { "_all" : {"enabled" : "true"}, "properties" : { "_id": {"type" : "string", "store" : "yes", "index" : "not_analyzed"}, "first_name": {"type" : "string", "store" : "yes", "index" : "yes"}, "last_name": {"type" : "string", "store" : "yes", "index" : "yes"}, "boosting_field": {"type" : "integer", "store" : "yes", "index" : "yes", "boost" : 10.0,} } } 
+3
Jan 26 '14 at 17:23
source share



All Articles