Cross_fields prefix for multiple elements

I have a multi_match request of type cross_fields , which I want to improve with prefix matching.

 { "index": "companies", "size": 25, "from": 0, "body": { "_source": { "include": [ "name", "address" ] }, "query": { "filtered": { "query": { "multi_match": { "type": "cross_fields", "query": "Google", "operator": "and", "fields": [ "name", "address" ] } } } } } } 

It is perfect for queries like google mountain view . The filtered array exists because I need to add geo-filters dynamically.

 { "id": 1, "name": "Google", "address": "Mountain View" } 

Now I want to allow prefix matching without breaking cross_fields .

Such requests should match:

  • goog
  • google mount
  • google mountain vi
  • mountain view goo

If I change the value of multi_match.type to phrase_prefix , it matches the entire request with one field, so it matches only mountain vi , but not against google mountain vi

How to solve this?

+5
source share
1 answer

Since there are no answers, and someone can see it, I had the same problem, and here is the solution:

Using the edgeNGrams tokenizer .

You need to change the index and mapping settings.

Here is an example of settings:

 "settings" : { "index" : { "analysis" : { "analyzer" : { "ngram_analyzer" : { "type" : "custom", "stopwords" : "_none_", "filter" : [ "standard", "lowercase", "asciifolding", "word_delimiter", "no_stop", "ngram_filter" ], "tokenizer" : "standard" }, "default" : { "type" : "custom", "stopwords" : "_none_", "filter" : [ "standard", "lowercase", "asciifolding", "word_delimiter", "no_stop" ], "tokenizer" : "standard" } }, "filter" : { "no_stop" : { "type" : "stop", "stopwords" : "_none_" }, "ngram_filter" : { "type" : "edgeNGram", "min_gram" : "2", "max_gram" : "20" } } } } } 

Of course, you must adapt the analyzers for your own use. You might want to leave the analyzer intact by default or add an ngram filter to it so that you don't have to change the mappings. This last decision will mean that all fields in your index will receive an ngram filter.

And to display:

 "mappings" : { "patient" : { "properties" : { "name" : { "type" : "string", "analyzer" : "ngram_analyzer" }, "address" : { "type" : "string", "analyzer" : "ngram_analyzer" } } } } 

Declare each field that you want to autocomplete with ngram_analyzer. Then the questions in your question should work. If you used anything else, I would be glad to hear about it.

+2
source

Source: https://habr.com/ru/post/1213881/


All Articles