How to create a flexible search query so that each token in the document field is matched?

I need to make sure that each field token matches at least one token in a user search.

This is a generic example to simplify.

Let Store_Name = "Square Steakhouse"

It’s easier to build a query matching this document when a user searches for a Square or Steakhouse. In addition, when using the kstem filter connected to the analyzer by default, Steakhouses can also match.

 { "size": 30, "query": { "match": { "Store_Name": { "query": "Square", "operator": "AND" } } } } 

Unfortunately, I need every token of the Store_Name field that needs to be matched. I need the following behavior:

 Query: Square Steakhouse Result: Match Query: Square Steakhouses Result: Match Query: Squared Steakhouse Result: Match Query: Square Result: No Match Query: Steakhouse Result: No Match 

Finally

  • You cannot use not_analyzed, since I need to use the analyzer functions.
  • I intend to use kstem, custom synonyms, custom char_filter, lower case, and also a standard tokenizer

However, I need to make sure that each field token is mapped

Is this possible in elastic search?

+4
source share
1 answer

Here is a good method.

This is not ideal, but it is a good compromise in terms of simplicity, computation and storage.

  • Field Token Count
  • Get the number of search text tokens
  • Run a filtered query and set the number of tokens between the results equal

You will need to use the analysis API to get the number of tokens. Be sure to use the same analyzer as the field. Here is the VB.NET function to get the number of tokens:

 Private Function GetTokenCount(ByVal RawString As String, Optional ByVal Analyzer As String = "default") As Integer If Trim(RawString) = "" Then Return 0 Dim client = New ElasticConnection() Dim result = client.Post("http://localhost:9200/myindex/_analyze?analyzer=" & Analyzer, RawString) 'Submit analyze request usign PlainElastic.NET API Dim J = JObject.Parse(result.ToString()) 'Populate JSON.NET JObject Return (From X In J("tokens")).Count() 'returns token count using a JSON.NET JObject End Function 

You will want to use this in index time to store the number of field tokens. Make sure there is an entry in the mapping for TokenCount

Here is a nice elastic search query for using this new token count information:

 { "size": 30, "query": { "filtered": { "query": { "match": { "MyField": { "query": "[query]", "operator": "AND" } } }, "filter": { "term": { "TokenCount": [tokencount] } } } } } 
  • Replace [query] with search queries
  • Replace [tokencount] with the number of tokens in search terms (using the GetTokenCount function above

This ensures that there is at least as many matches as tokens in MyField .

There are some drawbacks to the above. For example, if we match the blue red field and the user searches for blue blue, this will result in a match. This way you can use a unique token filter . You can also configure the filter so that

Link

+4
source

All Articles