Tinea in Elasticsearch, why does this custom analyzer example fail?

I rephrased my problem in a full curl script. Thus, it may be easier to reproduce the problem (the search is not performed using a custom analyzer). I am using the latest ES for this

Delete old data

curl -XDELETE "http://localhost:9200/test_shingling" 

Create an index with settings

 curl -XPOST "http://localhost:9200/test_shingling/" -d '{ "settings": { "index": { "number_of_shards": 10, "number_of_replicas": 1 }, "analysis": { "analyzer": { "ShingleAnalyzer": { "tokenizer": "BreadcrumbPatternAnalyzer", "filter": [ "standard", "lowercase", "filter_stemmer", "filter_shingle" ] } }, "filter": { "filter_shingle": { "type": "shingle", "max_shingle_size": 2, "min_shingle_size": 2, "output_unigrams": false }, "filter_stemmer": { "type": "porter_stem", "language": "English" } }, "tokenizer": { "BreadcrumbPatternAnalyzer": { "type": "pattern", "pattern": " |\\$\\$\\$" } } } } }' 

Define Display

 curl -XPOST "http://localhost:9200/test_shingling/item/_mapping" -d '{ "item": { "properties": { "Title": { "type": "string", "search_analyzer": "ShingleAnalyzer", "index_analyzer": "ShingleAnalyzer" } } } }' 

Create document

 curl -XPOST "http://localhost:9200/test_shingling/item/" -d '{ "Title":"Kyocera Solar Panel Test" }' 

PASS Test Analyzer

 curl 'localhost:9200/test_shingling/_analyze?pretty=1&analyzer=ShingleAnalyzer' -d 'Kyocera Solar Panel Test' 

Wait for the ES to sync (as well as update indices)

 curl -XPOST "http://localhost:9200/test_shingling/_refresh" 

Search "Kyocera Solar Panel Test" FAIL

 curl -XPOST "http://localhost:9200/test_shingling/item/_search?pretty=true" -d '{ "query": { "term": { "Title": "Kyocera Solar Panel Test" } } }' 

Search "Solar Panel" FAIL

 curl -XPOST "http://localhost:9200/test_shingling/item/_search?pretty=true" -d '{ "query": { "term": { "Title": "Kyocera Solar Panel Test" } } }' 

Search "Kyocera Solar Panel Test" FAIL

 curl -XPOST "http://localhost:9200/test_shingling/item/_search?pretty=true" -d '{ "query": { "query_string": { "default_field": "Title", "query": "Kyocera Solar Panel Test" } } }' 

Search "Solar Panel" FAIL

 curl -XPOST "http://localhost:9200/test_shingling/item/_search?pretty=true" -d '{ "query": { "query_string": { "default_field": "Title", "query": "solar panel" } } }' 
+7
elasticsearch
source share
2 answers

The term query ll looks for an exact match, and it will not apply ShingleAnalyzer for your query.

So you should use a match request. the match request will apply the analyzer to your query string and search in ES.

Full Word Search

  curl -XPOST "http://localhost:9200/test_shingling/item/_search" -d'{ "query": {"match": {"Title": "Kyocera Solar Panel Test"}}}' 

Search for a partial word

  curl -XPOST "http://localhost:9200/test_shingling/item/_search" -d'{ "query": {"match": {"Title": "Panel Test"}}}' 

Another partial word search

  curl -XPOST "http://localhost:9200/test_shingling/item/_search" -d'{ "query": {"match": {"Title": "Solar Panel Test"}}}' 

HOpe helps ..!

+3
source share

I think that query_string search by default is considered solar panel as solar or panel and that you should explicitly set it to query_string . This is what is written in the reference guide.

default_operator:

The default operator is used if no explicit operator is specified. For example, with the default operator OR, the request capital of Hungary is transferred to the capital OR OR or Hungary and with the default operator from AND, the same request is transferred to the capital AND AND AND Hungary. The default value is OR.

+1
source share

All Articles