Elasticsearch analyzer - line and spotlight tokenizer

How can I create a mapping that will tokenize a string in a space, and also change it to lowercase for indexing?

This is my current mapping, which is tokenized by spaces, I cannot understand how its lowercase letters, as well as the search (query) of the same ...

{ "mappings": { "my_type" : { "properties" : { "title" : { "type" : "string", "analyzer" : "whitespace", "tokenizer": "whitespace", "search_analyzer":"whitespace" } } } } } 

Please, help...

+7
javascript elasticsearch lucene
source share
2 answers

I managed to write my own analyzer, and it works ...

 "settings":{ "analysis": { "analyzer": { "lowercasespaceanalyzer": { "type": "custom", "tokenizer": "whitespace", "filter": [ "lowercase" ] } } } }, "mappings": { "my_type" : { "properties" : { "title" : { "type" : "string", "analyzer" : "lowercasespaceanalyzer", "tokenizer": "whitespace", "search_analyzer":"whitespace", "filter": [ "lowercase" ] } } } } 
+9
source share

You have two options -

Simple analyzer

A simple analyzer is likely to satisfy your needs:

 curl -XGET 'localhost:9200/myindex/_analyze?analyzer=simple&pretty' -d 'Some DATA' { "tokens" : [ { "token" : "some", "start_offset" : 0, "end_offset" : 4, "type" : "word", "position" : 1 }, { "token" : "data", "start_offset" : 5, "end_offset" : 9, "type" : "word", "position" : 2 } ] } 

To use a simple analyzer in your mapping:

 { "mappings": { "my_type" : { "properties" : { "title" : { "type" : "string", "analyzer" : "simple"} } } } } 

Custom analyzer

The second option is to define your own custom analyzer and specify how tokenize and filter the data. Then refer to this new analyzer in your mapping.

+2
source share

All Articles