Elasticsearch analyzer - line and spotlight tokenizer

Question

Elasticsearch analyzer - line and spotlight tokenizer

How can I create a mapping that will tokenize a string in a space, and also change it to lowercase for indexing?

This is my current mapping, which is tokenized by spaces, I cannot understand how its lowercase letters, as well as the search (query) of the same ...

{ "mappings": { "my_type" : { "properties" : { "title" : { "type" : "string", "analyzer" : "whitespace", "tokenizer": "whitespace", "search_analyzer":"whitespace" } } } } }

Please, help...

+7

javascript elasticsearch lucene

user3658423 Dec 13 '14 at 2:36

source share

2 answers

You have two options -

Simple analyzer

A simple analyzer is likely to satisfy your needs:

 curl -XGET 'localhost:9200/myindex/_analyze?analyzer=simple&pretty' -d 'Some DATA' { "tokens" : [ { "token" : "some", "start_offset" : 0, "end_offset" : 4, "type" : "word", "position" : 1 }, { "token" : "data", "start_offset" : 5, "end_offset" : 9, "type" : "word", "position" : 2 } ] }

To use a simple analyzer in your mapping:

 { "mappings": { "my_type" : { "properties" : { "title" : { "type" : "string", "analyzer" : "simple"} } } } }

Custom analyzer

The second option is to define your own custom analyzer and specify how tokenize and filter the data. Then refer to this new analyzer in your mapping.

+2

Olly cruickshank Dec 13 '14 at 8:07

source share

user3658423 · Accepted Answer · 2014-12-13T09:06:28+0000

I managed to write my own analyzer, and it works ...

 "settings":{ "analysis": { "analyzer": { "lowercasespaceanalyzer": { "type": "custom", "tokenizer": "whitespace", "filter": [ "lowercase" ] } } } }, "mappings": { "my_type" : { "properties" : { "title" : { "type" : "string", "analyzer" : "lowercasespaceanalyzer", "tokenizer": "whitespace", "search_analyzer":"whitespace", "filter": [ "lowercase" ] } } } }

Elasticsearch analyzer - line and spotlight tokenizer

Simple analyzer

Custom analyzer

More articles: