This question is very similar to the old question posted here: Get analytic tokens from ElasticSearch documents , but to see if there are any changes, I thought it would make sense to publish it again for the latest version of ElasticSearch.
We are trying to search for bodies of text in ElasticSearch using a search query and matching fields using the snowball stack built into ElasticSearch. The performance and results are great, but since we need a text thesis for post-analysis, we would like the search result to return the actual tokens for A text box for each document in the search results.
The mapping for the field currently looks like this:
"TitleEnglish": { "type": "string", "analyzer": "standard", "fields": { "english": { "type": "string", "analyzer": "english" }, "stemming": { "type": "string", "analyzer": "snowball" } } }
and the search query is done specifically on TitleEnglish.stemming . Ideally, I would like it to return this field, but return that it does not return the analyzed field, except for the original field.
Does anyone know how to do this? We examined Term Vectors , but they seem to be returnable only for individual documents or a set of documents, and not for the search result?
Or perhaps other solutions, such as Solr or Sphinx, offer this opportunity?
To add additional information. If we run the following query:
GET /_analyze?analyzer=snowball&text=Eight issue of Industrial Lorestan eliminate barriers to facilitate the Committees review of
It returns the words: eight , issu , industri , etc. This is exactly the result that we would like to return for each relevant document for all words in the text (so not only matches).