Elastic string length search

I am using ElasticSearch through NEST C #. I have a great list of information about people

{ firstName: 'Frank', lastName: 'Jones', City: 'New York' } 

I would like to be able to filter and sort this list of items by lastName, as well as in order of length so that people who have only 5 characters in their name will be at the beginning of the result set, then people with 10 characters.

So, with some pseudocode, I would like to do something like list.wildcard("j*").sort(m => lastName.length)

I am new to ElasticSearch, so any examples would be helpful.

+5
source share
1 answer

You can sort with a script based on the sort .

As an example of toys, I set up a trivial index with several documents:

 PUT /test_index POST /test_index/doc/_bulk {"index":{"_id":1}} {"name":"Bob"} {"index":{"_id":2}} {"name":"Jeff"} {"index":{"_id":3}} {"name":"Darlene"} {"index":{"_id":4}} {"name":"Jose"} 

Then I can order the search results as follows:

 POST /test_index/_search { "query": { "match_all": {} }, "sort": { "_script": { "script": "doc['name'].value.length()", "type": "number", "order": "asc" } } } ... { "took": 2, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 4, "max_score": null, "hits": [ { "_index": "test_index", "_type": "doc", "_id": "1", "_score": null, "_source": { "name": "Bob" }, "sort": [ 3 ] }, { "_index": "test_index", "_type": "doc", "_id": "4", "_score": null, "_source": { "name": "Jose" }, "sort": [ 4 ] }, { "_index": "test_index", "_type": "doc", "_id": "2", "_score": null, "_source": { "name": "Jeff" }, "sort": [ 4 ] }, { "_index": "test_index", "_type": "doc", "_id": "3", "_score": null, "_source": { "name": "Darlene" }, "sort": [ 7 ] } ] } } 

To filter by length, I can use a script filter as follows:

 POST /test_index/_search { "query": { "filtered": { "query": { "match_all": {} }, "filter": { "script": { "script": "doc['name'].value.length() > 3", "params": {} } } } }, "sort": { "_script": { "script": "doc['name'].value.length()", "type": "number", "order": "asc" } } } ... { "took": 3, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 3, "max_score": null, "hits": [ { "_index": "test_index", "_type": "doc", "_id": "4", "_score": null, "_source": { "name": "Jose" }, "sort": [ 4 ] }, { "_index": "test_index", "_type": "doc", "_id": "2", "_score": null, "_source": { "name": "Jeff" }, "sort": [ 4 ] }, { "_index": "test_index", "_type": "doc", "_id": "3", "_score": null, "_source": { "name": "Darlene" }, "sort": [ 7 ] } ] } } 

Here is the code I used:

http://sense.qbox.io/gist/22fef6dc5453eaaae3be5fb7609663cc77c43dab

PS: If any of the last names contains spaces, you can use "index": "not_analyzed" in this field.

+5
source

All Articles