Elasticsearch, get the average length of the document

Is there a better way in elasticsearch (except for matching all queries and manually averaging over the length of all returned documents) to get the average length of a document for a particular index?

+8
elasticsearch
source share
3 answers

The _ size field, if enabled, should provide you with the size of each document for free. Combining this with avg aggregation should lead to what you want. Something like:

 { "query" : {"match_all" : {}}, "aggs" : {"avg_size" : {"avg" : {"terms" : {"field" : "_size"}}}} } 
+9
source share

I used this code (I have _source enabled)

 { "query" : {"match_all" : {}}, "aggs":{ "avg_length" : { "avg" : { "script" : "_source.toString().length()"}} } } 

Well, characters ... if the string is UTF-8 to receive bytes:

 { "query" : {"match_all" : {}}, "aggs":{ "avg_length" : { "avg" : { "script" : "_source.toString().getBytes(\"UTF-8\").length"}} } } 
+2
source share

A shot in the dark, but faces or clusters combined with a script can do this.

 { ..., "aggs" : { "avg_length" : { "avg" : { "script" : "doc['_all'].length" } } } } 
+1
source share

All Articles