Get the latest version of the document and summarize the results.

My index contains many documents, each of which has several versions, for example:

{"doc_id": 13,
"version": 1,
"text": "bar"}

{"doc_id": 13,
"version": 2,
"text": "bar"}

{"doc_id": 13,
"version": 3,
"text": "bar"}

{"doc_id": 14,
"version": 1,
"text": "foo"}

{"doc_id": 14,
"version": 2,
"text": "bar"}

I want to get the latest version for each document and combine them (latest versions) using aggregation terms.
I tried to use top hitsto extract the latest versions:

{"size" :0,
"aggs" : {
    "doc_id_groups" : {
        "terms" : {
            "field" : "doc_id",
            "size" : "0"
        },
        "aggs" : {
            "docs" : {
                "top_hits" : {
                    "size" : 1,
                    "sort" : {
                        "version" : {
                            "order" : "desc"
                        }
                    }
                }
            }
        }
    }
}
}

But I can’t perform aggregation because I top hitsdo not support auxiliary aggregations.
I assume that getting the identifiers and then combining them will be very difficult for the client. Maybe scripting can help?

:, : , , , ,

0
2

chat , , . :

  • " " Boolean, true . ​​ - " current" false true .
  • " timepoints", . ( ) ( , . "09.30.2016" "" ) " " .

Pros

  • , , timepoints.

  • .

  • , . .

  • , .., .

  • , , .

  • " " , , .

  • . ( ) timepoints . , ( ), ( ) ( , ), . , .

  • , , . "" , .

, , .

+1

. .

  • api, . , text.
  • elasticsearch doc_id version. . OR, doc_id . text.

. , . , .

0

All Articles