Effectively retrieve all documents in elasticsearch index

I want to get all the results from a match-all query in an elasticsearch cluster. I donโ€™t care if the results correspond to the latter, and I donโ€™t care about the order, I just want to constantly monitor all the results, and then start from the very beginning. Scrolling and scanning are best suited for this, it looks like you have a snapshot that I don't need. I will look at the processing of 10 million million documents.

+3
source share
1 answer

In some ways, a duplicate of an elasticsearch request to return all records . But we can add a little more detailed information to solve problematic issues. (More precisely, "it seems a bit of a hit with a snapshot that I don't need.")

A scroll scrolling search is definitely what you want in this case. the "snapshot" here is not too much. The documentation describes this metaphorically as โ€œlike a snapshot in timeโ€ (in italics). The real implementation details are a little more subtle and pretty smart.

A slightly more detailed explanation comes later in the documentation:

, , . , , . Elasticsearch , .

, , , - - , Lucene. Lucene , -. ( ), Lucene . : .

, , Lucene . , , . , Lucene , .

, Lucene , B-. , IO . .

, , Elasticsearch, , , . , : . , , , .

, . , , , , , . , 2-3 .

+8

All Articles