MongoDB: what is the most efficient way to request a single random document?

I need to select a document from a collection at random (alternatively a small number of consecutive documents from a randomly located β€œwindow”). I found two solutions: 1 and 2 . The first is unacceptable since I expect a large collection size and want to minimize the size of the document. The second seems inefficient (I'm not sure about the complexity of the skip operation). And here you can find a mention of requesting a document with the specified index, but I do not know how to do it (I use the C ++ driver).

Are there other solutions to the problem? Which one is most effective?

+7
source share
2 answers

I had a similar problem. In my case, I had a date property in my docs. I knew the earliest date in the dataset, so in my application code I would generate a random date in the range of EARLIEST_DATE_IN_SET and NOW, and then query mongodb using the GTE query in the date property and just limit it to 1 result.

There was little chance that the random date would be greater than the highest date in the dataset, so I took this into account in the application code.

With the date property index, this was a quick request.

+2
source

It seems like you can mold solution 1 there (assuming your _id key was a value with autoincrementation), and then just count your entries and use this as the upper limit for a random int in C ++, then take this line.

Similarly, if you do not have an autoinc _id key, just create it with your results. If an extra field with INT should not increase the size of your document so much.

If you don't have an auto-inc field, Mongo talks about how to quickly add it here:

Auto Inc Field.

+2
source

All Articles