I have never worked with MongoDB with Python, but there is a general solution to your problem. This is where the MongoDB script shell is located to receive a single random document:
N = db.collection.count(condition) db.collection.find(condition).limit(1).skip(Math.floor(Math.random()*N))
condition here is the MongoDB request. If you want to query the entire collection, use query = null .
This is a general solution, so it works with any MongoDB driver.
Update
I conducted a test to test several implementations. First, I created a test collection with documents 5567249 with an indexed random field rnd .
I chose three methods to compare with each other:
First method:
db.collection.find().limit(1).skip(Math.floor(Math.random()*N))
Second method:
db.collection.find({rnd: {$gte: Math.random()}}).sort({rnd:1}).limit(1)
Third method:
db.collection.findOne({rnd: {$gte: Math.random()}})
I ran each method 10 times and got its average computational time:
method 1: 882.1 msec method 2: 1.2 msec method 3: 0.6 msec
This test shows that my solution is not the fastest.
But the third solution is also not very good, because it finds the first element in the database (sorted in natural order ) using rnd > random() . Thus, his conclusion is not truly random.
I think the second method is the best for frequent use. But it has one drawback: it requires changing the entire database and providing an additional index.
Leonid Beschastny
source share