MongoDB - pagination based on unique fields

I am familiar with pagination-based best practices on large MongoDB collections, however I am struggling with figuring out how to split a collection in which the sort value is in a non-single field.

For example, I have a large collection of users, and there is a field for the number of times they did something. This field is unique and can have large groups of documents with the same value.

I would like to return results sorted by the 'numTimesDoneSomething' field.

Here is an example dataset:

{_id: ObjectId("50c480d81ff137e805000003"), numTimesDoneSomething: 12} {_id: ObjectId("50c480d81ff137e805000005"), numTimesDoneSomething: 9} {_id: ObjectId("50c480d81ff137e805000006"), numTimesDoneSomething: 7} {_id: ObjectId("50c480d81ff137e805000007"), numTimesDoneSomething: 1} {_id: ObjectId("50c480d81ff137e805000002"), numTimesDoneSomething: 15} {_id: ObjectId("50c480d81ff137e805000008"), numTimesDoneSomething: 1} {_id: ObjectId("50c480d81ff137e805000009"), numTimesDoneSomething: 1} {_id: ObjectId("50c480d81ff137e805000004"), numTimesDoneSomething: 12} {_id: ObjectId("50c480d81ff137e805000010"), numTimesDoneSomething: 1} {_id: ObjectId("50c480d81ff137e805000011"), numTimesDoneSomething: 1} 

How can I return this dataset sorted by 'numTimesDoneSomething' with 2 records per page?

+8
source share
3 answers

@cubbuk shows a good example using offset ( skip ), but you can also format the query that it shows for rank pagination:

 db.collection.find().sort({numTimesDoneSomething:-1, _id:1}) 

Since _id will be unique here, and you take turns on it, you can then type _id , and the results, even between two entries having numTimesDoneSomething of 12 , should be consistent as to whether they should be on one page or the next.

So, doing something as simple as

 var q = db.collection.find({_id: {$gt: last_id}}).sort({numTimesDoneSomething:-1, _id:1}).limit(2) 

Should work pretty well for pagination.

+5
source

You can sort by several fields in this case, sort by fields numTimesDoneSomething and id . Since the id_ field increases by itself already in accordance with the insertion timestamp, you can split pages into fragments without repeating duplicate data, unless new data is inserted during the iteration.

 db.collection.find().sort({numTimesDoneSomething:-1, _id:1}).offset(index).limit(2) 
+2
source

@Sammaye We have a similar use case for sorting by non-unique fields, and we don’t want to go with a pass. $ Min, $ max the best option so far? I see that your last comment was about 3 years old. Is there a better option than this?

0
source

All Articles