Indexes in MongoDB are stored in a B-tree structure, where each index entry points to a specific disk space. Using a B-tree structure also means that the MongoDB index is stored in sorted order, always scanned in order, and is cheap for MongoDB to get a series of documents in sorted order through indexes.
Update : The B-tree structure is correct for the MMAPv1 storage engine, but implemented in a slightly different way to the WiredTiger storage engine (by default starting with MongoDB 3.2). The basic idea remains the same when index traversal is cheap in sorted order.
The SORT stage (i.e. sorting in memory) in the request is limited to 32 MB of memory usage. The request will not be executed if the SORT level exceeds this limit. This limit can be circumvented by using the sorted nature of indexes so that MongoDB can return a query with the sort() parameter without sorting in memory.
Suppose the request has the form:
db.a.find({b:{$gt:100}, c:{$gt:200}}).sort(...)
with a collection having index:
db.a.createIndex({b:1,c:1})
There are two possible scenarios when the sort() step is specified in the request:
1. MongoDB cannot use the sorted nature of the index and must complete the SORT step in memory .
This is the result if the query cannot use the "index prefix". For example:
db.a.find({b:{$gt:100}, c:{$gt:200}}).sort({c:1})
In the above query, the index {b:1,c:1} can be used to:
- Compliance documents having
b greater than 100 for the {b:{$gt:100}} request. - However, there is no guarantee that the returned documents are sorted in accordance with
c .
Therefore, MongoDB has no choice but to sort in memory. The explain() output of this request will have a SORT stage. This SORT stage will be limited to 32 MB of memory.
2. MongoDB can use the sorted nature of the index .
This is the result if the query uses:
- Sort keys according to index order and
- Indicates the same order as the index (that is, the index
{b:1,c:1} can be used for sort({b:1,c:1}) or sort({b:-1,c:-1}) , but not for sort({b:1,c:-1}) )
For example:
db.a.find({b:{$gt:100}, c:{$gt:200}}).sort({b:1})
In the above query, the index {b:1,c:1} can be used to:
- Compliance documents that have
b greater than 100 for the {b:{$gt:100}} request. - In this case, MongoDB can guarantee that the returned documents are sorted according to
b .
The explain() output in the request above will not have a SORT stage. In addition, the output of the explain() request with and without sort() identical. Essentially, we get sort() for free.
A useful resource for understanding this topic is to optimize complex MongoDB indexes . Please note that this post was written back in 2012. Although some terms may be outdated, the technical relevance of the post is still relevant.
Updated follow-up information
MongoDB uses only one index for most queries . So, for example, to avoid a step in the SORT memory in the request
db.a.find({a:1}).sort({b:1})
the index must simultaneously cover fields a and b ; for example, a composite index such as {a:1,b:1} is required. You cannot have two separate indexes {a:1} and {b:1} and expect the index {a:1} be used for the equality part, and the index {b:1} be used for the sort part. In this case, MongoDB will choose one of two indexes.
Thus, it is true that the results are sorted as they are searched and returned in index order.
To avoid sorting in memory using a composite index, the first part of the index should serve the part of the equality request, and the second part should serve the part of the sort request (as shown in the explanation to (1) above).
If you have a request like this:
db.a.find({}).sort({a:1})
the index {a:1,b:1} can be used for the sort part (since you basically return the entire collection). And if your request looks like this:
db.a.find({a:1}).sort({b:1})
the same index {a:1,b:1} can also be used for both parts of the query. Also:
db.a.find({a:1,b:1})
can also use the same index {a:1,b:1}
Pay attention to the diagram here: the find() parameters, followed by sort() , follow the index order {a:1,b:1} . Therefore, the composite index must be ordered by equality β sorting .
Update on sorting of various types
If the field has different types between documents (for example, if a is a string in one document, a number in others, a logical value in another), how will sorting be done?
The answer is MongoDB BSON type comparison order . To paraphrase a man page, order:
- MinKey (inner type)
- Null
- Numbers (integer, long, double, decimal)
- Symbol, String
- Object Array
- Binata
- Objectid
- Logical
- date
- Mark
- Regular expression
- MaxKey (internal type)
Thus, in the above example, using ascending order, documents containing numbers, then strings, and then a boolean value will first appear.