Mongodb's simple regexp query and sorting is slow

I am stuck in this simple prefix request. Although Mongo docs claim that you can get pretty good performance using the regex prefix format (/ ^ a /), the request is pretty slow when I try to sort the results:

940 millis

db.posts.find ({hashtags: / ^ noticias /}). limit (15) .sort ({rank: -1}). hint ('hashtags_1_rank_-1'). explain ()

{ "cursor" : "BtreeCursor hashtags_1_rank_-1 multi", "isMultiKey" : true, "n" : 15, "nscannedObjects" : 142691, "nscanned" : 142692, "nscannedObjectsAllPlans" : 142691, "nscannedAllPlans" : 142692, "scanAndOrder" : true, "indexOnly" : false, "nYields" : 1, "nChunkSkips" : 0, "millis" : 934, "indexBounds" : { "hashtags" : [ [ "noticias", "noticiat" ], [ /^noticias/, /^noticias/ ] ], "rank" : [ [ { "$maxElement" : 1 }, { "$minElement" : 1 } ] ] }, "server" : "XRTZ048.local:27017" } 

However, an unsorted version of the same query is super fast:

0 millis

db.posts.find ({hashtags: / ^ noticias /}). limit (15) .hint ('hashtags_1_rank_-1'). explain ()

 { "cursor" : "BtreeCursor hashtags_1_rank_-1 multi", "isMultiKey" : true, "n" : 15, "nscannedObjects" : 15, "nscanned" : 15, "nscannedObjectsAllPlans" : 15, "nscannedAllPlans" : 15, "scanAndOrder" : false, "indexOnly" : false, "nYields" : 0, "nChunkSkips" : 0, "millis" : 0, "indexBounds" : { "hashtags" : [ [ "noticias", "noticiat" ], [ /^noticias/, /^noticias/ ] ], "rank" : [ [ { "$maxElement" : 1 }, { "$minElement" : 1 } ] ] }, "server" : "XRTZ048.local:27017" 

}

The query also executes quickly if I remove the regex and sort:

0 millis

db.posts.find ({hashtags: 'noticias'}). limit (15) .sort ({rank: -1}). hint ('hashtags_1_rank_-1'). explain ()

 { "cursor" : "BtreeCursor hashtags_1_rank_-1", "isMultiKey" : true, "n" : 15, "nscannedObjects" : 15, "nscanned" : 15, "nscannedObjectsAllPlans" : 15, "nscannedAllPlans" : 15, "scanAndOrder" : false, "indexOnly" : false, "nYields" : 0, "nChunkSkips" : 0, "millis" : 0, "indexBounds" : { "hashtags" : [ [ "noticias", "noticias" ] ], "rank" : [ [ { "$maxElement" : 1 }, { "$minElement" : 1 } ] ] }, "server" : "XRTZ048.local:27017" 

}

Using both regular expression and sorting, it seems that Mongo scans a lot of records. However, sorting only scans 15 unless I use a regex. What is wrong here?

+7
source share
1 answer

scanAndOrder: true in the explanation output indicates that the request should retrieve documents and then sort them in memory before outputting the result. This is an expensive operation and will affect the performance of your request.

The existence of scanAndOrder: true , as well as the difference in nscanned an n in the output of the explanation, indicate that the query does not use the optimal index. In this case, it seems like you need to scan the collection. You may be able to resolve this issue by including index keys in your sort criteria. From my testing:

 db.posts.find({hashtags: /^noticias/ }).limit(15).sort({hashtags:1, rank : -1}).explain() 

It does not require scanning and ordering and returns n and nscanned number of records you are looking for. It would also mean hashtags , which may or may not be useful to you, but should improve query performance.

+6
source

All Articles