Projection Makes Query Slower

I have over 600 thousand entries in MongoDb. my user schema is as follows:

{ "_id" : ObjectId, "password" : String, "email" : String, "location" : Object, "followers" : Array, "following" : Array, "dateCreated" : Number, "loginCount" : Number, "settings" : Object, "roles" : Array, "enabled" : Boolean, "name" : Object } 

following request:

 db.users.find( {}, { name:1, settings:1, email:1, location:1 } ).skip(656784).limit(10).explain() 

leads to the following:

 { "cursor" : "BasicCursor", "isMultiKey" : false, "n" : 10, "nscannedObjects" : 656794, "nscanned" : 656794, "nscannedObjectsAllPlans" : 656794, "nscannedAllPlans" : 656794, "scanAndOrder" : false, "indexOnly" : false, "nYields" : 5131, "nChunkSkips" : 0, "millis" : 1106, "server" : "shreyance:27017", "filterSet" : false } 

and after deleting the projection query db.users.find().skip(656784).limit(10).explain()

leads to the following:

 { "cursor" : "BasicCursor", "isMultiKey" : false, "n" : 10, "nscannedObjects" : 656794, "nscanned" : 656794, "nscannedObjectsAllPlans" : 656794, "nscannedAllPlans" : 656794, "scanAndOrder" : false, "indexOnly" : false, "nYields" : 5131, "nChunkSkips" : 0, "millis" : 209, "server" : "shreyance:27017", "filterSet" : false } 

As far as I know, projection always increases query performance. Therefore, I cannot understand why MongoDB behaves this way. Can someone explain this. And when to use projection, and when not. And how the projection is actually implemented in MongoDB.

+7
mongodb mongodb-query
source share
2 answers

You are right that the forecast makes this missed request slower in MongoDB 2.6.3. This is due to an optimization issue with the query scheduler 2.6, which is tracked as SERVER-13946 .

The query scheduler 2.6 (as well as 2.6.3) adds SKIP (and LIMIT) steps after analyzing the projection, so the projection is unnecessarily applied to the results that are thrown during passes for this query. I tested a similar query in MongoDB 2.4.10, and nScannedObjects was equal to the number of results returned by my limit , not skip + limit .

There are several factors that affect the performance of your request:

1) You did not specify query criteria ( {} ), so this query performs collection validation in a natural way than using an index.

2) The request cannot be covered because there is no projection.

3) You have a very high value skip 656 784.

There is some room for improvement in terms of queries, but I would not expect that gaps of this magnitude would be reasonable under normal use. For example, if this is a pagination application request with 50 results per page, your skip() value will be equivalent to page number 13,135.

+4
source share

If the result of your projection does not do something to create a "index only" query, which means that only the fields "projected" as a result are all present only in the index, then you are always creating more for the query mechanism.

You should consider this process:

  • How do I fit? By document or index? Find the appropriate primary or other index.

  • Given an index, crawl and search for things.

  • Now, what do I need to return? Is all the data in the index? If you do not return to the collection and do not pull out the documents.

This is the main process. Therefore, if one of these stages is “optimized by nothing”, then, of course, things “take more time”.

You need to look at this as developing a “server engine” and understand the steps that need to be taken. Given that none of your conditions matches anything that could create the “optimal” in these steps, you need to learn how to accept it.

Your "best" case, only , projected fields are the fields present in the selected index. But in fact, even this one has the overhead of loading the index.

So, choose wisely and understand the limitations and memory requirements for what you are writing our request for. That's what "optimization" is.

+1
source share

All Articles