MongoDB index does not help query with multicode index

Question

MongoDB index does not help query with multicode index

I have a set of documents with multikey index. However, query performance is rather poor for only 43K documents. Is ~ 215 ms for this request considered poor? Did I correctly determine the index if nscanned is 43902 (which is equal to the common documents in the collection)?

Document

{ "_id": { "$oid": "50f7c95b31e4920008dc75dc" }, "bank_accounts": [ { "bank_id": { "$oid": "50f7c95a31e4920009b5fc5d" }, "account_id": [ "ff39089358c1e7bcb880d093e70eafdd", "adaec507c755d6e6cf2984a5a897f1e2" ] } ], "created_date": "2013,01,17,09,50,19,274089", }

Index

 { "bank_accounts.bank_id" : 1 , "bank_accounts.account_id" : 1}

Query:

 db.visitor.find({ "bank_accounts.account_id" : "ff39089358c1e7bcb880d093e70eafdd" , "bank_accounts.bank_id" : ObjectId("50f7c95a31e4920009b5fc5d")}).explain()

I explain:

 { "cursor" : "BtreeCursor bank_accounts.bank_id_1_bank_accounts.account_id_1", "isMultiKey" : true, "n" : 1, "nscannedObjects" : 43902, "nscanned" : 43902, "nscannedObjectsAllPlans" : 43902, "nscannedAllPlans" : 43902, "scanAndOrder" : false, "indexOnly" : false, "nYields" : 0, "nChunkSkips" : 0, "millis" : 213, "indexBounds" : { "bank_accounts.bank_id" : [ [ ObjectId("50f7c95a31e4920009b5fc5d"), ObjectId("50f7c95a31e4920009b5fc5d") ] ], "bank_accounts.account_id" : [ [ { "$minElement" : 1 }, { "$maxElement" : 1 } ] ] }, "server" : "Not_Important" }

+6

performance database mongodb mongoengine mlab

Jason Feb 17 '13 at 0:29

source share

1 answer

Eric · Accepted Answer · 2013-02-19T21:52:18+0000

I see three factors in the game.

First, for application purposes, make sure $ elemMatch is not a more suitable request for this use case. http://docs.mongodb.org/manual/reference/operator/elemMatch/ . It seems like it would be bad if the wrong results were returned due to the many subdocuments satisfying the query.

Secondly, I believe that a high nscanned value can be taken into account by querying each of the field values independently ..find ({bank_accounts.bank_id: X}) versus .find ({"bank_accounts.account_id": Y}). You can see that nscanned for a full query is roughly equal to nscanned the largest subquery. If the index key was evaluated completely as a range, this was not expected, but ...

Thirdly, {"bank_accounts.account_id": [[{{"$ minElement": 1}, {"$ maxElement": 1}]]} the explanation plan item indicates that the key range does not apply to this part.

I don’t know why, but I suspect that this has something to do with the nature of account_id (an array inside a subdocument inside an array). 200ms seems to be suitable for nscanned, which is high.

A more efficient organization of the document may be the denormalization of the relations account_id → bank_id inside the subdocument and saving:

 {"bank_accounts": [ { "bank_id": X, "account_id: Y, }, { "bank_id": X, "account_id: Z, } ]}

instead: {"bank accounts": [{"bank_id": X, "account_id: [Y, Z],}]}}

My tests below show that with this organization, the query optimizer returns to work and produces a range for both keys:

 > db.accounts.insert({"something": true, "blah": [{ a: "1", b: "2"} ] }) > db.accounts.ensureIndex({"blah.a": 1, "blah.b": 1}) > db.accounts.find({"blah.a": 1, "blah.b": "A RANGE"}).explain() { "cursor" : "BtreeCursor blah.a_1_blah.b_1", "isMultiKey" : false, "n" : 0, "nscannedObjects" : 0, "nscanned" : 0, "nscannedObjectsAllPlans" : 0, "nscannedAllPlans" : 0, "scanAndOrder" : false, "indexOnly" : false, "nYields" : 0, "nChunkSkips" : 0, "millis" : 0, "indexBounds" : { "blah.a" : [ [ 1, 1 ] ], "blah.b" : [ [ "A RANGE", "A RANGE" ] ] } }

MongoDB index does not help query with multicode index

More articles: