I see three factors in the game.
First, for application purposes, make sure $ elemMatch is not a more suitable request for this use case. http://docs.mongodb.org/manual/reference/operator/elemMatch/ . It seems like it would be bad if the wrong results were returned due to the many subdocuments satisfying the query.
Secondly, I believe that a high nscanned value can be taken into account by querying each of the field values โโindependently ..find ({bank_accounts.bank_id: X}) versus .find ({"bank_accounts.account_id": Y}). You can see that nscanned for a full query is roughly equal to nscanned the largest subquery. If the index key was evaluated completely as a range, this was not expected, but ...
Thirdly, {"bank_accounts.account_id": [[{{"$ minElement": 1}, {"$ maxElement": 1}]]} the explanation plan item indicates that the key range does not apply to this part.
I donโt know why, but I suspect that this has something to do with the nature of account_id (an array inside a subdocument inside an array). 200ms seems to be suitable for nscanned, which is high.
A more efficient organization of the document may be the denormalization of the relations account_id โ bank_id inside the subdocument and saving:
{"bank_accounts": [ { "bank_id": X, "account_id: Y, }, { "bank_id": X, "account_id: Z, } ]}
instead: {"bank accounts": [{"bank_id": X, "account_id: [Y, Z],}]}}
My tests below show that with this organization, the query optimizer returns to work and produces a range for both keys:
> db.accounts.insert({"something": true, "blah": [{ a: "1", b: "2"} ] }) > db.accounts.ensureIndex({"blah.a": 1, "blah.b": 1}) > db.accounts.find({"blah.a": 1, "blah.b": "A RANGE"}).explain() { "cursor" : "BtreeCursor blah.a_1_blah.b_1", "isMultiKey" : false, "n" : 0, "nscannedObjects" : 0, "nscanned" : 0, "nscannedObjectsAllPlans" : 0, "nscannedAllPlans" : 0, "scanAndOrder" : false, "indexOnly" : false, "nYields" : 0, "nChunkSkips" : 0, "millis" : 0, "indexBounds" : { "blah.a" : [ [ 1, 1 ] ], "blah.b" : [ [ "A RANGE", "A RANGE" ] ] } }