MongoDB too many records?

I have a PHP application that interacts with MongoDB. Until recently, the application worked fine, but after a few days I found that the application starts to respond REALLY slowly. One of the collections rose to 500K + records. Thus, MongCursor saves time for any request in this collection.

I do not think that 500K records are too many. Other pages using mongodb are also starting to slow down, but not as much as the one using a collection with 500k entries. Static pages that do not interact with MongoDB still respond quickly.

I'm not sure what might be the problem here. I indexed collections, so this does not seem to be a problem. Another point to note is that the RAM spec on the server is 512 MB, and when PHP executes Mongo, the top command shows 15000k of free memory.

Any help would be greatly appreciated.

+4
source share
2 answers

To summarize the next steps from the chat room, the problem is with the find () query, which checks all ~ 500k documents to find 15:

db.tweet_data.find({ $or: [ { in_reply_to_screen_name: /^kunalnayyar$/i, handle: /^kaleycuoco$/i, id: { $gt: 0 } }, { in_reply_to_screen_name: /^kaleycuoco$/i, handle: /^kunalnayyar$/i, id: { $gt: 0 } } ], in_reply_to_status_id_str: { $ne: null } } ).explain() { "cursor" : "BtreeCursor id_1", "nscanned" : 523248, "nscannedObjects" : 523248, "n" : 15, "millis" : 23682, "nYields" : 0, "nChunkSkips" : 0, "isMultiKey" : false, "indexOnly" : false, "indexBounds" : { "id" : [ [ 0, 1.7976931348623157e+308 ] ] } } 

This query uses case-insensitive regular expressions that will not efficiently use the index (although there really wasn’t a specific one, in this case).

Proposed Approach:

  • create lowercase handle_lc and inreply_lc search fields

  • add a composite index :

    db.tweet.ensureIndex({handle_lc:1, inreply_lc:1})

  • the order of the composite index allows you to efficiently find all tweets, either handle , or ( handle,in_reply_to )

  • search by exact match instead of regular expression:

db.tweet_data.find({ $or: [ { in_reply_to_screen_name:'kunalnayyar', handle:'kaleycuoco', id: { $gt: 0 } }, { in_reply_to_screen_name:'kaleycuoco', handle:'kunalnayyar', id: { $gt: 0 } } ], })

+7
source

Yes, 500K + should be fine. According to my information, there is no real "limit" to the number of documents in a collection. This is probably the number of unique combinations of the _id field that MongoDB can create. But it will be much more than 500K. In your case, which I suspect, perhaps your request is not very selective. Therefore, when there were fewer documents in the collection, you did not notice a problem. But with the increase, it seems that he is becoming sluggish. For example, how many documents are returned by MongoCursor?

0
source

All Articles