I have a python application (python and mongo newbie) that runs every hour through cron to get data, clear and paste into mongo. At run time, the application will request mongo to check for duplicates and insert if a new document.
I recently noticed that mongod works with 100% CPU usage ... and I'm not sure when / why it started.
I am launching an EC2 micro instance with dedicated EBS for mongo, which is ~ 2.2 GB in size.
I'm not sure where to start diagnosing the problem. Here is the output of statistics () and systemStatus () on the system:
> db.myApp.stats() { "ns" : "myApp.myApp", "count" : 138096, "size" : 106576816, "avgObjSize" : 771.7588923647318, "storageSize" : 133079040, "numExtents" : 13, "nindexes" : 1, "lastExtentSize" : 27090944, "paddingFactor" : 1, "flags" : 1, "totalIndexSize" : 4496800, "indexSizes" : { "_id_" : 4496800 }, "ok" : 1 } > db.serverStatus() { "host" : "kar", "version" : "2.0.4", "process" : "mongod", "uptime" : 4146089, "uptimeEstimate" : 3583433, "localTime" : ISODate("2013-04-07T21:18:05.466Z"), "globalLock" : { "totalTime" : 4146088784941, "lockTime" : 1483742858, "ratio" : 0.0003578656741237909, "currentQueue" : { "total" : 0, "readers" : 0, "writers" : 0 }, "activeClients" : { "total" : 2, "readers" : 2, "writers" : 0 } }, "mem" : { "bits" : 64, "resident" : 139, "virtual" : 1087, "supported" : true, "mapped" : 208, "mappedWithJournal" : 416 }, "connections" : { "current" : 7, "available" : 812 }, "extra_info" : { "note" : "fields vary by platform", "heap_usage_bytes" : 359456, "page_faults" : 634 }, "indexCounters" : { "btree" : { "accesses" : 3431, "hits" : 3431, "misses" : 0, "resets" : 0, "missRatio" : 0 } }, "backgroundFlushing" : { "flushes" : 69092, "total_ms" : 448897, "average_ms" : 6.497090835407862, "last_ms" : 0, "last_finished" : ISODate("2013-04-07T21:17:15.620Z") }, "cursors" : { "totalOpen" : 0, "clientCursors_size" : 0, "timedOut" : 1 }, "network" : { "bytesIn" : 297154435, "bytesOut" : 222773714, "numRequests" : 1721768 }, "opcounters" : { "insert" : 138004, "query" : 359, "update" : 0, "delete" : 0, "getmore" : 0, "command" : 1583416 }, "asserts" : { "regular" : 0, "warning" : 0, "msg" : 0, "user" : 0, "rollovers" : 0 }, "writeBacksQueued" : false, "dur" : { "commits" : 9, "journaledMB" : 0, "writeToDataFilesMB" : 0, "compression" : 0, "commitsInWriteLock" : 0, "earlyCommits" : 0, "timeMs" : { "dt" : 3180, "prepLogBuffer" : 0, "writeToJournal" : 0, "writeToDataFiles" : 0, "remapPrivateView" : 0 } }, "ok" : 1 }
And the top output:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 18477 mongodb 20 0 1087m 139m 122m R 99.9 23.7 10729:36 mongod
I'm curious how to do mongo debugging to determine where / what / why this terrible performance happens.
UPDATE:
I found out that I can use explain () to get the details, although I'm not sure how else to interpret the results
> db.myApp.find({'id':'320969221423124481'}).explain() { "cursor" : "BasicCursor", "nscanned" : 138124, "nscannedObjects" : 138124, "n" : 0, "millis" : 3949, "nYields" : 0, "nChunkSkips" : 0, "isMultiKey" : false, "indexOnly" : false, "indexBounds" : { } }
UPDATE:
OK, now I see that the sample request (which it executes BUNCH times) takes about 4 seconds. I think it does NOT use any index. I need to find how to add an index ... by doing it now.
UPDATE:
So, I did the following
db.myApp.ensureIndex({'id':1})
And all this is fixed. heh.