MongoDB slows down every 2 hours and 10 minutes for sure

Question

MongoDB slows down every 2 hours and 10 minutes for sure

Over the past 3 months, my MongoDB server has become very slow every 2 hours and 10 minutes, very accurately.

My server configuration:

3 sets of replicas and for data backup 1 of them has a delay of 3600 seconds.
There are no slave servers for the 3 masters in the replica set.
Use mongoose + node.js to provide a rest api.
About 9 views and 1.5 records per second on average for 24 hours of statistics.

What I did after searching stackoverflow and google:

Restarting the server CANNOT change the slow interval of 2 hours and 10 minutes.
Create index to all fields that I request, inaction
Delete the data file on one server and use another to restore, then delete anohter and restore back, inaction
Primary Shift Server, Inaction
Run 'currentOps' when the database is slow, I see a lot of queries that hang there, too many logs to insert here, but did not see some kind of abnormal query.
In the mongo console, check "serverStatus", when the database is slow, the command is waiting for the database to be restored.
There is no increase in memory usage from the top command when the database is slow.
The rest api, which does not work with the database, works well.

I suppose there might be something blocking, the most potential reason is that it can create an index. There is something special in my database:

I have about 14,000 collections in one database and is growing. In one collection there can be from 1 to 3000 records.
Both the number of sets and numerical records are growing dynamically.
Index fields will be specified when creating a new collection.

I have been obsessed with this problem for 3 months. Any comments / suggestions would be much appreciated!

Below are some of the logs from my log file:

Fri Jul 5 15:20:11 .040 [conn2765] serverStatus was very slow: {after basic: 0, after statements: 0, after backgroundFlushing: 0, after connections: 0, after cursors: 0, after dur: 0, after extra_info: 0, after globalLock: 0, after indexCounters: 0, after locks: 0, after network: 0, after opcounters: 0, after opcountersRepl: 0, after recordStats: 222694, after repl: 222694, at the end: 222694}
Fri Jul 5 17:30:09 .367 [conn4711] serverStatus was very slow: {after main: 0, after statements: 0, after backgroundFlushing: 0, after connections: 0, after cursors: 0, after dur: 0, after extra_info: 0, after globalLock: 0, after indexCounters: 0, after locks: 0, after network: 0, after opcounters: 0, after opcountersRepl: 0, after recordStats: 199498, after repl: 199498, at the end: 199528}
Fri Jul 5 19:40:12 .697 [conn6488] serverStatus was very slow: {after basic: 0, after statements: 0, after backgroundFlushing: 0, after connections: 0, after cursors: 0, after dur: 0, after extra_info: 0, after globalLock: 0, after indexCounters: 0, after locks: 0, after network: 0, after opcounters: 0, after opcountersRepl: 0, after recordStats: 204061, after repl: 204061, at the end: 204081}

Here is a screenshot of my pingdom report, server 4 minutes every 2 hours and 7 minutes. Initially, the server runs 2 minutes every 2 hours and 6 minutes. report from pingdom

[EDIT 1] More monitor result from host provider: CPU http://i.minus.com/iZBNyMPzLSLRr.png DiskIO http://i.minus.com/ivgrHr0Ghoz92.png Connections http://i.minus.com/ itbfYq0SSMlNs.png Periodically increasing connections are due to the fact that connections are waiting, and the counter for the current connection will accumulate until the database is unlocked. This is not due to the huge traffic.

+8

node.js mongodb mongoose

Mason zhang Jul 15 '13 at 15:26

source share

3 answers

Jar.jar.beans · Answer 1 · 2015-01-06T09:42:35+0000

We found a specific problem 2:10. In our case, it was the execution of dbStats via MMS. We had to upgrade the cluster and the problem was resolved.

kizzx2 · Answer 2 · 2014-04-01T08:03:10+0000

I had a similar problem. I will start with mongostat / mongotop and make my way from there. Identify the prevailing workload with mongostat , and then find out which collection is causing this activity.

In my specific case, I have a cron job that removes obsoletes entries. It turns out that the way to distribute replicas of these commands is extremely resource intensive. For example, I would delete 3 m records from the collection of what is happening on the replica setup server. For some reason, this distribution forces all minor works to work intensively in a subsequent distribution.

If you can see something in db.currentOp , I would focus on those who have a long time and try to pinpoint the root cause by exception.

Hope this helps.

Comtaler · Answer 3 · 2014-01-09T20:03:42+0000

I think you mean a replicas with three nodes instead of "3 sets of replicas."

If you are still experiencing the same problem. Here is my opinion:

Since you are using your server at linode.com. Your server is actually a virtual machine, and you share resources with others. Periodic slowdowns can occur due to the fact that others work periodically with disk load. Since you have already explored so many different possibilities, this may be an option for you, even this requires a little effort.
This is definitely caused by the work performed by mongodb or your system. Try looking for any job that is done regularly. For example, try to remove a delay of 3600 seconds on one of your secondary devices. Even this is not 2 hours and 10 minutes, but it can be a trigger.

I can not post my suggestions in the comments, as this does not allow me. So, I am posting this as an answer.

MongoDB slows down every 2 hours and 10 minutes for sure

More articles: