Reading from mongodb without blocking

We are using MongoDB 2.2.0 at work. The database contains about 51 GB of data (at the moment), and I would like to make some analytical data on the user data that we have collected so far. The problem is that it is a living machine, and at the moment we can not afford another slave. I know that MongoDB has a read lock, which can affect any records that occur especially with complex queries. Is there any way to tell MongoDB to consider my (specific) request with the lowest priority?

+6
source share
2 answers

In MongoDB, reads and writes affect each other. Reading locks is common, but reading locks locks blocking records from acquiring, and of course, no other reads or writes occur when a write lock is saved. MongoDB operations are issued periodically so that other threads expect locks to starve. Read more about this here .

What does this mean for your use case? Since it is impossible to inform MongoDB about access to data without reading locks, and there is no way to prioritize queries (at least not yet), whether reading your data will significantly affect the performance of your records depends on how much β€œstock” you have available while recording activity continues.

One suggestion I can make is to figure out how to run analytics, and not scan the entire data set (i.e., execute an aggregation request for all historical data). Try performing smaller aggregate queries on shorter time slices. This will do two things:

  • reading tasks will be shorter and, therefore, will end faster, this will give you the opportunity to assess what impact queries have on your "live" performance.
  • You won’t pull all the old data into RAM immediately - by separating these analytical queries over time, you will minimize the impact that it will have on the current recording performance.

Depending on what you cannot afford when getting another server, you might consider getting a short-circuit AWS instance that might not be very powerful, but will be available to run a long analytic query with a copy of your dataset , Just be careful when creating a copy of your data - full synchronization with the production system will lead to a large load on it (a more efficient way would be to use a recent backup / file snapshot to resume and I).

+6
source

Such operations are best left to subordinate replica sets. First, you can read locks to allow many reads at once, but write locks block reads. And, although you cannot prioritize requests, mongodb gives long read / write requests. Their concurrency docs should help

If you cannot afford another server, you can configure the slave device on the same computer, provided that you have spare RAM / Disk memory and you use the work environment lightly / occasionally. However, you must be careful, your disk I / O will increase significantly.

+2
source

All Articles