MongoDB load balancing across multiple AWS instances

Question

MongoDB load balancing across multiple AWS instances

We use the amazon web service for a business application that uses node.js server and mongodb as a database. The node.js server is currently running on an instance of the EC2 environment. And we keep our mongodb database in a separate micro instance. Now we want to deploy the replica set in our mongodb database, so if mongodb is blocked or unavailable, we can still start our database and get data from it.

So, we are trying to save each member of the replica in separate instances, so that we can receive data from the database, even if the primary memeber instance is disconnected.

Now I want to add a load balancer to the database so that the database works fine even with a huge load of traffic at a time. In this case, I can read the database balance by adding the slaveOK configuration to replicaSet. But it will not load the database balance if the database has a huge load of traffic for the write operation.

To solve this problem, I have two options so far.

Option 1: I have to outline the database and save each shard in a separate instance. And under each fragment in one instance will be a set of reaplica. But there is a problem, since the splinter divides the database into several parts, so each splinter will not store the same data in it. Therefore, if one instance is disconnected, we will not be able to access data from the fragment in this instance.

To solve this problem, I am trying to split the database in shards, and each shard will have a replicaSet in separate instances. Therefore, even if one instance is disconnected, we will not encounter any problem. But if we have 2 shards, and each shard has 3 members in replicaSet, I need 6 aws instances. Therefore, I believe that this is not an optimal solution.

Option 2: We can create a master-master configuration in mongodb, which means that the entire database will be primary and everyone will have read / write access, but I would also like them to automatically synchronize with each other often, so everyone they become clones of each other. And all these primary databases will be in a separate instance. But I do not know if mongodb supports this structure or not.

I have no mogodb doc / blog data for this situation. So, please suggest me what should be the best solution for this problem.

+8

database mongodb amazon-web-services amazon-ec2 load-balancing

Indra Jul 10 '14 at 7:55

source share

2 answers

Sammaye · Answer 1 · 2014-07-10T08:22:21+0000

This is far from a complete answer, too many details, and I could write an entire essay about this issue, like many others, however, since I do not have such time to save money, I will add a comment about what I see.

Now I want to add a load balancer to the database so that the database works fine even with a huge load of traffic at a time.

Replica sets are not designed for such work. If you want to load balance, you are probably looking for shards that will allow you to do this.

Replication is designed to automatically recover from a failure.

In this case, I can read the database balance by adding the slaveOK configuration to replicaSet.

Since in order to stay up to date, your members will receive as many options as the main one, it seems that this may not help much.

In fact, instead of having one server with many connections, you have many connections on many servers, for the next queues for outdated data, because the sequence of elements is possible, not immediate, unlike ACID technologies, however, stating that they end up matching 32-odd ms, which means they aren't lagging enough to provide decent bandwidth if the primary load.

Since reading ARE at the same time, you will get the same speed, regardless of whether you are reading from primary or secondary. I suppose you could delay the slave to create an OP pause, but that will return massive data back.

Not to mention the fact that MongoDB is not a multimaster as such, you can write only one node, which makes slaveOK not the most useful setting in the world, and I have seen many times when 10gen itself recommends you use enveloping this setting.

Option 2: We can create a master-master configuration in mongodb,

To do this, you will need your own encoding. At this point, you may consider using a database that supports http://en.wikipedia.org/wiki/Multi-master_replication

This is because the speed you are looking for is most likely not actually written to the recording, as I discussed above.

Option 1: I have to outline the database and save each shard in a separate instance.

This is the recommended method, but you have found a reservation with it. Unfortunately, it is not decided that replication with several masters should be resolved, however replication with several masters adds its own plague rat ship to Europe itself, and I highly recommend that you do some serious research before you think about MongoDB currently cannot satisfy your needs.

You might not worry anything, since the fsync queue is designed to eliminate the IO bottleneck that slows down your records, as it would in SQL, and reading is parallel, so if you plan your schema and working set correctly, you should be in able to get a huge amount of OP.

In fact, there is a question related to us from a 10-year-old employee who is very good to read: https://stackoverflow.com/a/166646/ and it shows how much bandwidth MongoDB can reach under load.

A new document-level lock will soon grow, which is already in the dev branch.

Lalit agarwal · Answer 2 · 2014-07-10T08:56:42+0000

Option 1 is the recommended method, as pointed out by @Sammaye, but you will not need 6 instances and can manage it with 4 instances.

Assuming you need the setting below.

2 shards (S1, S2)
1 copy for each fragment (replica set secondary) (RS1, RS2)
1 Arbiter for each shard (RA1, RA2)

Then you can split your server configuration as shown below.

Instance 1 : Runs : S1 (Primary Node) Instance 2 : Runs : S2 (Primary Node) Instance 3 : Runs : RS1 (Secondary Node S1) and RA2 (Arbiter Node S2) Instance 4 : Runs : RS2 (Secondary Node S2) and RA1 (Arbiter Node S1)

You can run the arbitrator nodes along with your secondary nodes, which will help you in the election during failures.

MongoDB load balancing across multiple AWS instances

More articles: