The desired replication strategy is not formally supported by MongoDB.
A MongoDB replica set consists of one primary with asynchronous replication to one or more secondary servers in the same replica set. You cannot configure a replica set with multiple primers or replication to another replica set.
However, there are several possible approaches to your use case, depending on how actively you want to update the central server and the amount of data / updates that you need to manage.
Some common caveats:
Combining data from multiple stand-alone servers can cause unexpected conflicts. For example, unique indexes will not know about documents created on other servers.
Ideally, the data you consolidate will still be separated by a unique database name on the source server so that you do not have strange crosstalk between disparate documents having the same namespace and _id shared by different origin servers.
Approach # 1: use mongodump and mongorestore
If you just need to periodically synchronize content with a central server, one way to do this is to use mongodump and mongorestore . You can schedule a periodic mongodump from each individual instance and use mongorestore to import to a central server.
Warning:
There is a --db option for mongorestore , which allows you to restore the original name in another database (if necessary)
mongorestore only performs inserts into an existing database (i.e. does not perform updates or updates). If existing data with the same _id already exists in the target database, mongorestore will not replace it.
You can use mongodump options like --query to be more selective when exporting data (e.g. just select the latest data, not all)
If you want to limit the amount of data for dump and recovery at each start (for example, only exporting βchangedβ data), you will need to decide how to handle updates and deletes on a central server.
Given reservations, the simplest use of this approach would be a complete reset and recovery (i.e. using mongorestore --drop ) to ensure all changes are copied.
Approach # 2: use the tail cursor with the MongoDB oplog .
If you need more real-time or incremental replication, it's possible that creating tail cursors for MongoDB oplog replication.
This approach basically "collapses your own replication." You will need to write an application that processes oplog on each of your MongoDB instances and looks for changes of interest to be stored on your central server. For example, you can only replicate changes to custom namespaces (databases or collections).
A related tool that may be of interest is the experimental Mongo Connector from 10gen labs. This is a Python module that provides an interface for restricting oplog replication.
Warning:
To do this, you need to implement your own code and learn / understand how to work with oplog documents
There may be an alternative product that better supports your desired replication model out of the box.
Stennie
source share