CouchDB or Mongo for very high refresh rates and volume?

What would be the best no-sql alternative for storing user data with very high refresh rates and data volume?

for example, dropping from tens to hundreds of lines of user state / navigation state data on a page request for a site with a large volume.

I am currently looking at Mongo or Couch, but open to other alternatives.

EDIT (in response to a kprobst request): It will be hosted on Linux and multiple instances (HW or VM) can be provided.

The system would use to store the state of visitors to the site, 1-2 weeks for non-authenticated users, and (potentially) indefinitely for authenticated users.

I think that the current way of thinking in business is to use CouchDB, since we use it elsewhere, but I also continue to read it at least not enough for constant updating, and there is potential in this system to update 30 - 400 json lines to several documents per user, since the user interacts with the site (use is expected to be very high).

In addition to this โ€œdumpโ€ state, other user information and the ability to query which will be useful will be stored.

+6
mongodb couchdb nosql
source share
4 answers

I recently researched a number of NoSQL technologies, including CouchDB and MongoDB. I felt that MongoDB is more performance oriented than CouchDB, possibly due to certain features. for example, MongoDB uses language drivers, CouchDB uses REST. MongoDB is "updated in place", whereas CouchDB is MVCC . MongoDB stores data in memory mapped files.

I chose MongoDB because it matched the type of data I want to save and the performance it offers. IMHO, I do not think that the MVCC solution would be best suited for the use that you described. When a document is updated, instead of rewriting an existing document, it creates a new version, and then old as an obsolete one, which should be periodically deleted / compressed. The more updates there are, the more work will be associated with my task.

This does not mean that MongoDB is the โ€œbestโ€ choice for CouchDB, since they offer different things and that there may be a flaw in one technology in a particular scenario, it may well be an advantage in another scenario. You obviously have an advantage with CouchDB in that it already uses it within the business, so there seems to be less learning curve.

Here is a bit more comparison 2 on MongoDB.org .

+6
source share

You do not say which platform you are running on, or which platform you can host the nosql solution on. You also do not indicate whether you want to have a direct distributed keystore or a NoSQL database, which will be MongoDB. The two things do not match, although the NoSQL database can be used as a kv repository, I suppose.

However, if you need a simple keystore that works well on Linux, I would go with Redis . Of all NoSQL solutions, I used MongoDB, but it works well on 2008 server (64-bit) and works fine on Linux (CentOS).

It really depends on what you need and where you can place it. For example, MongoDB requires at least two instances. If you provide more information, maybe someone can give you a better recommendation.

+1
source share

Membase is a NoSQL database based on cluster memory. It was developed by several memcached leaders. In addition to its own protocol, it also has 100% memcache API compatibility. Membase is already used in very large applications such as Farmville.

Membase and CouchOne merge into Couchbase (where I work, FWIW, but I do not work on Membase). Therefore, it seems reasonable that the future of Membase will have CouchDB functions: a request for downsizing, replication / backup outside the site, an HTTP REST interface, etc.

+1
source share

Another option to consider is Berkeley DB , which is often used to support large web applications and infrastructure (such as Amazon.com). Berkeley DB supports both the key / value API (NoSQL) and the SQL API. If you are building a Java SOA solution, you can consider the BDB Java Edition , which is used by the Heretix Way Back Machine .

Disclaimer: I am one of the product managers for Berkeley DB, so I am a little biased. However, BDB was written to provide a fast, scalable, reliable built-in data warehouse for exactly the kind of operations you describe.

+1
source share

All Articles