Voldemort vs couchdb

I am trying to decide whether to use voldemort or couchdb for an upcoming healthcare project. I want a storage system that has high availability, fault tolerance and can scale for the huge amount of data that it throws.

What are the advantages / disadvantages of each?

thanks

+6
database couchdb voldemort
source share
3 answers

The Voldemort project looks beautiful, but I have not looked deeper into it yet.

In this current state, CouchDB may not be correct for "massive amounts of data." The distribution of data between nodes and routing requests is accordingly included in the roadmap, but has not yet been implemented. CouchDB's most famous manufacturing facilities use โ€œtablesโ€ (โ€œdatabasesโ€ on the couch) of about 200G.

HA is not supported on the basis of CouchDB, but can be easily created: all CouchDB nodes replicate database nodes among themselves in a multi-wizard setup. We put two Varnish proxies in front of the CouchDB machines, and Larn boxes are redundant with CARP . The design of CouchDBs "build from the Web" makes such things very easy.

The most pressing issue in our setup is the fact that there are still problems replicating large (multi-byte) attachments to CouchDB documents.

I suggest you also check out the traditional RDBMS route. There are huge problems with available talent beyond the RDBMS approach, and Oracle and Co have very affordable offers.

+5
source share

Without knowing enough about your question, I would say that Project Voldemort or distributed hash tables ( DHTs ) like CouchDB, in general, are a solution to your HA problem.

These DHTs are very good for high availability, but harder to write code than traditional relational databases (RDBMS) regarding consistency.

They do a good job of storing information about the type of document that can go well with your healthcare project, but makes data development difficult.

  • The biggest limitation of most stores is that they are not transaction safe (see Scalaris for transaction-safe storage), and you need to ensure data consistency yourself - most use read time consistency by merging conflicting data). An RDBMS Much Easier to Use for Data Consistency (ACID)
  • Combining data is much more complicated. In RDBM, you can easily query data across multiple tables; you need to write code in CouchDB to aggregate the data. For other stores, Hadoop may be a good choice for aggregating information.

Read about BASE and the CAP Consistency and Availability Theorem.

Cm.

+4
source share

Is there a memcacheDB option? I heard Digg deal with HA issues.

+1
source share

All Articles