Redistributing data in Cassandra when adding new servers

Question

Redistributing data in Cassandra when adding new servers

Suppose I have a Cassandra cluster with 3 nodes, each of which has 100 GB of free hard disk space. The replication coefficient for this cluster is set to 3, and for R / W CL - 2, that is, I can transfer one of my nodes down without sacrificing consistency or availability.

Now imagine that my servers started to fill up (80GB example), and I would like to add 3 more servers of the same specification to my cluster, supporting the same CL and RF.

My question is: after I added new nodes to my cluster and launched the node recovery tool, is it fair to assume that each of my nodes should roughly (more or less than a few GB) contain 40 GB of data each?

If not, how can I add new nodes without fear of running out of space on my hard drive?

A small example of why I ask this question: I am developing an application that connects to a server that runs Cassandra to store data. Since it’s only been developed by me, and I have limited resources in terms of money for buying servers, I decided that I would like to buy small cheap servers instead of more expensive rack options, but I really worry about nodes ending in space if disk allocation is not (at least partially) homogeneous.

Thank you very much for your help,

+4

cassandra cassandra-2.0

kha May 28 '15 at 17:48

source share

1 answer

RussS · Accepted Answer · 2015-05-28T18:13:09+0000

: , node, , ( ) 40

nodetool 40 node. , node . , .

Redistributing data in Cassandra when adding new servers

More articles: