Redistributing data in Cassandra when adding new servers

Suppose I have a Cassandra cluster with 3 nodes, each of which has 100 GB of free hard disk space. The replication coefficient for this cluster is set to 3, and for R / W CL - 2, that is, I can transfer one of my nodes down without sacrificing consistency or availability.

Now imagine that my servers started to fill up (80GB example), and I would like to add 3 more servers of the same specification to my cluster, supporting the same CL and RF.

My question is: after I added new nodes to my cluster and launched the node recovery tool, is it fair to assume that each of my nodes should roughly (more or less than a few GB) contain 40 GB of data each?

If not, how can I add new nodes without fear of running out of space on my hard drive?

A small example of why I ask this question: I am developing an application that connects to a server that runs Cassandra to store data. Since it’s only been developed by me, and I have limited resources in terms of money for buying servers, I decided that I would like to buy small cheap servers instead of more expensive rack options, but I really worry about nodes ending in space if disk allocation is not (at least partially) homogeneous.

Thank you very much for your help,

+4
source share
1 answer

: , node, , ( ) 40

nodetool 40 node. , node . , .

+6

All Articles