Best way to start nodetool updates after upgrade?

I am currently upgrading a 21 node cluster from 0.8 to version 1.0.11. The cassandra update process requires sstables to be updated to the latest format after a software update (via nodetool upgradestables). This process seems to be very time consuming. I have one node that has been running it for 48 hours and is still not executed.

I would like to know whether it is advisable to do this in parallel on all nodes. In particular, what would be the performance implications? This cluster is under fairly intensive use of r / w and should be available 24/7.

+8
cassandra upgrade
source share
2 answers

During compaction, your nodes will rewrite each sstable at the speed of "compaction_throughput_mb_per_sec".

I assume that the performance implications are directly related to the value of this parameter. A low value (16 MB by default, you can go lower) should allow you to upgrade the cluster without slowing it down.

+6
source share

I run the update simultaneously on all nodes. I run the command (on Linux)

nohup nodetool upgradesstables & 

and then log out and let it work. This is a low priority task and it will take as long as it takes to rewrite all sstables that require rewriting. I did not notice any problems with the delay during the update.

If, for example, you have 1 TB of data per node (naughty!), Then updating requires rewriting all 1 TB of data across multiple files. Reading a record of this data at a slow speed may take several days.

note : since sstables are immutable, and since the backup is done by creating a hard link to the sstable file, as the upgrade process works, you will double the amount of disk space used. Therefore, monitor your disk space and delete snapshots, if necessary, to free up space, especially if your nodes use more than 50% of the disk space for data.

0
source share

All Articles