Starting Cassandra 1.0, I compress the ring from 5 nodes to 4. For this, I ran nodetool decommission to the node I want to remove, then stopped cassandra on this host and used nodetool move and nodetool cleanup to update tokens on the remaining 4 nodes to rebalance the cluster.
My seed nodes are A and B. Remote node is C.
It seemed to work fine for 6-7 days, but now one of my four nodes believes that the decommissioned node is still part of the ring.
Why did this happen, and what is the right way to completely remove a decommissioned node from the ring?
Here's the output of the nodetool ring on one node, which still believes that the decommissioned node is part of the ring:
Address DC Rack Status State Load Owns Token 127605887595351923798765477786913079296 xx.x.xxx.xx datacenter1 rack1 Up Normal 616.17 MB 25.00% 0 xx.xxx.xxx.xxx datacenter1 rack1 Up Normal 1.17 GB 25.00% 42535295865117307932921825928971026432 xx.xxx.xx.xxx datacenter1 rack1 Down Normal ? 9.08% 57981914123659253974350789668785134662 xx.xx.xx.xxx datacenter1 rack1 Up Normal 531.99 MB 15.92% 85070591730234615865843651857942052864 xx.xxx.xxx.xx datacenter1 rack1 Up Normal 659.92 MB 25.00% 127605887595351923798765477786913079296
Here is the output of the nodetool ring on the remaining 3 nodes:
Address DC Rack Status State Load Owns Token 127605887595351923798765477786913079296 xx.x.xxx.xx datacenter1 rack1 Up Normal 616.17 MB 25.00% 0 xx.xxx.xxx.xxx datacenter1 rack1 Up Normal 1.17 GB 25.00% 42535295865117307932921825928971026432 xx.xx.xx.xxx datacenter1 rack1 Up Normal 531.99 MB 25.00% 85070591730234615865843651857942052864 xx.xxx.xxx.xx datacenter1 rack1 Up Normal 659.92 MB 25.00% 127605887595351923798765477786913079296
UPDATE: I tried to remove node using nodetool removetoken on node B, which is the one that still claims that node C is in the ring. This command was executed for 5 hours and did not seem to do anything. The only change is that the state of node C now "goes away" when I run nodetool ring on node B.
source share