I am trying, but not joining a new (well old, but destroyed) node to an existing cluster.
Currently, the cluster consists of 2 nodes and runs C * 2.1.2. I start the third node with 2.1.2, it gets the connection, it loads, i.e. Passes some data, as shown by nodetool netstats, but after a while it gets stuck. From this point, nothing flows, the new node remains in a connected state. I restarted the node twice, every time it broadcast more data, but then got stuck again. (I'm in the third round now).
Other facts:
- I do not see errors in registration on any of the nodes.
- Binding seems fine, I can ping, netcat to port 7000 in all ways.
- I have a load of 267 GB for running node, replication 2, 16 tokens.
- Downloading a new node is now around 100 GB.
- I assume that node, after several rounds of restart, will finally absorb all the data from the running nodes and join the cluster. But definitely this is not how it should work.
EDIT: I found one more info:
The boot process stops in the middle of streaming some table always after sending exactly 10 MB of some SSTable, for example:
$ nodetool netstats | grep -P -v "bytes\(100"
Mode: NORMAL
Bootstrap e0abc160-7ca8-11e4-9bc2-cf6aed12690e
/192.168.200.16
Sending 516 files, 124933333900 bytes total
/home/data/cassandra/data/leadbullet/page_view-2a2410103f4411e4a266db7096512b05/leadbullet-page_view-ka-13890-Data.db 10485760/167797071 bytes(6%) sent to idx:0/192.168.200.16
Read Repair Statistics:
Attempted: 2016371
Mismatch (Blocking): 0
Mismatch (Background): 168721
Pool Name Active Pending Completed
Commands n/a 0 55802918
Responses n/a 0 425963
I cannot diagnose the error and I will be grateful for any help!
source
share