Cassandra does not perform master-slave replication. There is no master in cassandra. Rather, the data is distributed across the cluster. The distribution mechanism depends on a number of factors.
Data is stored on nodes in partitions. Remember cassandra is a partitioned string repository? This is where the sections go. Data is stored in sections. All rows for a partition are stored together in one node (and replicas). How many replicas depends on the replication rate of the table. If the replication rate is 3 for the table, each section for this table (and as such, all rows in this section) is stored in two additional replicas. This is how to say: "I want 3 copies of this data."
Clients can indicate a level of consistency (CL) during recording. This is the number of nodes that should confirm the success of the recording. Clients can also specify CL for reading. Cassandra sends read requests to n = CL nodes and takes the last value as the result of the request.
By configuring the read and write CLI, you control consistency. If Read CL + Write CL> Replication Rate (RF), you will get full consistency.
In terms of fault tolerance, you can configure CL and RF to be what you need. For example, if you have RF = 3, read CL = 2, write CL = 2, then you will get full consistency, and you can move one node down. For RF = 5, read CL = 3, write CL = 3, you have the same thing, but you can move 2 nodes down.
Two cluster node, this is actually not a good idea. You can set RF = 2 (all replicated data), write CL = 2 and read CL = 1. However, this will mean that if node does not work, you can read, but not write. You can set read CL = 2 and write CL = 1, and in this case, if the node is omitted, you can write, but not read. In reality, you should go for at least 5 (at least 4) nodes with RF = 3. All of this is lower, and you are asking for trouble.
ashic
source share