Why does ZooKeeper require a majority to run?

I was wondering why ZooKeeper needs most machines in an ensemble to work in general. Let's say we have a very simple ensemble of 3 cars - A, B, C.

When A fails, a new leader is chosen - excellent, everything works. When another dies, say, B, the service is unavailable. Does this make sense? Why can't machine C process everything all by itself until A and B come back again?

Since one machine is enough to do all the work (for example, a single machine ensemble works fine) ...

Is there any special reason why ZooKeeper is designed this way? Is there a way to configure ZooKeeper, for example, is an ensemble always available when at least one of N is raised?

Edit: Maybe there is a way to apply our own leader selection algorithm? Or determine the size of the quorum?

Thanks in advance.

+7
source share
3 answers

Zookeeper is designed to reliably distribute things. If the network of systems is divided into segments, then you do not want the two halves to work independently and perhaps not synchronize, because when the failure is resolved, he will not know what to do. If you have a refusal to work when it received less than most, then you can be sure that when the refusal is resolved, everything will be returned without additional intervention.

+7
source

The reason for getting a majority of the votes is to avoid a problem called a split brain.

Basically, when the network fails, you do not want the two parts of the system to continue as usual. you want to continue, and another to understand that it is not part of a cluster.

There are two main ways to achieve a shared resource, for example, a shared drive on which the leader has a lock, if you see a lock in which you are part of a cluster, if you are not a user, again. If you hold a castle, you are a leader, and if you do not. The problem with this approach is that you need this share.

Another way to prevent brain splitting is to count the votes, if you get enough votes, you are the leader. This still works with two nodes (for quorum 3), where the leader says he is the leader and the other node, acting as a โ€œwitnessโ€, also agrees. This method is preferable since it can work in the architecture without sharing, and this is what Zookeeper uses

As Michael noted, a node cannot know whether the reason it does not see other nodes in the cluster is because these nodes are omitted or there is a network problem - the safe bet is that there is no quorum.

+6
source

Let's look at an example that shows how things can go wrong if the quorum (most working servers) is too small.

Say we have five servers, and a quorum can be any set of two servers. Now say that the s1 and s2 servers confirm that they have replicated the request to create znode / z. The service returns to the client a message stating that znode has been created. Suppose now that s1 and s2 are separated from other servers and from clients for an arbitrarily long time before they can replicate the new znode to other servers. A service in this state can make progress because there are three servers available, and according to our assumptions, only two are really needed, but these three servers have never seen the new znode / z. Therefore, the request to create / z is not durable.

This is an example of a split brain scenario . To avoid this problem, in this example, the quorum must be at least three, which is the majority of the five servers in the ensemble. To make progress, the ensemble needs at least three servers. To confirm that the state update request completed successfully, this ensemble also requires at least three servers to acknowledge that they replicated it.

0
source

All Articles