What happens if Zookeeper fails?

we have created a Kafka / Zookeeper cluster consisting of 3 brokers. We have one producer who sends messages on one specific topic of Kafka and several consumer groups reading this topic. These consumers conduct leadership elections through Zookeeper for themselves (regardless of Kafka).

Used Versions:

  • Kafka: 0.9.0.1
  • Zookeeper: 3.4.6 (included in the Kafka package)

All processes are controlled by Supervisor. So far, everything is working fine. What we tried now (for testing purposes) was just to kill all the Zookeeper processes and see what happens.

As we expected, our consumer processes could no longer connect to Zookeeper. But all of a sudden, Kafka Brokers still worked. Our producer did not complain at all and still could write about this topic. Although I could not use kafka / bin / kafka-topics.sh or something like that, since they all require the zookeeper parameter, I could still see that the actual size of the temp log is growing. After restarting the processes, zookeeper worked again the same way as before.

What we could not understand now ... what really happened? We thought Kafka would require a working Zookeeper-Connection, and we could not find any explanation for this behavior on the Internet.

+6
source share
1 answer

When you have one node zookeeper, the broker will not be able to contact the zookeeper, after the broker detects that the zoo is unavailable, the broker will also become unavailable. Hence the producer and consumer. In the case of the manufacturer, it begins to decline (reject the record). In the case of the consumer, it may happen that the read record that will not be processed can be processed again when the broker is ready and ready ...

in case of 3node zk, one node failure is acceptable, since the quorum is still satisfied ... but cannot allow 2node errors that will lead to the above consequences ...

0
source

All Articles