Kafka's only consumer failure in the group

I'm in the early stages of learning Kafka, version 0.8.1.1.

I am successfully running an example of a consumer group, with many sections and its distribution messages among consumers, pretty well.

One test case I wanted to run was when a consumer in a group dies suddenly (e.g. kill -9). When I do this, I expected a rebalancing to happen, but that would not happen. So can I do one of these things?

  • Trigger rebalancing using API
  • Configure kafka to wait a certain time for consumer activity and rebalancing automatically, considering that it was closed carefree.

The problem is that all messages in the sections assigned to the dead consumer remain in the queue and are never processed until rebalancing occurs.

+8
apache-kafka
source share
1 answer

Rebalancing will happen automatically, which can be set in the user configuration (zookeeper.session.timeout.ms). According to the documentation

zookeeper.session.timeout.ms : ZooKeeper session timeout. If a consumer cannot strike at the zoo during this period of time, he is considered dead and rebalancing will occur. default value is 6000 ms

Another live consumer in the same group will begin to receive the message after a timeout period.

Set this timeout value to your requirements.

There is also additional information from the kafka documentation:

Consumer balancing fails (you'll see a ConsumerRebalanceFailedException): This is due to conflicts when two consumers try to own the same topic section. The magazine will show you what caused the conflict (search for "conflict in").

  • If your consumer subscribes to many topics and your ZK server is busy, this may be due to the fact that consumers do not have enough time to see a consistent view of all consumers in the same group. If so, try increasing rebalance.max.retries and rebalance.backoff.ms.
  • Another reason may be that one of the consumers is difficult to kill. Other consumers during the rebalancing will not understand that the consumer has left after zookeeper.session.timeout.ms time. In this case, make sure rebalance.max.retries * rebalance.backoff.ms> zookeeper.session.timeout.ms.
+7
source share

All Articles