When does kafka change leader?

I performed my services that have been working with Kafka for a year now, and there are no spontaneous changes in the leader. But over the past 2 weeks, it started quite often. Kafka will write:

  • [2015-09-27 15: 35: 14,826] INFO [ReplicaFetcherManager on broker 2] Remote collector for sections [myTopic] (kafka.server.ReplicaFetcherManager)
  • [2015-09-27 15: 35: 14,830] INFO Truncation of myTopic-0 log to offset 11520979. (kafka.log.Log)
  • [2015-09-27 15: 35: 14,845] WARN [Replica Manager on Broker 2]: a request for selection with a correlation identifier 713276 from the ReplicaFetcherThread-0-2 client in the [myTopic, 0] section failed because of a non-local leader for the [myTopic, 0] section on broker 2 (kafka.server.ReplicaManager)
  • [2015-09-27 15: 35: 14,857] WARN [Replica Manager on Broker 2]: a request to select with a correlation identifier 256685 from the Mirrormaker-1 client on the [myTopic, 0] section failed because Leader did not is local to the [myTopic, 0] section on broker 2 (kafka.server.ReplicaManager)
  • [2015-09-27 15: 35: 20,171] INFO [ReplicaFetcherManager on broker 2] The collector for partitions [myTopic, 0] (kafka.server.ReplicaFetcherManager) has been removed

What can cause a switch leader? If there is information in any kafka documentation, please just provide a link. I could not find.


system configuration

kafka version: kafka_2.10-0.8.2.1

os: Red Hat Enterprise Linux Server 6.5 release (Santiago)

server.properties (different from the default):

  • broker.id = 001
  • socket.send.buffer.bytes = 1048576
  • socket.receive.buffer.bytes = 1048576
  • socket.request.max.bytes = 104857600
  • log.flush.interval.messages = 10000
  • log.flush.interval.ms = 1000
  • log.retention.bytes = -1
  • controlled.shutdown.enable = true
  • auto.create.topics.enable = false
+7
apache-kafka
source share
2 answers

It seems that the lead broker is not working for this section. It is possible that the data directroy (log.dirs) configured in server.properties has left space and the broker cannot host it. Also, what is the replication rate of the topic and cluster size of brokers?

0
source share

I assume that you have one topic and one partition with a replication rate of 2. Which is not a good configuration for optimal performance by Kafka and consumers.

Your logs are not clear enough for a leader switch. The main problem in your topic can be only one leader because of a single section. Now one file in your logs is increasing every day. Kafka internally performs balancing at a certain level (details not confirmed). This may be the reason your leader is moving. But I'm not sure.

Also in your second line of the magazine it says that some of the magazines are truncated. Can you please look through the logs and see if this happens only after truncation?

As already mentioned, you have already checked the Kafka log directory files and their size. Please run the description when you have this problem. The leader switch will also be displayed here. Or, if you can customize some control panel that displays the leader last time. Then it will be easy for you to find the root cause.

bin/kafka-topics.sh --describe --zookeeper Zookeeperhost:Port --topic TopicName 

Suggestion: I suggest you create a new topic with a large number of sections (read the Kafka documentation to get an idea of ​​the optimal number of sections) and start writing on it. Or you can check how to change sections for the current topic.

The last thing: does the leader switch cause some problems in your Clients or are you only worried about warnings?

0
source share

All Articles