Is it possible to create a kafka theme with a dynamic number of sections?

I use kafka for the flow of page visits by website users to the analytics service. Each event will contain the following data for the consumer:

  • user ID
  • User IP

I need a very high bandwidth, so I decided to split the topic with the partition key as userId-ipAddress i.e.

For userId 1000 and ip address 10.0.0.1, the event will have a section "1000-10.0.0.1"

In this case, the section key is dynamic, therefore, setting the number of sections in advance when creating a topic. Is it possible to create a theme in kafka with a dynamic number of sections?

Is it good to use this type of partitioning, or is there another way that this can be achieved?

+7
partitioning apache-kafka kafka-consumer-api
source share
1 answer

It is not possible to create a Kafka theme with a dynamic number of sections. When you create a topic, you must specify the number of sections. You can change it later manually using the Replication Tools .

But I do not understand why you need a dynamic number of partitions. The partition key is not related to the number of partitions. You can use a partition key with ten partitions or with thousands of partitions. When you post a message to a Kafka topic, Kafka must send it to a specific section. Each section identifies an identifier that is simply a number. Kafka is calculating something like this

 partition_id = hash(partition_key) % number_of_partition 

and sends a message to the partition_id section. If you have a lot more users than partitions, you should be fine. Additional offers:

  • Use userId as the partition key. You probably don't need an IP address as part of the partition key. What is this for? As a rule, you need all messages from one user to get into one section. If you have an IP address as a partition key, messages from a single user may end up in several sections. I do not know your use case, but it is common, which is not what you want.
  • Measure the number of sections needed to process all messages. Then create let say ten times as many partitions. You can create more partitions than you really need. Kafka will not mind, and there will be no penalties. See How to choose the number of topics / sections in a Kafka cluster?

Now you can process all messages in your system. If traffic grows, you can add more Kafka brokers, and you can use replication tools to change leaders / replicas for partitions. If traffic grows more than ten times, you should create new partitions.

+10
source share

All Articles