Kafka in the Kubernete cluster - How to publish / use messages from outside the Kubernete cluster

  1. My Kafka is deployed and works in a Kubernetes cluster. I use this image from the docker hub - https://hub.docker.com/r/cloudtrackinc/kubernetes-kafka/
  2. I have 3 kube nodes in my kubernetes cluster. I have 3 Kafka applications and 3 zookeeper applications running, and I have the corresponding services zoo1, zoo2, zoo3 and kafka-1, kafka-2 and kafka-3. I can publish / consume from inside the kubernetes cluster, but I cannot publish / consume from outside the kubernetes cluster, i.e. from an external machine that is not part of the kubernetes cluster.
  3. I can access kube nodes from an external machine - basically I can ping them using name / ip.
  4. I do not use any external load balancer, but I have a DNS that can resolve both my external machine and kube nodes.
  5. Using NodePort or ExternalIP to provide the Kafka service does not work in this case.
  6. Setting KAFKA_ADVERTISED_HOST_NAME or KAFKA_ADVERTISED_LISTENERS in Kafka RC YML, which ultimately sets the ADVERTISED_HOST_NAME / ADVERTISED_LISTENERS properties in server.properties also does not help access kafka from outside the kuberes cluster.

Please suggest how I can publish / use outside of the kubernetes cluster. Many thanks!

+11
docker kubernetes apache-kafka
source share
4 answers

I had the same issue with kafka access due to k8s cluster on AWS. I manage to solve this problem using the kafka listening function, which supports several interfaces since version 0.10.2.

this is how i configured the kafka container.

  ports: - containerPort: 9092 - containerPort: 9093 env: - name: KAFKA_ZOOKEEPER_CONNECT value: "zookeeper:2181" - name: KAFKA_LISTENER_SECURITY_PROTOCOL_MAP value: "INTERNAL_PLAINTEXT:PLAINTEXT,EXTERNAL_PLAINTEXT:PLAINTEXT" - name: KAFKA_ADVERTISED_LISTENERS value: "INTERNAL_PLAINTEXT://kafka-internal-service:9092,EXTERNAL_PLAINTEXT://123.us-east-2.elb.amazonaws.com:9093" - name: KAFKA_LISTENERS value: "INTERNAL_PLAINTEXT://0.0.0.0:9092,EXTERNAL_PLAINTEXT://0.0.0.0:9093" - name: KAFKA_INTER_BROKER_LISTENER_NAME value: "INTERNAL_PLAINTEXT" 

In addition, I configured two services. One for the internal (without the head) and one for the external (LoadBalancer).

Hope this saves people.

+12
source share

I was able to solve my problem by making the following changes -

  1. Using NodeSelector in YML to run the kafka module on a specific node of the kube cluster.

  2. Set KAFKA_ADVERTISED_HOST_NAME to the Kube hostName that this Kafka POD is configured for (as configured in step 1)

  3. Open the Kafka service using NodePort and set the POD port to the same as the open NodePort, as shown below -

spec: ports: - name: broker-2 port: 30031 targetPort: 9092 nodePort: 30031 protocol: TCP selector: application: kafka-2 broker_id: "2" type: NodePort

Now you can access Kafka brokers from outside the kube cluster using host: posedPort

+6
source share

I solved this problem using the Confluent Kafka REST proxy file.

https://hub.docker.com/r/confluentinc/cp-kafka-rest/

The REST proxy documentation is here:

http://docs.confluent.io/3.1.2/kafka-rest/docs/index.html

Step A: Create a Kafka brokerage company docker image using the latest version of Kafka

I used a custom Kafka broker image based on the same image that you used. You just need to update your cloudtrackinc image to use Kafka version 0.10.1.0 or else it will not work. Just update the Docker file from the cloudertrackinc image to use the latest kurka wurstmeister image and rebuild the docker image.

 - FROM wurstmeister/kafka:0.10.1.0 

I set ADVERTISED_HOST_NAME for each Kafka broker on the POD IP so that each broker gets a unique URL.

 - name: ADVERTISED_HOST_NAME valueFrom: fieldRef: fieldPath: status.podIP 

Step B: Install the cp-kafka-rest proxy to use the Kafka broker cluster

The Kafka Rest proxy server must run in the same cluster as your Kafka broker cluster.

You need to provide two environment variables for the cp-kafka-rest image at least to run. KAFKA_REST_HOST_NAME and KAFKA_REST_ZOOKEEPER_CONNECT. You can set KAFKA_REST_HOST_NAME to use POD IP.

 - name: KAFKA_REST_HOST_NAME valueFrom: fieldRef: fieldPath: status.podIP - name: KAFKA_REST_ZOOKEEPER_CONNECT value: "zookeeper-svc-1:2181,zookeeper-svc-2:2181,zookeeper-svc-3:2181" 

Step C: Run Kafka REST Proxy as a Service

spec: type: NodePort or LoadBalancer ports: - name: kafka-rest-port port: 8082 protocol: TCP

You can use NodePort or LoadBalancer to use one or more Kafka POS proxy modules.

Pros and cons of using the Kafka REST proxy server

Pros:

  • You can easily scale the Kafka broker cluster.
  • You do not need to expose Kakfa brokers outside the cluster
  • You can use a loadbalancer with a proxy.
  • You can use any type of client to access the Kafka cluster (i.e. waving). Very light weight.

Minuses:

  • Another component / layer on top of the Kakfa cluster.
  • Consumers are created within the proxy server. This will need to be tracked by your REST client.
  • Performance isn't perfect: REST instead of Kafka's own protocol. Although deploying multiple proxies may help a bit. I would not use this setting for high volume traffic. For messages with low message volume this may be good.

So, if you can live with the above problems, try Kafka Rest Proxy.

+4
source share

This currently seems impossible, the kafka network architecture is rather poor in relation to this topic. The new consumer uses a list of brokers who return the zookeeper host, but, unfortunately, it is on a different network, so it cannot be obtained from your local client. The bad part is kafka, it is not possible to specify brokers and zookeeper servers. This prevents clients from accessing the system from the outside.

We were working on this at the moment using busybox, where we installed tools for interacting with kafka. In our case, plunger

0
source share

All Articles