How to Monitor Apache Kafka

Technical writer

Kafka

08.07.2025

Reading time: 17 min

With the development of microservice architecture, new tools are emerging that make working with microservice applications easier and more streamlined. One of these tools is Apache Kafka — a popular platform and system for stream data processing and real-time messaging. It is used by various companies around the world to build scalable message transmission systems, data analytics, and integration with microservice applications.

As a core service in application architecture, Kafka requires monitoring. Without proper monitoring, the cluster may experience failures, which could lead to data loss or leaks of information. Today, we will examine in detail how to organize monitoring for Apache Kafka.

Apache Kafka Architecture

Before moving on to the process of organizing monitoring and securing Kafka, let’s break down the program’s architecture.

Kafka is a distributed system consisting of several key components:

Brokers — physical or virtual servers (hosts) that receive, store, and process messages. Each broker is responsible for specific topic partitions.
Topics — logical categories where messages arrive. Topics are divided into partitions for parallel processing.
Producers — data sources or, more simply, clients that send data to topics.
Consumers — clients that read data from topics, often combined in groups for load distribution.
ZooKeeper — used to coordinate brokers and also stores metadata and configuration. Starting from version 3.3+, it is possible to work without ZooKeeper thanks to KRaft (a protocol for storing and managing metadata inside Kafka). The key feature of KRaft is eliminating Apache Kafka’s dependence on an external ZooKeeper service.

Messages in Kafka are key-value pairs written to partitions as logs. Consumers read these messages by tracking their position in the log. This architecture ensures high throughput but makes the system vulnerable to failures if monitoring and security are not given sufficient attention.

Monitoring

Kafka often plays the role of a central component in the infrastructure of large applications, especially when used in microservice architecture. For example, it can transmit millions of events per second between multiple systems or databases. Any delay, failure, or data loss can lead to serious consequences, including financial losses or data loss. Therefore, it is necessary to build Kafka monitoring that will address the following tasks:

Performance control. Broker performance decreases if there are delivery delays or if the broker itself is overloaded. These actions slow down the entire data processing chain.
Data integrity control. With data integrity monitoring, it is possible to minimize problems associated with message loss, duplication, or data corruption.
Scaling planning. Monitoring helps understand when to add brokers (horizontal scaling) or increase server resources (vertical scaling).

Key Metrics for Kafka Monitoring

Effective monitoring requires tracking metrics at all system levels. Let’s look at the main categories and examples.

Broker Metrics

Incoming and Outgoing Traffic. Shows how much data the broker receives and sends. If the values approach network or disk limits, this is a signal for scaling.
Request Processing Latency. The average time to process requests from clients. Growth in latency may indicate a lack of resources.
Number of Active Connections. An abnormally high number of connections may indicate an attack or incorrect client behavior.
Resource Utilization. CPU, RAM, and disk space usage.

Topic and Partition Metrics

Log Size. The total volume of data in a topic. If it grows uncontrollably, the cleanup policy should be reviewed.
Number of Messages. Data arrival rate. Sharp spikes may indicate peak loads.
Offset. The position of the last recorded message and the position up to which consumers have read.

Consumer and Producer Metrics

Consumer Lag. The lag of consumers behind producers. For example, if the lag exceeds 10,000 messages, it may mean that consumers cannot keep up with processing.
Producer Request Rate. The frequency of producer requests. A drop in this metric may signal failures on the sender side.
Fetch Latency. The time required by the consumer to fetch data. High values indicate network or broker problems.

Kafka Monitoring Setup

Let’s break down how to set up Kafka monitoring in practice.

Prerequisites

We will need one server or virtual machine with any pre-installed Linux distribution. In this article, we will use Ubuntu 24.04 as an example.

The server must meet the following requirements:

At least 4 GB of RAM. This amount is suitable only for setting up and test usage of Apache Kafka and is not intended for high-resource tasks. For more serious tasks, at least 8 GB of RAM is required.
At least a single-core processor for basic configuration. For real workloads (for example, working with large data volumes, mathematical or scientific calculations), a 4-core processor is recommended.
A public IP address, which can be rented when creating the server in the “Network” section.

The server can be created in the control panel under Cloud Servers. During setup, we recommend choosing a region with minimal ping for fast data transfer. Other parameters can be left unchanged.

The server will launch in a couple of minutes, and you will find its IP address, login, and password in the server’s dashboard.

Installing and Launching Apache Kafka

Let’s start by installing Kafka using these steps:

Update the repository index and install the OpenJDK 11 package needed to run Kafka:

apt update && apt -y install openjdk-11-jdk

Check that Java was successfully installed by displaying its version:

java -version

If a version is returned, Java was successfully installed.

Next, use wget to download the program archive (used version — 3.9.1):

wget https://downloads.apache.org/kafka/3.9.1/kafka_2.13-3.9.1.tgz

Unpack the downloaded archive with the command:

tar -xvzf kafka_2.13-3.9.1.tgz

A directory named kafka_2.13-3.9.1 will appear. Move it to /opt/kafka:

mv kafka_2.13-3.9.1 /opt/kafka

Next, for convenient Kafka management, create systemd units. Let’s start with ZooKeeper. Using any text editor, create a file zookeeper.service:

nano /etc/systemd/system/zookeeper.service

Use the following content:

[Unit]
Description=Apache Zookeeper service
Requires=network.target remote-fs.target
After=network.target remote-fs.target

[Service]
Type=simple
ExecStart=/opt/kafka/bin/zookeeper-server-start.sh /opt/kafka/config/zookeeper.properties
ExecStop=/opt/kafka/bin/zookeeper-server-stop.sh
Restart=on-abnormal

[Install]
WantedBy=multi-user.target

Save changes and exit the file.

Also create a systemd file for Kafka:

nano /etc/systemd/system/kafka.service

Use this content:

[Unit]
Description=Apache Kafka Service
Requires=zookeeper.service

[Service]
Type=simple
Environment="JAVA_HOME=/usr/lib/jvm/java-1.11.0-openjdk-amd64"
ExecStart=/opt/kafka/bin/kafka-server-start.sh /opt/kafka/config/server.properties
ExecStop=/opt/kafka/bin/kafka-server-stop.sh

[Install]
WantedBy=multi-user.target

Reload the daemon configuration files with:

systemctl daemon-reload

Start ZooKeeper:

systemctl start zookeeper

Check its status:

systemctl status zookeeper

It should show active (running) indicating ZooKeeper started successfully.

Next, start Kafka:

systemctl start kafka

And also check its status:

systemctl status kafka

It should show active (running) indicating Kafka started successfully.

Additionally, create a separate user who will be assigned as the owner of all Kafka-related files and directories:

useradd -r -m -s /bin/false kafka

Set the necessary permissions:

chown -R kafka:kafka /opt/kafka

Testing the Installation

After both services—ZooKeeper and Kafka—have been started, let’s test Kafka’s operation.

All commands below should be run from the /opt/kafka directory:

cd /opt/kafka

Create a new topic called new-topic1:

bin/kafka-topics.sh --create --topic new-topic1 --bootstrap-server localhost:9092 --partitions 1 --replication-factor 1

If successful, the terminal will display Created topic new-topic1.

Also list all topics in the current Kafka instance:

bin/kafka-topics.sh --list --bootstrap-server localhost:9092

The topic new-topic1 should be listed.

Next, test the producer. Launch it with:

bin/kafka-console-producer.sh --topic new-topic1 --bootstrap-server localhost:9092

Send a test message:

Hello from kafka!

Without closing the current SSH session, open a new one and go to /opt/kafka:

cd /opt/kafka

Start the consumer:

bin/kafka-console-consumer.sh --topic new-topic1 --from-beginning --bootstrap-server localhost:9092

If everything works correctly, you will see the previously sent message.

Installing Prometheus

Create a user named prometheus:

useradd --no-create-home --shell /bin/false prometheus

Create directories for Prometheus configuration files:

mkdir /etc/prometheus
mkdir /var/lib/prometheus

Assign the directory owner:

chown prometheus:prometheus /var/lib/prometheus

Move to the /tmp directory:

cd /tmp/

And download the program archive:

wget https://github.com/prometheus/prometheus/releases/download/v2.53.5/prometheus-2.53.5.linux-amd64.tar.gz

Unpack the downloaded archive:

tar xvfz prometheus-2.53.5.linux-amd64.tar.gz

Go into the extracted directory:

cd prometheus-2.53.5.linux-amd64

Move the console directory, prometheus.yml config file, and the Prometheus binary, and set ownership:

mv console* /etc/prometheus
mv prometheus.yml /etc/prometheus
mv prometheus /usr/local/bin/
chown -R prometheus:prometheus /etc/prometheus
chown prometheus:prometheus /usr/local/bin/prometheus

Additionally, create a systemd unit for Prometheus:

nano /etc/systemd/system/prometheus.service

Use the following content:

[Unit]
Description=Prometheus
Wants=network-online.target
After=network-online.target

[Service]
User=prometheus
Group=prometheus
Type=simple
ExecStart=/usr/local/bin/prometheus \
  --config.file /etc/prometheus/prometheus.yml \
  --storage.tsdb.path /var/lib/prometheus/ \
  --web.console.templates=/etc/prometheus/consoles \
  --web.console.libraries=/etc/prometheus/console_libraries

[Install]
WantedBy=multi-user.target

By default, Prometheus is only accessible from localhost. Let’s allow access from all addresses by editing the main config:

nano /etc/prometheus/prometheus.yml

At the end of the file, find the targets parameter under static_configs and replace localhost with the external IP address of your server (you will have your own external IP).

static_configs:
  - targets: ["166.1.227.100:9090"]

Save and exit.

Start Prometheus, add it to autostart, and check its status:

systemctl start prometheus && systemctl enable prometheus && systemctl status prometheus

If the status shows active (running), Prometheus has started successfully.

Restart the systemd daemon and Prometheus and check its status again:

systemctl daemon-reload && systemctl restart prometheus && systemctl status prometheus

If active (running) is displayed, Prometheus is successfully running.

Now go to your browser using the server’s IP address and port 9090 (default Prometheus port). You should see the program’s web interface.

Installing Grafana

Install the necessary packages:

apt-get install -y apt-transport-https software-properties-common wget

Create a directory to store the key:

mkdir -p /etc/apt/keyrings/

Import the GPG key:

wget -q -O - https://apt.grafana.com/gpg.key | gpg --dearmor | sudo tee /etc/apt/keyrings/grafana.gpg > /dev/null

Add the repository:

echo "deb [signed-by=/etc/apt/keyrings/grafana.gpg] https://apt.grafana.com stable main" | tee -a /etc/apt/sources.list.d/grafana.list

Update the package index and install Grafana:

apt update && apt -y install grafana

Start the service with the following commands:

systemctl daemon-reload && systemctl enable grafana-server && systemctl start grafana-server

Check Grafana’s status:

systemctl status grafana-server

If it shows active (running), Grafana has started successfully.

Using the server’s IP address and port 3000 (Grafana’s default port), go to the web interface. The initial login and password for the web interface are admin / admin. On first login, the system will prompt you to set a new password for the admin user.

After authentication, the web interface will open.

Installing JMX Exporter

JMX Exporter is a utility that collects and transmits metrics from applications running on Java to monitoring systems such as Prometheus. To install JMX Exporter, you need to perform the following steps:

Download the utility from the official repository using wget:

wget https://repo.maven.apache.org/maven2/io/prometheus/jmx/jmx_prometheus_javaagent/0.20.0/jmx_prometheus_javaagent-0.20.0.jar

Move the downloaded JAR file to the /opt/kafka/libs directory:

mv jmx_prometheus_javaagent-0.20.0.jar /opt/kafka/libs/

Open the kafka-server-start.sh file for editing:

nano /opt/kafka/bin/kafka-server-start.sh

And add the following lines at the very end of the file:

KAFKA_OPTS="-javaagent:/opt/kafka/libs/jmx_prometheus_javaagent-0.20.0.jar=9091:/etc/prometheus/prometheus.yml"
KAFKA_OPTS="-javaagent:/opt/kafka/libs/jmx_prometheus_javaagent-0.20.0.jar=9091:/opt/kafka/config/sample_jmx_exporter.yml"

Save the changes and exit the file.

Restart Kafka using the commands:

systemctl daemon-reload && systemctl restart kafka

Configuring JMX Exporter

Let's proceed to configure JMX Exporter.

Go to the /opt/kafka/config directory:

cd /opt/kafka/config

Create the sample_jmx_exporter.yml file:

nano sample_jmx_exporter.yml

And use the following content:

lowercaseOutputName: true

rules:
# Special cases and very specific rules
- pattern : kafka.server<type=(.+), name=(.+), clientId=(.+), topic=(.+), partition=(.*)><>Value
  name: kafka_server_$1_$2
  type: GAUGE
  labels:
    clientId: "$3"
    topic: "$4"
    partition: "$5"
- pattern : kafka.server<type=(.+), name=(.+), clientId=(.+), brokerHost=(.+), brokerPort=(.+)><>Value
  name: kafka_server_$1_$2
  type: GAUGE
  labels:
    clientId: "$3"
    broker: "$4:$5"
- pattern : kafka.coordinator.(\w+)<type=(.+), name=(.+)><>Value
  name: kafka_coordinator_$1_$2_$3
  type: GAUGE

# Generic per-second counters with 0-2 key/value pairs
- pattern: kafka.(\w+)<type=(.+), name=(.+)PerSec\w*, (.+)=(.+), (.+)=(.+)><>Count
  name: kafka_$1_$2_$3_total
  type: COUNTER
  labels:
    "$4": "$5"
    "$6": "$7"
- pattern: kafka.(\w+)<type=(.+), name=(.+)PerSec\w*, (.+)=(.+)><>Count
  name: kafka_$1_$2_$3_total
  type: COUNTER
  labels:
    "$4": "$5"
- pattern: kafka.(\w+)<type=(.+), name=(.+)PerSec\w*><>Count
  name: kafka_$1_$2_$3_total
  type: COUNTER

- pattern: kafka.server<type=(.+), client-id=(.+)><>([a-z-]+)
  name: kafka_server_quota_$3
  type: GAUGE
  labels:
    resource: "$1"
    clientId: "$2"

- pattern: kafka.server<type=(.+), user=(.+), client-id=(.+)><>([a-z-]+)
  name: kafka_server_quota_$4
  type: GAUGE
  labels:
    resource: "$1"
    user: "$2"
    clientId: "$3"

# Generic gauges with 0-2 key/value pairs
- pattern: kafka.(\w+)<type=(.+), name=(.+), (.+)=(.+), (.+)=(.+)><>Value
  name: kafka_$1_$2_$3
  type: GAUGE
  labels:
    "$4": "$5"
    "$6": "$7"
- pattern: kafka.(\w+)<type=(.+), name=(.+), (.+)=(.+)><>Value
  name: kafka_$1_$2_$3
  type: GAUGE
  labels:
    "$4": "$5"
- pattern: kafka.(\w+)<type=(.+), name=(.+)><>Value
  name: kafka_$1_$2_$3
  type: GAUGE

# Emulate Prometheus 'Summary' metrics for the exported 'Histogram's.
#
# Note that these are missing the '_sum' metric!
- pattern: kafka.(\w+)<type=(.+), name=(.+), (.+)=(.+), (.+)=(.+)><>Count
  name: kafka_$1_$2_$3_count
  type: COUNTER
  labels:
    "$4": "$5"
    "$6": "$7"
- pattern: kafka.(\w+)<type=(.+), name=(.+), (.+)=(.*), (.+)=(.+)><>(\d+)thPercentile
  name: kafka_$1_$2_$3
  type: GAUGE
  labels:
    "$4": "$5"
    "$6": "$7"
    quantile: "0.$8"
- pattern: kafka.(\w+)<type=(.+), name=(.+), (.+)=(.+)><>Count
  name: kafka_$1_$2_$3_count
  type: COUNTER
  labels:
    "$4": "$5"
- pattern: kafka.(\w+)<type=(.+), name=(.+), (.+)=(.*)><>(\d+)thPercentile
  name: kafka_$1_$2_$3
  type: GAUGE
  labels:
    "$4": "$5"
    quantile: "0.$6"
- pattern: kafka.(\w+)<type=(.+), name=(.+)><>Count
  name: kafka_$1_$2_$3_count
  type: COUNTER
- pattern: kafka.(\w+)<type=(.+), name=(.+)><>(\d+)thPercentile
  name: kafka_$1_$2_$3
  type: GAUGE
  labels:
    quantile: "0.$4"

Save the changes and exit the file.

Next, open the main Prometheus configuration file prometheus.yml for editing:

nano /etc/prometheus/prometheus.yml

We need to add the Kafka endpoint so that Prometheus can collect data. To do this, at the very bottom add the following block, where 166.1.227.100 is the external IP address of the server (do not forget to change it to your actual external IP address):

- job_name: 'kafka'
  static_configs:
    - targets: ["166.1.227.100:9091"]

Save the changes and exit the file.

Restart Prometheus and check its status:

systemctl daemon-reload && systemctl restart prometheus && systemctl status prometheus

Configuring Kafka

Next, it is necessary to make changes when starting Kafka by adding the paths to the Prometheus and JMX Exporter files.

Open the Kafka systemd file for editing:

nano /etc/systemd/system/kafka.service

And add the following lines to the [Service] block:

Environment="KAFKA_OPTS=-javaagent:/opt/kafka/libs/jmx_prometheus_javaagent-0.20.0.jar=9091:/etc/prometheus/prometheus.yml"
Environment="KAFKA_OPTS=-javaagent:/opt/kafka/libs/jmx_prometheus_javaagent-0.20.0.jar=9091:/opt/kafka/config/sample_jmx_exporter.yml"

Save the changes and exit the file.

Restart Kafka and check its status:

systemctl daemon-reload && systemctl restart kafka && systemctl status kafka

Go to the Prometheus web interface, then to the Status section, and in the dropdown menu select Targets:

A new data source for Kafka will appear.

Connecting Prometheus Metrics to Grafana

The final step is to add the metrics from Prometheus to Grafana to build visualizations using graphs.

Go to the Grafana web interface. On the left panel, select the Connections menu, then in the dropdown go to the Data sources section.
In the opened section, click the Add data source button.
Then select Prometheus as the data source.
As the name of the source, specify Kafka (you can choose any other unused name), and as the address, specify the IP address and port where Prometheus is located.
Click the Save & test button.

If connected to Prometheus successfully, a corresponding message will be displayed.

Creating a Visualization in Grafana

After we have configured monitoring, it is time to add a dashboard for visualization in Grafana.

On the left panel, go to the Dashboards section.
In the opened window, click the New button on the right and in the dropdown menu select New dashboard.
Next, go to the Import dashboard section:

Use dashboard number 11962 to add it to Grafana and click the Load button:

In the opened section, you can set a name for the dashboard. At the bottom, as the data source, select the previously added Prometheus instance:

Click the Import button.

Creating a Test Load

The added dashboard currently does not show any load. Let’s simulate it ourselves.

On the server, go to the /opt/kafka directory:

cd /opt/kafka

Create a new topic named test-load:

bin/kafka-topics.sh --create --topic test-load --bootstrap-server localhost:9092 --partitions 4 --replication-factor 1

Kafka has a built-in tool kafka-producer-perf-test.sh, which allows you to simulate message sending by a producer. Let’s launch it to create a test load:

bin/kafka-producer-perf-test.sh --topic test-load --num-records 1000000 --record-size 100 --throughput -1 --producer-props bootstrap.servers=localhost:9092

The command above will generate and send 1,000,000 messages.

Also, create a load by consuming another 1,000,000 messages with a consumer:

bin/kafka-consumer-perf-test.sh --topic test-load --messages 1000000 --broker-list localhost:9092 --group test-group

Go to the Grafana dashboard and you can observe the graphs:

Conclusion

Monitoring Apache Kafka is a complex and comprehensive process that requires maximum attention to detail. The process starts with metrics collection, which can be organized using modern tools like Prometheus and Grafana. Once the metrics are set up, it is necessary to regularly check the cluster’s state for possible problems. Proper monitoring ensures stability of operation. Apache Kafka is a powerful tool that will fully reveal its potential only with correct setup and operation.

Kafka

08.07.2025

Reading time: 17 min

Similar

Kafka

Installing and Configuring Kafka on Windows, Ubuntu, and Other Operating Systems

29 August 2024 · 18 min to read

Kafka

How to Install Apache Kafka on Ubuntu 22.04: A Step-by-Step Tutorial

Apache Kafka is a distributed streaming platform designed for building real-time data pipelines and applications. It provides a scalable, fault-tolerant infrastructure to handle streams of data across various applications. It excels in handling high-throughput, fault-tolerant, and publish-subscribe messaging, making it a popular choice for developers looking to implement real-time analytics and event-driven systems. This is a step-by-step guide to learn how to install Apache Kafka on Ubuntu 22.04. Prerequisites A cloud server with Ubuntu 22.04 installed A non-root user with sudo privileges At least 4GB of RAM. Step 1: Creating a user for Kafka The first step is to create a dedicated user to ensure that Kafka's operations do not interfere with the system's other functionalities. Add a new user called kafka: sudo adduser kafka Next, you need to add the kafka user to the sudo group to have the necessary privileges for Kafka installation. sudo adduser kafka sudo Then, log in to the kafka account: su -l kafka The kafka user now is ready to be used. Step 2: Installing Java Development Kit (JDK) Apache Kafka is written in Java and Scala, which means Java Runtime Environment (JRE) is required to run it. However, for a complete development setup that may involve custom Kafka clients or plugins, the full Java Development Kit (JDK) is recommended. Installing Java Development Kit Open the terminal and update the package index: sudo apt update Install the OpenJDK 11 package: sudo apt install openjdk-11-jdk Now that you’ve installed the JDK, you can start downloading Kafka. Step 3: Downloading Kafka You can download the 3.4 Kafka version from here and extract it in a folder. Start by creating a folder named downloads to store the archive: mkdir ~/downloadscd ~/downloadswget https://archive.apache.org/dist/kafka/3.4.0/kafka_2.12-3.4.0.tgz Then, move to ~ and extract the archive you downloaded: cd ~tar -xvzf ~/downloads/kafka_2.12-3.4.0.tgz Let’s rename the directory kafka_2.12-3.4.0 to kafka. mv kafka_2.12-3.4.0/ kafka/ Now that you’ve downloaded Kafka, you can start configuring your Kafka server. Step 4: Configuring the Kafka server First, start by setting the log.dirs property to change the directory where the Kafka logs are. To do so, you need to edit the server.properties file: nano ~/kafka/config/server.properties Look for log.dirs and set the value to /home/kafka/kafka-logs. You can also change the value of num.partition to 3 so that when you create the topic you don’t specify the number of partitions, it will be 3 by default. Now that you’ve finished configuring your Kafka server, you can run the server. Step 5: Starting the Kafka server To start the Kafka server, you need to first start Zookeeper and then start Kafka. What is Zookeeper? Apache ZooKeeper manages coordination and configuration for distributed systems, such as Kafka. Kafka uses ZooKeeper to maintain the state between nodes in the Kafka cluster and to keep track of topics, partitions, and configurations. In this release of Kafka, zookeeper comes with Kafka, so no need to install it. To start Zookeeper & Kafka, there are 2 commands: ~/bin/zookeeper-server-start.sh ~/kafka/config/zookeeper.properties~/kafka/bin/kafka-server-start.sh ~/kafka/config/server.properties But, to be more efficient, you need to create systemd unit files and use systemctl instead. Unit File for Zookeeper: sudo nano /etc/systemd/system/zookeeper.service [Unit] Description=Apache Zookeeper Service Requires=network.target After=network.target [Service] Type=simple User=kafka ExecStart=/home/kafka/kafka/bin/zookeeper-server-start.sh /home/kafka/kafka/config/zookeeper.properties ExecStop=/home/kafka/kafka/bin/zookeeper-server-stop.sh Restart=on-abnormal [Install] WantedBy=multi-user.target Unit File for Kafka: sudo nano /etc/systemd/system/kafka.service [Unit] Description=Apache Kafka Service that requires zookeeper service Requires=zookeeper.service After=zookeeper.service [Service] Type=simple User=kafka ExecStart= /home/kafka/kafka/bin/kafka-server-start.sh /home/kafka/kafka/config/server.properties ExecStop=/home/kafka/kafka/bin/kafka-server-stop.sh Restart=on-abnormal [Install] WantedBy=multi-user.target Then, you can start the Kafka server: sudo systemctl start kafka Check the status: sudo systemctl status kafka Step 6: Testing the Kafka server You can check if the Kafka server is up with netcat. By default, Kafka server runs on 9092: nc -vz localhost 9092 You can also check logs: cat ~/kafka/logs/server.log It looks like it’s all good. If your server is running successfully, try to create a topic: ~/kafka/bin/kafka-topics.sh --bootstrap-server localhost:9092 --create --topic firstTopic Let’s check the topics’ list: ~/kafka/bin/kafka-topics.sh --list --bootstrap-server localhost:9092 You can produce messages to the topic: ~/kafka/bin/kafka-console-producer.sh --bootstrap-server localhost:9092 --topic firstTopic You can then read the messages: ~/kafka/bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic firstTopic --from-beginning Step 7: Setting up Kafka in production (optional) When transitioning from a development setup to a production environment, it's crucial to consider deploying Apache Kafka as a part of a cluster rather than as a single instance. A Kafka cluster ensures better reliability, scalability, and fault tolerance. Running a cluster involves multiple Kafka servers (brokers) and, typically, several ZooKeeper instances to manage the cluster's state. Here’s an overview of the process for establishing a robust multi-node Kafka environment. Overview of Setting Up a Multi-Node Kafka Cluster Infrastructure Preparation Nodes: Prepare multiple servers (physical or virtual) with Ubuntu 22.04 installed, and at least three brokers for production environments to ensure fault tolerance. Each server act as a Kafka broker. Networking: Ensure all nodes can communicate with each other. Consistent Software Installation Install Java on all brokers. Install Kafka on each node following the same steps used above, ensuring consistency across all installations. ZooKeeper Setup Cluster Configuration: Although a single ZooKeeper instance can manage a small Kafka cluster, a ZooKeeper ensemble (cluster) is recommended for production. Typically, this consists of an odd number of servers (at least three) to avoid split-brain scenarios and to ensure high availability and failover capabilities. Configure each ZooKeeper node with a unique identifier and set up the ensemble so that each Kafka node knows how to connect to the ZooKeeper cluster. Kafka Configuration Unique Broker ID: Each Kafka broker must be assigned a unique ID (change “broker.id” in server.properties). Network Configuration: Configure server properties to include listeners and advertised listeners for broker communication. Replication Factor: Set the appropriate replication factor in Kafka settings to ensure that copies of each partition are stored on multiple brokers. This replication is key to Kafka’s fault tolerance. Starting the Services Start the ZooKeeper ensemble first, ensuring all nodes in the ensemble are up and communicating. Launch the Kafka brokers across all nodes. Check the logs to ensure that each broker has joined the cluster and is functioning correctly. Step 8: Installing CMAK (optional) CMAK (Cluster Manager for Apache Kafka, previously known as Kafka Manager) is a web-based management tool for Apache Kafka clusters. It provides a user-friendly interface for monitoring cluster health and performance, managing topics, and configuring multiple Kafka clusters. CMAK will simplify complex administrative tasks, making it easier for users to maintain and optimize their Kafka environments. To install CMAK, you need to install sbt which is a build tool for Scala projects like CMAK. echo "deb https://repo.scala-sbt.org/scalasbt/debian all main" | sudo tee /etc/apt/sources.list.d/sbt.list echo "deb https://repo.scala-sbt.org/scalasbt/debian /" | sudo tee -a /etc/apt/sources.list.d/sbt.list curl -sL "https://keyserver.ubuntu.com/pks/lookup?op=get&search=0x2EE0EA64E40A89B84B2DF73499E82A75642AC823" | sudo apt-key add sudo apt update sudo apt install sbt Then clone the latest version of CMAK: git clone https://github.com/yahoo/CMAK.gitcd CMAK Use sbt to build CMAK. sbt clean dist This command compiles the application and packages it into a zip file under the target/universal/ directory. Install unzip to be able to extract the file: sudo apt install unzip Once the build process is complete, extract the generated ZIP file: cd target/universal/ unzip cmak-VERSION.zip mv cmak-VERSION cmak Change VERSION to the one that you have. Now, we need to set the host and port of zookeeper correctly. Open ~CMAK/target/universal/cmak/conf/application.conf and change zkhosts properties. And to be able to run cmak, we need to set JAVA_OPTS variable: export JAVA_OPTS="-Dconfig.file=/home/kafka/CMAK/target/universal/cmak/conf/application.conf -Dhttp.port=9000" Then, move to ~/CMAK/target/universal/cmak directory and start CMAK: ./bin/cmak Go to your browser, enter the address: yourhost:9000, and make sure you have the right firewall rules to access to it. Then, add your cluster by adding your zookeeper host. Click Add Cluster: Then add your host: Now, your CMAK is ready, you can manage your brokers, topics, partitions, and much more. To learn more please refer to the documentation.

22 May 2024 · 8 min to read

How to Monitor Apache Kafka

Apache Kafka Architecture

Monitoring

Key Metrics for Kafka Monitoring

Kafka Monitoring Setup

Prerequisites

Installing and Launching Apache Kafka

Testing the Installation

Installing Prometheus

Installing Grafana

Installing JMX Exporter

Configuring JMX Exporter

Configuring Kafka

Connecting Prometheus Metrics to Grafana

Creating a Visualization in Grafana

Creating a Test Load

Conclusion

Similar

Installing and Configuring Kafka on Windows, Ubuntu, and Other Operating Systems

How to Install Apache Kafka on Ubuntu 22.04: A Step-by-Step Tutorial

Do you have questions, comments, or concerns?

Do you have questions,
comments, or concerns?