---
id: kafka
title: Kafka - Classic Collector
sidebar_label: Kafka
description: This guide provides an overview of Kafka related features and technologies.
slug: /help/docs/integrations/containers-orchestration/kafka/
canonical: https://www.sumologic.com/help/docs/integrations/containers-orchestration/kafka/
---
import useBaseUrl from '@docusaurus/useBaseUrl';
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
This guide provides an overview of Kafka related features and technologies. In addition, it contains recommendations on best practices, tutorials for getting started, and troubleshooting information for common situations.
The Sumo Logic App for Kafka is a unified logs and metrics app. The app helps you to monitor the availability, performance, and resource utilization of Kafka messaging/streaming clusters. Pre-configured dashboards provide insights into the cluster status, throughput, broker operations, topics, replication, zookeepers, node resource utilization, and error logs.
This App has been tested with following Kafka versions:
* 2.6.0
* 2.7.0
## Sample log messages
The first service in the pipeline is Telegraf. Telegraf collects metrics from Kafka. We’re running Telegraf in each pod we want to collect metrics from as a sidecar deployment. In other words, Telegraf runs in the same pod as the containers it monitors. Telegraf uses the [Jolokia input plugin](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/jolokia2)to obtain metrics, (For simplicity, the diagram doesn’t show the input plugins.) The injection of the Telegraf sidecar container is done by the Telegraf Operator. Prometheus pulls metrics from Telegraf and sends them to [Sumo Logic Distribution for OpenTelemetry Collector](https://github.com/SumoLogic/sumologic-otel-collector) which enriches metadata and sends metrics to Sumo Logic.
In the logs pipeline, Sumo Logic Distribution for OpenTelemetry Collector collects logs written to standard out and forwards them to another instance of Sumo Logic Distribution for OpenTelemetry Collector, which enriches metadata and sends logs to Sumo Logic.
#### Configure Metrics Collection
Follow these steps to collect metrics from a Kubernetes environment:
1. **Setup Kubernetes Collection with the Telegraf operator**. Ensure that you are monitoring your Kubernetes clusters with the Telegraf operator **enabled**. If you are not, then follow [these instructions](/docs/send-data/collect-from-other-data-sources/collect-metrics-telegraf/install-telegraf) to do so.
2. **Add annotations on your Kafka pods**.
1. Open [this yaml file](https://sumologic-app-data.s3.amazonaws.com/Kafka/KAfka_PodAnnotations.yaml) and add the annotations mentioned there.
2. Enter in values for the parameters marked with `CHANGE_ME` in the yaml file:
* `telegraf.influxdata.com/inputs`. As telegraf will be run as a sidecar the `urls` should always be localhost.
* In the input plugins section:
* `urls` - The URL to the Kafka server. As telegraf will be run as a sidecar the `urls` should always be localhost. This can be a comma-separated list to connect to multiple Kafka servers.
* In the tags sections, (`[inputs.jolokia2_agent.tags]` and `[inputs.disk.tags]`):
* `environment`. This is the deployment environment where the Kafka cluster identified by the value of servers resides. For example: dev, prod or qa. While this value is optional we highly recommend setting it.
* `messaging_cluster`. Enter a name to identify this Kafka cluster. This cluster name will be shown in the Sumo Logic dashboards.
**Do not modify the following values** as it will cause the Sumo Logic app to not function correctly.
* `telegraf.influxdata.com/class: sumologic-prometheus`. This instructs the Telegraf operator what output to use. This should not be changed.
* `prometheus.io/scrape: "true"`. This ensures our Prometheus plugin will scrape the metrics.
* `prometheus.io/port: "9273"`. This tells Prometheus what ports to scrape metrics from. This should not be changed.
* `telegraf.influxdata.com/inputs`
* In the tags sections `[inputs.jolokia2_agent/diskio/disk]`
* `component: “messaging”` - This value is used by Sumo Logic apps to identify application components.
* `messaging_system: “kafka”` - This value identifies the database system.
For more information on all other parameters, see [this doc](/docs/send-data/collect-from-other-data-sources/collect-metrics-telegraf/install-telegraf) for more parameters that can be configured in the Telegraf agent globally.
For more information on configuring the Joloka input plugin for Telegraf, see [this doc](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/jolokia2).
3. Configure your Kafka Pod to use the Jolokia Telegraf Input Plugin. Jolokia agent needs to be available to the Kafka Pods. Starting Kubernetes 1.10.0, you can store a binary file in a [configMap](https://kubernetes.io/docs/concepts/storage/volumes/#configmap). This makes it very easy to load the Jolokia jar file, and make it available to your pods.
4. Download the latest version of the **Jolokia JVM-Agent** from [Jolokia](https://jolokia.org/download.html).
5. Rename the file to `jolokia.jar`.
6. Create a `configMap jolokia` from the binary file:
```bash
kubectl create configmap jolokia --from-file=jolokia.jar
```
7. Modify your Kafka Pod definition to include volume (type [ConfigMap](https://kubernetes.io/docs/concepts/storage/volumes/#configmap)) and `volumeMounts`. Finally, update the `env` (environment variable) to start Jolokia, and apply the updated Kafka pod definition.
```yml
spec:
volumes:
- name: jolokia
configMap:
name: jolokia
containers:
- name: XYZ
image: XYZ
env:
- name: KAFKA_OPTS
value: "-javaagent:/opt/jolokia/jolokia.jar=port=8778,host=0.0.0.0"
volumeMounts:
- mountPath: "/opt/jolokia"
name: jolokia
```
8. **Verification Step:** You can ssh to Kafka pod and run following commands to make sure Telegraf (and Jolokia) is scraping metrics from your Kafka Pod:
```bash
curl localhost:9273/metrics
curl http://localhost:8778/jolokia/list
echo $KAFKA_OPTS
```
It should give you the following result:
```bash
-javaagent:/opt/jolokia/jolokia.jar=port=8778,host=0.0.0.0
```
9. Make sure jolokia.jar exists at /opt/jolokia/ directory of kafka pod. This is an example of what a [Pod definition file](https://sumologic-app-data.s3.amazonaws.com/Kafka/Kafka_Pod_annotations_Labels_MountVolume.yaml) looks like.
10. Once this has been done, the Sumo Logic Kubernetes collection will automatically start collecting metrics from the pods having the labels and annotations defined in the previous step. Verify metrics are flowing into Sumo Logic by running the following metrics query:
```sql
component="messaging" and messaging_system="kafka"
```
#### Configure Logs Collection
This section explains the steps to collect Kafka logs from a Kubernetes environment.
1. **Collect Kafka logs written to standard output**. If your Kafka helm chart/pod is writing the logs to standard output then follow the steps listed below to collect the logs:
1. Apply the following labels to your Kafka pods:
`environment: "prod-CHANGE_ME"` \
`component: "messaging"` \
`messaging_system: "kafka"` \
`messaging_cluster: "kafka_prod_cluster01-CHANGE_ME”`
2. Enter in values for the following parameters (marked in bold and `CHANGE_ME` above):
* `environment`. This is the deployment environment where the Kafka cluster identified by the value of **servers** resides. For example: dev, prod or qa. While this value is optional we highly recommend setting it.
* `messaging_cluster`. Enter a name to identify this Kafka cluster. This cluster name will be shown in the Sumo Logic dashboards.
* **Do not modify the following values** as it will cause the Sumo Logic app to not function correctly.
* `component: “messaging”` - This value is used by Sumo Logic apps to identify application components.
* `messaging_system: “kafka”` - This value identifies the messaging system.
* For all other parameters, see [this doc](/docs/send-data/collect-from-other-data-sources/collect-metrics-telegraf/install-telegraf) for more parameters that can be configured in the Telegraf agent globally.
3. The Sumologic-Kubernetes-Collection will automatically capture the logs from stdout and will send the logs to Sumologic. For more information on deploying Sumologic-Kubernetes-Collection, see [this page](/docs/integrations/containers-orchestration/kubernetes/#collecting-metrics-and-logs-for-the-kubernetes-app).
2. **Collect Kafka logs written to log files (Optional)**. If your Kafka helm chart/pod is writing its logs to log files, you can use a [sidecar](https://github.com/SumoLogic/tailing-sidecar/tree/main/operator) to send log files to standard out. To do this:
1. Determine the location of the Kafka log file on Kubernetes. This can be determined from helm chart configurations.
2. Install the Sumo Logic [tailing sidecar operator](https://github.com/SumoLogic/tailing-sidecar/tree/main/operator#deploy-tailing-sidecar-operator).
3. Add the following annotation in addition to the existing annotations.
```xml
annotations:
tailing-sidecar: sidecarconfig;
This section provides instructions for configuring log and metric collection for the Kafka app in Non-Kubernetes environments.
#### Prerequisite
Metrics collection setup can be done in two ways: [using Telegraf with an installed collector](#using-telegraf-and-installed-collector); or by using OpenTelemetry. Both methods require you to configure Jolokia JVM Agent to collect metrics:
1. Download the latest version of the **Jolokia JVM-Agent** from [Jolokia](https://jolokia.org/download.html).
1. Rename downloaded Jar file to `jolokia-agent.jar`.
1. Save the file `jolokia-agent.jar` on your kafka server in `/opt/kafka/libs`.
1. Configure Kafka to use Jolokia by adding the following to `kafka-server-start.sh`:
```
export JMX_PORT=9999
export RMI_HOSTNAME=0.0.0.0
export KAFKA_JMX_OPTS="-javaagent:/opt/kafka/libs/jolokia.jar=port=8778,host=$RMI_HOSTNAME -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=$RMI_HOSTNAME -Dcom.sun.management.jmxremote.rmi.port=$JMX_PORT"
```
1. Restart Kafka Service.
1. Verify that you can access jolokia on port 8778 using following command:
```
curl http://KAFKA_SERVER_IP_ADDRESS:8778/jolokia/
```
#### Using Telegraf and Installed Collector
We use the Telegraf Operator for Kafka metric collection and the Sumo Logic Installed Collector for collecting Kafka logs. The diagram below illustrates the components of the Kafka collection in a non-Kubernetes environment. Telegraf runs on the same system as Kafka, and uses the Kafka Jolokia input plugin to obtain Kafka metrics, and the Sumo Logic output plugin to send the metrics to Sumo Logic. Kafka Logs are sent to Sumo Logic Local File Source on Installed Collector.
This section provides instructions for configuring metrics collection for the Sumo Logic App for Kafka. Follow the instructions documented below to set up metrics collection for a given Broker in your Kafka Cluster:
#### Configure Collection of Kafka Metrics
1. Configure a Hosted Collector. To create a new Sumo Logic hosted collector, perform the steps in the[ Configure a Hosted Collector](/docs/send-data/hosted-collectors/configure-hosted-collector) section of the Sumo Logic documentation.
2. Configure an HTTP Logs and Metrics Source. Create a new HTTP Logs and Metrics Source in the hosted collector created above by following [these instructions](/docs/send-data/hosted-collectors/http-source). Make a note of the **HTTP Source URL**.
3. Install Telegraf. Follow the steps in [this document](/docs/send-data/collect-from-other-data-sources/collect-metrics-telegraf/install-telegraf) to install Telegraf on each Kafka Broker node.
4. Configure and start telegraf. Create or modify the telegraf.conf file in /etc/telegraf/telegraf.d and copy and paste the text [from this file](https://sumologic-app-data.s3.amazonaws.com/Kafka/config_telegraf.conf).
5. Please enter values for the following parameters (marked with `CHANGE_ME`) in the downloaded file:
* In the input plugins section, which is `[[inputs.jolokia2_agent]]`:
* `urls` - In the `[[inputs.jolokia2_agent]]` section. The URL to the Kafka server. This can be a comma-separated list to connect to multiple Kafka servers. Please see [this doc](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/jolokia2) for more information on additional parameters for configuring the Jolokia input plugin for Telegraf.
* In the tags sections (total 3) which is section[inputs.jolokia2_agent.tags], and [inputs.disk.tags]
* `environment`. This is the deployment environment where the Kafka cluster identified by the value of `urls` parameter resides. For example: dev, prod or qa. While this value is optional we highly recommend setting it.
* `messaging_cluster`. Enter a name to identify this Kafka cluster. This cluster name will be shown in the Sumo Logic dashboards.
* In the output plugins section:
* `url` - This is the HTTP source URL created in step 3. Please see [this doc](/docs/send-data/collect-from-other-data-sources/collect-metrics-telegraf/configure-telegraf-output-plugin) for more information on additional parameters for configuring the Sumo Logic Telegraf output plugin.
**Do not modify these values** as they will cause the Sumo Logic apps to not function correctly:
* `data_format` - “prometheus” In the output plugins section. In other words, this indicates that metrics should be sent in the Prometheus format to Sumo Logic.
* `Component`: “messaging” - In the input plugins section.In other words, this value is used by Sumo Logic apps to identify application components.
* `messaging_system`: “kafka” - In the input plugins sections.In other words, this value identifies the messaging system.
* `component`: “messaging” - In the input plugins sections. In other words, this value identifies application components.
Here is an example [telegraf.conf](https://sumologic-app-data.s3.amazonaws.com/Kafka/telegraf.conf+) file.
For all other parameters please see [this doc](https://github.com/influxdata/telegraf/blob/master/docs/CONFIGURATION.md) for more properties that can be configured in the Telegraf agent globally.
6. Restart Telegraf. Once you have finalized your telegraf.conf file, you can start or reload the telegraf service using instructions from their [doc](https://docs.influxdata.com/telegraf/v1.17/introduction/getting-started/#start-telegraf-service).
At this point, Kafka metrics should start flowing into Sumo Logic.
#### Configure Collection of Kafka Logs on each Kafka Broker node
This section provides instructions for configuring log collection for Kafka running on a non-Kubernetes environment for the Sumo Logic App for Kafka. By default, Kafka logs are stored in a log file. Perform the steps outlined below for each Kafka Broker node.
1. Configure logging in Kafka. By default Kafka logs (server.log and controller.log) are stored in the directory: `/opt/Kafka/kafka_
### Kafka - Outlier Analysis
The **Kafka - Outlier Analysis** dashboard helps you identify outliers for key metrics across your Kafka clusters.
Use this dashboard to:
* To analyze trends, and quickly discover outliers across key metrics of your Kafka clusters
### Kafka - Replication
The Kafka - Replication dashboard helps you understand the state of replicas in your Kafka clusters.
Use this dashboard to monitor the following key metrics:
* In-Sync Replicas (ISR) Expand Rate - The ISR Expand Rate metric displays the one-minute rate of increases in the number of In-Sync Replicas (ISR). ISR expansions occur when a broker comes online, such as when recovering from a failure or adding a new node. This increases the number of in-sync replicas available for each partition on that broker.The expected value for this rate is normally zero.
* In-Sync Replicas (ISR) Shrink Rate - The ISR Shrink Rate metric displays the one-minute rate of decreases in the number of In-Sync Replicas (ISR). ISR shrinks occur when an in-sync broker goes down, as it decreases the number of in-sync replicas available for each partition replica on that broker.The expected value for this rate is normally zero.
* ISR Shrink Vs Expand Rate - If you see a Spike in ISR Shrink followed by ISR Expand Rate - this may be because of nodes that have fallen behind replication and they may have either recovered or are in the process of recovering now.
* Failed ISR Updates
* Under Replicated Partitions Count
* Under Min ISR Partitions Count -The Under Min ISR Partitions metric displays the number of partitions, where the number of In-Sync Replicas (ISR) is less than the minimum number of in-sync replicas specified. The two most common causes of under-min ISR partitions are that one or more brokers are unresponsive, or the cluster is experiencing performance issues and one or more brokers are falling behind.
* The expected value for this rate is normally zero.
### Kafka - Zookeeper
The **Kafka -Zookeeper** dashboard provides an at-a-glance view of the state of your partitions, active controllers, leaders, throughput and network across Kafka brokers and clusters.
Use this dashboard to monitor key Zookeeper metrics such as:
* **Zookeeper disconnect rate** - This metric indicates if a Zookeeper node has lostits connection to a Kafka broker.
* **Authentication Failures** - This metric indicates a Kafka Broker is unable to connect to its Zookeeper node.
* **Session Expiration** - When a Kafka broker - Zookeeper node session expires, leader changes can occur and the broker can be assigned a new controller. If this metric is increasing we recommend you:
1. Check the health of your network.
2. Check for garbage collection issues and tune your JVMs accordingly.
* Connection Rate.
### Kafka - Broker
The Kafka - Broker dashboard provides an at-a-glance view of the state of your partitions, active controllers, leaders, throughput, and network across Kafka brokers and clusters.
Use this dashboard to:
* Monitor Under Replicaed and offline partitions to quickly identify if a Kafka broker is down or over utilized.
* Monitor Unclean Leader Election count metrics - this metric shows the number of failures to elect a suitable leader per second. Unclean leader elections are caused when there are no available in-sync replicas for a partition (either due to network issues, lag causing the broker to fall behind, or brokers going down completely), so an out of sync replica is the only option for the leader. When an out of sync replica is elected leader, all data not replicated from the previous leader is lost forever.
* Monitor producer and fetch request rates.
* Monitor Log flush rate to determine the rate at which log data is written to disk
### Kafka - Failures and Delayed Operations
The **Kafka - Failures and Delayed Operations** dashboard gives you insight into all failures and delayed operations associated with your Kafka clusters.
Use this dashboard to:
* Analyze failed produce requests - A failed produce request occurs when a problem is encountered when processing a produce request. This could be for a variety of reasons, however some common reasons are:
* The destination topic doesn’t exist (if auto-create is enabled then subsequent messages should be sent successfully).
* The message is too large.
* The producer is using _request.required.acks=all_ or –_1_, and fewer than the required number of acknowledgements are received.
* Analyze failed Fetch Request - A failed fetch request occurs when a problem is encountered when processing a fetch request. This could be for a variety of reasons, but the most common cause is consumer requests timing out.
* Monitor delayed Operations metrics - This contains metrics regarding the number of requests that are delayed and waiting in purgatory. The purgatory size metric can be used to determine the root cause of latency. For example, increased consumer fetch times could be explained by an increased number of fetch requests waiting in purgatory. Available metrics are:
* Fetch Purgatory Size - The Fetch Purgatory Size metric shows the number of fetch requests currently waiting in purgatory. Fetch requests are added to purgatory if there is not enough data to fulfil the request (determined by fetch.min.bytes in the consumer configuration) and the requests wait in purgatory until the time specified by fetch.wait.max.ms is reached, or enough data becomes available.
* Produce Purgatory Size - The Produce Purgatory Size metric shows the number of produce requests currently waiting in purgatory. Produce requests are added to purgatory if request.required.acks is set to -1 or all, and the requests wait in purgatory until the partition leader receives an acknowledgement from all its followers. If the purgatory size metric keeps growing, some partition replicas may be overloaded. If this is the case, you can choose to increase the capacity of your cluster, or decrease the amount of produce requests being generated.
### Kafka - Request-Response Times
The **Kafka - Request-Response** **Times** dashboard helps you get insight into key request and response latencies of your Kafka cluster.
Use this dashboard to:
* Monitor request time metrics - The Request Metrics metric group contains information regarding different types of request to and from the cluster. Important request metrics to monitor:
1. **Fetch Consumer Request Total Time** - The Fetch Consumer Request Total Time metric shows the maximum and mean amount of time taken for processing, and the number of requests from consumers to get new data. Reasons for increased time taken could be: increased load on the node (creating processing delays), or perhaps requests are being held in purgatory for a long time (determined by fetch.min.bytes and fetch.wait.max.ms metrics).
2. **Fetch Follower Request Total Time** - The Fetch Follower Request Total Time metric displays the maximum and mean amount of time taken while processing, and the number of requests to get new data from Kafka brokers that are followers of a partition. Common causes of increased time taken are increased load on the node causing delays in processing requests, or that some partition replicas may be overloaded or temporarily unavailable.
3. **Produce Request Total Time**- The Produce Request Total Time metric displays the maximum and mean amount of time taken for processing, and the number of requests from producers to send data. Some reasons for increased time taken could be: increased load on the node causing delays in processing the requests, or perhaps requests are being held in purgatory for a long time (if the `requests.required.acks` metrics is equal to '1' or all).
### Kafka - Logs
This dashboard helps you quickly analyze your Kafka error logs across all clusters.
Use this dashboard to:
* Identify critical events in your Kafka broker and controller logs;
* Examine trends to detect spikes in Error or Fatal events
* Monitor Broker added/started and shutdown events in your cluster.
* Quickly determine patterns across all logs in a given Kafka cluster.
### Kafka Broker - Performance Overview
The **Kafka Broker - Performance Overview** dashboards helps you Get an at-a-glance view of the performance and resource utilization of your Kafka brokers and their JVMs.
Use this dashboard to:
* Monitor the number of open file descriptors. If the number of open file descriptors reaches the maximum file descriptor, it can cause an IOException error
* Get insight into Garbage collection and its impact on CPU usage and memory
* Examine how threads are distributed
* Understand the behavior of class count. If class count keeps on increasing, you may have a problem with the same classes loaded by multiple classloaders.
### Kafka Broker - CPU
The **Kafka Broker - CPU** dashboard shows information about the CPU utilization of individual Broker machines.
Use this dashboard to:
* Get insights into the process and user CPU load of Kafka brokers. High CPU utilization can make Kafka flaky and can cause read/write timeouts.
### Kafka Broker - Memory
The **Kafka Broker - Memory** dashboard shows the percentage of the heap and non-heap memory used, physical and swap memory usage of your Kafka broker’s JVM.
Use this dashboard to:
* Understand how memory is used across Heap and Non-Heap memory.
* Examine physical and swap memory usage and make resource adjustments as needed.
* Examine the pending object finalization count which when high can lead to excessive memory usage.
### Kafka Broker - Disk Usage
The **Kafka Broker - Disk Usage** dashboard helps monitor disk usage across your Kafka Brokers.
Use this dashboard to:
* Monitor Disk Usage percentage on Kafka Brokers. This is critical as Kafka brokers use disk space to store messages for each topic. Other factors that affect disk utilization are:
1. Topic replication factor of Kafka topics.
2. Log retention settings.
* Analyze trends in disk throughput and find any spikes. This is especially important as disk throughput can be a performance bottleneck.
* Monitor iNodes bytes used, and disk read vs writes. These metrics are important to monitor as Kafka may not necessarily distribute data from a heavily occupied disk, which itself can bring the Kafka down.
### Kafka Broker - Garbage Collection
The **Kafka Broker - Garbage Collection** dashboard shows key Garbage Collector statistics like the duration of the last GC run, objects collected, threads used, and memory cleared in the last GC run of your java virtual machine.
Use this dashboard to:
* Understand the amount of time spent in garbage collection. If this time keeps increasing, your Kafka brokers may have more CPU usage.
* Understand the amount of memory cleared by garbage collectors across memory pools and their impact on the Heap memory.
### Kafka Broker - Threads
The **Kafka Broker - Threads** dashboard shows the key insights into the usage and type of threads created in your Kafka broker JVM
Use this dashboard to:
* Understand the dynamic behavior of the system using peak, daemon, and current threads.
* Gain insights into the memory and CPU time of the last executed thread.
### Kafka Broker - Class Loading and Compilation
The **Kafka Broker - Class Loading and Compilation** dashboard helps you get insights into the behavior of class count trends.
Use this dashboard to:
* Determine If the class count keeps increasing, this indicates that the same classes are loaded by multiple classloaders.
* Get insights into time spent by Java Virtual machines during compilation.
### Kafka - Topic Overview
The Kafka - Topic Overview dashboard helps you quickly identify under-replicated partitions, and incoming bytes by Kafka topic, server and cluster.
Use this dashboard to:
* Monitor under replicated partitions - The Under Replicated Partitions metric displays the number of partitions that do not have enough replicas to meet the desired replication factor. A partition will also be considered under-replicated if the correct number of replicas exist, but one or more of the replicas have fallen significantly behind the partition leader. The two most common causes of under-replicated partitions are that one or more brokers are unresponsive, or the cluster is experiencing performance issues and one or more brokers have fallen behind.
This metric is tagged with cluster, server, and topic info for easy troubleshooting. The colors in the Honeycomb chart are coded as follows:
1. Green indicates there are no under Replicated Partitions.
2. Red indicates a given partition is under replicated.
### Kafka - Topic Details
The Kafka - Topic Details dashboard gives you insight into throughput, partition sizes and offsets across Kafka brokers, topics and clusters.
Use this dashboard to:
* Monitor metrics like Log partition size, log start offset, and log segment count metrics.
* Identify offline/under replicated partitions count. Partitions can be in this state on account of resource shortages or broker unavailability.
* Monitor the In Sync replica (ISR) Shrink rate. ISR shrinks occur when an in-sync broker goes down, as it decreases the number of in-sync replicas available for each partition replica on that broker.
* Monitor In Sync replica (ISR) Expand rate. ISR expansions occur when a broker comes online, such as when recovering from a failure or adding a new node. This increases the number of in-sync replicas available for each partition on that broker.
## Create monitors for Kafka app
import CreateMonitors from '../../reuse/apps/create-monitors.md';
| Kafka Metrics List |
|---|
| kafka_broker_disk_free |
| kafka_broker_disk_inodes_total |
| kafka_broker_disk_inodes_used |
| kafka_broker_disk_total |
| kafka_broker_disk_used_percent |
| kafka_broker_diskio_io_time |
| kafka_broker_diskio_iops_in_progress |
| kafka_broker_diskio_merged_reads |
| kafka_broker_diskio_merged_writes |
| kafka_broker_diskio_read_bytes |
| kafka_broker_diskio_read_time |
| kafka_broker_diskio_reads |
| kafka_broker_diskio_weighted_io_time |
| kafka_broker_diskio_write_bytes |
| kafka_broker_diskio_write_time |
| kafka_broker_diskio_writes |
| kafka_controller_ActiveControllerCount_Value |
| kafka_controller_AutoLeaderBalanceRateAndTimeMs_50thPercentile |
| kafka_controller_AutoLeaderBalanceRateAndTimeMs_75thPercentile |
| kafka_controller_AutoLeaderBalanceRateAndTimeMs_98thPercentile |
| kafka_controller_AutoLeaderBalanceRateAndTimeMs_99thPercentile |
| kafka_controller_AutoLeaderBalanceRateAndTimeMs_Count |
| kafka_controller_AutoLeaderBalanceRateAndTimeMs_FifteenMinuteRate |
| kafka_controller_AutoLeaderBalanceRateAndTimeMs_Max |
| kafka_controller_AutoLeaderBalanceRateAndTimeMs_Mean |
| kafka_controller_AutoLeaderBalanceRateAndTimeMs_Min |
| kafka_controller_AutoLeaderBalanceRateAndTimeMs_StdDev |
| kafka_controller_ControlledShutdownRateAndTimeMs_99thPercentile |
| kafka_controller_ControlledShutdownRateAndTimeMs_FiveMinuteRate |
| kafka_controller_ControlledShutdownRateAndTimeMs_Min |
| kafka_controller_ControllerChangeRateAndTimeMs_50thPercentile |
| kafka_controller_ControllerChangeRateAndTimeMs_75thPercentile |
| kafka_controller_ControllerChangeRateAndTimeMs_98thPercentile |
| kafka_controller_ControllerChangeRateAndTimeMs_99thPercentile |
| kafka_controller_ControllerChangeRateAndTimeMs_Max |
| kafka_controller_ControllerChangeRateAndTimeMs_MeanRate |
| kafka_controller_ControllerChangeRateAndTimeMs_StdDev |
| kafka_controller_ControllerShutdownRateAndTimeMs_50thPercentile |
| kafka_controller_ControllerShutdownRateAndTimeMs_75thPercentile |
| kafka_controller_ControllerShutdownRateAndTimeMs_99thPercentile |
| kafka_controller_ControllerShutdownRateAndTimeMs_Count |
| kafka_controller_ControllerShutdownRateAndTimeMs_FifteenMinuteRate |
| kafka_controller_ControllerShutdownRateAndTimeMs_Min |
| kafka_controller_ControllerShutdownRateAndTimeMs_StdDev |
| kafka_controller_EventQueueSize_Value |
| kafka_controller_EventQueueTimeMs_95thPercentile |
| kafka_controller_EventQueueTimeMs_98thPercentile |
| kafka_controller_EventQueueTimeMs_999thPercentile |
| kafka_controller_EventQueueTimeMs_Min |
| kafka_controller_GlobalPartitionCount_Value |
| kafka_controller_GlobalTopicCount_Value |
| kafka_controller_IsrChangeRateAndTimeMs_50thPercentile |
| kafka_controller_IsrChangeRateAndTimeMs_75thPercentile |
| kafka_controller_IsrChangeRateAndTimeMs_95thPercentile |
| kafka_controller_IsrChangeRateAndTimeMs_98thPercentile |
| kafka_controller_IsrChangeRateAndTimeMs_99thPercentile |
| kafka_controller_IsrChangeRateAndTimeMs_Count |
| kafka_controller_IsrChangeRateAndTimeMs_FifteenMinuteRate |
| kafka_controller_IsrChangeRateAndTimeMs_FiveMinuteRate |
| kafka_controller_LeaderAndIsrResponseReceivedRateAndTimeMs_75thPercentile |
| kafka_controller_LeaderAndIsrResponseReceivedRateAndTimeMs_95thPercentile |
| kafka_controller_LeaderAndIsrResponseReceivedRateAndTimeMs_FiveMinuteRate |
| kafka_controller_LeaderAndIsrResponseReceivedRateAndTimeMs_MeanRate |
| kafka_controller_LeaderAndIsrResponseReceivedRateAndTimeMs_Min |
| kafka_controller_LeaderAndIsrResponseReceivedRateAndTimeMs_OneMinuteRate |
| kafka_controller_LeaderElectionRateAndTimeMs_95thPercentile |
| kafka_controller_LeaderElectionRateAndTimeMs_999thPercentile |
| kafka_controller_LeaderElectionRateAndTimeMs_FifteenMinuteRate |
| kafka_controller_LeaderElectionRateAndTimeMs_Max |
| kafka_controller_LeaderElectionRateAndTimeMs_Min |
| kafka_controller_ListPartitionReassignmentRateAndTimeMs_50thPercentile |
| kafka_controller_ListPartitionReassignmentRateAndTimeMs_95thPercentile |
| kafka_controller_ListPartitionReassignmentRateAndTimeMs_999thPercentile |
| kafka_controller_ListPartitionReassignmentRateAndTimeMs_Mean |
| kafka_controller_ListPartitionReassignmentRateAndTimeMs_Min |
| kafka_controller_ListPartitionReassignmentRateAndTimeMs_OneMinuteRate |
| kafka_controller_LogDirChangeRateAndTimeMs_75thPercentile |
| kafka_controller_LogDirChangeRateAndTimeMs_999thPercentile |
| kafka_controller_LogDirChangeRateAndTimeMs_99thPercentile |
| kafka_controller_LogDirChangeRateAndTimeMs_Count |
| kafka_controller_LogDirChangeRateAndTimeMs_FifteenMinuteRate |
| kafka_controller_ManualLeaderBalanceRateAndTimeMs_50thPercentile |
| kafka_controller_ManualLeaderBalanceRateAndTimeMs_75thPercentile |
| kafka_controller_ManualLeaderBalanceRateAndTimeMs_98thPercentile |
| kafka_controller_ManualLeaderBalanceRateAndTimeMs_999thPercentile |
| kafka_controller_ManualLeaderBalanceRateAndTimeMs_FiveMinuteRate |
| kafka_controller_ManualLeaderBalanceRateAndTimeMs_Mean |
| kafka_controller_ManualLeaderBalanceRateAndTimeMs_Min |
| kafka_controller_ManualLeaderBalanceRateAndTimeMs_OneMinuteRate |
| kafka_controller_PartitionReassignmentRateAndTimeMs_50thPercentile |
| kafka_controller_PartitionReassignmentRateAndTimeMs_75thPercentile |
| kafka_controller_PartitionReassignmentRateAndTimeMs_98thPercentile |
| kafka_controller_PartitionReassignmentRateAndTimeMs_999thPercentile |
| kafka_controller_PartitionReassignmentRateAndTimeMs_99thPercentile |
| kafka_controller_PartitionReassignmentRateAndTimeMs_Count |
| kafka_controller_PartitionReassignmentRateAndTimeMs_FiveMinuteRate |
| kafka_controller_PartitionReassignmentRateAndTimeMs_Max |
| kafka_controller_PartitionReassignmentRateAndTimeMs_Mean |
| kafka_controller_PartitionReassignmentRateAndTimeMs_MeanRate |
| kafka_controller_PartitionReassignmentRateAndTimeMs_OneMinuteRate |
| kafka_controller_PreferredReplicaImbalanceCount_Value |
| kafka_controller_ReplicasIneligibleToDeleteCount_Value |
| kafka_controller_TopicChangeRateAndTimeMs_99thPercentile |
| kafka_controller_TopicChangeRateAndTimeMs_Count |
| kafka_controller_TopicChangeRateAndTimeMs_FiveMinuteRate |
| kafka_controller_TopicChangeRateAndTimeMs_Mean |
| kafka_controller_TopicChangeRateAndTimeMs_MeanRate |
| kafka_controller_TopicChangeRateAndTimeMs_Min |
| kafka_controller_TopicChangeRateAndTimeMs_StdDev |
| kafka_controller_TopicDeletionRateAndTimeMs_75thPercentile |
| kafka_controller_TopicDeletionRateAndTimeMs_95thPercentile |
| kafka_controller_TopicDeletionRateAndTimeMs_98thPercentile |
| kafka_controller_TopicDeletionRateAndTimeMs_Count |
| kafka_controller_TopicDeletionRateAndTimeMs_FifteenMinuteRate |
| kafka_controller_TopicDeletionRateAndTimeMs_Max |
| kafka_controller_TopicDeletionRateAndTimeMs_OneMinuteRate |
| kafka_controller_TopicsToDeleteCount_Value |
| kafka_controller_TopicUncleanLeaderElectionEnableRateAndTimeMs_98thPercentile |
| kafka_controller_TopicUncleanLeaderElectionEnableRateAndTimeMs_999thPercentile |
| kafka_controller_TopicUncleanLeaderElectionEnableRateAndTimeMs_Count |
| kafka_controller_TopicUncleanLeaderElectionEnableRateAndTimeMs_FifteenMinuteRate |
| kafka_controller_TotalQueueSize_Value |
| kafka_controller_UncleanLeaderElectionEnableRateAndTimeMs_50thPercentile |
| kafka_controller_UncleanLeaderElectionEnableRateAndTimeMs_75thPercentile |
| kafka_controller_UncleanLeaderElectionEnableRateAndTimeMs_95thPercentile |
| kafka_controller_UncleanLeaderElectionEnableRateAndTimeMs_98thPercentile |
| kafka_controller_UncleanLeaderElectionEnableRateAndTimeMs_Count |
| kafka_controller_UncleanLeaderElectionEnableRateAndTimeMs_FifteenMinuteRate |
| kafka_controller_UncleanLeaderElectionEnableRateAndTimeMs_FiveMinuteRate |
| kafka_controller_UncleanLeaderElectionEnableRateAndTimeMs_MeanRate |
| kafka_controller_UncleanLeaderElectionEnableRateAndTimeMs_Min |
| kafka_controller_UncleanLeaderElectionsPerSec_FifteenMinuteRate |
| kafka_controller_UpdateFeaturesRateAndTimeMs_MeanRate |
| kafka_controller_UpdateFeaturesRateAndTimeMs_StdDev |
| kafka_java_lang_GarbageCollector_CollectionCount |
| kafka_java_lang_GarbageCollector_CollectionTime |
| kafka_java_lang_GarbageCollector_LastGcInfo_endTime |
| kafka_java_lang_GarbageCollector_LastGcInfo_GcThreadCount |
| kafka_java_lang_GarbageCollector_LastGcInfo_id |
| kafka_java_lang_GarbageCollector_LastGcInfo_memoryUsageAfterGc_Code_Cache_max |
| kafka_java_lang_GarbageCollector_LastGcInfo_memoryUsageAfterGc_Code_Cache_used |
| kafka_java_lang_GarbageCollector_LastGcInfo_memoryUsageAfterGc_CodeHeap__non_nmethods__init |
|
kafka_java_lang_GarbageCollector_LastGcInfo_memoryUsageAfterGc_CodeHeap__non_profiled_nmethods__used |
| kafka_java_lang_GarbageCollector_LastGcInfo_memoryUsageAfterGc_Compressed_Class_Space_init |
| kafka_java_lang_GarbageCollector_LastGcInfo_memoryUsageAfterGc_G1_Eden_Space_committed |
| kafka_java_lang_GarbageCollector_LastGcInfo_memoryUsageAfterGc_G1_Eden_Space_init |
| kafka_java_lang_GarbageCollector_LastGcInfo_memoryUsageAfterGc_G1_Eden_Space_max |
| kafka_java_lang_GarbageCollector_LastGcInfo_memoryUsageAfterGc_G1_Old_Gen_committed |
| kafka_java_lang_GarbageCollector_LastGcInfo_memoryUsageAfterGc_G1_Old_Gen_used |
| kafka_java_lang_GarbageCollector_LastGcInfo_memoryUsageAfterGc_G1_Survivor_Space_init |
| kafka_java_lang_GarbageCollector_LastGcInfo_memoryUsageAfterGc_G1_Survivor_Space_used |
| kafka_java_lang_GarbageCollector_LastGcInfo_memoryUsageBeforeGc_Code_Cache_init |
| kafka_java_lang_GarbageCollector_LastGcInfo_memoryUsageBeforeGc_Code_Cache_max |
|
kafka_java_lang_GarbageCollector_LastGcInfo_memoryUsageBeforeGc_CodeHeap__non_nmethods__committed |
|
kafka_java_lang_GarbageCollector_LastGcInfo_memoryUsageBeforeGc_CodeHeap__profiled_nmethods__used |
| kafka_java_lang_GarbageCollector_LastGcInfo_memoryUsageBeforeGc_Compressed_Class_Space_used |
| kafka_java_lang_GarbageCollector_LastGcInfo_memoryUsageBeforeGc_G1_Eden_Space_committed |
| kafka_java_lang_GarbageCollector_LastGcInfo_memoryUsageBeforeGc_G1_Eden_Space_init |
| kafka_java_lang_GarbageCollector_LastGcInfo_memoryUsageBeforeGc_G1_Eden_Space_max |
| kafka_java_lang_GarbageCollector_LastGcInfo_memoryUsageBeforeGc_G1_Old_Gen_committed |
| kafka_java_lang_GarbageCollector_LastGcInfo_memoryUsageBeforeGc_G1_Old_Gen_init |
| kafka_java_lang_GarbageCollector_LastGcInfo_memoryUsageBeforeGc_G1_Old_Gen_used |
| kafka_java_lang_GarbageCollector_LastGcInfo_memoryUsageBeforeGc_G1_Survivor_Space_max |
| kafka_java_lang_GarbageCollector_LastGcInfo_memoryUsageBeforeGc_Metaspace_used |
| kafka_java_lang_GarbageCollector_LastGcInfo_startTime |
| kafka_java_lang_Memory_HeapMemoryUsage_committed |
| kafka_java_lang_Memory_HeapMemoryUsage_init |
| kafka_java_lang_Memory_HeapMemoryUsage_used |
| kafka_java_lang_MemoryPool_CollectionUsage_committed |
| kafka_java_lang_MemoryPool_CollectionUsage_init |
| kafka_java_lang_MemoryPool_CollectionUsage_max |
| kafka_java_lang_MemoryPool_CollectionUsage_used |
| kafka_java_lang_MemoryPool_CollectionUsageThresholdSupported |
| kafka_java_lang_MemoryPool_PeakUsage_committed |
| kafka_java_lang_MemoryPool_PeakUsage_init |
| kafka_java_lang_MemoryPool_PeakUsage_max |
| kafka_java_lang_MemoryPool_PeakUsage_used |
| kafka_java_lang_MemoryPool_Usage_committed |
| kafka_java_lang_MemoryPool_Usage_init |
| kafka_java_lang_MemoryPool_Usage_max |
| kafka_java_lang_MemoryPool_Usage_used |
| kafka_java_lang_MemoryPool_UsageThresholdSupported |
| kafka_java_lang_OperatingSystem_CommittedVirtualMemorySize |
| kafka_java_lang_OperatingSystem_FreePhysicalMemorySize |
| kafka_java_lang_OperatingSystem_MaxFileDescriptorCount |
| kafka_java_lang_OperatingSystem_ProcessCpuTime |
| kafka_java_lang_OperatingSystem_TotalSwapSpaceSize |
| kafka_java_lang_Runtime_BootClassPathSupported |
| kafka_java_lang_Threading_CurrentThreadCpuTime |
| kafka_java_lang_Threading_SynchronizerUsageSupported |
| kafka_java_lang_Threading_ThreadAllocatedMemoryEnabled |
| kafka_java_lang_Threading_ThreadAllocatedMemorySupported |
| kafka_java_lang_Threading_ThreadCpuTimeEnabled |
| kafka_network_ResponseQueueSizeValue |
| kafka_partition_LogEndOffset |
| kafka_partition_LogStartOffset |
| kafka_partition_NumLogSegments |
| kafka_partition_Size |
| kafka_partition_UnderReplicatedPartitions |
| kafka_purgatory_Heartbeat_NumDelayedOperations |
| kafka_purgatory_Produce_NumDelayedOperations |
| kafka_purgatory_Produce_PurgatorySize |
| kafka_purgatory_Rebalance_NumDelayedOperations |
| kafka_purgatory_topic_NumDelayedOperations |
| kafka_purgatory_topic_PurgatorySize |
| kafka_replica_manager_FailedIsrUpdatesPerSec_Count |
| kafka_replica_manager_FailedIsrUpdatesPerSec_MeanRate |
| kafka_replica_manager_FailedIsrUpdatesPerSec_OneMinuteRate |
| kafka_replica_manager_IsrExpandsPerSec_FifteenMinuteRate |
| kafka_replica_manager_IsrExpandsPerSec_FiveMinuteRate |
| kafka_replica_manager_IsrExpandsPerSec_MeanRate |
| kafka_replica_manager_IsrShrinksPerSec_MeanRate |
| kafka_replica_manager_LeaderCount_Value |
| kafka_replica_manager_PartitionCount_Value |
| kafka_replica_manager_ReassigningPartitions_Value |
| kafka_replica_manager_UnderMinIsrPartitionCount_Value |
| kafka_replica_manager_UnderReplicatedPartitions_Value |
| kafka_request_handlers_MeanRate |
| kafka_request_LocalTimeMs_50thPercentile |
| kafka_request_LocalTimeMs_75thPercentile |
| kafka_request_LocalTimeMs_95thPercentile |
| kafka_request_LocalTimeMs_98thPercentile |
| kafka_request_LocalTimeMs_999thPercentile |
| kafka_request_LocalTimeMs_99thPercentile |
| kafka_request_LocalTimeMs_Count |
| kafka_request_LocalTimeMs_Max |
| kafka_request_LocalTimeMs_Mean |
| kafka_request_LocalTimeMs_Min |
| kafka_request_LocalTimeMs_StdDev |
| kafka_request_MessageConversionsTimeMs_50thPercentile |
| kafka_request_MessageConversionsTimeMs_75thPercentile |
| kafka_request_MessageConversionsTimeMs_95thPercentile |
| kafka_request_MessageConversionsTimeMs_98thPercentile |
| kafka_request_MessageConversionsTimeMs_99thPercentile |
| kafka_request_MessageConversionsTimeMs_Count |
| kafka_request_MessageConversionsTimeMs_Max |
| kafka_request_MessageConversionsTimeMs_Min |
| kafka_request_RemoteTimeMs_50thPercentile |
| kafka_request_RemoteTimeMs_75thPercentile |
| kafka_request_RemoteTimeMs_95thPercentile |
| kafka_request_RemoteTimeMs_98thPercentile |
| kafka_request_RemoteTimeMs_999thPercentile |
| kafka_request_RemoteTimeMs_99thPercentile |
| kafka_request_RemoteTimeMs_Count |
| kafka_request_RemoteTimeMs_Max |
| kafka_request_RemoteTimeMs_Mean |
| kafka_request_RemoteTimeMs_Min |
| kafka_request_RemoteTimeMs_StdDev |
| kafka_request_RequestBytes_50thPercentile |
| kafka_request_RequestBytes_75thPercentile |
| kafka_request_RequestBytes_95thPercentile |
| kafka_request_RequestBytes_98thPercentile |
| kafka_request_RequestBytes_999thPercentile |
| kafka_request_RequestBytes_99thPercentile |
| kafka_request_RequestBytes_Count |
| kafka_request_RequestBytes_Max |
| kafka_request_RequestBytes_Mean |
| kafka_request_RequestBytes_Min |
| kafka_request_RequestBytes_StdDev |
| kafka_request_RequestQueueTimeMs_50thPercentile |
| kafka_request_RequestQueueTimeMs_75thPercentile |
| kafka_request_RequestQueueTimeMs_95thPercentile |
| kafka_request_RequestQueueTimeMs_98thPercentile |
| kafka_request_RequestQueueTimeMs_999thPercentile |
| kafka_request_RequestQueueTimeMs_99thPercentile |
| kafka_request_RequestQueueTimeMs_Count |
| kafka_request_RequestQueueTimeMs_Max |
| kafka_request_RequestQueueTimeMs_Mean |
| kafka_request_RequestQueueTimeMs_Min |
| kafka_request_RequestQueueTimeMs_StdDev |
| kafka_request_ResponseQueueTimeMs_50thPercentile |
| kafka_request_ResponseQueueTimeMs_75thPercentile |
| kafka_request_ResponseQueueTimeMs_95thPercentile |
| kafka_request_ResponseQueueTimeMs_98thPercentile |
| kafka_request_ResponseQueueTimeMs_999thPercentile |
| kafka_request_ResponseQueueTimeMs_99thPercentile |
| kafka_request_ResponseQueueTimeMs_Count |
| kafka_request_ResponseQueueTimeMs_Max |
| kafka_request_ResponseQueueTimeMs_Mean |
| kafka_request_ResponseQueueTimeMs_Min |
| kafka_request_ResponseQueueTimeMs_StdDev |
| kafka_request_ResponseSendTimeMs_50thPercentile |
| kafka_request_ResponseSendTimeMs_75thPercentile |
| kafka_request_ResponseSendTimeMs_95thPercentile |
| kafka_request_ResponseSendTimeMs_98thPercentile |
| kafka_request_ResponseSendTimeMs_999thPercentile |
| kafka_request_ResponseSendTimeMs_99thPercentile |
| kafka_request_ResponseSendTimeMs_Count |
| kafka_request_ResponseSendTimeMs_Max |
| kafka_request_ResponseSendTimeMs_Mean |
| kafka_request_ResponseSendTimeMs_Min |
| kafka_request_ResponseSendTimeMs_StdDev |
| kafka_request_TemporaryMemoryBytes_75thPercentile |
| kafka_request_TemporaryMemoryBytes_98thPercentile |
| kafka_request_TemporaryMemoryBytes_999thPercentile |
| kafka_request_TemporaryMemoryBytes_99thPercentile |
| kafka_request_TemporaryMemoryBytes_Max |
| kafka_request_TemporaryMemoryBytes_Mean |
| kafka_request_TemporaryMemoryBytes_Min |
| kafka_request_TemporaryMemoryBytes_StdDev |
| kafka_request_ThrottleTimeMs_50thPercentile |
| kafka_request_ThrottleTimeMs_75thPercentile |
| kafka_request_ThrottleTimeMs_95thPercentile |
| kafka_request_ThrottleTimeMs_98thPercentile |
| kafka_request_ThrottleTimeMs_999thPercentile |
| kafka_request_ThrottleTimeMs_99thPercentile |
| kafka_request_ThrottleTimeMs_Count |
| kafka_request_ThrottleTimeMs_Max |
| kafka_request_ThrottleTimeMs_Mean |
| kafka_request_ThrottleTimeMs_Min |
| kafka_request_ThrottleTimeMs_StdDev |
| kafka_request_TotalTimeMs_50thPercentile |
| kafka_request_TotalTimeMs_75thPercentile |
| kafka_request_TotalTimeMs_95thPercentile |
| kafka_request_TotalTimeMs_98thPercentile |
| kafka_request_TotalTimeMs_999thPercentile |
| kafka_request_TotalTimeMs_99thPercentile |
| kafka_request_TotalTimeMs_Count |
| kafka_request_TotalTimeMs_Max |
| kafka_request_TotalTimeMs_Mean |
| kafka_request_TotalTimeMs_Min |
| kafka_request_TotalTimeMs_StdDev |
| kafka_topic_BytesInPerSec_Count |
| kafka_topic_BytesInPerSec_FiveMinuteRate |
| kafka_topic_BytesInPerSec_MeanRate |
| kafka_topic_BytesInPerSec_OneMinuteRate |
| kafka_topic_BytesOutPerSec_FiveMinuteRate |
| kafka_topic_BytesOutPerSec_MeanRate |
| kafka_topic_MessagesInPerSec_Count |
| kafka_topic_TotalFetchRequestsPerSec_FifteenMinuteRate |
| kafka_topic_TotalFetchRequestsPerSec_FiveMinuteRate |
| kafka_topic_TotalFetchRequestsPerSec_MeanRate |
| kafka_topic_TotalFetchRequestsPerSec_OneMinuteRate |
| kafka_topic_TotalProduceRequestsPerSec_Count |
| kafka_topic_TotalProduceRequestsPerSec_FifteenMinuteRate |
| kafka_topic_TotalProduceRequestsPerSec_FiveMinuteRate |
| kafka_topic_TotalProduceRequestsPerSec_MeanRate |
| kafka_topics_BytesInPerSec_Count |
| kafka_topics_BytesInPerSec_FifteenMinuteRate |
| kafka_topics_BytesInPerSec_MeanRate |
| kafka_topics_BytesInPerSec_OneMinuteRate |
| kafka_topics_BytesOutPerSec_MeanRate |
| kafka_topics_BytesOutPerSec_OneMinuteRate |
| kafka_topics_BytesRejectedPerSec_Count |
| kafka_topics_BytesRejectedPerSec_FiveMinuteRate |
| kafka_topics_BytesRejectedPerSec_MeanRate |
| kafka_topics_FailedFetchRequestsPerSec_MeanRate |
| kafka_topics_FailedProduceRequestsPerSec_FifteenMinuteRate |
| kafka_topics_FailedProduceRequestsPerSec_FiveMinuteRate |
| kafka_topics_FailedProduceRequestsPerSec_MeanRate |
| kafka_topics_FailedProduceRequestsPerSec_OneMinuteRate |
| kafka_topics_InvalidMagicNumberRecordsPerSec_FifteenMinuteRate |
| kafka_topics_InvalidMagicNumberRecordsPerSec_FiveMinuteRate |
| kafka_topics_InvalidMagicNumberRecordsPerSec_MeanRate |
| kafka_topics_InvalidMessageCrcRecordsPerSec_FifteenMinuteRate |
| kafka_topics_InvalidOffsetOrSequenceRecordsPerSec_FiveMinuteRate |
| kafka_topics_InvalidOffsetOrSequenceRecordsPerSec_MeanRate |
| kafka_topics_InvalidOffsetOrSequenceRecordsPerSec_OneMinuteRate |
| kafka_topics_MessagesInPerSec_Count |
| kafka_topics_MessagesInPerSec_FifteenMinuteRate |
| kafka_topics_MessagesInPerSec_FiveMinuteRate |
| kafka_topics_NoKeyCompactedTopicRecordsPerSec_Count |
| kafka_topics_NoKeyCompactedTopicRecordsPerSec_FifteenMinuteRate |
| kafka_topics_NoKeyCompactedTopicRecordsPerSec_FiveMinuteRate |
| kafka_topics_NoKeyCompactedTopicRecordsPerSec_MeanRate |
| kafka_topics_ProduceMessageConversionsPerSec_FifteenMinuteRate |
| kafka_topics_ProduceMessageConversionsPerSec_OneMinuteRate |
| kafka_topics_ReassignmentBytesInPerSec_Count |
| kafka_topics_ReassignmentBytesInPerSec_FifteenMinuteRate |
| kafka_topics_ReassignmentBytesInPerSec_FiveMinuteRate |
| kafka_topics_ReassignmentBytesInPerSec_MeanRate |
| kafka_topics_ReassignmentBytesInPerSec_OneMinuteRate |
| kafka_topics_ReassignmentBytesOutPerSec_Count |
| kafka_topics_ReassignmentBytesOutPerSec_FifteenMinuteRate |
| kafka_topics_ReassignmentBytesOutPerSec_MeanRate |
| kafka_topics_ReassignmentBytesOutPerSec_OneMinuteRate |
| kafka_topics_ReplicationBytesInPerSec_Count |
| kafka_topics_ReplicationBytesInPerSec_MeanRate |
| kafka_topics_ReplicationBytesOutPerSec_Count |
| kafka_topics_ReplicationBytesOutPerSec_FiveMinuteRate |
| kafka_topics_ReplicationBytesOutPerSec_MeanRate |
| kafka_topics_ReplicationBytesOutPerSec_OneMinuteRate |
| kafka_topics_TotalFetchRequestsPerSec_Count |
| kafka_topics_TotalFetchRequestsPerSec_FifteenMinuteRate |
| kafka_topics_TotalFetchRequestsPerSec_FiveMinuteRate |
| kafka_topics_TotalFetchRequestsPerSec_MeanRate |
| kafka_topics_TotalProduceRequestsPerSec_FiveMinuteRate |
| kafka_topics_TotalProduceRequestsPerSec_MeanRate |
| kafka_topics_TotalProduceRequestsPerSec_OneMinuteRate |
| kafka_zookeeper_auth_failures_FifteenMinuteRate |
| kafka_zookeeper_auth_failures_FiveMinuteRate |
| kafka_zookeeper_authentications_Count |
| kafka_zookeeper_authentications_OneMinuteRate |
| kafka_zookeeper_disconnects_FiveMinuteRate |
| kafka_zookeeper_expires_FifteenMinuteRate |
| kafka_zookeeper_expires_FiveMinuteRate |
| kafka_zookeeper_expires_MeanRate |
| kafka_zookeeper_expires_OneMinuteRate |
| kafka_zookeeper_readonly_connects_FifteenMinuteRate |
| kafka_zookeeper_readonly_connects_MeanRate |
| kafka_zookeeper_sync_connects_FifteenMinuteRate |
| kafka_zookeeper_sync_connects_MeanRate |
| kafka_zookeeper_sync_connects_OneMinuteRate |