Kubernetes monitoring - definition & overview

What is Kubernetes monitoring?

Kubernetes monitoring helps DevOps practitioners manage complex app modernization and scale-up containerization in human-readable environments. By monitoring Kubernetes clusters, you can easily manage your entire container infrastructure and identify issues quickly; track uptime, cluster resources, and the interaction between various cluster components.

Kubernetes has built-in cluster operators to monitor clusters and send alerts based on pods running. However, many DevOps and engineering teams use third-party tools for monitoring Kubernetes clusters and other applications.

Key takeaways

Kubernetes monitoring is complex as it involves hundreds of thousands or more containers — in private, cloud or hybrid environments.
Kubernetes has limited capabilities to view and collect internal logs within various clusters and containers, which is why leveraging third-party tools for Kubernetes monitoring is so important.
Kubernetes security is challenging because it is a sprawling platform composed of many different parts. Each of those components carries security risks and vulnerabilities.
There are different types of Kubernetes monitoring and metrics for measuring pods, clusters, networks, costs, security and application performance.

What is Kubernetes?

Kubernetes is an open-source container management system developed by Google and made available to the public in June 2014. A container is a virtualized environment that consists of an application and all the configuration files, libraries, binaries and dependencies needed to execute that application. Kubernetes aims to make deploying and managing complex distributed systems easier for DevOps engineers and developers that want to break up an application monolith into microservices. A cluster is typically composed of a parent machine (called a node) with multiple child nodes that run the applications in a container.

Kubernetes architecture overview

Kubernetes—commonly called K8s—was the third container cluster manager developed by Google, improving core scheduling architecture and a shared persistent store at its core. Kubernetes application program interfaces (APIs), which process REST operations, are similar to other APIs.

Learn more: Kubernetes observability

Why do we use Kubernetes?

The main reason to use Kubernetes is to automate container orchestration, thereby eliminating the need to perform tedious tasks, like manually starting and stopping containers or assigning containers to individual servers.

If you have a large-scale container deployment, Kubernetes (or a similar orchestration tool) is essential for making it practical to manage the environment. You can get away with managing a half-dozen container instances by hand. Beyond that point, it becomes unfeasible to manage an application environment without the automation provided by Kubernetes.

Beyond its automation benefits, Kubernetes (also called K8s) provides some other valuable features. Although Kubernetes is not a security tool, it lets you implement some security protections (using features like role-based access control and pod security policies) that add security to containerized application environments. K8s also makes it easy to migrate an application deployment from one infrastructure to another since its configurations and data are portable across different infrastructures.

Kubernetes adoption from 2020 Sumo Logic customer study.

Due to the complexity of managing several containers and environments, monitoring Kubernetes is essential to ensure the best application experience.

Learn more: Why consider Kubernetes

Kubernetes monitoring basics

Kubernetes is an orchestrator that manages application environments by automating tasks that human operators would otherwise have to perform manually. Those tasks include starting and stopping different infrastructure components, providing load-balancing to ensure that requests are distributed evenly across an environment, and managing the exchange of information between other parts of an application environment.

While Kubernetes is often used to orchestrate containers, it can orchestrate other application infrastructures, like virtual machines (VMs) and microservices running on bare metal (without an operating system)

Learn more: Get started with Kubernetes

Why is Kubernetes monitoring so complex?

Kubernetes allows companies to harness more computing power when running software applications. It automates the deployment, scheduling, and operation of application containers on clusters of machines — often hundreds of thousands or more — in private, cloud or hybrid environments. It also allows developers to create a “container-centric” environment with container images deployed on Kubernetes or integrated with a continuous integration/deployment (CI/CD) system.

When things go wrong in Kubernetes, you must navigate complex web dependencies, from Kubernetes to the underlying infrastructure to the application layer.

Learn more: Continuous Intelligence with Kubernetes

As a platform, K8s can be combined with other technologies for added functionality and does not limit the types of supported applications or services. Some container-based Platform-as-a-Service (PaaS) systems run on Kubernetes. K8s differs from these PaaS systems in that it is not all-inclusive and does not provide middleware, deploy source code, build an application, or have a click-to-deploy marketplace. In essence, K8s is a more imperative orchestration platform, whereas other PaaS is often more declarative and often has more features.

Kubernetes monitoring terminology

There are a few Kubernetes-specific terms that are useful to know when starting with K8s:

Kubernetes API – flexible API (can be accessed directly or with tools) with a RESTful interface that stores the state of the cluster.

Note: Representational State Transfer (RESTful) API is a software architecture designed to safely and securely facilitate communication and information exchange over a complex system, i.e., the internet.

Kubectl – command line interface for running commands.

Kubelet – an agent that uses PodSpecs to ensure containers are healthy and running according to specifications.

Note: A PodSpec is a library configuration type that determines data sources, specific files included, and the architecture of a container.

Image – files that make up the application that runs inside the container.

Pod – a set of containers that are running on a cluster.

Cluster – parent with multiple child machines (called nodes) that run the applications in a container. You might see these explicitly described as “Kubernetes clusters.”

Node – a child machine with services to run a pod, managed by the parent component.

Note: The Parent Component is created and managed separately from an application. Use Parent Components to replicate or cite other components for use within an application.

Minikube – a tool that runs a cluster node inside a VM on a local computer.

Controller – a control loop that ensures the desired state matches the observed state of the cluster.

Container - a container is a package that contains everything needed to run an application, including code, runtime, libraries, and values.

DaemonSet – ensures nodes run a copy of a pod when adding a node to a cluster.

Kubernetes distributions

Kubernetes is open source. You can download the Kubernetes source code from GitHub and compile it yourself. However, installing and updating Kubernetes in this way is complicated unless you want to build it from scratch. This could teach you the ins and outs of the platform or perhaps you are using a host environment where prebuilt Kubernetes distributions are unavailable. Compiling Kubernetes from a source is usually not worth all the trouble and effort.

For most teams, using a Kubernetes distribution makes more sense. A Kubernetes distribution is a prebuilt version of Kubernetes that you can install using packages instead of compiling from the source. Most Kubernetes distributions are also preconfigured to make installation and setup easier. Many come with additional tools or integrations that add functionality to the core Kubernetes platform. This way, you can think of Kubernetes distributions akin to Linux distributions. Most people use Linux distributions that come prebuilt and preconfigured to serve different purposes (like powering desktops, servers, or networking equipment).

Popular Kubernetes distributions include Red Hat OpenShift, Rancher, Canonical's Kubernetes distribution for Ubuntu, and SUSE's CaaS platform. These distributions can be installed on-premises or on a cloud-based infrastructure that you provision yourself. As noted below, there are also special Kubernetes distributions designed for different types of deployments.

In addition, all of the major public cloud providers offer hosted Kubernetes services, such as AWS EKS and Azure AKS. These cloud-based services allow you to set up a Kubernetes cluster with minimal infrastructure management, although they typically offer fewer opportunities for configuration tweaks.

The value of Kubernetes monitoring and container services

Kubernetes and container services enable software to run reliably when moved from one computing environment to another, regardless of compatibility. It allows application developers and IT administrators to run multiple application containers on a shared system across clusters of servers.

Every component of Kubernetes exposes its metrics in a Prometheus format. The running processes behind those components serve up the metrics on an HTTP URL.

Application containers are isolated from each other but share the OS kernel and the host (i.e., shared parts of the operating system) are read-only. In this way, all components of an application are separate from the underlying host infrastructure, which makes deploying and scaling in the different cloud and on-premises environments easier.

Containers are more lightweight and use fewer resources than VMs. A container typically consists of an application, its dependencies, library, binaries and configuration files. A VM contains the runtime environment plus its operating system, making it more cumbersome and less portable.

A Kubernetes orchestration platform is virtualization at the OS level. It provides a virtual platform for applications to run on with OS resources called via a REST API. It is a form of microservices architecture using portable executable images containing software and its dependencies.

In the past, heavy, non-portable applications were the standard. With automated container systems like Kubernetes, applications are built with a single OS operation supporting multiple containers across different computing environments—regardless of platform. As an example, Google runs billions of containers weekly.

Learn more: Kubernetes cluster testing

Kubernetes monitoring best practices

Track the API gateway for microservices to detect application issues. API metrics are some of the best KPIs to help you identify microservices issues. Request rates, call errors and latency highlight component degradations and are more clear-cut metrics.

Create alerts for high disk utilization. When your application utilizes a high volume of disk space and resources, it usually indicates a problem with your application. All disk volumes should be monitored (including the root file system), and alerts should be set for 75-80% utilization.
Monitor end-user experience of Kubernetes applications. While end-user or customer experience is not measured natively in the Kubernetes platform, your Kubernetes monitoring strategy should include real-user monitoring.
Focus on cloud monitoring. Your Kubernetes applications and clusters are likely running in the cloud. Therefore, you’ll want cloud monitoring to monitor identity and access management (IAM) events, Cloud API, cost, and network performance.
Leverage Kubernetes DaemonSets. Deploy Kubernetes DaemonSets on each node of your Kubernetes environment. DaemonSets are workload objects responsible for running a Kubernetes pod on every node, and they ensure hosts appear and are prepared to provide metrics.
Use labels to manage complex clusters better. Creating an effective and easy-to-navigate label and tagging taxonomy can make it easier for your DevOps teams to identify different components.
Monitor with Kube-state-metrics (KSM). KSM provides information on different parts of Kubernetes clusters such as node storage space, pod scheduling, container restarts in pods, jobs running/succeeded/failed and node availability.
Use Service discovery to connect and collect application metrics. Because all applications are scheduled dynamically through Kubernetes, you will not know which applications are running. Service Discovery is used with other Kubernetes monitoring platforms to collect metrics from several moving containers.

Learn more: Sumo Logic Kubernetes Monitoring Solution

Kubernetes monitoring metrics

After deploying Kubernetes, you can track a set number of metrics for performance, which include:

Resource utilization metrics
Cluster status information
Kubernetes log data

Learn more: Advanced Kubernetes metrics

Kubernetes logs

To measure and monitor Kubernetes workloads, you need to review the log outputs of this orchestration engine. Kubernetes has limited capabilities to view and collect internal logs within various clusters and containers.

The most basic form of logging in Kubernetes is the output generated by individual containers using stdout and stderr. The output for the currently running container instance can be accessed via the kubectl logs command.

Read the article below to learn how Kubernetes logging is structured, how to use its native functionality, and how to use a third-party logging engine to enhance what to do with logos generated within a Kubernetes environment.

Learn more: Kubernetes logs

Types of Kubernetes monitoring

Kubernetes monitoring is a proactive method of reporting on clusters and Kubernetes containers. Monitoring these clusters involves tracking resource utilization, memory, CPU, storage and more. The article below offers a crash course on what to monitor.

Learn more: Kubernetes monitoring

Kubernetes pod monitoring

A pod is the smallest execution unit in Kubernetes and typically contains a single application. Kubernetes will usually create a replica to continue operations when a pod fails. By limiting pods to a single process, you can easily monitor the health of each process running in the cluster.

Learn how to set up Kubernetes anywhere—on-premises, AWS, Azure, and GCP.

You can monitor a Kubernetes pod by running the kubectl “get pod” command and checking the STATUS column. This will show you the running state of pods in your Kubernetes clusters.

The most common Kubernetes pod monitoring metrics are:

Number of instances of a pod
Expected instances of a pod
On-progress deployment
Health checks
Network data and usage
Memory usage on containers

Using Sumo Logic, you can get instant access to performance metrics, logs, traces, Kubernetes system events, as well as Kubernetes security events. Sumo Logic allows you to monitor and troubleshoot your applications in Kubernetes using an intuitive mental model of Kubernetes hierarchies, instead of the server-based focus.

Kubernetes cluster monitoring

Kubernetes cluster monitoring aims to review the health of the entire Kubernetes cluster. DevOps staff typically want to know if all nodes in clusters are working as expected, what capacity they’re operating at, the resource utilization of each cluster, and the number of applications running on each node.

Kubernetes cluster monitoring metrics include:

Node resource utilization
- Disk utilization
- Network bandwidth
- CPU
- Memory utilization
Volume of nodes
- Cluster usage
- Overall cost
Running pods

Kubernetes cost monitoring

Kubernetes is unique compared to traditional SaaS platforms and services because workloads are unique and costs can be variable.

With Kubernetes, you can monitor cluster costs through the following five components:

CPU utilization and memory
Load balancing
Persistent storage
Common services
Cluster management fees

While managing Kubernetes costs can be challenging, here are some best practices for Kubernetes cost monitoring:

Ensure you’re using the right size nodes
Ensure you’re using the correct size pods
Leverage autoscaling and downscaling
Take advantage of cloud discounts from AWS
Use spot instances to run some Kubernetes workloads

Kubernetes application performance monitoring

Kubernetes is the platform of choice for site reliability engineers and platform teams to implement containerized microservices and accelerate cloud migration. However, it also introduces significant complexity and new security and operational risks. Sumo Logic provides native integrations with best-practice data sources for Kubernetes—Prometheus, FluentD, Fluentbit, and Falco. With the easy-to-set-up unified collection deployed using Helm, you get instant access to:

Performance metrics
Logs
Traces
Kubernetes system events
as well as Kubernetes security events

Sumo Logic also makes upstream changes and contributions to open source projects, like OTEL standard.

Learn how to set up Kubernetes anywhere—on-premises, AWS, Azure, and GCP.

Kubernetes network monitoring

The network your clusters connect to is the source of one set of issues associated with the complexity of monitoring Kubernetes applications. Monitoring these issues includes:

HTTP requests (error rate and response time)
Endpoint monitoring
Endpoint transactions
Service map and health

Kubernetes security monitoring

The network that your clusters connect to is the source of one set of issues associated with the complexity of monitoring Kubernetes applications. Monitoring these issues includes:

HTTP requests (error rate and response time)
Endpoint monitoring
Endpoint transactions
Service map and health

Kubernetes security monitoring

Kubernetes security is challenging because it is a sprawling platform composed of many different parts. Each of those components carries its security risks and vulnerabilities.

Here’s an overview of the key parts of a Kubernetes environment that have security risks:

Containers
Host operating systems
Container runtimes
Network layer
API
Kubectl (and other management tools)

Learn more: Kubernetes security and DevSecOps

Troubleshooting Kubernetes

Instead of monitoring a static set of physical servers or VMs, Kubernetes containers are more complex due to the high volume and shorter lifespans. Thousands of containers now live for mere minutes while serving millions of users across hundreds of services. In addition to the containers, development teams must monitor the Kubernetes system and its many components, ensuring they are all operating as expected.

Learn more: Troubleshooting Kubernetes

Kubernetes monitoring dashboard

DevOps teams and engineers managing your environments should have a dashboard to monitor your Kubernetes environments.

The link below discusses how to deploy and utilize a standard Kubernetes dashboard and its benefits.

Learn more: Kubernetes Dashboard

Kubernetes trends

Kubernetes has become the standard for container management, enabling automated deployment in hybrid, multi-cloud and on-premises environments. Here are key trends driving Kubernetes adoption:

DevSecOps
Organizations realize that integrating security into every stage of their development lifecycle is essential to secure containerized environments properly. As a result, DevSecOps patterns are becoming inseparable in modern containerized environments.

GitOps
GitOps is evolving to support multi-tenant and multi-cluster deployments, making it easy to manage tens of thousands of Kubernetes clusters running at the edge or in hybrid environments. As a result, GitOps is becoming the gold standard for continuous deployment.

Cloud migration
As more companies move to cloud-based services, hyperscalers like Amazon, Microsoft, and Google will provide new tools to simplify the move to container-native environments.

Stateful applications
There is a growing need to run stateful applications in containers as stateful processes have become more challenging to manage. New Kubernetes mechanisms will continue to emerge for stateful use cases.

AI and machine learning
The continuing rise of machine learning (AI/ML) workloads will mean the growing adoption of Kubernetes, which increases computing power. Packaging AI/ML workloads as containers and running them as clusters on Kubernetes allows data science teams to create and consistently replicate tested environments without reconfiguring GPU support each time they run workloads.

Kubernetes alternatives and comparisons

Kubernetes is often compared to other container and deployment services. Below is a list of several Kubernetes alternatives and comparisons.

Kubernetes monitoring integrations

Sumo Logic offers several applications that integrate with Kubernetes to protect cloud-native systems from vulnerabilities across images, containers, Kubernetes, and your running deployments. Learn more about:

Getting started with Kubernetes monitoring

Kubernetes container management system allows enterprises to create an automated, virtual, microservices application platform. By using container services, organizations can build, deploy, and horizontally scale lightweight applications more efficiently across multiple types of server hosts, cloud environments, and other infrastructures.

But Kubernetes adoption comes with a fundamental challenge: how to gain comprehensive visibility into your Kubernetes applications.

Sumo Logic provides native integrations with best practices, open-source data sources for Kubernetes and data collection that auto-discovers your Kubernetes architecture.

As part of our commitment to customers, Sumo Logic offers a guided onboarding workflow to get Kubernetes observability up and running in just a few clicks—without needing to know Sumo Logic’s platform in depth beforehand. Leverage instant access to performance metrics, logs, traces, Kubernetes system events and security events.

Learn more about how you can gain Kubernetes visibility in just a few clicks with Sumo Logic.

Complete visibility for DevSecOps

Reduce downtime and move from reactive to proactive monitoring.

Start free trial

Kubernetes monitoring - definition & overview

Key takeaways

Complete visibility for DevSecOps

You're in good company