Kubernetes is first and foremost an orchestration engine that has well-defined interfaces that allows for a wide variety of plugins and integrations to make it the industry leading platform in the battle to run the world’s workloads. From machine learning to running the applications a restaurant needs, Kubernetes has proven it can run things.
All these workloads, and the Kubernetes platform itself, produce output that is most often in the form of logs. Kubernetes has some very limited capabilities to view – and in some cases collect – its internal logs, and the logs generated by all the individual workloads it is running most often in the form of ephemeral containers. This article will cover how Kubernetes logging is structured, how to use its native functionality, and how to use a third-party logging engine to really enhance what can be done with logos generated within a Kubernetes environment.
Basic Logging Architecture and Node-Level Logging in Kubernetes
The most basic form of logging in Kubernetes is the output generated by individual containers using stdout and stderr. The output for the current running container instance is available to be accessed via the kubectl logs command.
The next level up of logging in the Kubernetes world is called node level logging. This is broken down into two components: the actual log files being stored; and the Kubernetes side, which allows the logs to be viewed remotely and removed, under certain circumstances.
The actual files are created by the container runtime engine – like Docker containerd – and contain the output from stdout and stderr. There are files for every running container on the host, and these are what Kubernetes reads when kubectl logs is run. Kubernetes is configured to know where to find these log files and how to read them through the appropriate log driver, which is specific to the container runtime.
Kubernetes has some log rotating capabilities, but it is limited to when a pod is evicted or restarted. When a pod is evicted, all logs are removed by kubelet. When a pod is restarted, kubelet keeps the current logs and the most recent version of the logs from before the restart. Any older logs are removed. This is great, but does not help keep the logs from long-running pods under control.
For any live environment with a constant stream of new log entries being generated, the reality of disk space not being infinite becomes very real the first time an application crashes due to no available space. To mitigate this, it is best practice to implement some kind of log rotation on each node that will take into account both the number of pods that potentially will be run on the node, and the disk space that is available to support logging.
While Kubernetes itself can not handle scheduled log rotation, there are many tools available that can. One of the more popular tools is logrotate, and like most other tools in the space, it can rotate based on time (like once a day), the size of the file, or a combination of both. Using size as one of the parameters makes it possible to do capacity planning to ensure there is adequate disk space to handle all the number of pods that could potentially run any given node.
There are two types of system components within Kubernetes: those that run as part of the OS, and those that run as containers managed by kubelet. As kubelet and the container runtime run as part of the operating system, their logs are consumed using the standard OS logging frameworks. As most modern Linux operating systems use systemd, all the logs are available via journalctl. In non-systemd Linux distributions these processes create “.log” files in the /var/logs/ directory.
The second type of system components that run as containers – like the schedule, api-manager, and cloud-controller-manager -– have their logs managed by the same mechanisms as any other container on any host in that Kubernetes cluster.
Cluster-Level Logging Architecture in Kubernetes
The bad news about cluster-level logging in Kubernetes is that Kubernetes has no native cluster-level logging. The good news is that there are a few proven methods that can be applied cluster-wide to provide the same effective result of all the logs being collected in a standardized way and sent to a central location.
The most widely-used methods are:
- Configure an agent on every node
- Include a sidecar that attaches to every pod
- Configure every application individually to ship its own logs
Node Logging Agent
Installing a running agent on every node, preferably as a DaemonSet in Kubernetes, but it could be at the Operating System level.
Benefits are that it requires no changes to the Kubernetes cluster and can be extended to capture other system logs. But the downfall is that it requires a container to run with elevated privileges to access the files that some environments will not be friendly too.
This actually has two options for deployment; the first being the sidecar simply diverts all the log traffic to a custom stdout file that is watched by a custom node logging agent and then shipped off. Or, the sidecar can ship traffic directly to the central logging repository.
While this option requires no changes to the individual container images, it does require changes to the deployment specification for every application that is deployed. This is a great option if you can not run containers with elevated privileges, or only want to send different applications to different logs repositories. But that flexibility comes with many more moving parts to configure than the node-agent option that needs to be watched.
Configuring the applications directly has the same benefits as the sidecar option listed above, and can potentially provide even more valuable information as the application development team can tailor what messages are being generated. The biggest downfall revolves around the fact it is upstream in the application lifecycle and, therefore, needs involvement from the development teams to ensure it is implemented. This leads to additional cross-team coordination and can increase timelines when changes are required, due to the nature of a larger group being involved in all related activities.
Viewing Logging with Kubernetes
Viewing logs with Kubernetes native tools is completely centered around the kubectl command line utility. The single-most useful piece of documentation around kubectl is the cheat sheet that is part of the official documentation, as it tracks all the options and parameters that are available through the command.
First up is the most basic commands that will get used to view the logs from a known container. You can view individual log streams (stdout or stderr, for example), but most people just view all the logs for more context.
$ kubectl logs apache-httpd-pod
10.2.1.1 - - [15/Aug/2017:21:30:32 +0000] "GET / HTTP/1.1" 200 576 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36" "127.0.0.1"
If you wish to follow the live stream of log entries (i.e., tail -f in Linux/UNIX) then add the -f flag to the above command before the pod name, and it will provide the same functionality. It would look like:
$ kubectl logs -f apache-httpd-pod
In the event that you only want to view logs that have been generated within a certain time period, then you can use the --since flag to provide that functionality. This command will pull back all log entries created in the last hour.
$ kubectl logs --since=1h apache-httpd-pod
If the pod has multiple containers, and the logs you need are from just one of the containers, then the logs command allows for further refinement by appending -c container_name to the end of the command.
$ kubectl logs apache-httpd-pod -c httpd-server
Kubernetes has the ability to group pods into namespaces for segmentation and easier applications of things like role-based access control. Because of the sheer number of namespaces and pods that can often be in a Kubernetes cluster, it is a common occurance to need to reference a pod or resource that is defined in another namespace. To access other namespaces without changing your default, you can add -n namespace_name to the beginning of a kubectl command to context switch.
$ kubectl -n f5-namespace logs nginx-pod
There are multiple other options available within logs that can be useful, including displaying logs from pods that are part of a specific deployment.
$ kubectl logs deployment/random-deployment
Beyond the logs, which have some traces in them, if you want to get more metrics to see the holistic view of the cluster and get closer to the idea of the Three Pillars of Observability, you can use additional commands like kubectl get pods to list running pods and kubectl top to see how many resources are being used by individual pods or nodes in the cluster.
Example of Creating and Collecting Application Logs in Kubernetes
To show what logging looks like in Kubernetes, we first need to create a pod that will generate logs. For this purpose we will create a simple pod using busybox that will run a continuous loop, and output the current date and time every second.
$ cat <<EOF | kubectl apply -f -
- name: count
args: [/bin/sh, -c,
'i=0; while true; do echo "$i: $(date)"; i=$((i+1)); sleep 1; done']
Will create one pod
The output from these logs will look like
$ kubectl logs counter
0: Mon Jan 1 00:00:00 UTC 2001
1: Mon Jan 1 00:00:01 UTC 2001
2: Mon Jan 1 00:00:02 UTC 2001
Next step is to configure node-level logging so we can see and ship all the logs on each node to a central server. To do this with Fluentd requires it to have a namespace, a secret with several variables set for things like log target and api keys, and finally the actual deployment of Fluentd using something like a DaemonSet so it runs on every node. Explicit details on the installation are maintained by SumoLogic on GitHub.
In the event that logs are produced outside of stdout and stderr, the pod will need to mount a local volume on the node so the logs are available outside of the running containers and then the logging-agent – in this case Fluentd – can be configured to pick up those log files. This is done by adding a new source section to the fluentd.conf file, and restarting the service. Specific details on how this would work are located in Fluentd’s documentation. This below example would pick up an Apache access_log.
Collecting Logs for Sumo Logic and Kubernetes
There are two ways to enable collection of logs for Sumo Logic. The first is via Helm, to install and configure the Kubernetes cluster directly, which will be the method recommended for most deployments using vanilla Kubernetes or an offering from a public cloud provider like EKS or GKE.
The second way is to leverage tools that may already be active in the cluster, like Prometheus. This will be the preferred when the Kubernetes cluster is in one of the distributions that target on-premise enterprise deployments, like Red Hat OpenShift, so they automatically configure advanced monitoring services as part of cluster creation.
Viewing and Managing Logs with Sumo Logic
Sumo Logic has a platform that really helps companies see all Three Pillars of Observability, which are logs, metrics, and traces. The Kubernetes application, which Sumo Logic has created for their platform, actively ingests metrics and logs into their platform from connected Kubernetes clusters so they can be processed and then visualized through both predefined and custom-made dashboards to increase transparency and expose the important information from Kubernetes – like detailed cluster health and resource utilization – in addition to building trends that allow for earlier detection of anomalies in the monitored clusters.
In addition to visualization, once the data from Kubernetes has been processed in the Sumo Logic platform, it can also be queried using Sumo Logic’s powerful query language, to make analysis easier and give the ability to correlate data from additional log sources to provide a holistic view of your infrastructure.