Logging is one of the many ways in which virtual machines fundamentally differ from Docker containers. It’s also an aspect of migrating to Docker that can be easy to overlook—but one that you absolutely need to plan for if you intend to get the most out of Dockerized infrastructure and from Docker logs.
In this article, I’ll explain what makes aggregating, managing and interpreting virtual machine logs so different from doing the same for Docker logs, and offer tips on how to go about revamping your logging strategy when you migrate to Docker.
Virtual Machine Logging
To start, here’s an overview of what the log aggregation and management process entails in most environments built using virtual machines.
In many respects, managing logs for virtual machines is the same as for bare-metal servers. Whether you use Windows, Linux or another type of operating system as the guest OS in the virtual machines, the operating system will produce logs such as syslog and cron logs on Linux. By default, Windows tracks “events” instead of generating traditional logs, although you can set up syslog for Windows if you wish.
In most cases, the applications hosted on your virtual machines will also generate logs. If you run an Apache web server or MySQL database, for example, you’ll get log files for them.
In virtual machine guest environments, viewing the system and application logs is easy and straightforward. By default, logs will usually be stored in the same location (usually /var/log) on Linux, or accessible via Event Viewer on Windows).
The only thing that makes virtual machine log management different from bare-metal logging is that, in addition to system and application logs for the virtual machines that run your apps, you also have system logs for the bare-metal servers that host your virtual machines, plus logs for the hypervisor. This is really not too complicated, however. Logs on bare-metal servers will be of the same type and stored in the same locations as they would be on your guest machines. Log files for hypervisor platforms are generally stored directly on the bare-metal host servers in locations either under the same directory as system logs (this is the case with KVM, for example) or alongside virtual machine configuration files (this is what VMware typically does).
Because the locations and types of all of your logs are predictable, using log aggregators to collect all of the logs within distributed environments is simple. Tools like Sumo Logic make it even simpler because Sumo’s Collectors are pre-configured to aggregate logs from many sources, including virtual machines.
Docker Container Logging
Logging gets considerably trickier when you’re working with Docker logs. This is true for several reasons.
One is that logs for Dockerized applications live inside the Docker container, and Docker containers by default do not store data persistently. As a result, any logs stored inside a container will disappear forever when that container shuts down, unless you take steps to move them elsewhere. With virtual machines, you don’t have this problem because most virtual machines have virtual disk images that remain intact after the virtual machine shuts down.
This makes it especially important to set up a log aggregator that can collect log data from all of your Docker applications and store it in a place where it will remain accessible when the containers stop running.
However, setting up Docker log aggregation is more complicated than it would be in virtual server environments. This is another of the major challenges to Docker logging. Because Docker application logs live inside containers, a log aggregator that has access to the Docker host can’t simply pull application logs from the containers as if they were log files on the host. The aggregator instead needs to be sophisticated enough to access the file system inside the container and collect the logs from there, or collect logging information in another way.
At the same time, you still need to worry about logs on the servers (whether bare-metal or virtual) that host your Docker environment. Those logs include the server system logs, which are important for keeping the servers stable and healthy, as well as the Docker daemon log (which on most Linux distributions usually lives in /var/log or a subdirectory of this location).
Essentially, then, this means that with Docker logging, you have to deal with two very different types of log aggregation. One involves logs from your Dockerized applications, which live inside containers. Others are system logs and Docker daemon logs from the host servers.
There are a few possible solutions for collecting Docker application logs. One is to use Docker’s native syslog logging driver, then manage syslog as you would normally. The disadvantage here is that the syslog logging driver essentially becomes a middleman. Another approach is to use a tool like Sumo Logic, which can collect Docker application log data in multiple ways: From the Docker host server, directly from containers, via an HTTP stream, from a storage volume, or from another container.
|VM Logs||Docker Logs|
|Log file location||On Host||On Host and Inside the container|
|Log Storage persistently?||Yes||No (for logs inside the container)|
|Common methods for log aggregation||Syslog||Syslog, HTTP Endpoints, Special logging containers|
While there is some overlap in the processes and tools you use for aggregating and managing logs on both Docker and virtual machine platforms like VMware or KVM, the processes are different in key respects. Simply put, Docker log management is more complicated because Docker container logs are more abstracted from log aggregation tools. This is why log aggregators that are Docker-aware and capable of automating Docker log collection are so useful.