REPORT

2022 Gartner® Magic Quadrant™ for APM and Observability Read the Report

Brian Goleno

Posts by Brian Goleno

Blog

Docker Monitoring: A Complete Guide

Docker Monitoring: How It Works When it comes to monitoring and logging in Docker, the recommended pathway for developers has been for the container to write to its standard output, and let Docker collect the output. Then you configure Docker to either store it in files, or send it to syslog. Another option is to write to a directory, so the plain log file is the typical /var/log thing, and then you share that directory with another container. In practice, when you stop the first container, you indicate that /var/log will be a “volume,” essentially a special directory, that can then be shared with another container. Then you can run tail -f in a separate container to inspect those logs. Running tail by itself isn’t extremely exciting, but it becomes much more meaningful if you want to run a log collector that takes those logs and ships them somewhere. The reason is you shouldn’t have to synchronize between application and logging containers (for example, where the logging system needs Java or Node.js because it ships logs that way). The application and logging containers should not have to agree on specific dependencies, and risk breaking each others’ code. Docker Logging: The 12-Factor App However, this isn’t the only way to log in Docker. Remember the 12-Factor app, a methodology for building SaaS applications, recommending that you limit to one process per container as a best practice, with each running unbuffered and sending data to Stdout. There are numerous options for container logging from the pre-Docker 1.6 days forward, and are better than others. You could: Log Directly from an ApplicationInstall a File Collector in the ContainerInstall a File as a ContainerInstall a Syslog Collector as a ContainerUse Host Syslog for Local SyslogUse a Syslog Container for Local SyslogLog to Stdout and use a file collectorLog to StdOut and use LogspoutCollect from the Docker File systems (Not recommended)Inject Collector via Docker Exec Docker Logging Drivers in Docker Engine Docker 1.6 added 3 new log drivers: docker logs, syslog, and log-driver null. The driver interface was meant to support the smallest subset available for logging drivers to implement their functionality. Stdout and stderr would still be the source of logging for containers, but Docker takes the raw streams from the containers to create discrete messages delimited by writes that are then sent to the logging drivers. Version 1.7 added the ability to pass in parameters to drivers, and in Docker 1.9 tags were made available to other drivers. Importantly, Docker 1.10 allows syslog to run encrypted, thus allowing companies like Sumo Logic to send securely to the cloud. Recent proposals for Google Cloud Cloud Logging driver, and the TCP, UDP, Unix Domain Socket driver. “As part of the Docker engine, you need to go through the engine commit protocol. This is good, because there’s a lot of review stability. But it is also suboptimal because it is not really modular, and it adds more and more dependencies on third party libraries.” In fact, others have suggested the drivers be external plugins, similar to how volumes and networks work. Plugins would allow developers to write custom drivers for their specific infrastructure, and it would enable third-party developers to build drivers without having to get them merged upstream and wait for the next Docker release. A Comprehensive Approach for Docker Monitoring and Logging To get real value from machine-generated data, you need to look at “comprehensive monitoring.” There are five requirements to enable comprehensive monitoring. 5 Requirements of Comprehensive Monitoring Events Let's start with events. The Docker API makes it trivial to subscribe to the event stream. Events contain lots of interesting information. The full list is well described in the Docker API doc, but let’s just say you can track containers come and go, as well as observe containers getting killed, and other interesting stuff, such as out of memory situations. Docker has consistently added new events with every version, so this is a gift that will keep on giving in the future. Think of Docker events as nothing but logs. And they are very nicely structured—it's all just JSON. If, for example, you load this into my log aggregation solution, you can now track which container is running where. I can also track trends - for example, which images are run in the first place, and how often are they being run. Or, why are suddenly 10x more containers started in this period vs. before, and so on. This probably doesn't matter much for personal development, but once you have fleets, this is a super juicy source of insight. Lifecycle tracking for all your containers will matter a lot. Configurations Docker events, among other things, allow us to see containers come and go. What if we wanted also to track the configurations of those containers? Maybe we want to track drift of run parameters, such as volume settings, or capabilities and limits. The container image is immutable, but what about the invocation? Having detailed records of container starting configurations in my mind is another piece of the puzzle towards solving total visibility. Orchestration solutions will provide those settings, sure, but who is telling those solutions what to do? From experience, we know that deployment configurations are inevitably going to be drifting, and we have found the root cause to otherwise inscrutable problems there more than once. Docker allows us to use the inspect API to get the container configuration. Again, in my mental model, that's just a log. Send it to your aggregator. Alert on deviations, use the data after the fact for troubleshooting. Docker provides this info in a clean and convenient format. Logs Well, obviously, it would be great to have logs, right? Turns out there are many different ways to deal with logs in Docker, and new options are being enabled by the new log driver API. Not everybody is quite there yet in 12-factor land, but the again there are workarounds for when you need fat containers and you need to collect logs from files inside of containers. More and more people following the best practice of writing logs to standard out and standard error, and it is pretty straightforward to grab those logs from the logs API and forward them from there. The Logspout approach, for example, is really neat. It uses the event API to watch which containers get started, then turns around and attaches to the log endpoint, and then pumps the logs somewhere. Easy and complete, and you have all the logs in one place for troubleshooting, analytics, and alerting. Stats Since the release of Docker 1.5, container-level statistics are exposed via a new API. Now you can alert on the "throttled_data" information, for example - how about that? Again (and at this point, this is getting repetitive, perhaps), this data should be sucked into a centralized system. Ideally, this is the same system that already has the events, the configurations, and the logs! Logs can be correlated with the metrics and events. There are many pieces to the puzzle, but all of this data can be extracted from Docker pretty easily today already. Docker Daemon Logs and Hosts In all the excitement around APIs for monitoring data, let's not forget that we also need to have host level visibility. A comprehensive solution should therefore also work hard to get the Docker daemon logs, and provide a way to get any other system level logs that factor into the way Docker is being put to use on the hosts of the fleet. Add host level statistics to this and now performance issues can be understood in a holistic fashion - on a container basis, but also related to how the host is doing. Maybe there's some intricate interplay between containers based on placement that pops up on one host but not the other? Without quick access to the actual data, you will scratch your head all day. User Experience What's the desirable user experience for a comprehensive monitoring solution for Docker? Thanks to the API-based approach that allows us to get to all the data either locally or remotely, it should be easy to encapsulate all the monitoring data acquisition and forwarding into a container that can either run remotely, if the Docker daemons support remote access, or as a system container on every host. Depending on how the emerging orchestration solutions approach this, it might not even be too crazy to assume that the collection container could simply attach to a master daemon. It seems Docker Swarm might make this possible. Super simple, just add the URL to the collector config and go. Sumo Logic API and Docker Logging In its default configuration, our containerized Collector agent will use the Docker API to collect the logs and statistics (metrics) from all containers, and the events that are emitted from the Docker Engine. Unless configured otherwise, the Collector will monitor all containers that are currently active, as well as any containers that are started and stopped subsequently. Within seconds, the latest version of the Collector container will be downloaded, and all of the signals coming from your Docker environment will be pumped up to Sumo Logic’s platform. Using the API has its advantages. It allows us to get all 3 telemetry types (logs, metrics, and events), we can query for additional metadata during container startup, we don’t have to accommodate for different log file locations, and the integration is the same regardless of whether you log to files, or to journalD. The Benefits of Docker Agent-Based Collection The other advantage of this approach is the availability of a data collection agent that provides additional data processing capabilities and ensures reliable data delivery. Data processing capabilities include multiline processing, and data filtering and masking of data before leaving the host. This last capability is important when considering compliance requirements such as PCI or HIPAA. Also important from a compliance standpoint is reliability. All distributed logging systems must be able to accommodate networking issues or impedance mismatches, such as latency or endpoint throttling. These are all well covered issues when using the Sumo Logic Collector Agent. Docker Multiline Logging Lack of multiline logging support has always plagued Docker logging. The default Docker logging drivers, and the existing 3rd party logging drivers, have not supported multiline log messages, and for the most part, they still do not. One of Sumo Logic’s strengths has always been its ability to rejoin multiline log messages back into a single log message. This is an especially important issue to consider when monitoring JVM-based apps, and working with stack traces. Sumo Logic automatically infers common boundary patterns, and supports custom message boundary expressions. We ensure that our Docker Log Source and our Docker Logging Plugin maintain these same multiline processing capabilities. The ability to maintain multiline support is one of the reasons why we recommend using our custom Docker API based integration over simply reading the log files from the host. Generally speaking, reading container logs from the file system is a fine approach. However, when the logs are wrapped in JSON, and ornamented with additional metadata, it makes the multiline processing far more difficult. Other logging drivers are starting to consider this issue, no doubt based on market feedback. However, their capabilities are far less mature than Sumo’s. Instant Gratification of Docker Logging The installation of the containerized agent couldn’t be simpler. And with a simple query, you can see the data from all of the containers on your host, with all of the fields extracted and ready to explore. From there, it is easy to install our Docker App to monitor your complete Docker Environment as you scale this out to all of your hosts. Going Beyond Docker Basics When you deploy the Sumo Logic Collector container across a fleet of hosts, monitoring hundreds or thousands of containers, you will want to be a bit more sophisticated than just running with the default container settings. However, that is beyond the scope of this discussion. When you deploy our Collector Agent as a container, all of the Collector agent’s features are available, and all parameters can be configured. To read about how to dive into the advanced configuration options, check out the container’s readme on Docker Hub and read more details in our documentation . Sometimes You Gotta Go Agentless There are times when you require an agentless solution – or you may just prefer one. If you have another way to collect Docker container metrics, and you just need container logs, then a Docker Logging Plugin (earlier versions referred to as Logging Drivers) may be the perfect solution. Note: The agentless approach is an ideal solution for AWS ECS users that rely on CloudWatch for their container metrics and events. How Sumo Logic's Docker Logging Plugin Works Our Docker Logging Plugin is written in Go, and runs within the Docker Engine. It is configured on a per container basis, and sends data directly to Sumo Logic’s HTTP Endpoint, using a pre-configured “HTTP Source.” You can access our plugin on the new Docker Store , but the best place to read about how to use it is on its Github repo. Following the theme set out earlier, it is very easy to use in its default configuration, with a host of advanced options available. Follow these simple steps: Register the plugin with the Docker Engine :$ docker plugin install –grant-all-permissions store/sumologic/docker-logging-driver:<ver>(make sure you go to the Docker Store, and get the latest version number. As of this publishing, the latest version is 1.0.1 , and Docker Store does not support the ‘latest’ parameter. So, here is the corresponding command line for this version:$ docker plugin install –grant-all-permissions store/sumologic/docker-logging-driver:1.0.1 )Specify the driver when you run a container:$ docker run –log-driver=sumologic –log-opt sumo-url=<sumo_HTTP_url> Docker Logging Plugin Capabilities This plugin provides some very important capabilities: Buffering and batching. You can configure the size of each HTTP POSTCompression: Configurable gzip compression levels to minimize data transfer costsProxy support: Critical for highly secure enterprise deploymentTLS Required: This is a Sumo Logic requirement. All data transfer must meet PCI compliance requirements.Multiline Support: Multiline stitching is processed within the Sumo Logic cloud platform rather than in the logging plugin. This keeps the plugin fast and efficient. However, we made specific design considerations to ensure that the we preserved multiline support while providing rich metadata support.Configurable Metadata per Container: The Docker Logging Plugin framework supports a flexible templating system that is used by our plugin to construct dynamic Source Category metadata that varies per container. The template syntax gives you access to environment vars, docker labels, and the ability to pass in custom values when starting containers. Our Docker Logging Plugin is the first of our integrations to support this capability. A similar capability will be supported by our Docker Log and Stats Sources with our next Collector release. Integrating With Other Docker Source Agents If, for some reason, these two methods do not satisfy your needs, then one of our many other collection methods (aka “Sources”) will most likely do the trick. Sumo Logic also integrates with various other open source agents and cloud platform infrastructures, and relies on some of them for certain scenarios. Details on all of the above integrations are available in our docs. If you have been using Docker for a while, and have implemented a solution from the early days, such as syslog or logspout, we encourage you to review the approaches defined here, and migrate your solution accordingly.

Blog

Journey to the Cloud, with Pivotal and Sumo Logic

There is no denying it – the digital business transformation movement is real, and the time for this transformation is now. When, according to survey from Bain & Company, 48 of 50 Fortune Global companies have publicly announced plans to adopt public cloud, it is clear that there are no industries immune from this disruption. We are seeing traditional industries such as insurance, banking, and healthcare carving out labs and incubators that bring innovative solutions to market, and establish processes and platforms to help the rest of the organization with their evolution. For large enterprises it is critical that they manage the challenges of moving to public cloud, while satisfying the needs of a diverse set of internal customers. They need to support a variety of development languages, multiple deployment tool chains, and a mix of data centers and multiple public cloud vendors. Because these are long term strategies that involve considerable investment, they are concerned about long-term vendor lock-in, and are being proactive about developing strategies to mitigate those risks. These organizations are looking toward cloud-neutral commercial vendors to help them migrate to the cloud, and have consistency in how they deploy and manage their applications across heterogeneous environments. These enterprises are increasingly turning to Pivotal Cloud Foundry® to help them abstract their app deployments from the deployment specifics of individual cloud platforms, and maintain their ability to move apps and workloads across cloud providers when the time comes. Effective DevOps Analytics for the Modern Application The migration of enterprise workloads to the cloud, and the rise of public cloud competition, is driving the demand for Sumo Logic as a cloud-native platform for monitoring and securing modern applications. Pivotal Cloud Foundry enables users to abstract the underlying plumbing necessary to deploy, manage and scale containerized cloud native applications. This benefits developers by greatly increasing their productivity and ability to launch applications quickly. Such an environment also exposes a broader set of operational and security constructs that are useful to track, log and analyze. However it can also be more complicated to diagnose performance issues with decoupled architectures and composable micro-services. Full stack observability and the ability to trace all the apps and services together are critical to successful cloud deployments. Observability of decoupled architectures with composable services requires the ability to trace all layers of the stack With Pivotal Cloud Foundry and tools from Sumo Logic, an organization can have an observable, enterprise-class platform for application delivery, operations, and support across multiple public cloud providers and on-premises data centers. Beyond platform operations, Cloud Foundry customers want to enable their app teams to be self sufficient, and promote an agile culture of DevOps. Often, with legacy monitoring and analytics tools, the operations team will have access to the data, but they can’t scale to support the application teams. Or, the apps team may restrict access to their sensitive data, and therefore not support the needs of the security and compliance team. Sumo Logic believes in democratized analytics. This means that this massive flow of highly valuable data, from across the stack and cloud providers, should be available to everyone that can benefit from it. This requires the right level of scale, security, ubiquity of access, and economics that only Sumo Logic can provide. Sumo Logic & Pivotal Cloud Foundry Partnership Through our collaboration with Pivotal®, Sumo Logic has developed an app for Pivotal Cloud Foundry, as well as an easy-to-deploy integration with Pivotal Cloud Foundry Loggregator. A customer ready Beta of the “Sumo Logic Nozzle for PCF”, is available now as an Operations Manager Tile for Pivotal Cloud Foundry, available for download in BETA from Pivotal Network. Sumo Logic Tile Installed in the PCF Ops Manager If you are already using or evaluating Pivotal Cloud Foundry you can get started with operational and security analytics in a manner of minutes. With this integration, all of the log and metrics data collated by Cloud Foundry Loggregator will be streamed securely to the Sumo Logic Platform. For deployments with security and compliance requirements, Sumo Logic’s cloud-based service is SOC 2, HIPAA, and PCI-compliant. The Sumo Logic integration for Pivotal Cloud Foundry will be available in the App Library soon. If you would like early access, please contact your account team. Sumo Logic App for Pivotal Cloud Foundry highlight key Pivotal data and KPI’s The Sumo Logic App for Pivotal Cloud Foundry highlights key Pivotal data and KPIs. Sumo Logic’s App for Cloud Foundry operationalizes Pivotal Cloud Foundry’s monitoring best practices for you, and provides a platform for you to build upon to address your unique monitoring and diagnostic requirements.