Back to blog results

December 11, 2014By Christian Beedgen

An Official Docker Image For The Sumo Logic Collector

Note: This post is now superceded by Update On Logging With Docker.

Learning By Listening, And Doing

Over the last couple of months, we have spent a lot of time learning about Docker, the distributed application delivery platform that is taking the world by storm. We have started looking into how we can best leverage Docker for our own service. And of course, we have spent a lot of time talking to our customers. We have so far learned a lot by listening to them describe how they deal with logging in a containerized environment.

We actually have already re-blogged how Caleb, one of our customers, is Adding Sumo Logic To A Dockerized App. Our very own Dwayne Hoover has written about Four Ways to Collect Docker Logs in Sumo Logic.

Along the way, it has become obvious that it makes sense for us to provide an “official” image for the Sumo Collector. Sumo Logic exposes an easy to use HTTP API, but the vast majority of our customers are leveraging our Collector software as a trusted, production-grade data collection conduit. We are and will continue to be excited about folks building their own images for their own custom purposes. Yet, the questions we get make it clear that we should release an official Sumo Logic Collector image for use in a containerized world

Instant Gratification, With Batteries Included

A common way to integrate logging with containers is to use Syslog. This has been discussed before in various places all over the internet. If you can direct all your logs to Syslog, we now have a Sumo Logic Syslog Collector image that will get you up and running immediately:

docker run -d -p 514:514 -p 514:514/udp --name="sumo-logic-collector"
sumologic/collector:latest-syslog [Access ID] [Access key]

Started this way, the default Syslog port 514 is mapped port on the host. To test whether everything is working well, use telnet on the host:

telnet localhost 514

Then type some text, hit return, and then CTRL-] to close the connection, and enter quit to exittelnet. After a few moments, what you type should show up in the Sumo Logic service. Use a search to find the message(s).

To test the UDP listener, on the host, use Netcat, along the lines of:

I'm in ur sysloggz | nc -v -u -w 0 localhost 514

And again, the message should show up on the Sumo Logic end when searched for.

If you want to start a container that is configured to log to syslog and make it automatically latch on to the Collector container’s exposed port, use linking:

docker run -it --link sumo-logic-collector:sumo ubuntu /bin/bash

From within the container, you can then talk to the Collector listening on port 514 by using the environment variables populated by the linking:

echo "I'm in ur linx" | nc -v -u -w 0 $SUMO_PORT_514_TCP_ADDR $SUMO_PORT_514_TCP_PORT

That’s all there is to it. The image is available from Docker Hub. Setting up an Access ID/Access Key combination is described in our online help.

Composing Collector Images From Our Base Image

Following the instructions above will get you going quickly, but of course it can’t possibly cover all the various logging scenarios that we need to support. To that end, we actually started by first creating a base image. The Syslog image extends this base image. Your future images can easily extend this base image as well. Let’s take a look at what is actually going on! Here’s the Github repo:https://github.com/SumoLogic/sumologic-collector-docker.

One of the main things we set out to solve was to clarify how to allow creating an image that does not require customer credentials to be baked in. Having credentials in the image itself is obviously a bad idea! Putting them into the Dockerfile is even worse. The trick is to leverage a not-so-well documented command line switch on the Collector executable to pass the Sumo Logic Access ID and Access Key combination to the Collector. Here’s the meat of the run.sh startup script referenced in the Dockerfile:

/opt/SumoCollector/collector console -- -t -i $access_id -k $access_key
-n $collector_name -s $sources_json

The rest is really just grabbing the latest Collector Debian package and installing it on top of a base Ubuntu 14.04 system, invoking the start script, checking arguments, and so on.

As part of our continuous delivery pipeline, we are getting ready to update the Docker Hub-hosted image every time a new Collector is released. This will ensure that when you pull the image, the latest and greatest code is available.

How To Add The Batteries Yourself

The base image is intentionally kept very sparse and essentially ships with “batteries not included”. In itself, it will not lead to a working container. This is because the Sumo Logic Collector has a variety of ways to setup the actual log collection. It supports tailing files locally and remotely, as well as pulling Windows event logs locally and remotely.

Of course, it can also act as a Syslog sink. And, it can do any of this in any combination at the same time. Therefore, the Collector is either configured manually via the Sumo Logic UI, or (and this is almost always the better way), via a configuration file. The configuration file however is something that will change from use case to use case and from customer to customer. Baking it into a generic image simply makes no sense.

What we did instead is to provide a set of examples. This can be found in the same Github repository under “example”: https://github.com/SumoLogic/sumologic-collector-docker/tree/master/example. There’s a couple of sumo-source.json example files illustrating, respectively, how to set up file collection, and how to setup Syslog UDP and Syslog TCP collection. The idea is to allow you to either take one of the example files verbatim, or as a starting point for your own sumo-sources.json. Then, you can build a custom image using our image as a base image. To make this more concrete, create a new folder and put this Dockerfile in there:

FROM sumologic/collector
MAINTAINER Happy Sumo Customer
ADD sumo-sources.json /etc/sumo-sources.json

Then, put a sumo-sources.json into the same folder, groomed to fit your use case. Then build the image and enjoy.

A Full Example

Using this approach, if you want to collect files from various containers, mount a directory on the host to the Sumo Logic Collector container. Then mount the same host directory to all the containers that use file logging. In each container, setup logging to log into a subdirectory of the mounted log directory. Finally, configure the Collector to just pull it all in.

The Sumo Logic Collector has for years been used across our customer base in production for pulling logs from files. More often than not, the Collector is pulling from a deep hierarchy of files on some NAS mount or equivalent. The Collector is quite adept and battle tested at dealing with file-based collection.

Let’s say the logs directory on the host is called /tmp/clogs. Before setting up the source configuration accordingly, make a new directory for the files describing the image. Call it for example sumo-file. Into this directory, put this Dockerfile:

FROM sumologic/collector
MAINTAINER Happy Sumo Customer
ADD sumo-sources.json /etc/sumo-sources.json

The Dockerfile extends the base image, as discussed. Next to the Dockerfile, in the same directory, there needs to be a file called sumo-sources.json which contains the configuration:

{
 "api.version": "v1",
 "sources": [
 {
 "sourceType" : "LocalFile",
 "name": "localfile-collector-container",
 "pathExpression": "/tmp/clogs/**",
 "multilineProcessingEnabled": false,
 "automaticDateParsing": true,
 "forceTimeZone": false,
 "category": "collector-container"
 }
 ]
}

With this in place, build the image, and run it:

docker run -d -v /tmp/clogs:/tmp/clogs -d --name="sumo-logic-collector"
[image name] [your Access ID] [your Access key]

Finally, add -v /tmp/clogs:/tmp/clogs when running other containers that are configured to log to /tmp/clogs in order for the Collector to pick up the files.

Just like the ready-to-go syslog image we described in the beginning, a canonical image for file collection is available. See the source: https://github.com/SumoLogic/sumologic-collector-docker/tree/master/file.

docker run -v /tmp/clogs:/tmp/clogs -d --name="sumo-logic-collector"
sumologic/collector:latest-file [Access ID] [Access key]

If you want to learn more about using JSON to configure sources to collect logs with the Sumo Logic Collector, there is a help page with all the options spelled out.

That’s all for today. We have more coming. Watch this space. And yes, comments are very welcome.

Complete visibility for DevSecOps

Reduce downtime and move from reactive to proactive monitoring.

Christian Beedgen

As co-founder and CTO of Sumo Logic, Christian Beedgen brings 18 years experience creating industry-leading enterprise software products. Since 2010 he has been focused on building Sumo Logic’s multi-tenant, cloud-native machine data analytics platform which is widely used today by more than 1,600 customers and 50,000 users. Prior to Sumo Logic, Christian was an early engineer, engineering director and chief architect at ArcSight, contributing to ArcSight’s SIEM and log management solutions.

More posts by Christian Beedgen.

People who read this also enjoyed

Blog

Software visibility is the key to innovation

Blog

The Key Message from KubeCon NA 2018: Prometheus is King

Blog

Sumo Logic Experts Reveal Their Top Enterprise Tech and Security Predictions for 2019