Pricing Login
Pricing
Christian Beedgen

Christian Beedgen

As co-founder and CTO of Sumo Logic, Christian Beedgen brings 18 years experience creating industry-leading enterprise software products. Since 2010 he has been focused on building Sumo Logic’s multi-tenant, cloud-native machine data analytics platform which is widely used today by more than 1,600 customers and 50,000 users. Prior to Sumo Logic, Christian was an early engineer, engineering director and chief architect at ArcSight, contributing to ArcSight’s SIEM and log management solutions.

Posts by Christian Beedgen

Blog

Sumo Logic Recognized as Data Analytics Solution of the Year Showcasing the Power of Continuous Intelligence

Blog

Love In The Time Of Coronavirus

Blog

How We Understand Monitoring

Blog

All The Logs For All The Intelligence

Blog

Service Levels––I Want To Buy A Vowel

Blog

See You in September at Illuminate!

Blog

The Super Bowl of the Cloud

Blog

Platforms All The Way Up & Down

Blog

Microservices for Startups Explained

Blog

Christian's Musings from Web Summit 2017

I was able to attend my third Web Summit last week. This is the second time for me in Lisbon, as I was lucky enough to be invited to talk on the Binate.io stage again after last year. If you are interested, check out my musings on instinct, intuition, experience and data analytics. Web Summit has grown tremendously since I first attended the Dublin incarnation in 2013. This year, the event was sold out at 60,000 attendees (!) - the Portuguese came out in force, but it was very clear that this event is, while of course drawing most attendees from all across Europe, ultimately an international affair as well. With so many people attending, Web Summit can be rather overwhelming. There is a bit of everything, and an incredible crowd of curious people. Lisbon is fantastically beautiful city, off the beaten path when it comes to tech conferences mostly, so the local folks are really coming out in force to take in the spectacle. So, what is Web Summit? Originally started in Dublin in 2009, it has over the years become a massive endeavor highlighting every conceivable aspect of technology. There's four massive conference halls with multiple stages for speakers and podium discussions in each hall. Christian Beedgen on binate.io stage - Web Summit, Lisbon 2017 Then there is the main arena holding 20,000 people; this is where the most high-profile keynote speakers hit the stage. Web Summit has always brought in government officials and politicians to the show as well in an effort to promote technology. I was actually standing next to Nigel Farage at the speaker cloak room waiting for my coat. There was another guy there as well who was already berating this unfortunate character, so thankfully I didn't have to do it myself. I managed to catch a couple of the keynotes in the aforementioned large arena. Three of them left an impression. Firstly, it was great to see Max Tegmark speak. I am reading his current book, Life 3.0, right now, and it is always a bit of trip when the author suddenly appears on a stage and you realize you have to throw away your mental image of that voice in your head that has been speaking to you from the pages of the book and adopt reality. In this case however, this was not a negative, as Max came across as both deeply knowledgeable and quite relaxed. He looked a bit like he is playing in the Ramones with his black leather jacket and black jeans; this I didn't see coming. In any case, I highly recommend checking out what he has to say. In light of the current almost bombastically overblown hype around AI, he is taking a very pragmatic view, based on many years of his own research. If you can imagine a future of "beneficial AI", check out his book, Life 3.0, for why and how we have a chance to get there. I was also impressed by Margrethe Vestager. She is a Danish politician and currently the European Commissioner for Competition. She captured the audience by simply speaking off of a couple of cue cards, not PowerPoint slides at all. Being a politician, she was casting a very official appearance, of course - but she wore some sick sneakers to a conservative dress which I thought was just awesome. Gotta love the Danish! Her talk centered around the reasoning behind the anti-trust investigation she brought against Google (which eventually lead to a $2.7 billion fine!) The details are too complicate to be reasonably summarized here, but essentially centered around the fact that while nobody in the EU has issues with Google's near-monopoly on search, in the eyes of the competition watchdogs, for Google to use this position to essentially favor their own products in search results creates intolerable fairness issues for other companies. It is very interesting to see how these views are developing outside of the US. The third and last memorable session had animated AI robots dialoguing with their inventor, Einstein, Artificial General Intelligence, distributed AI and models and the blockchain. It was by and large only missing Taylor Swift. SingularityNET is a new effort to create an open, free and decentralized market place for AI technology, enabled by Smart Contracts. I frankly don't have the slightest clue how that would work, but presenter Ben Goertzel was animatedly excited about the project. The case for needing an AI marketplace for narrow AIs to compose more general intelligences was laid out in a strenuous "discussion" with "lifelike" robots from Hanson Robotics. It is lost on me why everybody thinks they need to co-opt Einstein; first Salesforce calls their machine learning features Einstein, now these robotics guys have an Einstein robot on stage. I guess the path to the future requires still more detours to the past. I guess Einstein can't fight back on this anymore and at least they are picking an exceptional individual... Now that I am back in the US for only a day, the techno-optimism that's pervasive at Web Summit feels like a distant memory already.

November 14, 2017

Blog

Machine Data for the Masses

Blog

Update On Logging With Docker

A Simpler & Better WayIn New Docker Logging Drivers, I previously described how to use the new Syslog logging driver introduced in Docker 1.6 to transport container logs to Sumo Logic.Since then, there have been improvements to the Syslog logging driver, which now allows users to specify the address of the Syslog server to send the logs to. In its initial release the Syslog logging driver simply logged to the local Syslog daemon, but this is now configurable. We can exploit this in conjunction with the Sumo Logic Collector container for Syslog to make logging with Docker and Sumo Logic even easier.Simply run the Syslog Collector container as previously described:$ docker run -d -p 514:514 -p 514:514/udp \ --name="sumo-logic-collector" \ sumologic/collector:latest-syslog \ [Access ID] [Access key]Now you have a collector running, listening for Syslog on both ports 514/tcp and 514/udp.For every container required to run on the same host, you can now add the following to the Docker run command in order to make the container log to your Syslog collector:--log-driver syslog --log-opt syslog-address=udp://localhost:514Or, in a complete example:$ docker run --rm --name test \ --log-driver syslog --log-opt syslog-address=udp://localhost:514 \ ubuntu \ bash -c 'for i in `seq 1 10`; do echo Hello $i; sleep 1; done'You should now see something along these lines in Sumo Logic:This, of course, works remotely, as well. You can run the Sumo Logic Collector on one host, and have containers on all other hosts log to it by setting the syslog address accordingly when running the container.And Here Is An ErrataIn New Docker Logging Drivers, I described the newly added logging drivers in Docker 1.6. At the time, Docker was only able to log to local syslog, and hence our recommendation for integration was as follows:$ docker run -v /var/log/syslog:/syslog -d \ --name="sumo-logic-collector" \ sumologic/collector:latest-logging-driver-syslog \ [Access ID] [Access Key]This will basically have the Sumo Logic Collector tail the OS /var/log/syslog file. We discovered in the meantime that this will cause issues if /var/log/syslog is being logrotate’d. The container will hang on to the original file into which Syslog initially wrote the messages, and not pick up the new file after the old file was moved out of the way.There’s a simple solution to the issue: mount the directory into the container, not the file. In other words, please do this:$ docker pull sumologic/collector:latest-logging-driver-syslog$ docker run -v /var/log:/syslog -d \ --name="sumo-logic-collector" \ sumologic/collector:latest-logging-driver-syslog \ [Access ID] [Access Key]Or, of course, switch to the above described new and improved approach!

Blog

Comprehensive Monitoring For Docker - More Than "Just" Logs

Today I am happy to be able to talk about something that has been spooking around in my head for the last six months or so. I've been thinking about this ever since we started looking into Docker and how it applies to what we are doing here at Sumo. There are many different and totally valid ways to get logs and statistics out of Docker. Options are great, but I have concluded that the ultimate goal should be a solution that doesn't require users to have in-depth knowledge about all the things that are available for monitoring and the various methods to get to them. Instead, I want something that just pulls all the monitoring data out of the containers and Docker daemons with minimal user effort. In my head, I have been calling this "a comprehensive solution". Let me introduce you to the components that I think need to be part of a comprehensive monitoring solution for Docker: Docker events, to track container lifecycles Configuration info on containers Logs, naturally Statistics on the host and the containers Other host stuff (daemon logs, host logs, ...) Events Let's start with events. The Docker API makes it trivial to subscribe to the event stream. Events contain lots of interesting information. The full list is well described in the Docker API doc, but let’s just say you can track containers come and go, as well as observe containers getting killed, and other interesting stuff, such as out of memory situations. Docker has consistently added new events with every version, so this is a gift that will keep on giving in the future. I think of Docker events as nothing but logs. And they are very nicely structured—it's all just JSON. If, for example, I can load this into my log aggregation solution, I can now track which container is running where. I can also track trends - for example, which images are run in the first place, and how often are they being run. Or, why are suddenly 10x more containers started in this period vs. before, and so on. This probably doesn't matter much for personal development, but once you have fleets, this is a super juicy source of insight. Lifecycle tracking for all your containers will matter a lot. Configurations Docker events, among other things, allow us to see containers come and go. What if we wanted also to track the configurations of those containers? Maybe we want to track drift of run parameters, such as volume settings, or capabilities and limits. The container image is immutable, but what about the invocation? Having detailed records of container starting configurations in my mind is another piece of the puzzle towards solving total visibility. Orchestration solutions will provide those settings, sure, but who is telling those solutions what to do? From our own experience, we know that deployment configurations are inevitably going to be drifting, and we have found the root cause to otherwise inscrutable problems there more than once. Docker allows us to use the inspect API to get the container configuration. Again, in my mental model, that's just a log. Send it to your aggregator. Alert on deviations, use the data after the fact for troubleshooting. Docker provides this info in a clean and convenient format. Logs Well, obviously, it would be great to have logs, right? Turns out there are many different ways to deal with logs in Docker, and new options are being enabled by the new log driver API. Not everybody is quite there yet in 12-factor land, but the again there are workarounds for when you need fat containers and you need to collect logs from files inside of containers. More and more I see people following the best practice of writing logs to standard out and standard error, and it is pretty straightforward to grab those logs from the logs API and forward them from there. The Logspout approach, for example, is really neat. It uses the event API to watch which containers get started, then turns around and attaches to the log endpoint, and then pumps the logs somewhere. Easy and complete, and you have all the logs in one place for troubleshooting, analytics, and alerting. Stats Since the release of Docker 1.5, container-level statistics are exposed via a new API. Now you can alert on the "throttled_data" information, for example - how about that? Again (and at this point, this is getting repetitive, perhaps), this data should be sucked into a centralized system. Ideally, this is the same system that already has the events, the configurations, and the logs! Logs can be correlated with the metrics and events. Now, this is how I think we are getting to a comprehensive solution. There are many pieces to the puzzle, but all of this data can be extracted from Docker pretty easily today already. I am sure as we all keep learning more about this it will get even easier and more efficient. Host Stuff In all the excitement around APIs for monitoring data, let's not forget that we also need to have host level visibility. A comprehensive solution should therefore also work hard to get the Docker daemon logs, and provide a way to get any other system level logs that factor into the way Docker is being put to use on the hosts of the fleet. Add host level statistics to this and now performance issues can be understood in a holistic fashion - on a container basis, but also related to how the host is doing. Maybe there's some intricate interplay between containers based on placement that pops up on one host but not the other? Without quick access to the actual data, you will scratch your head all day. User Experience What's the desirable user experience for a comprehensive monitoring solution for Docker? I think it needs to be brain-dead easy. Thanks to the API-based approach that allows us to get to all the data either locally or remotely, it should be easy to encapsulate all the monitoring data acquisition and forwarding into a container that can either run remotely, if the Docker daemons support remote access, or as a system container on every host. Depending on how the emerging orchestration solutions approach this, it might not even be too crazy to assume that the collection container could simply attach to a master daemon. It seems Docker Swarm might make this possible. Super simple, just add the URL to the collector config and go. I really like the idea of being able to do all of this through the API because now I don't need to introduce other requirements on the hosts. Do they have Syslog? JournalD? Those are of course all great tools, but as the levels of abstractions keep rising, we will less and less be able to make assumptions about the hosts. So the API-based access provides decoupling and allows for composition. All For One So, to be completely honest, there's a little bit more going on here on our end than just thinking about this solution. We have started to implement almost all of the ideas into a native Sumo Logic collection Source for Docker. We are not ready to make it generally available just yet, but we will be showing it off next week at DockerCon (along with another really cool thing I am not going to talk about here). Email docker@sumologic.com to get access to a beta version of the Sumo Logic collection Source for Docker.

Blog

The Power of 5

Five years, five rounds of financing, five hundred customers already and 500 Sumo employees down the road. And there’s another 5 hidden in this story which you will have to puzzle out yourself. We welcome our new investors Draper Fisher Jurvetson Growth and Institutional Venture Partners, as well as Glynn Capital and Tenaya Capital. And we say thank you for the continued support of the people and the firms that have added so much value while fueling our journey: Greylock, Sutter Hill Ventures, Accel Partners, and Sequoia. It is fair to say that we were confident in the beginning that the hypotheses on which Sumo Logic was founded are fundamentally solid. But living through the last 5 years, and seeing what the people in this company have accomplished to build on top of this foundation is truly breathtaking and fills me with great pride. For us, the last five years have been a time of continuous scaling. And yet we managed to stay true to our vision – to make machine data useful with the best service we can possibly provide. We have become experts at using the power of the scalability that’s on tap in our backend to relentlessly crunch through data. Our customers are telling us that this again and again surfaces the right insights that help them understand their application and security infrastructures. And with our unique machine learning capabilities, we can turn outliers and anomalies into those little “tap tap tap”-on-your-shoulder moments that make the unknown known and that truly turn data into gold. One of the (many) things that is striking to me when looking back over the last 5 years is just how much I appreciate the difference between building software and building a service. They will have to drag me back kicking and screaming to build a product as a bunch of code to be shipped to customers. That’s right, I am a recovering enterprise software developer. We had a hunch that there must be a better way, and boy were we right. Choosing to build Sumo Logic as a service was a very fundamental decision – we never wanted to ever again be in a situation in which we were unable to observe how our product was being used. As a service, we have the ultimate visibility, and seeing and synthesizing what our customers are doing continuously helps to support our growth. At the same time, we have nearly total control over the execution environment in which our product operates. This is enlightening for us as engineers because it removes the guesswork when bug reports are coming in. No longer do I have to silently suspect that maybe it is related to that old version of Solaris that the customer insists on using to run my product on. And no, I don’t want to educate you which RAID level you need to run the database supporting my software on anymore, because if you don’t believe me, we are both going to be in a world of hurt 6 months down the road when everything grinds to a halt. I simply don’t want to talk anymore about you having to learn to run and administer my product. Our commitment and value is simple: let me do it for you, so you can focus on using our service and getting more value. Give us the control to run it right and all will benefit. Obviously, we are not alone in having realized the many benefits of software as a service – SaaS. This is why the trend to convert all software to services has only started. Software is eating software, quite literally. I see it every day when we replace legacy systems. We are ourselves exclusively consuming services at Sumo Logic – we have no data center. We literally have just one Linksys router sitting alone and lonely in the corner of our office, tying the wireless access points to some fiber coming out of the floor. That’s it. Everything else is a service. We believe this is a better way to live, and we put our money where our mouth is, supporting our fellow product companies that have gone the service route. So in many ways we are all riding the same wave, the big mega trend – a trend that is based on efficiency and a possibility of doing things in a truly better way. And we have the opportunity to both be like and behave like our customers, while actually helping our customers build these great new forward looking systems. At Sumo Logic, we have created a purpose-built cloud analytics service that supports, and is needed, by every software development shop over the next number of years as more and more products are built on the new extreme architecture. Those who have adopted and are adopting the new way of doing things are on board already and we are looking forward to support the next waves by continuing to provide the best service to monitor, troubleshoot, and proactively maintain the quality of your applications, infrastructure, and ultimately of your service. In addition, with our unique and patented machine learning analytics capabilities we can further deliver on our vision to bring machine intelligence to the masses where as this was previously only available to the fortunate few. As we scale along with the great opportunity that the massive wave of change in IT and software is bringing, we will put the money provided by our investors to the best possible use we can think of. First of all, we will continue to bring more engineers and product development talent on board. The addition of this new tech talent will continue to help us further develop our massive elastic scale platform which has grown more than 1000X in the past few years in terms of data ingested. In fact, we are already processing 50TB of new data every day, and that number will only go up. Our own production footprint has reached a point where we would literally have to invent a product like Sumo Logic in order to keep up – thankfully, we enjoy eating our dog food, all across the company. Except for the dogs in the office, they’d actually much rather have more human food. In any case, this service is engineering heavy, full of challenges along many dimensions, and scale is just one of them. If you are looking for a hardcore technical challenge, let’s talk (talk@sumologic.com). And while we continue to tweak our system and adhere to our SLAs (even for queries!), we will also massively grow the sales, G&A, marketing and customer success side of the company to bring what we believe to be the best purpose built cloud service for monitoring modern application architectures to more and more people, and to constantly improve on our mission of maniacal customer success. What do you say? Five more years, everybody!!!

June 1, 2015

Blog

Collecting In-Container Log Files

Docker and the use of containers is spreading like wildfire. In a Docker-ized environment, certain legacy practices and approaches are being challenged. Centralized logging is the one of them. The most popular way of capturing logs coming from a container is to setup the containerized process such that it logs to stdout. Docker then spools this to disk, from where it can be collected. This is great for many use cases. We have of course blogged about this multiple times already. If the topic fascinates you, also checkout a presentation I did in December at the Docker NYC meetup. At the same time, at Sumo Logic our customers are telling us that the stdout approach doesn’t always work. Not all containers are setup to follow the process-per-container model. This is sometimes referred to as “fat” containers. There are tons of opinions about whether this is the right thing to do or not. Pragmatically speaking, it is a reality for some users. What if you could visualize your entire Docker ecosystem in real-time? See how Sumo Logic makes it possible and get started for free today.Free Trial Even some programs that are otherwise easily containerized as single processes pose some challenges to the stdout model. For example, popular web servers write at least two log files: access and error logs. There are of course workarounds to map this back to a single stdout stream. But ultimately there’s only so much multiplexing that can be done before the demuxing operation becomes too painful. A Powerstrip for Logfiles Powerstrip-Logfiles presents a proof of concept towards easily centralizing log files from within a container. Simply setting LOGS=/var/logs/nginx in the container environment, for example, will use a bind mount to make the Nginx access and error logs available on the host under /var/logs/container-logfiles/containers/[ID of the Nginx container]/var/log/nginx. A file-based log collector can now simply be configured to recursively collect from /var/logs/container-logfiles/containers and will pick up logs from any container configured with the LOGS environment variable. Powerstrip-Logfiles is based on the Powerstrip project by ClusterHQ, which is meant to provide a way to prototype extensions to Docker. Powerstrip is essentially a proxy for the Docker API. Prototypical extensions can hook Docker API calls and do whatever work they need to perform. The idea is to allow for extensions to Docker to be composable – for example, to add support for overlay networks such as Weave and for storage managers such as Flocker. Steps to run Powerstrip-Logfiles Given that the Powerstrip infrastructure is meant to support prototyping of what one day will hopefully become Docker extensions, there’s still a couple of steps required to get this to work. First of all, you need to start a container that contains the powerstrip-logfiles logic: $ docker run --privileged -it --rm \ --name powerstrip-logfiles \ --expose 80 -v /var/log/container-logfiles:/var/log/container-logfiles \ -v /var/run/docker.sock:/var/run/docker.sock \ raychaser/powerstrip-logfiles:latest \ -v --root /var/log/container-logfiles Next you need to create a Powerstrip configuration file… $ mkdir -p ~/powerstrip-demo $ cat > ~/powerstrip-demo/adapters.yml <<EOF endpoints: "POST /*/containers/create": pre: [logfiles] post: [logfiles] adapters: logfiles: http://logfiles/v1/extension EOF …and then you can start the powerstrip container that acts as the Docker API proxy: $ docker run -d --name powerstrip \ -v /var/run/docker.sock:/var/run/docker.sock \ -v ~/powerstrip-demo/adapters.yml:/etc/powerstrip/adapters.yml \ --link powerstrip-logfiles:logfiles \ -p 2375:2375 \ clusterhq/powerstrip Now you can use the normal docker client to run containers. First you must export the DOCKER_HOST variable to point at the powerstrip server: $ export DOCKER_HOST=tcp://127.0.0.1:2375 Now you can specify as part of the container’s environment which paths are supposed to be considered logfile paths. Those paths will be bind-mounted to appear under the location of the –root specified when running the powerstrip-logfiles container. $ docker run --cidfile=cid.txt --rm -e "LOGS=/x,/y" ubuntu \ bash -c 'touch /x/foo; ls -la /x; touch /y/bar; ls -la /y' You should now be able to see the files “foo” and “bar” under the path specified as the –root: $ CID=$(cat cid.txt) $ ls /var/log/container-logfiles/containers/$CID/x $ ls /var/log/container-logfiles/containers/$CID/y See the example in the next section on how to most easily hook up a Sumo Logic Collector. Sending Access And Error Logs From An Nginx Container To Sumo Logix For this example, you can just run Nginx from a toy image off of Docker Hub: $ CID=$(DOCKER_HOST=localhost:2375 docker run -d --name nginx-example-powerstrip -p 80:80 -e LOGS=/var/log/nginx raychaser/powerstrip-logfiles:latest-nginx-example) && echo $CID You should now be able to see the Nginx container’s /var under the host’s /var/log/container-logfiles/containers/$CID/: $ ls -la /var/log/container-logfiles/containers/$CID/ And if you tail the access log from that location while hitting http://localhost you should see the hits being logged: $ tail -F /var/log/container-logfiles/containers/$CID/var/log/nginx/access.log Now all that’s left is to hook up a Sumo Logic collector to the /var/log/container-logfiles/containers/ directory, and all the logs will come to your Sumo Logic account: $ docker run -v /var/log/container-logfiles:/var/log/container-logfiles -d \ --name="sumo-logic-collector" sumologic/collector:latest-powerstrip [Access ID] [Access Key] This collector is pre-configured to collect all files from /container-logfiles which by way of the -v volume mapping in the invocation above is mapped to /var/log/container-logs/containers, which is where powerstrip-logfiles by default writes the logs for the in-container files. As a Sumo Logic user, it is very easy to generate the required access key by going to the Preferences page. Once the collector is running, you can search for _sourceCategory=collector-container in the Sumo Logic UI and you should see the toy Nginx logs. Simplify using Docker Compose And just because we can, here’s how this could all work with Docker Compose. Docker Compose will allow us to write a single spec file that contains all the details on how the Powerstrip container, powerstrip-logfiles, and the Sumo Logic collector container are to be run. The spec is a simple YAML file: powerstriplogfiles: image: raychaser/powerstrip-logfiles:latest ports: - 80 volumes: - /var/log/container-logfiles:/var/log/container-logfiles - /var/run/docker.sock:/var/run/docker.sock environment: ROOT: /var/log/container-logfiles VERBOSE: true entrypoint: - node - index.js powerstrip: image: clusterhq/powerstrip:latest ports: - "2375:2375" volumes: - /var/run/docker.sock:/var/run/docker.sock - ~/powerstrip-demo/adapters.yml:/etc/powerstrip/adapters.yml links: - "powerstriplogfiles: logfiles" sumologiccollector: image: sumologic/collector:latest-powerstrip volumes: - "/var/log/container-logfiles:/var/log/container-logfiles" env_file: .env You can copy and paste this into a file called docker-compose.yml, or take it from the powerstrip-logfiles Github repo. Since the Sumo Logic Collector will require valid credentials to log into the service, we need to put those somewhere so Docker Compose can wire them into the container. This can be accomplished by putting them into the file .env in the same directory, something like so: SUMO_ACCESS_ID=[Access ID] SUMO_ACCESS_KEY=[Access Key] This is not a great way to deal with credentials. Powerstrip in general is not production ready, so please keep in mind to try this only outside of a production setup, and make sure to delete the access ID and access key in the Sumo Logic UI. Then simply run, in the same directory as docker-compose.yml, the following: $ docker-compose up This will start all three required containers and start streaming logs to Sumo Logic. Have fun!

Blog

New Docker Logging Drivers

Docker Release 1.6 introduces the notion of a logging driver. This is a very cool capability and a huge step forward in creating a comprehensive approach to logging in Docker environments.It is now possible to route container output (stdout and stderr) to syslog. It is also possible to completely suppress the writing of container output to file, which can help in situations where disk space usage is of importance. This post will also show how easy it is to integrate the syslog logging driver with Sumo Logic.Let’s review for a second. Docker has been supporting logging of a container’s standard output and standard error streams to file for a while. You can see how this works in this quick example:<pre class="brush: plain; title: ; notranslate" title="">$ CID=$(docker run -d ubuntu echo "Hello")$ echo $CID5594248e11b7d4d40cfec4737c7e4b7577fe1e665cf033439522fbf4f9c4e2d5$ sudo cat /var/lib/docker/containers/$CID/$CID-json.log{"log":"Hello\n","stream":"stdout","time":"2015-03-30T00:34:58.782658342Z"}</pre>What happened here? Our container simply outputs Hello. This output will go to the standard output of the container. By default, Docker will write the output wrapped into JSON into a specific file named after the container ID, in a directory under /var/lib/docker/containers named after the container ID.Logging the Container Output to SyslogWith the new logging drivers capability, it is possible to select the logging behavior when running a container. In addition to the default json-file driver, there is now also a syslog driver supported. To see this in action, do this in one terminal window:<pre class="brush: plain; title: ; notranslate" title="">$ tail -F /var/log/syslog</pre>Then, in another terminal window, do this:<pre class="brush: plain; title: ; notranslate" title="">$ docker run -d --log-driver=syslog ubuntu echo "Hello"</pre>When running the container, you should see something along these lines in the tailed syslog file:Mar 29 17:39:01 dev1 docker[116314]: 0e5b67244c00: HelloCool! Based on the --log-driver flag, which is set to syslog here, syslog received a message from the Docker daemon, which includes the container ID (well, the first 12 characters anyways), plus the actual output of the container. In this case of course, the output was just a simple message. To generate more messages, something like this will do the trick:<pre class="brush: plain; title: ; notranslate" title="">$ docker run -t -d --log-driver=syslog ubuntu \ /bin/bash -c 'while true; do echo "Hello $(date)"; sleep 1; done'</pre>While still tailing the syslog file, a new log message should appear every minute.Completely Suppressing the Container OutputNotably, when the logging driver is set to syslog, Docker sends the container output only to syslog, and not to file. This helps in managing disk space. Docker’s default behavior of writing container output to file can cause pain in managing disk space on the host. If a lot of containers are running on the host, and logging to standard out and standard error are used (as recommended for containerized apps) then some sort of space management for those files has to be bolted on, or the host eventually runs out of disk space. This is obviously not great. But now, there is also a none option for the logging driver, which will essentially dev-null the container output.<pre class="brush: plain; title: ; notranslate" title="">$ CID=$(docker run -d --log-driver=none ubuntu \ /bin/bash -c 'while true; do echo "Hello"; sleep 1; done')$ sudo cat /var/lib/docker/containers/$CID/$CID-json.logcat: /var/lib/docker/containers/52c646fc0d284c6bbcad48d7b81132cb7ba03c04e9978244fdc4bcfcbf98c6e4/52c646fc0d284c6bbcad48d7b81132cb7ba03c04e9978244fdc4bcfcbf98c6e4-json.log: No such file or directory</pre>However, this will also disable the Logs API, so the docker logs CLI will also not work anymore, and neither will the /logs API endpoint. This means that if you are using for example Logspout to ship logs off the Docker host, you will still have to use the default json-file option.Integrating the Sumo Logic Collector With the New Syslog Logging DriverIn a previous blog, we described how to use the Sumo Logic Collector images to get container logs to Sumo Logic. We have prepared an image that extends the framework developed in the previous post. You can get all the logs into Sumo Logic by running with the syslog logging driver and running the Sumo Logic Collector on the host:<pre class="brush: plain; title: ; notranslate" title="">$ docker run -v /var/log/syslog:/syslog -d \ --name="sumo-logic-collector" \ sumologic/collector:latest-logging-driver-syslog \ [Access ID] [Access Key]</pre>As a Sumo Logic user, it is very easy to generate the required access key by going to the Preferences page. And that’s it folks. Select the syslog logging driver, and add the Sumo Logic Collector container to your hosts, and all the container logs will go into one place for analysis and troubleshooting.

Blog

An Official Docker Image For The Sumo Logic Collector

Note: This post is now superceded by Update On Logging With Docker.Learning By Listening, And DoingOver the last couple of months, we have spent a lot of time learning about Docker, the distributed application delivery platform that is taking the world by storm. We have started looking into how we can best leverage Docker for our own service. And of course, we have spent a lot of time talking to our customers. We have so far learned a lot by listening to them describe how they deal with logging in a containerized environment.We actually have already re-blogged how Caleb, one of our customers, is Adding Sumo Logic To A Dockerized App. Our very own Dwayne Hoover has written about Four Ways to Collect Docker Logs in Sumo Logic.Along the way, it has become obvious that it makes sense for us to provide an “official” image for the Sumo Collector. Sumo Logic exposes an easy to use HTTP API, but the vast majority of our customers are leveraging our Collector software as a trusted, production-grade data collection conduit. We are and will continue to be excited about folks building their own images for their own custom purposes. Yet, the questions we get make it clear that we should release an official Sumo Logic Collector image for use in a containerized worldInstant Gratification, With Batteries IncludedA common way to integrate logging with containers is to use Syslog. This has been discussed before in various places all over the internet. If you can direct all your logs to Syslog, we now have a Sumo Logic Syslog Collector image that will get you up and running immediately:docker run -d -p 514:514 -p 514:514/udp --name="sumo-logic-collector"sumologic/collector:latest-syslog [Access ID] [Access key]Started this way, the default Syslog port 514 is mapped port on the host. To test whether everything is working well, use telnet on the host:<pre class="brush: plain; title: ; notranslate">telnet localhost 514</pre>Then type some text, hit return, and then CTRL-] to close the connection, and enter quit to exittelnet. After a few moments, what you type should show up in the Sumo Logic service. Use a search to find the message(s).To test the UDP listener, on the host, use Netcat, along the lines of:<pre class="brush: plain; title: ; notranslate">I'm in ur sysloggz | nc -v -u -w 0 localhost 514</pre>And again, the message should show up on the Sumo Logic end when searched for.If you want to start a container that is configured to log to syslog and make it automatically latch on to the Collector container’s exposed port, use linking:docker run -it --link sumo-logic-collector:sumo ubuntu /bin/bashFrom within the container, you can then talk to the Collector listening on port 514 by using the environment variables populated by the linking:echo "I'm in ur linx" | nc -v -u -w 0 $SUMO_PORT_514_TCP_ADDR $SUMO_PORT_514_TCP_PORTThat’s all there is to it. The image is available from Docker Hub. Setting up an Access ID/Access Key combination is described in our online help.Composing Collector Images From Our Base ImageFollowing the instructions above will get you going quickly, but of course it can’t possibly cover all the various logging scenarios that we need to support. To that end, we actually started by first creating a base image. The Syslog image extends this base image. Your future images can easily extend this base image as well. Let’s take a look at what is actually going on! Here’s the Github repo:https://github.com/SumoLogic/sumologic-collector-docker.One of the main things we set out to solve was to clarify how to allow creating an image that does not require customer credentials to be baked in. Having credentials in the image itself is obviously a bad idea! Putting them into the Dockerfile is even worse. The trick is to leverage a not-so-well documented command line switch on the Collector executable to pass the Sumo Logic Access ID and Access Key combination to the Collector. Here’s the meat of the run.sh startup script referenced in the Dockerfile:<pre class="brush: plain; title: ; notranslate">/opt/SumoCollector/collector console -- -t -i $access_id -k $access_key-n $collector_name -s $sources_json</pre>The rest is really just grabbing the latest Collector Debian package and installing it on top of a base Ubuntu 14.04 system, invoking the start script, checking arguments, and so on.As part of our continuous delivery pipeline, we are getting ready to update the Docker Hub-hosted image every time a new Collector is released. This will ensure that when you pull the image, the latest and greatest code is available.How To Add The Batteries YourselfThe base image is intentionally kept very sparse and essentially ships with “batteries not included”. In itself, it will not lead to a working container. This is because the Sumo Logic Collector has a variety of ways to setup the actual log collection. It supports tailing files locally and remotely, as well as pulling Windows event logs locally and remotely.Of course, it can also act as a Syslog sink. And, it can do any of this in any combination at the same time. Therefore, the Collector is either configured manually via the Sumo Logic UI, or (and this is almost always the better way), via a configuration file. The configuration file however is something that will change from use case to use case and from customer to customer. Baking it into a generic image simply makes no sense.What we did instead is to provide a set of examples. This can be found in the same Github repository under “example”: https://github.com/SumoLogic/sumologic-collector-docker/tree/master/example. There’s a couple of sumo-source.json example files illustrating, respectively, how to set up file collection, and how to setup Syslog UDP and Syslog TCP collection. The idea is to allow you to either take one of the example files verbatim, or as a starting point for your own sumo-sources.json. Then, you can build a custom image using our image as a base image. To make this more concrete, create a new folder and put this Dockerfile in there:<pre class="brush: plain; title: ; notranslate">FROM sumologic/collectorMAINTAINER Happy Sumo CustomerADD sumo-sources.json /etc/sumo-sources.json</pre>Then, put a sumo-sources.json into the same folder, groomed to fit your use case. Then build the image and enjoy.A Full ExampleUsing this approach, if you want to collect files from various containers, mount a directory on the host to the Sumo Logic Collector container. Then mount the same host directory to all the containers that use file logging. In each container, setup logging to log into a subdirectory of the mounted log directory. Finally, configure the Collector to just pull it all in.The Sumo Logic Collector has for years been used across our customer base in production for pulling logs from files. More often than not, the Collector is pulling from a deep hierarchy of files on some NAS mount or equivalent. The Collector is quite adept and battle tested at dealing with file-based collection.Let’s say the logs directory on the host is called /tmp/clogs. Before setting up the source configuration accordingly, make a new directory for the files describing the image. Call it for example sumo-file. Into this directory, put this Dockerfile:<pre class="brush: plain; title: ; notranslate">FROM sumologic/collectorMAINTAINER Happy Sumo CustomerADD sumo-sources.json /etc/sumo-sources.json</pre>The Dockerfile extends the base image, as discussed. Next to the Dockerfile, in the same directory, there needs to be a file called sumo-sources.json which contains the configuration:<pre class="brush: plain; title: ; notranslate">{ "api.version": "v1", "sources": [ { "sourceType" : "LocalFile", "name": "localfile-collector-container", "pathExpression": "/tmp/clogs/**", "multilineProcessingEnabled": false, "automaticDateParsing": true, "forceTimeZone": false, "category": "collector-container" } ]}</pre>With this in place, build the image, and run it:<pre class="brush: plain; title: ; notranslate">docker run -d -v /tmp/clogs:/tmp/clogs -d --name="sumo-logic-collector"[image name] [your Access ID] [your Access key]</pre>Finally, add -v /tmp/clogs:/tmp/clogs when running other containers that are configured to log to /tmp/clogs in order for the Collector to pick up the files.Just like the ready-to-go syslog image we described in the beginning, a canonical image for file collection is available. See the source: https://github.com/SumoLogic/sumologic-collector-docker/tree/master/file.docker run -v /tmp/clogs:/tmp/clogs -d --name="sumo-logic-collector"sumologic/collector:latest-file [Access ID] [Access key]If you want to learn more about using JSON to configure sources to collect logs with the Sumo Logic Collector, there is a help page with all the options spelled out.That’s all for today. We have more coming. Watch this space. And yes, comments are very welcome.https://www.sumologic.com/blog... class="at-below-post-recommended addthis_tool">

Blog

Shifting Into Overdrive

How Our Journey Began Four years ago, my co-founder Kumar and I were just two guys who called coffee shops our office space. We had seen Werner Vogel’s AWS vision pitch at Stanford and imagined a world of Yottabyte scale where machine learning algorithms could make sense of it all. We dreamed of becoming the first and only native cloud analytics platform for machine generated data and next gen apps, and we dreamed that we would attract and empower customers. We imagined the day when we’d get our 500th customer. After years of troubleshooting scale limitations with on-premises enterprise software deployments, we bet our life savings that multi-tenant cloud apps could scale to the infinite data scales that were just a few years away. Eclipsing Our First Goal Just a few weeks ago, we added our 500th enterprise customer in just over two years since Sumo Logic’s inception. As software developers, the most gratifying part of our job is when customers use and love our software. This past month has been the most gratifying part of the journey so far as I’ve travelled around the world meeting with dozens of happy customers. At each city, I’m blown away by the impact that Sumo Logic has on our customers’ mission critical applications. Our code works, our customers love our software and our business is taking off faster than we could have imagined. Momentum Is Kicking In Our gratitude for our customers only grows when we dig through the stats of what we’ve been able to build together with our world class investors and team of 170+ Sumos. Just last quarter alone, we exceeded expectations with: 100%+ Quarter over Quarter ACV growth 100+ new customer logos 12 new 1 Terabyte/day customers 1 Quadrillion new logs indexed Dozens of new Sumos bringing badass skills from companies like Google, Atlassian, Microsoft, Akamai and even VMware… Shifting Into Overdrive It is still early days, and we have a tireless road of building ahead of us. Big data is approaching a $20B per year industry. And, we’re addressing machine data, which is growing 5X faster than any other segment of data. No company has built a platform for machine data that approaches our scale in the cloud: 1 million events ingested per second 8 petabytes scanned per day 1 million queries processed per day Today, we’re excited to share the news that Ramin Sayar will joining us to lead Sumo Logic as our new president and CEO. With 20 years of industry experience, he has a proven track record for remarkable leadership, incubating and growing significant new and emerging businesses within leading companies. He comes to us from VMWare, where he was Sr. Vice President and General Manager of the Cloud Management Business Unit. In his time at VMWare, he developed the product and business strategy and led the fastest growing business unit. He was responsible for the industry leading Cloud Management Business and Strategy, R&D, Operating P&L, Product Mgmt, Product Marketing and field/business Operations for VMware’s Cloud Mgmt offerings. Our mission remains the same: to enable businesses to harness the power of machine data to improve their operations and deliver outstanding customer experience. With our current momentum and Ramin’s leadership, I am extremely excited about the next chapter in Sumo Logic’s journey. Please know how grateful we are to you, our customers, partners, and investors, for your belief in us and for the privilege to innovate on your behalf every day.

December 2, 2014

Blog

Meatballs And Flying Tacos Don't Make a Cloud

Yes, we are cloud and proud. Puppies, ponies, rainbows, unicorns. We got them all. But the cloud is not a personal choice for us at Sumo Logic. It is an imperative. An imperative to build a better product, for happier customers. We strongly believe that if designed correctly, there is no need to fragment your product into many different pieces, each with different functional and performance characteristics that confuse decision-makers. We have built the Sumo Logic platform from the very beginning with a mindset of scalability. Sumo Logic is a service that is designed to appeal and adapt to many use cases. This explains why in just three short years we have been successful in a variety of enterprise accounts across three continents because - first and foremost - our product scales. On the surface, scale is all about the big numbers. We got Big Data, thank you. So do our customers, and we scale to the level required by enterprise customers. Yet, scaling doesn't mean scaling up by sizes of data sets. Scaling also means being able to scale back, to get out of the way, and provide value to everyone, including those customers that might not have terabytes of data to deal with. Our Sumo Free offering has proven that our approach to scaling is holistic - one product for everyone. No hard decisions to be made now, and no hard decisions to be made later. Just do it and get value. Another compelling advantage of our multi-tenant, one service approach is that we can very finely adjust to the amount of data and processing required by every customer, all the time. Elasticity is key, because it enables agility. Agile is the way of business today. Why would anyone want to get themselves tied into a fixed price license, and on top of that provision large amount of compute and storage resources permanently upfront just to buy insurance for those days of the year where business spikes, or, God forbid, a black swan walks into the lobby? Sumo Logic is the cure for anti-agility in the machine data analytics space. As a customer, you get all the power you need, when you need it, without having to pay for it when you don't. Finally, Sumo Logic scales insight. With our recently announced anomaly detection capability, you can now rely on the army of squirrels housed in our infrastructure to generate and vet millions of hypotheses about potential problems on your behalf. Only the most highly correlated anomalies survive this rigorous process, meaning you get actionable insight into potential infrastructure issues for free. You will notice repetitive events and be able to annotate them precisely and improve your operational processes. Even better - you will be able to share documented anomalous events with and consume them back from the Sumo Logic community. What scales to six billion humans? Sumo Logic does. One more thing: as a cloud-native company, we have also scaled the product development process, to release more features, more improvements, and yes, more bug fixes than any incumbent vendor. Sumo Logic runs at the time of now, and new stuff rolls out on a weekly basis. Tired of waiting for a year to get issues addressed? Tired of then having to provision an IT project to just update the monitoring infrastructure? Scared of how that same issue will apply even if the vendor "hosts" the software for you? We can help. Sumo Logic scales, along all dimensions. You like scale? Come on over. Oh, and thanks for the date, Praveen. I'll let you take the check.

October 2, 2013

Blog

Me at the End of the World