Christian Beedgen

Christian Beedgen

As co-founder and CTO of Sumo Logic, Christian Beedgen brings 18 years experience creating industry-leading enterprise software products. Since 2010 he has been focused on building Sumo Logic’s multi-tenant, cloud-native machine data analytics platform which is widely used today by more than 1,600 customers and 50,000 users. Prior to Sumo Logic, Christian was an early engineer, engineering director and chief architect at ArcSight, contributing to ArcSight’s SIEM and log management solutions.

Posts by Christian Beedgen

Blog

Sumo Logic Recognized as Data Analytics Solution of the Year Showcasing the Power of Continuous Intelligence

Blog

Love In The Time Of Coronavirus

Blog

How We Understand Monitoring

Blog

All The Logs For All The Intelligence

Blog

Service Levels––I Want To Buy A Vowel

Blog

See You in September at Illuminate!

Blog

The Super Bowl of the Cloud

Blog

Platforms All The Way Up & Down

Blog

Microservices for Startups Explained

Blog

Christian's Musings from Web Summit 2017

I was able to attend my third Web Summit last week. This is the second time for me in Lisbon, as I was lucky enough to be invited to talk on the Binate.io stage again after last year. If you are interested, check out my musings on instinct, intuition, experience and data analytics. Web Summit has grown tremendously since I first attended the Dublin incarnation in 2013. This year, the event was sold out at 60,000 attendees (!) - the Portuguese came out in force, but it was very clear that this event is, while of course drawing most attendees from all across Europe, ultimately an international affair as well. With so many people attending, Web Summit can be rather overwhelming. There is a bit of everything, and an incredible crowd of curious people. Lisbon is fantastically beautiful city, off the beaten path when it comes to tech conferences mostly, so the local folks are really coming out in force to take in the spectacle. So, what is Web Summit? Originally started in Dublin in 2009, it has over the years become a massive endeavor highlighting every conceivable aspect of technology. There's four massive conference halls with multiple stages for speakers and podium discussions in each hall. Christian Beedgen on binate.io stage - Web Summit, Lisbon 2017 Then there is the main arena holding 20,000 people; this is where the most high-profile keynote speakers hit the stage. Web Summit has always brought in government officials and politicians to the show as well in an effort to promote technology. I was actually standing next to Nigel Farage at the speaker cloak room waiting for my coat. There was another guy there as well who was already berating this unfortunate character, so thankfully I didn't have to do it myself. I managed to catch a couple of the keynotes in the aforementioned large arena. Three of them left an impression. Firstly, it was great to see Max Tegmark speak. I am reading his current book, Life 3.0, right now, and it is always a bit of trip when the author suddenly appears on a stage and you realize you have to throw away your mental image of that voice in your head that has been speaking to you from the pages of the book and adopt reality. In this case however, this was not a negative, as Max came across as both deeply knowledgeable and quite relaxed. He looked a bit like he is playing in the Ramones with his black leather jacket and black jeans; this I didn't see coming. In any case, I highly recommend checking out what he has to say. In light of the current almost bombastically overblown hype around AI, he is taking a very pragmatic view, based on many years of his own research. If you can imagine a future of "beneficial AI", check out his book, Life 3.0, for why and how we have a chance to get there. I was also impressed by Margrethe Vestager. She is a Danish politician and currently the European Commissioner for Competition. She captured the audience by simply speaking off of a couple of cue cards, not PowerPoint slides at all. Being a politician, she was casting a very official appearance, of course - but she wore some sick sneakers to a conservative dress which I thought was just awesome. Gotta love the Danish! Her talk centered around the reasoning behind the anti-trust investigation she brought against Google (which eventually lead to a $2.7 billion fine!) The details are too complicate to be reasonably summarized here, but essentially centered around the fact that while nobody in the EU has issues with Google's near-monopoly on search, in the eyes of the competition watchdogs, for Google to use this position to essentially favor their own products in search results creates intolerable fairness issues for other companies. It is very interesting to see how these views are developing outside of the US. The third and last memorable session had animated AI robots dialoguing with their inventor, Einstein, Artificial General Intelligence, distributed AI and models and the blockchain. It was by and large only missing Taylor Swift. SingularityNET is a new effort to create an open, free and decentralized market place for AI technology, enabled by Smart Contracts. I frankly don't have the slightest clue how that would work, but presenter Ben Goertzel was animatedly excited about the project. The case for needing an AI marketplace for narrow AIs to compose more general intelligences was laid out in a strenuous "discussion" with "lifelike" robots from Hanson Robotics. It is lost on me why everybody thinks they need to co-opt Einstein; first Salesforce calls their machine learning features Einstein, now these robotics guys have an Einstein robot on stage. I guess the path to the future requires still more detours to the past. I guess Einstein can't fight back on this anymore and at least they are picking an exceptional individual... Now that I am back in the US for only a day, the techno-optimism that's pervasive at Web Summit feels like a distant memory already.

November 14, 2017

Blog

Machine Data for the Masses

Blog

Sumo Logic CTO Christian Beedgen Talks Unified Logs and Metrics

Blog

Update On Logging With Docker

A Simpler & Better WayIn New Docker Logging Drivers, I previously described how to use the new Syslog logging driver introduced in Docker 1.6 to transport container logs to Sumo Logic.Since then, there have been improvements to the Syslog logging driver, which now allows users to specify the address of the Syslog server to send the logs to. In its initial release the Syslog logging driver simply logged to the local Syslog daemon, but this is now configurable. We can exploit this in conjunction with the Sumo Logic Collector container for Syslog to make logging with Docker and Sumo Logic even easier.Simply run the Syslog Collector container as previously described:$ docker run -d -p 514:514 -p 514:514/udp \ --name="sumo-logic-collector" \ sumologic/collector:latest-syslog \ [Access ID] [Access key]Now you have a collector running, listening for Syslog on both ports 514/tcp and 514/udp.For every container required to run on the same host, you can now add the following to the Docker run command in order to make the container log to your Syslog collector:--log-driver syslog --log-opt syslog-address=udp://localhost:514Or, in a complete example:$ docker run --rm --name test \ --log-driver syslog --log-opt syslog-address=udp://localhost:514 \ ubuntu \ bash -c 'for i in `seq 1 10`; do echo Hello $i; sleep 1; done'You should now see something along these lines in Sumo Logic:This, of course, works remotely, as well. You can run the Sumo Logic Collector on one host, and have containers on all other hosts log to it by setting the syslog address accordingly when running the container.And Here Is An ErrataIn New Docker Logging Drivers, I described the newly added logging drivers in Docker 1.6. At the time, Docker was only able to log to local syslog, and hence our recommendation for integration was as follows:$ docker run -v /var/log/syslog:/syslog -d \ --name="sumo-logic-collector" \ sumologic/collector:latest-logging-driver-syslog \ [Access ID] [Access Key]This will basically have the Sumo Logic Collector tail the OS /var/log/syslog file. We discovered in the meantime that this will cause issues if /var/log/syslog is being logrotate’d. The container will hang on to the original file into which Syslog initially wrote the messages, and not pick up the new file after the old file was moved out of the way.There’s a simple solution to the issue: mount the directory into the container, not the file. In other words, please do this:$ docker pull sumologic/collector:latest-logging-driver-syslog$ docker run -v /var/log:/syslog -d \ --name="sumo-logic-collector" \ sumologic/collector:latest-logging-driver-syslog \ [Access ID] [Access Key]Or, of course, switch to the above described new and improved approach!

Blog

Comprehensive Monitoring For Docker - More Than "Just" Logs

Today I am happy to be able to talk about something that has been spooking around in my head for the last six months or so. I've been thinking about this ever since we started looking into Docker and how it applies to what we are doing here at Sumo. There are many different and totally valid ways to get logs and statistics out of Docker. Options are great, but I have concluded that the ultimate goal should be a solution that doesn't require users to have in-depth knowledge about all the things that are available for monitoring and the various methods to get to them. Instead, I want something that just pulls all the monitoring data out of the containers and Docker daemons with minimal user effort. In my head, I have been calling this "a comprehensive solution". Let me introduce you to the components that I think need to be part of a comprehensive monitoring solution for Docker: Docker events, to track container lifecycles Configuration info on containers Logs, naturally Statistics on the host and the containers Other host stuff (daemon logs, host logs, ...) Events Let's start with events. The Docker API makes it trivial to subscribe to the event stream. Events contain lots of interesting information. The full list is well described in the Docker API doc, but let’s just say you can track containers come and go, as well as observe containers getting killed, and other interesting stuff, such as out of memory situations. Docker has consistently added new events with every version, so this is a gift that will keep on giving in the future. I think of Docker events as nothing but logs. And they are very nicely structured—it's all just JSON. If, for example, I can load this into my log aggregation solution, I can now track which container is running where. I can also track trends - for example, which images are run in the first place, and how often are they being run. Or, why are suddenly 10x more containers started in this period vs. before, and so on. This probably doesn't matter much for personal development, but once you have fleets, this is a super juicy source of insight. Lifecycle tracking for all your containers will matter a lot. Configurations Docker events, among other things, allow us to see containers come and go. What if we wanted also to track the configurations of those containers? Maybe we want to track drift of run parameters, such as volume settings, or capabilities and limits. The container image is immutable, but what about the invocation? Having detailed records of container starting configurations in my mind is another piece of the puzzle towards solving total visibility. Orchestration solutions will provide those settings, sure, but who is telling those solutions what to do? From our own experience, we know that deployment configurations are inevitably going to be drifting, and we have found the root cause to otherwise inscrutable problems there more than once. Docker allows us to use the inspect API to get the container configuration. Again, in my mental model, that's just a log. Send it to your aggregator. Alert on deviations, use the data after the fact for troubleshooting. Docker provides this info in a clean and convenient format. Logs Well, obviously, it would be great to have logs, right? Turns out there are many different ways to deal with logs in Docker, and new options are being enabled by the new log driver API. Not everybody is quite there yet in 12-factor land, but the again there are workarounds for when you need fat containers and you need to collect logs from files inside of containers. More and more I see people following the best practice of writing logs to standard out and standard error, and it is pretty straightforward to grab those logs from the logs API and forward them from there. The Logspout approach, for example, is really neat. It uses the event API to watch which containers get started, then turns around and attaches to the log endpoint, and then pumps the logs somewhere. Easy and complete, and you have all the logs in one place for troubleshooting, analytics, and alerting. Stats Since the release of Docker 1.5, container-level statistics are exposed via a new API. Now you can alert on the "throttled_data" information, for example - how about that? Again (and at this point, this is getting repetitive, perhaps), this data should be sucked into a centralized system. Ideally, this is the same system that already has the events, the configurations, and the logs! Logs can be correlated with the metrics and events. Now, this is how I think we are getting to a comprehensive solution. There are many pieces to the puzzle, but all of this data can be extracted from Docker pretty easily today already. I am sure as we all keep learning more about this it will get even easier and more efficient. Host Stuff In all the excitement around APIs for monitoring data, let's not forget that we also need to have host level visibility. A comprehensive solution should therefore also work hard to get the Docker daemon logs, and provide a way to get any other system level logs that factor into the way Docker is being put to use on the hosts of the fleet. Add host level statistics to this and now performance issues can be understood in a holistic fashion - on a container basis, but also related to how the host is doing. Maybe there's some intricate interplay between containers based on placement that pops up on one host but not the other? Without quick access to the actual data, you will scratch your head all day. User Experience What's the desirable user experience for a comprehensive monitoring solution for Docker? I think it needs to be brain-dead easy. Thanks to the API-based approach that allows us to get to all the data either locally or remotely, it should be easy to encapsulate all the monitoring data acquisition and forwarding into a container that can either run remotely, if the Docker daemons support remote access, or as a system container on every host. Depending on how the emerging orchestration solutions approach this, it might not even be too crazy to assume that the collection container could simply attach to a master daemon. It seems Swarm might make this possible. Super simple, just add the URL to the collector config and go. I really like the idea of being able to do all of this through the API because now I don't need to introduce other requirements on the hosts. Do they have Syslog? JournalD? Those are of course all great tools, but as the levels of abstractions keep rising, we will less and less be able to make assumptions about the hosts. So the API-based access provides decoupling and allows for composition. All For One So, to be completely honest, there's a little bit more going on here on our end than just thinking about this solution. We have started to implement almost all of the ideas into a native Sumo Logic collection Source for Docker. We are not ready to make it generally available just yet, but we will be showing it off next week at DockerCon (along with another really cool thing I am not going to talk about here). Email docker@sumologic.com to get access to a beta version of the Sumo Logic collection Source for Docker.

Blog

The Power of 5

Five years, five rounds of financing, five hundred customers already and 500 Sumo employees down the road. And there’s another 5 hidden in this story which you will have to puzzle out yourself. We welcome our new investors Draper Fisher Jurvetson Growth and Institutional Venture Partners, as well as Glynn Capital and Tenaya Capital. And we say thank you for the continued support of the people and the firms that have added so much value while fueling our journey: Greylock, Sutter Hill Ventures, Accel Partners, and Sequoia. It is fair to say that we were confident in the beginning that the hypotheses on which Sumo Logic was founded are fundamentally solid. But living through the last 5 years, and seeing what the people in this company have accomplished to build on top of this foundation is truly breathtaking and fills me with great pride. For us, the last five years have been a time of continuous scaling. And yet we managed to stay true to our vision – to make machine data useful with the best service we can possibly provide. We have become experts at using the power of the scalability that’s on tap in our backend to relentlessly crunch through data. Our customers are telling us that this again and again surfaces the right insights that help them understand their application and security infrastructures. And with our unique machine learning capabilities, we can turn outliers and anomalies into those little “tap tap tap”-on-your-shoulder moments that make the unknown known and that truly turn data into gold. One of the (many) things that is striking to me when looking back over the last 5 years is just how much I appreciate the difference between building software and building a service. They will have to drag me back kicking and screaming to build a product as a bunch of code to be shipped to customers. That’s right, I am a recovering enterprise software developer. We had a hunch that there must be a better way, and boy were we right. Choosing to build Sumo Logic as a service was a very fundamental decision – we never wanted to ever again be in a situation in which we were unable to observe how our product was being used. As a service, we have the ultimate visibility, and seeing and synthesizing what our customers are doing continuously helps to support our growth. At the same time, we have nearly total control over the execution environment in which our product operates. This is enlightening for us as engineers because it removes the guesswork when bug reports are coming in. No longer do I have to silently suspect that maybe it is related to that old version of Solaris that the customer insists on using to run my product on. And no, I don’t want to educate you which RAID level you need to run the database supporting my software on anymore, because if you don’t believe me, we are both going to be in a world of hurt 6 months down the road when everything grinds to a halt. I simply don’t want to talk anymore about you having to learn to run and administer my product. Our commitment and value is simple: let me do it for you, so you can focus on using our service and getting more value. Give us the control to run it right and all will benefit. Obviously, we are not alone in having realized the many benefits of software as a service – SaaS. This is why the trend to convert all software to services has only started. Software is eating software, quite literally. I see it every day when we replace legacy systems. We are ourselves exclusively consuming services at Sumo Logic – we have no data center. We literally have just one Linksys router sitting alone and lonely in the corner of our office, tying the wireless access points to some fiber coming out of the floor. That’s it. Everything else is a service. We believe this is a better way to live, and we put our money where our mouth is, supporting our fellow product companies that have gone the service route. So in many ways we are all riding the same wave, the big mega trend – a trend that is based on efficiency and a possibility of doing things in a truly better way. And we have the opportunity to both be like and behave like our customers, while actually helping our customers build these great new forward looking systems. At Sumo Logic, we have created a purpose-built cloud analytics service that supports, and is needed, by every software development shop over the next number of years as more and more products are built on the new extreme architecture. Those who have adopted and are adopting the new way of doing things are on board already and we are looking forward to support the next waves by continuing to provide the best service to monitor, troubleshoot, and proactively maintain the quality of your applications, infrastructure, and ultimately of your service. In addition, with our unique and patented machine learning analytics capabilities we can further deliver on our vision to bring machine intelligence to the masses where as this was previously only available to the fortunate few. As we scale along with the great opportunity that the massive wave of change in IT and software is bringing, we will put the money provided by our investors to the best possible use we can think of. First of all, we will continue to bring more engineers and product development talent on board. The addition of this new tech talent will continue to help us further develop our massive elastic scale platform which has grown more than 1000X in the past few years in terms of data ingested. In fact, we are already processing 50TB of new data every day, and that number will only go up. Our own production footprint has reached a point where we would literally have to invent a product like Sumo Logic in order to keep up – thankfully, we enjoy eating our dog food, all across the company. Except for the dogs in the office, they’d actually much rather have more human food. In any case, this service is engineering heavy, full of challenges along many dimensions, and scale is just one of them. If you are looking for a hardcore technical challenge, let’s talk (talk@sumologic.com). And while we continue to tweak our system and adhere to our SLAs (even for queries!), we will also massively grow the sales, G&A, marketing and customer success side of the company to bring what we believe to be the best purpose built cloud service for monitoring modern application architectures to more and more people, and to constantly improve on our mission of maniacal customer success. What do you say? Five more years, everybody!!!

June 1, 2015

Blog

Collecting In-Container Log Files

Docker and the use of containers is spreading like wildfire. In a Docker-ized environment, certain legacy practices and approaches are being challenged. Centralized logging is the one of them. The most popular way of capturing logs coming from a container is to setup the containerized process such that it logs to stdout. Docker then spools this to disk, from where it can be collected. This is great for many use cases. We have of course blogged about this multiple times already. If the topic fascinates you, also checkout a presentation I did in December at the Docker NYC meetup. At the same time, at Sumo Logic our customers are telling us that the stdout approach doesn’t always work. Not all containers are setup to follow the process-per-container model. This is sometimes referred to as “fat” containers. There are tons of opinions about whether this is the right thing to do or not. Pragmatically speaking, it is a reality for some users. What if you could visualize your entire Docker ecosystem in real-time? See how Sumo Logic makes it possible and get started for free today.Free Trial Even some programs that are otherwise easily containerized as single processes pose some challenges to the stdout model. For example, popular web servers write at least two log files: access and error logs. There are of course workarounds to map this back to a single stdout stream. But ultimately there’s only so much multiplexing that can be done before the demuxing operation becomes too painful. A Powerstrip for Logfiles Powerstrip-Logfiles presents a proof of concept towards easily centralizing log files from within a container. Simply setting LOGS=/var/logs/nginx in the container environment, for example, will use a bind mount to make the Nginx access and error logs available on the host under /var/logs/container-logfiles/containers/[ID of the Nginx container]/var/log/nginx. A file-based log collector can now simply be configured to recursively collect from /var/logs/container-logfiles/containers and will pick up logs from any container configured with the LOGS environment variable. Powerstrip-Logfiles is based on the Powerstrip project by ClusterHQ, which is meant to provide a way to prototype extensions to Docker. Powerstrip is essentially a proxy for the Docker API. Prototypical extensions can hook Docker API calls and do whatever work they need to perform. The idea is to allow for extensions to Docker to be composable – for example, to add support for overlay networks such as Weave and for storage managers such as Flocker. Steps to run Powerstrip-Logfiles Given that the Powerstrip infrastructure is meant to support prototyping of what one day will hopefully become Docker extensions, there’s still a couple of steps required to get this to work. First of all, you need to start a container that contains the powerstrip-logfiles logic: $ docker run --privileged -it --rm \ --name powerstrip-logfiles \ --expose 80 -v /var/log/container-logfiles:/var/log/container-logfiles \ -v /var/run/docker.sock:/var/run/docker.sock \ raychaser/powerstrip-logfiles:latest \ -v --root /var/log/container-logfiles Next you need to create a Powerstrip configuration file… $ mkdir -p ~/powerstrip-demo $ cat > ~/powerstrip-demo/adapters.yml <<EOF endpoints: "POST /*/containers/create": pre: [logfiles] post: [logfiles] adapters: logfiles: http://logfiles/v1/extension EOF …and then you can start the powerstrip container that acts as the Docker API proxy: $ docker run -d --name powerstrip \ -v /var/run/docker.sock:/var/run/docker.sock \ -v ~/powerstrip-demo/adapters.yml:/etc/powerstrip/adapters.yml \ --link powerstrip-logfiles:logfiles \ -p 2375:2375 \ clusterhq/powerstrip Now you can use the normal docker client to run containers. First you must export the DOCKER_HOST variable to point at the powerstrip server: $ export DOCKER_HOST=tcp://127.0.0.1:2375 Now you can specify as part of the container’s environment which paths are supposed to be considered logfile paths. Those paths will be bind-mounted to appear under the location of the –root specified when running the powerstrip-logfiles container. $ docker run --cidfile=cid.txt --rm -e "LOGS=/x,/y" ubuntu \ bash -c 'touch /x/foo; ls -la /x; touch /y/bar; ls -la /y' You should now be able to see the files “foo” and “bar” under the path specified as the –root: $ CID=$(cat cid.txt) $ ls /var/log/container-logfiles/containers/$CID/x $ ls /var/log/container-logfiles/containers/$CID/y See the example in the next section on how to most easily hook up a Sumo Logic Collector. Sending Access And Error Logs From An Nginx Container To Sumo Logix For this example, you can just run Nginx from a toy image off of Docker Hub: $ CID=$(DOCKER_HOST=localhost:2375 docker run -d --name nginx-example-powerstrip -p 80:80 -e LOGS=/var/log/nginx raychaser/powerstrip-logfiles:latest-nginx-example) && echo $CID You should now be able to see the Nginx container’s /var under the host’s /var/log/container-logfiles/containers/$CID/: $ ls -la /var/log/container-logfiles/containers/$CID/ And if you tail the access log from that location while hitting http://localhost you should see the hits being logged: $ tail -F /var/log/container-logfiles/containers/$CID/var/log/nginx/access.log Now all that’s left is to hook up a Sumo Logic collector to the /var/log/container-logfiles/containers/ directory, and all the logs will come to your Sumo Logic account: $ docker run -v /var/log/container-logfiles:/var/log/container-logfiles -d \ --name="sumo-logic-collector" sumologic/collector:latest-powerstrip [Access ID] [Access Key] This collector is pre-configured to collect all files from /container-logfiles which by way of the -v volume mapping in the invocation above is mapped to /var/log/container-logs/containers, which is where powerstrip-logfiles by default writes the logs for the in-container files. As a Sumo Logic user, it is very easy to generate the required access key by going to the Preferences page. Once the collector is running, you can search for _sourceCategory=collector-container in the Sumo Logic UI and you should see the toy Nginx logs. Simplify using Docker Compose And just because we can, here’s how this could all work with Docker Compose. Docker Compose will allow us to write a single spec file that contains all the details on how the Powerstrip container, powerstrip-logfiles, and the Sumo Logic collector container are to be run. The spec is a simple YAML file: powerstriplogfiles: image: raychaser/powerstrip-logfiles:latest ports: - 80 volumes: - /var/log/container-logfiles:/var/log/container-logfiles - /var/run/docker.sock:/var/run/docker.sock environment: ROOT: /var/log/container-logfiles VERBOSE: true entrypoint: - node - index.js powerstrip: image: clusterhq/powerstrip:latest ports: - "2375:2375" volumes: - /var/run/docker.sock:/var/run/docker.sock - ~/powerstrip-demo/adapters.yml:/etc/powerstrip/adapters.yml links: - "powerstriplogfiles: logfiles" sumologiccollector: image: sumologic/collector:latest-powerstrip volumes: - "/var/log/container-logfiles:/var/log/container-logfiles" env_file: .env You can copy and paste this into a file called docker-compose.yml, or take it from the powerstrip-logfiles Github repo. Since the Sumo Logic Collector will require valid credentials to log into the service, we need to put those somewhere so Docker Compose can wire them into the container. This can be accomplished by putting them into the file .env in the same directory, something like so: SUMO_ACCESS_ID=[Access ID] SUMO_ACCESS_KEY=[Access Key] This is not a great way to deal with credentials. Powerstrip in general is not production ready, so please keep in mind to try this only outside of a production setup, and make sure to delete the access ID and access key in the Sumo Logic UI. Then simply run, in the same directory as docker-compose.yml, the following: $ docker-compose up This will start all three required containers and start streaming logs to Sumo Logic. Have fun!

Blog

New Docker Logging Drivers

Docker Release 1.6 introduces the notion of a logging driver. This is a very cool capability and a huge step forward in creating a comprehensive approach to logging in Docker environments.It is now possible to route container output (stdout and stderr) to syslog. It is also possible to completely suppress the writing of container output to file, which can help in situations where disk space usage is of importance. This post will also show how easy it is to integrate the syslog logging driver with Sumo Logic.Let’s review for a second. Docker has been supporting logging of a container’s standard output and standard error streams to file for a while. You can see how this works in this quick example:<pre class="brush: plain; title: ; notranslate" title="">$ CID=$(docker run -d ubuntu echo "Hello")$ echo $CID5594248e11b7d4d40cfec4737c7e4b7577fe1e665cf033439522fbf4f9c4e2d5$ sudo cat /var/lib/docker/containers/$CID/$CID-json.log{"log":"Hello\n","stream":"stdout","time":"2015-03-30T00:34:58.782658342Z"}</pre>What happened here? Our container simply outputs Hello. This output will go to the standard output of the container. By default, Docker will write the output wrapped into JSON into a specific file named after the container ID, in a directory under /var/lib/docker/containers named after the container ID.Logging the Container Output to SyslogWith the new logging drivers capability, it is possible to select the logging behavior when running a container. In addition to the default json-file driver, there is now also a syslog driver supported. To see this in action, do this in one terminal window:<pre class="brush: plain; title: ; notranslate" title="">$ tail -F /var/log/syslog</pre>Then, in another terminal window, do this:<pre class="brush: plain; title: ; notranslate" title="">$ docker run -d --log-driver=syslog ubuntu echo "Hello"</pre>When running the container, you should see something along these lines in the tailed syslog file:Mar 29 17:39:01 dev1 docker[116314]: 0e5b67244c00: HelloCool! Based on the --log-driver flag, which is set to syslog here, syslog received a message from the Docker daemon, which includes the container ID (well, the first 12 characters anyways), plus the actual output of the container. In this case of course, the output was just a simple message. To generate more messages, something like this will do the trick:<pre class="brush: plain; title: ; notranslate" title="">$ docker run -t -d --log-driver=syslog ubuntu \ /bin/bash -c 'while true; do echo "Hello $(date)"; sleep 1; done'</pre>While still tailing the syslog file, a new log message should appear every minute.Completely Suppressing the Container OutputNotably, when the logging driver is set to syslog, Docker sends the container output only to syslog, and not to file. This helps in managing disk space. Docker’s default behavior of writing container output to file can cause pain in managing disk space on the host. If a lot of containers are running on the host, and logging to standard out and standard error are used (as recommended for containerized apps) then some sort of space management for those files has to be bolted on, or the host eventually runs out of disk space. This is obviously not great. But now, there is also a none option for the logging driver, which will essentially dev-null the container output.<pre class="brush: plain; title: ; notranslate" title="">$ CID=$(docker run -d --log-driver=none ubuntu \ /bin/bash -c 'while true; do echo "Hello"; sleep 1; done')$ sudo cat /var/lib/docker/containers/$CID/$CID-json.logcat: /var/lib/docker/containers/52c646fc0d284c6bbcad48d7b81132cb7ba03c04e9978244fdc4bcfcbf98c6e4/52c646fc0d284c6bbcad48d7b81132cb7ba03c04e9978244fdc4bcfcbf98c6e4-json.log: No such file or directory</pre>However, this will also disable the Logs API, so the docker logs CLI will also not work anymore, and neither will the /logs API endpoint. This means that if you are using for example Logspout to ship logs off the Docker host, you will still have to use the default json-file option.Integrating the Sumo Logic Collector With the New Syslog Logging DriverIn a previous blog, we described how to use the Sumo Logic Collector images to get container logs to Sumo Logic. We have prepared an image that extends the framework developed in the previous post. You can get all the logs into Sumo Logic by running with the syslog logging driver and running the Sumo Logic Collector on the host:<pre class="brush: plain; title: ; notranslate" title="">$ docker run -v /var/log/syslog:/syslog -d \ --name="sumo-logic-collector" \ sumologic/collector:latest-logging-driver-syslog \ [Access ID] [Access Key]</pre>As a Sumo Logic user, it is very easy to generate the required access key by going to the Preferences page. And that’s it folks. Select the syslog logging driver, and add the Sumo Logic Collector container to your hosts, and all the container logs will go into one place for analysis and troubleshooting.

Blog

An Official Docker Image For The Sumo Logic Collector

Note: This post is now superceded by Update On Logging With Docker.Learning By Listening, And DoingOver the last couple of months, we have spent a lot of time learning about Docker, the distributed application delivery platform that is taking the world by storm. We have started looking into how we can best leverage Docker for our own service. And of course, we have spent a lot of time talking to our customers. We have so far learned a lot by listening to them describe how they deal with logging in a containerized environment.We actually have already re-blogged how Caleb, one of our customers, is Adding Sumo Logic To A Dockerized App. Our very own Dwayne Hoover has written about Four Ways to Collect Docker Logs in Sumo Logic.Along the way, it has become obvious that it makes sense for us to provide an “official” image for the Sumo Collector. Sumo Logic exposes an easy to use HTTP API, but the vast majority of our customers are leveraging our Collector software as a trusted, production-grade data collection conduit. We are and will continue to be excited about folks building their own images for their own custom purposes. Yet, the questions we get make it clear that we should release an official Sumo Logic Collector image for use in a containerized worldInstant Gratification, With Batteries IncludedA common way to integrate logging with containers is to use Syslog. This has been discussed before in various places all over the internet. If you can direct all your logs to Syslog, we now have a Sumo Logic Syslog Collector image that will get you up and running immediately:docker run -d -p 514:514 -p 514:514/udp --name="sumo-logic-collector"sumologic/collector:latest-syslog [Access ID] [Access key]Started this way, the default Syslog port 514 is mapped port on the host. To test whether everything is working well, use telnet on the host:<pre class="brush: plain; title: ; notranslate">telnet localhost 514</pre>Then type some text, hit return, and then CTRL-] to close the connection, and enter quit to exittelnet. After a few moments, what you type should show up in the Sumo Logic service. Use a search to find the message(s).To test the UDP listener, on the host, use Netcat, along the lines of:<pre class="brush: plain; title: ; notranslate">I'm in ur sysloggz | nc -v -u -w 0 localhost 514</pre>And again, the message should show up on the Sumo Logic end when searched for.If you want to start a container that is configured to log to syslog and make it automatically latch on to the Collector container’s exposed port, use linking:docker run -it --link sumo-logic-collector:sumo ubuntu /bin/bashFrom within the container, you can then talk to the Collector listening on port 514 by using the environment variables populated by the linking:echo "I'm in ur linx" | nc -v -u -w 0 $SUMO_PORT_514_TCP_ADDR $SUMO_PORT_514_TCP_PORTThat’s all there is to it. The image is available from Docker Hub. Setting up an Access ID/Access Key combination is described in our online help.Composing Collector Images From Our Base ImageFollowing the instructions above will get you going quickly, but of course it can’t possibly cover all the various logging scenarios that we need to support. To that end, we actually started by first creating a base image. The Syslog image extends this base image. Your future images can easily extend this base image as well. Let’s take a look at what is actually going on! Here’s the Github repo:https://github.com/SumoLogic/sumologic-collector-docker.One of the main things we set out to solve was to clarify how to allow creating an image that does not require customer credentials to be baked in. Having credentials in the image itself is obviously a bad idea! Putting them into the Dockerfile is even worse. The trick is to leverage a not-so-well documented command line switch on the Collector executable to pass the Sumo Logic Access ID and Access Key combination to the Collector. Here’s the meat of the run.sh startup script referenced in the Dockerfile:<pre class="brush: plain; title: ; notranslate">/opt/SumoCollector/collector console -- -t -i $access_id -k $access_key-n $collector_name -s $sources_json</pre>The rest is really just grabbing the latest Collector Debian package and installing it on top of a base Ubuntu 14.04 system, invoking the start script, checking arguments, and so on.As part of our continuous delivery pipeline, we are getting ready to update the Docker Hub-hosted image every time a new Collector is released. This will ensure that when you pull the image, the latest and greatest code is available.How To Add The Batteries YourselfThe base image is intentionally kept very sparse and essentially ships with “batteries not included”. In itself, it will not lead to a working container. This is because the Sumo Logic Collector has a variety of ways to setup the actual log collection. It supports tailing files locally and remotely, as well as pulling Windows event logs locally and remotely.Of course, it can also act as a Syslog sink. And, it can do any of this in any combination at the same time. Therefore, the Collector is either configured manually via the Sumo Logic UI, or (and this is almost always the better way), via a configuration file. The configuration file however is something that will change from use case to use case and from customer to customer. Baking it into a generic image simply makes no sense.What we did instead is to provide a set of examples. This can be found in the same Github repository under “example”: https://github.com/SumoLogic/sumologic-collector-docker/tree/master/example. There’s a couple of sumo-source.json example files illustrating, respectively, how to set up file collection, and how to setup Syslog UDP and Syslog TCP collection. The idea is to allow you to either take one of the example files verbatim, or as a starting point for your own sumo-sources.json. Then, you can build a custom image using our image as a base image. To make this more concrete, create a new folder and put this Dockerfile in there:<pre class="brush: plain; title: ; notranslate">FROM sumologic/collectorMAINTAINER Happy Sumo CustomerADD sumo-sources.json /etc/sumo-sources.json</pre>Then, put a sumo-sources.json into the same folder, groomed to fit your use case. Then build the image and enjoy.A Full ExampleUsing this approach, if you want to collect files from various containers, mount a directory on the host to the Sumo Logic Collector container. Then mount the same host directory to all the containers that use file logging. In each container, setup logging to log into a subdirectory of the mounted log directory. Finally, configure the Collector to just pull it all in.The Sumo Logic Collector has for years been used across our customer base in production for pulling logs from files. More often than not, the Collector is pulling from a deep hierarchy of files on some NAS mount or equivalent. The Collector is quite adept and battle tested at dealing with file-based collection.Let’s say the logs directory on the host is called /tmp/clogs. Before setting up the source configuration accordingly, make a new directory for the files describing the image. Call it for example sumo-file. Into this directory, put this Dockerfile:<pre class="brush: plain; title: ; notranslate">FROM sumologic/collectorMAINTAINER Happy Sumo CustomerADD sumo-sources.json /etc/sumo-sources.json</pre>The Dockerfile extends the base image, as discussed. Next to the Dockerfile, in the same directory, there needs to be a file called sumo-sources.json which contains the configuration:<pre class="brush: plain; title: ; notranslate">{ "api.version": "v1", "sources": [ { "sourceType" : "LocalFile", "name": "localfile-collector-container", "pathExpression": "/tmp/clogs/**", "multilineProcessingEnabled": false, "automaticDateParsing": true, "forceTimeZone": false, "category": "collector-container" } ]}</pre>With this in place, build the image, and run it:<pre class="brush: plain; title: ; notranslate">docker run -d -v /tmp/clogs:/tmp/clogs -d --name="sumo-logic-collector"[image name] [your Access ID] [your Access key]</pre>Finally, add -v /tmp/clogs:/tmp/clogs when running other containers that are configured to log to /tmp/clogs in order for the Collector to pick up the files.Just like the ready-to-go syslog image we described in the beginning, a canonical image for file collection is available. See the source: https://github.com/SumoLogic/sumologic-collector-docker/tree/master/file.docker run -v /tmp/clogs:/tmp/clogs -d --name="sumo-logic-collector"sumologic/collector:latest-file [Access ID] [Access key]If you want to learn more about using JSON to configure sources to collect logs with the Sumo Logic Collector, there is a help page with all the options spelled out.That’s all for today. We have more coming. Watch this space. And yes, comments are very welcome.https://www.sumologic.com/blog... class="at-below-post-recommended addthis_tool">

Blog

Shifting Into Overdrive

How Our Journey Began Four years ago, my co-founder Kumar and I were just two guys who called coffee shops our office space. We had seen Werner Vogel’s AWS vision pitch at Stanford and imagined a world of Yottabyte scale where machine learning algorithms could make sense of it all. We dreamed of becoming the first and only native cloud analytics platform for machine generated data and next gen apps, and we dreamed that we would attract and empower customers. We imagined the day when we’d get our 500th customer. After years of troubleshooting scale limitations with on-premises enterprise software deployments, we bet our life savings that multi-tenant cloud apps could scale to the infinite data scales that were just a few years away. Eclipsing Our First Goal Just a few weeks ago, we added our 500th enterprise customer in just over two years since Sumo Logic’s inception. As software developers, the most gratifying part of our job is when customers use and love our software. This past month has been the most gratifying part of the journey so far as I’ve travelled around the world meeting with dozens of happy customers. At each city, I’m blown away by the impact that Sumo Logic has on our customers’ mission critical applications. Our code works, our customers love our software and our business is taking off faster than we could have imagined. Momentum Is Kicking In Our gratitude for our customers only grows when we dig through the stats of what we’ve been able to build together with our world class investors and team of 170+ Sumos. Just last quarter alone, we exceeded expectations with: 100%+ Quarter over Quarter ACV growth 100+ new customer logos 12 new 1 Terabyte/day customers 1 Quadrillion new logs indexed Dozens of new Sumos bringing badass skills from companies like Google, Atlassian, Microsoft, Akamai and even VMware… Shifting Into Overdrive It is still early days, and we have a tireless road of building ahead of us. Big data is approaching a $20B per year industry. And, we’re addressing machine data, which is growing 5X faster than any other segment of data. No company has built a platform for machine data that approaches our scale in the cloud: 1 million events ingested per second 8 petabytes scanned per day 1 million queries processed per day Today, we’re excited to share the news that Ramin Sayar will joining us to lead Sumo Logic as our new president and CEO. With 20 years of industry experience, he has a proven track record for remarkable leadership, incubating and growing significant new and emerging businesses within leading companies. He comes to us from VMWare, where he was Sr. Vice President and General Manager of the Cloud Management Business Unit. In his time at VMWare, he developed the product and business strategy and led the fastest growing business unit. He was responsible for the industry leading Cloud Management Business and Strategy, R&D, Operating P&L, Product Mgmt, Product Marketing and field/business Operations for VMware’s Cloud Mgmt offerings. Our mission remains the same: to enable businesses to harness the power of machine data to improve their operations and deliver outstanding customer experience. With our current momentum and Ramin’s leadership, I am extremely excited about the next chapter in Sumo Logic’s journey. Please know how grateful we are to you, our customers, partners, and investors, for your belief in us and for the privilege to innovate on your behalf every day.

December 2, 2014

Blog

Meatballs And Flying Tacos Don't Make a Cloud

Yes, we are cloud and proud. Puppies, ponies, rainbows, unicorns. We got them all. But the cloud is not a personal choice for us at Sumo Logic. It is an imperative. An imperative to build a better product, for happier customers. We strongly believe that if designed correctly, there is no need to fragment your product into many different pieces, each with different functional and performance characteristics that confuse decision-makers. We have built the Sumo Logic platform from the very beginning with a mindset of scalability. Sumo Logic is a service that is designed to appeal and adapt to many use cases. This explains why in just three short years we have been successful in a variety of enterprise accounts across three continents because - first and foremost - our product scales. On the surface, scale is all about the big numbers. We got Big Data, thank you. So do our customers, and we scale to the level required by enterprise customers. Yet, scaling doesn't mean scaling up by sizes of data sets. Scaling also means being able to scale back, to get out of the way, and provide value to everyone, including those customers that might not have terabytes of data to deal with. Our Sumo Free offering has proven that our approach to scaling is holistic - one product for everyone. No hard decisions to be made now, and no hard decisions to be made later. Just do it and get value. Another compelling advantage of our multi-tenant, one service approach is that we can very finely adjust to the amount of data and processing required by every customer, all the time. Elasticity is key, because it enables agility. Agile is the way of business today. Why would anyone want to get themselves tied into a fixed price license, and on top of that provision large amount of compute and storage resources permanently upfront just to buy insurance for those days of the year where business spikes, or, God forbid, a black swan walks into the lobby? Sumo Logic is the cure for anti-agility in the machine data analytics space. As a customer, you get all the power you need, when you need it, without having to pay for it when you don't. Finally, Sumo Logic scales insight. With our recently announced anomaly detection capability, you can now rely on the army of squirrels housed in our infrastructure to generate and vet millions of hypotheses about potential problems on your behalf. Only the most highly correlated anomalies survive this rigorous process, meaning you get actionable insight into potential infrastructure issues for free. You will notice repetitive events and be able to annotate them precisely and improve your operational processes. Even better - you will be able to share documented anomalous events with and consume them back from the Sumo Logic community. What scales to six billion humans? Sumo Logic does. One more thing: as a cloud-native company, we have also scaled the product development process, to release more features, more improvements, and yes, more bug fixes than any incumbent vendor. Sumo Logic runs at the time of now, and new stuff rolls out on a weekly basis. Tired of waiting for a year to get issues addressed? Tired of then having to provision an IT project to just update the monitoring infrastructure? Scared of how that same issue will apply even if the vendor "hosts" the software for you? We can help. Sumo Logic scales, along all dimensions. You like scale? Come on over. Oh, and thanks for the date, Praveen. I'll let you take the check.

October 2, 2013

Blog

Me at the End of the World

Blog

The Precursor Legacy

Blog

It's a culture thing - Devopsdays Austin 2012

Stefan and I attended Devopsdays last week in Austin. It was a great event, and I am really glad we went -- it’s always fun to be able to present your company to the public. We are very comfortable with the development and operations crowd, because it is largely at the core of what we are doing ourselves. There’s not a whole lot of abstractions to overcome! Sumo Logic sponsored the event, and so we had a little table set up in the “vendor” area. There, as well as throughout the conference, we had many interesting discussions, about our product, but also about the larger theme of the conference. We gave away a lot of T-Shirts, and it turns out that the little Sumo toys we had initially made for the company birthday two weeks ago are a great giveaway. This is the first time we came equipped with swag, and it came across well. As topical as Log Analytics and Application Management are for the crowd attending, it’s still fun to see them all smile at little toys of big naked men! Maybe my single most favorite moment of the entire conference was when the discussion turned to hiring. We are still struggling with a recovering economy and uncomfortably high unemployment numbers in this country, so it was notable that when the room was asked who’s hiring, pretty much all hands went up. Wow. Then somebody yelled out, “Hey, who needs a job”? And all hands went down. Not a single person in the room was looking for a job. In the words of @wickett on Twitter: “No recession in DevOps world”. One of the things I personally find fascinating is to observe the formation of trends, communities, maybe even cultures. It is not often that one has the luck to be around when something new is getting born. I was personally lucky to be, albeit somewhat from afar, observing the early days of the Ruby On Rails community, having attended the first conference in Chicago (and then some more in the following years). Rails never really mattered in my day job, and I ultimately was just a bystander. But even so, seeing the thought process in the community evolve was extremely interesting. I feel a little bit similar about the Devops development (pun!!). I actually was attending that mythical gathering in Mountain View in 2010. But at the time, I was more worried about getting Sumo Logic company off the ground, so I actually didn’t pay attention :) I was trying to listen in a bit more closely this time. A good overall summary of where Devops has come from -- and what its main motivational forces are today -- is available in a recent post by John Willis. John also presented the keynote kicking off the Austin event. This was a very interesting talk, as it was laying out the basic principles behind Devops as seen through the eyes of one of the main players in the movement. Based on the keynote, here’s Devops in 5 keywords (buzzwords?): Culture - Lean - Automation - Measurement - Sharing. In that order. This leads to the following insight: Devops is a human problem -- it’s a problem of culture, and it’s the cultural aspects that need to be addressed first, before even thinking about the other four principles. In other words, as great as tools such as Puppet, Chef, Github, and yes, Sumo Logic are, they can’t in themselves change a culture that is based on segregation. Or, simply put: as long as you have (process and cultural) walls between development and operations, operations and security, and security and development, you end up with people that say No. And that’s basically the end of agility. And this leads to something that surprised me (I guess I am a bit late to the party, but hey): I am sensing that Devops is really about the desire on the side of the operations folks to apply the learnings of Agile Development. I consider this as a good thing. We are building more and more software that runs as a service, and so it’s pretty obvious that Agile needs to extend from the construction process into the deployment process (and along the way destroy the distinction). I do think that the Agile approach has won in the development world. It still needs to be applied properly however (see for example “Flaccid Scrum”), and I am sure overeager managers will cause more than one spectacular failure for Devops projects by misunderstanding the process/tools vs culture priorities. And since we are in 2012, Agile rears its head in one of its newer incarnations in this context: Lean - see above, right after Culture. Given that the name “Devops” is still hotly discussed, maybe we will end up with a new label before too long: LeanOps, anyone? It was also great to see teams within larger companies making the leap - the best example is National Instruments (also the host of the event), who have managed to get more agile by adopting a Devops approach (see also this presentation). So in summary, this event was great fun. A lot of real people with real problems, applying real forward thinking. I felt the crowd was leaning more towards Ops vs Dev, but as I said above, at least in the context of the systems we are building here at Sumo Logic, this distinction has long been jettisoned. And of course, people need tools. In all our discussions, the ability to manage and analyze the logs of production systems has stood out as a key contributor in allowing teams to troubleshoot and find the root causes of issues in the applications faster, and to manage their applications and infrastructure more proactively such that they can find and fix problems before they impact the customer. Finally, in an act of shameless self promotion, here’s yours truly being interviewed by Barton George from Dell during the event.

Blog

Sumo Logic turns 2

Today, we find ourselves in the exceptionally fortunate situation of being able to celebrate the second birthday of Sumo Logic. Companies obviously don’t get created on a single day, but from the beginning, Kumar and I always thought that March 29th of 2010 was the real beginning of the life of this company. On this day two years ago, we agreed with Asheem Chandna on the terms under which Greylock Partners would invest into our vision as part of a Series A. This really is the singular point in time at which Sumo Logic became Sumo Logic. Well, technically, it was another couple of days, as we raced to actually incorporate the company as part of the closing of the financing 🙂 – And yes, we did get the termsheet at the Starbucks on Sand Hill Road, nervously sipping drip. Life really can work that way. With the company officially having been born, we set out to find an office. There was never any question between Kumar and myself that we had to find something on Castro St. in Mountain View. We both live close by, and have enjoyed the coffee culture established by the well-known coffee shops Red Rock and Dana St during the process of getting the company off the ground. We also like to eat, sometimes too much, and again Mountain View’s modest city center provides over 40 restaurants on Castro St. alone. And yes, they have Racer 5 on tab at Xhan’s, at least most of the time, and that’s clearly important as well, at least for yours truly. And finally, there’s a Caltrain station. As silly as that sounds to me in particular, having used nothing but public transport where I grew up, this really is important to enable all the people we want to work with to actually get to work. We found a nice spot on top of the legendary Book Buyer’s store, whose owner greeted us with a bow, and a hearty “Namaste” when we first ran into him. I think Kumar still hasn’t digested that one. When two techies start a company, it is important to counterbalance their natural tendencies with a strong personality on the product side, and we were again very lucky in finding Bruno, who has been our founding VP Product and Strategy since inception. And Sumo Logic isn’t Bruno’s only baby – days after he started, his wife gave birth to twins. We have no idea whether Bruno has ever slept since then 🙂 – on the technical side, we quickly ramped up hiring, reaching back to friends we have worked with before, as well as a whole new set of people who have become friends quickly. Domingo, our VP Engineering, completed the executive team when joining in March 2011, and quickly started applying his experience towards realizing our goal of agile development and continuous delivery. After spending many hours talking to prospects and building the foundation of the underlying scalable architecture, we managed to get a couple of potential customers onto a beta version of the Sumo Logic service in May 2011. The feedback of the early folks has been tremendously useful, and with their generous help, we continued pushing towards a public release of the Sumo Logic service, which happened in January 2012. We have since seen the first checks, written by people who are using Sumo Logic in production, where it solves real problems for them! The very first such check is pinned onto the wall of the kitchen in our office, along with a calculation of how many bottles of Pliny the Elder we could buy with it. Let me assure you, it’s a borderline unhealthy amount! 🙂 While working on getting the service ready for public launch, two more important things happened. Firstly, we outgrew our initial office, and had to go look for new digs. If you are familiar with the commercial real-estate situation in Mountain View, you will shudder to think of this. But luckily, we found our current building, lovingly dubbed the Sumo Towers, on 605 Castro St., allowing us to stay put in downtown Mountain View. We managed to give the place a whole new coat of paint on the inside, after blowing out every wall we could possible take away without the whole building collapsing, so it’s a really open and cozy place. The building also has a great retro look from the sixties, with this weird metal shielding applied to the outside of the second floor, apparently to shield the poor engineers living there from the sunlight. I still have this silly grin on my face when walking by on the way back from dinner at Shabu Way, when it’s dark, and the second floor of the building is lit, with Sumos hacking away. The other major company milestone happened to be our raising of more money as part of a Series B investment. We literally prepared this for months, with major help by way of Asheem’s incredibly useful feedback (pro tip: when you need to build an investor deck, get someone who’s an investor to help you, and _listen_ to them). The result of all the preparation was that we went from first investor pitch to handshake termsheet in only 54 hours. I am still at times scraping parts of my mind off the wall when thinking about this. The interest that our little company got from investors was just unbelievable, and deeply satisfying at the same time. We are extremely blessed to have Mike Speiser from Sutter Hill Ventures lead the Series B and joining our board successively. So here we are, two years in. The company that was named after a little rescue dog that was once found shivering in the night box of the Monterey shelter has entered the public spotlight and is not a puppy anymore by a long shot. As excited as I am of having been part of the story so far, I can barely contain myself when thinking of the stuff we have in the hopper, both as far as the product is concerned, but also when thinking of the amazing people we are talking to and who are planning to join the company soon. Who knows what’s going to happen in the next two years – all I know is that we are going to ride it hard and honest all the way, creating value for our customers and having fun building our company into a successful business with an enviable culture.

March 29, 2012

Blog

Log Data is Big Data

Blog

Log Management Challenges: So Much Pain, Not Enough Gain

True fact: unstructured data not only represents the average enterprise’s largest data set, but it’s also growing at a mind-boggling rate, which presents significant problems. Unstructured data, almost by definition, is not readily available to be analyzed.Log management addresses a significant subset of this expanding pile of unstructured data: diagnostic and run-time log information produced by applications, servers, and devices. Think of these logs like IT’s exhaust. Since these data sets are massive and unwieldy, organizations often opt to avoid them altogether; those who do use them are typically forced to implement and support a costly legacy log management solution.Those who choose not to leverage them are missing out on a significant opportunity to drive their IT and security operations to excellence. After all, it is hard to know what is really going on in an organization without looking at all the evidence, which is precisely the purpose logs serve. No other enterprise data set can record the realities across all infrastructure components with the same level of precision and detail.Over the past ten years, a number of vendors have emerged, each claiming to have finally solved the cumbersome, costly log management conundrum, however, customers continue to face increasingly high annual licensing fees and hidden costs to manage and configure these systems, along with mounting hardware and storage requirements. An honest TCO calculation should include not just the cost of the annual software license and a realistic assessment of the unavoidable associated maintenance and support charges, but also, they should account for the inevitable internal staffing resources required to install and configure these systems, since maintaining a log management system is a complex and personnel-intensive proposition.As we speak with companies, we’re discovering that in most cases, IT is spending innumerable hours to keep their commercial systems running in the first place. Today’s log management solutions are complex deployments mixing software with hardware appliances. They require hands-on tuning of RDBMS systems that don’t scale past one server. And finally, they exhibit an extreme hunger for expensive enterprise-class storage. Even so, they often fail at scaling to the entire log dataset, requiring complex arrangements for splitting the data over multiple systems that leads to additional management cost.In the end, the system that is meant to monitor the actual IT systems is becoming an un-monitorable behemoth itself. “Hooray for recursion,” we say, biting our tails. For the IT practitioner, however, this is not a pretty picture.