At Sumo Logic, we strongly believe in using our own service, sometimes called “dogfooding.” The primary reason for doing this is because Sumo Logic is a great fit for our environment. We run a mix of on-premise and cloud appliances, services and applications for which we need troubleshooting and monitoring capabilities:
- Our service, the Sumo Logic Log Management and Analytics Service, a distributed, complex SaaS application
- A heterogeneous, on-premise office network
- Our development infrastructure, which lives in Amazon Web Services (AWS)
- Our website
In short: We are like many other companies out there, with a mix of needs and use cases for the Sumo Logic service. In this post, I’ll explain how we’ve deployed our Sumo Logic Collectors in our environment. Collectors are small software components that gather logs from various sources and send them to the Sumo Logic Cloud in a reliable and secure fashion. They can be deployed in a variety of ways. This post provides some real-world examples.
The Sumo Logic Service
Our service is deployed across a large number of servers that work in concert to accept logs, store them in NoSQL databases, index them, manage retention, and provide all the search and analytics functionality in the service. Any interaction with our system is almost guaranteed to touch more than one of these machines. As a result, debugging and monitoring based on log files alone would be impractical, bordering on impossible.
The solution to this, of course, is the Sumo Logic service. After deciding that we wanted our own service to monitor itself, we weighed several different deployment options:
- A centralized Collector, pulling logs via SSH
- A centralized Collector, receiving logs via syslog
- Collectors on each machine in the deployment, reading local files
In the end, we went with the third option: both our test and production environments run Sumo Logic Collectors on each machine. The primary motivator for this choice was that it was best for testing the service – running a bigger number of Collectors is more likely to surface any issues.
This decision made it a priority to enable automatic deployment features in the Collector. This is why our Collectors can now be deployed both by hand and in a scripted fashion. Every time we deploy the service, a script installs and registers the Collector. Using the JSON configuration, we configure collection of files from our own application, third party applications, and system logs.
So there you have it – the Sumo Logic service monitors itself.
Sumo Logic’s main office is located in downtown Mountain View, on Castro Street. As strong proponents of cloud-based technologies, we’ve made an effort to keep the amount of physical infrastructure inside our office to a minimum. But there are some items any office needs, including ours:
- Robust Internet connectivity (we run a load balanced setup with two ISPs)
- Network infrastructure (we have switches, WiFi access points, firewalls)
- Storage for workstation backups (we have a small NAS)
- Phones (we just deployed some IP phones)
- Security Devices (we run an IDS with taps into our multiple points in network, and a web proxy)
- DHCP/DNS/Directory services (we run a Mac server with Open Directory)
Some of these devices log to files, others to syslog, while yet others are Windows machines. Whenever a new device is added to the network, we make sure logs are being collected from it. This has been instrumental in debugging tricky WiFi issues (Castro Street is littered with tech companies running WiFi), figuring out login issues, troubleshooting Time Machine problems with our NAS, and many other use cases.
Our bug tracker, CI cluster and other development infrastructure live in AWS. In order to monitor this infrastructure, we run Sumo Logic Collectors on all nodes, picking up system log files, web server logs, and application logs from the various commercial and open source tools we run. We use these logs for troubleshooting and to monitor trends.
The web server on our public facing web site logs lots of interesting information about visitors and how they interact with the site. Of course, we couldn’t resist dropping in a Collector to pick up these log files. We use a scheduled query that runs hourly and tells us who signed up for the demo and trial accounts.
We eat our own dog food and derive a lot of value from our own service, to troubleshoot and monitor all of our infrastructure, both on-premise and in the cloud. Our Collector’s ability to collect from a rich set of common source types (files, syslog, Windows), as well as the automatic, scripted installation, make it very easy to add new logs into the Sumo Logic Cloud.
– Stefan Zier, Cloud Infrastructure Architect