05.29.2013 | Posted by Amanda Saso, Sr. Tech Writer
Have you ever put your cell phone through the wash? Personally, I’ve done it. Twice. What did I learn, finally? To always double-check where I put my iPhone before I turn on the washing machine. It’s a very real and painful threat that I’ve learned to proactively manage by using a process with a low rate of failure. But, from time to time, other foreign objects slip through, like a lipstick, my kids’s crayon, a blob of Silly Putty—things that are cheaper than an iPhone yet create havoc in the dryer. Clothes are stained, the dryer drum is a mess, and my schedule is thrown completely off while I try to remember my grandmother’s instructions for removing red lipstick from a white shirt.
What do low-tech laundry woes have to do with Sumo Logic’s big data solution? Well, I see LogReduce as a tool that helps fortify your organization against known problems (for which you have processes in place) while guarding against unknown threats that may cause huge headaches and massive clean-ups.
When you think about it, a small but messy threat that you don’t know you need to look for is a nightmare. These days we’re dealing with an unbelievable quantity of machine data that may not be human-readable, meaning that a proverbial Chap Stick in the pocket could be lurking right below your nose. LogReduce takes the “noise” out of that data so you can see those hidden threats, problems, or issues that could otherwise take a lot of time to resolve.
Say you’re running a generic search for a broad area of your deployment, say billing errors, user creations, or log ins. Whatever the search may be, it returns thousands and thousands of pages of results. So, you could take your work day slogging through messages, hoping to find the real problem, or you can simply click Log Reduce. Those results are logically sorted into signatures–groups of messages that contain similar or relevant information. Then, you can teach Sumo Logic what messages are more important, and what data you just don’t need to see again. That translates into unknown problems averted.
Of course your team has processes in place to prevent certain events. How do you guard against the unknown? LogReduce can help you catch a blip before it turns into a rogue wave. Oh, and if you ever put Silly Putty through the washer and dryer, a good dose of Goo Gone will do the trick.
05.21.2013 | Posted by Sebastian Mies, Software Engineer
While I was wrapping up my Ph.D. thesis, my girlfriend (now wife) and I decided that we wanted to leave Germany to live and work in a different country. Prior to my Ph.D., I started off in computer gaming (ported “Turrican 2″ to the PC when I was a kid1). Following that, I did my MSCS and Ph.D. in distributed systems and computer networks in Karlsruhe, Germany.
I have been working as a Software Engineer at Sumo Logic since October 2012. At first I was skeptical about how intellectually engaging and challenging a commercial venture in log management could be. However, after working at Sumo Logic for more than 6 months, I have to admit that I misjudged the academic and engineering challenges of log management.
Why? I underestimated the problem and potential!
In contrast to academia, where algorithms are tested under controlled and reproducible conditions, we face the full force of unexpected behaviors of a live system here at Sumo Logic. When we turn algorithms into reality, we are responsible for the entire development process, including planning, testing, and implementing the finished component in a production environment.
No other company is approaching Big Data-scale log management like Sumo Logic. As a main differentiator Sumo Logic offers enterprise class log file processing in the Cloud. Sumo Logic ingests terabytes per day of unstructured log files that need to be processed in real time. In contrast to websites or other content, log files need exact processing; e.g., a needle in the haystack of logs can be comprised of merely 16 characters (out of the terabytes of data ingested and stored). Thus, there are only a few heuristics we can use to increase efficiency. This makes developing new algorithms to process log data challenging and interesting.
Furthermore, all our databases need to answer queries in a timely manner. Databases with unpredictable latencies on certain queries are not suitable for the problems we are solving. We mix-and-match between open source technologies and in-house customized solutions for that reason.
In addition, our customers trust us with information of vital importance to them. Security concerns influence design decisions across many levels, ranging from operating system level for full hard drive encryption, to application level for role-based access control (RBAC). We have to carefully select algorithms to balance performance (encrypted log files can challenge the efficient use of our cloud resources) while continuing to isolate customers, so that one customer’s demands don’t impact the performance of another.
In summary, I am glad I took the opportunity and joined Sumo Logic to turn my academic research into solutions used by customers to process TBs of their critical data in real time. This experience has brought self-improvement with each challenge, full-stack knowledge, and a sense of engineering not possible in any other environment.
And, by the way, we are hiring.
05.09.2013 | Posted by Joan Pepin, Director of Security
Pharmacy networks, electronic medical records, third-party billing, referrals— the medical establishment in this country runs on shared data. To ensure the safety and proper use of all of this highly sensitive and widely-shared information the US Congress passed the Health Insurance and Portability Act of 1996 (HIPAA). This law has changed the way healthcare related businesses operate inside the United States, and has had wide-reaching and expensive effects on every aspect of the healthcare industry.
There is no central certification authority for HIPAA, and the onus is on individual medical providers to ensure they are compliant with all of the appropriate “rules” within the act. HIPAA, while affording important protection, is a complex and cumbersome regulation with potentially severe civil and criminal penalties for violation. As such, compliance with the act is of utmost importance to “covered entities” (largely, billing providers, employer sponsored health plans, health insurers, and medical service providers, including doctor’s offices and pharmacies) who must ensure that any service provider they do business with is compliant if there is any chance that “Protected Health Information” is involved.
In order to provide our cutting-edge log management and analytics platform to these businesses we need to assure them that Sumo Logic can be trusted to handle this highly sensitive information in a secure and compliant manner. To accomplish this, Sumo Logic has undergone an extensive examination by a well-respected Certified Public Accounting firm who determined that Sumo Logic’s information security program “incorporates the essential elements of the HIPAA final security rule, including but not limited to administrative, physical and technical safeguards.”
This report, (available to Sumo Logic customers and prospects under NDA) is easily digestible by the compliance office at any medical company and will demonstrate our best-in-class dedication to the security of our customers’ data. Our commitment to data security and privacy makes Sumo Logic the only cloud-based log management solution able to demonstrate the ability to operate in a HIPAA regulated environment (as well as the only cloud-based log management service to carry a SOC 2 attestation, the replacement for the venerable SAS70.)
And our compliance story is just beginning! We have several other very exciting initiatives on the way over the next 12 months which will continue to prove that our dedication to enterprise-grade information security practices sets us clearly apart from the rest.
05.01.2013 | Posted by Ben Newton, Corporate Sales Engineering Manager
Of all of the new tools spawned by the DevOps movement, I find Etsy’s open-source tool, statsd, the most interesting. The enterprise software market is being shaken to its foundation, and statsd is one of the tools providing the vibrations. Instead of relying on the more generic metrics provided by application performance management (APM) vendors, Etsy, and others like them, is delivering highly specific, and highly relevant metrics directly from their code with statsd. With just a few lines of code, developers can measure any part of their application they choose, in the way they choose. This is very similar to the freedom that developers gain with a proper log analysis tool – they can dump any data they want to a log and analyze it later. Freed from the issue of storage, and of the mechanics of log analysis, they can focus on using the data to enhance performance management, troubleshooting, business intelligence etc.
For current users of statsd, the question might be – why would I want to put this in Sumo Logic, as opposed to using a tool like graphite for dashboard purposes? First of all, Sumo Logic provides analytics that supplement basic statsd metrics very well. For example, if you are watching your error count skyrocket and your user performance plummet, your next step will usually be to look for specific applications errors and do root cause analysis, which is a perfect use case for Sumo Logic. Secondly, there is a lot of value of both having the statsd and Sumo Logic metrics in “single pane of glass”, where performance metrics can be viewed alongside more complex analytics. Finally, for current users of Sumo Logic, statsd is a simple way to push application performance data straight into Sumo Logic, without filling up log files or worrying about data volumes.
Background for Statsd
First a little background on StatsD. The basis for the project started at Flickr, and was expanded at Etsy. This is appropriate since John Allspaw and his team helped kick-start the DevOps movement at Flickr, before coming over to Etsy. From the technical perspective statsd is, in their own words:
A network daemon that runs on the Node.js platform and listens for statistics, like counters and timers, sent over UDP and sends aggregates to one or more pluggable backend services.
So, statsd modules forward clear-text metrics over UDP. StatsD supports a few different types of metrics, as well as analytics, but for the sake of simplicity, I will only cover two areas here: Counting and Timing. The counting metric sends the metric name, the amount to increment/decrement, and possibly the sampling interval:
The timing metric looks very similar, with a metric name and value:
Generating the Metrics
To generate the data, I created a simple perl script using the statsd perl module Net::Statsd. I then created a Syslog Source on a Linux Collector over the standard port of 514. The Sumo Logic Syslog Source, essentially a listener for text over UDP, can receive the statsd message just fine. One caveat, though – since the statsd messages do not include a timestamp, Sumo Logic will assign the ingest time as the timestamp. This means that is essential that you set the timezone setting correctly. I tested this with thousands of events, and there were no issues. To make some interesting, and relevant, metrics I added extra logic to my perl script to create some patterns with the rand() function and some math:
# Configure where to send events
# That’s where your statsd daemon is listening.
$Net::Statsd::HOST = ‘localhost’; # Default
$Net::Statsd::PORT = 514; # Default
# Initial Values
$basepercent = 0.50;
$webTime = 50;
$appTime = 100;
$dbTime = 150;
$basecount = 5;
# Infinite loop
$basepercent = ($basepercent + (rand(100) + 50)/100)/2;
$webTime = $basepercent*($webTime + 50 + rand(750))/2;
$appTime = $basepercent*($appTime + 100 + rand(1000))/2;
$dbTime = $basepercent*($dbTime + 150 + rand(1200))/2;
$k = 0;
$basecount = $basepercent*($basecount + rand(5))/2;
while($k < $basecount)
sleep(5 + rand(10))
Making sense of the Metrics
Once the metrics were successfully being ingested into Sumo Logic, I needed to create some useful searches and Dashboard Monitors. With the statsd counter function, I simply wanted to extract the data, drop it into 1m buckets, and sum up the number of increments to the counter over each minute. The key-value structure of a statsd message can be easily parsed with our keyvalue operator. Basically, I just told Sumo Logic to look for a lower case key name with “.” in it [a-z\.]+ and a numerical value \d+. I only searched for “site.logins”, but you could use the statement to look for any number of different counters in the same dashboard.
| keyvalue regex “([a-z\.]+?):(\d+?)\|c” “site.logins” as logins
| timeslice by 1m
| sum(logins) by _timeslice
With the timing metrics, an average over each minute seems most relevant (though other functions like max, min, or standard deviations could be useful here). I pulled out all three timings together, by looking for key that looks like *.time – ?<tier>[a-z]+).time . Since I named my metrics web.time, app.time, and db.time, I was able to put each of the “tier” metrics on the same graph.
_sourceCategory=*statsd* AND time
| parse regex “(?<tier>[a-z]+).time:(?<test_time>\d+)\|ms”
| timeslice by 1m
| avg(test_time) by _timeslice, tier
| transpose row _timeslice column tier
As I ran each of these searches, I clicked the “Add to Dashboard” button on the far right to add them a newly created StatsD dashboard. I included a screenshot below (the tier metrics are on the left, and the counter is on the right):
You can see from this example how easy it is to analyze data in the statsd format. Once the data is in Sumo Logic, the sky is the limit to what you can do with it. There are other metrics and backend functions that Sumo Logic can support over the long term, but this simple integration provides the majority of functionality needed. Let us know you think, and sign up for a free account to try it out yourself.