Blog › Security

Brandon Mensing

Keep Tabs on Your Box

12.09.2014 | Posted by Brandon Mensing

You can argue till you’re blue in the face about which cloud storage to use these days. But you can’t argue (at least not with me) about whether you should be auditing what’s going on in those tools. Sure, these services have a decent EULA and plenty of encryption. However, that doesn’t mean your own employees are following your policies. And let’s not forget how easily passwords fail.

The good news is that you can get setup to monitor your Box account with Sumo Logic in a few simple steps. Following our documentation, a simple script will talk to the Box service, retrieve audit events and bring them into Sumo Logic via a Script Source. Then just install the Box app from our service (a few simple clicks) and you’ll be able to quickly inspect how your organization is using Box.

With our off the shelf dashboards, I encourage all of you using Box to explore your deployment. You may be surprised at what you find. Odd behavior patterns can bubble up, showing you that a critically sensitive document is being overshared; or that one of your users has recently shared thousands upon thousands of documents in an apparent attempt to walk out the door with your IP; or that an unexpected user is logged in and accessing documents.

Cloud storage systems promise a lot of security and privacy but ultimately it’s still up to you to identify malicious users, policy violations and security threats. Using this great new app, you’ll find that it’s easy and effective to monitor and investigate activities in your Box account. Give it a try today and use our support portal to send us your feedback!

Eillen Voellinger

Certified, bonafide, on your side

11.05.2014 | Posted by Eillen Voellinger

“How do I build trust with a cloud-based service?”  This is the most common question Sumo Logic is asked, and we’ve got you covered. We built the service so it was not just an effortless choice for enterprise customers but the obvious one and building trust through a secure architecture was one of the first things we took care of.

Sumo Logic is  SOC 2 Type 2 and HIPAA compliant. Sumo Logic also complies with the U.S. – E.U. Safe Harbor framework and will soon be PCI/DSS 3.0 compliant. No other cloud-based log analytics service can say this. For your company, this means you can safely get your logs into Sumo Logic – a service you can trust and a service that will protect your data just like you would.

These are no small accomplishments, and it takes an A-team to get it done. It all came together when we hired Joan Pepin, a phreak and a hacker by admission. Joan is our VP of Security and CISO. She was employee number 11 at Sumo Logic and her proficiency has helped shape our secure service.

Our secure architecture is also a perfect match for our “Customer First” policy and agile development culture. We make sure that we are quickly able to meet customer needs and to fix issues in real-time without compromising our secure software development processes. From network security to secure software development practices, we ensured that our developers are writing secure code in a peer-reviewed and process-driven fashion.

Sumo Logic was built from the ground up to be secure, reliable, fast, and compliant. Joan understands what it means to defend a system, keep tabs on it, watch it function live. Joan worked for the Department of Defense. She can’t actually talk about what she did when she was there, but we can confirm that she was there because the Department of Defense, as she puts it, “thought my real world experience would balance off the Ph.Ds.”

Joan learned the craft from Dr. Who, a member of the ( Legion of Doom. ( ) If hacker groups were rock and roll, the Legion of Doom would be Muddy Waters, Chuck Berry, Buddy Holly. They created the idea of a hacker group. They hacked into a number of state 911 systems and stole the documentation on them, distributing it throughout BBS’ in the United States. They were the original famous hacking group. Joan is no Jane-come-lately. She’s got the best resume you can have in this business.

We’re frequently asked about all the security procedures we adopt at Sumo Logic. Security is baked into every component of our service. Other than the various attestations I mentioned earlier, we also encrypt data at rest and in transit. Other security processes that are core to the Sumo Logic service include:

+ Centrally managed, FIPS-140 two-factor authentication devices for operations personnel

+ Biometric access controls

+ Whole-disk encryption

+ Thread-level access controls

+ Whitelisting of individual processes, users, ports and addresses

+ Strong AES-256-CBC encryption

+ Regular penetration tests and vulnerability scans

+ A strong Secure Development Life-Cycle (SDLC)

+ Threat intelligence and managed vulnerability feeds to stay current with the constantly evolving threatscape and security trends

If you’re still curious about the extent to which our teams have gone to keep your data safe, check out our white paper on the topic:

We use our own service to capture our logs, which has helped us accomplish our enviable security and compliance accomplishments. We’ve done the legwork so your data is secure and so you can use Sumo Logic to meet your unique security and compliance needs. We have been there done that with the Sumo Logic service and now it’s your turn.

Sanjay Sarathy, CMO

The Three Questions Customers Invariably Ask Us

10.08.2014 | Posted by Sanjay Sarathy, CMO

For almost all DevOps, App Ops and Security teams, finding that needle in the haystack, that indicator of cause, the unseen effect, and finding it quickly is fundamental to their success. Our central mission is to enable the success of these teams via rapid analysis of their machine data.  During their process of researching and investigating Sumo Logic, customers invariably ask us three questions:

  • How long will it take to get value from Sumo Logic?
  • Everyone provides analytics – what’s different about yours?
  • How secure is my data in the cloud?

Let’s address each of these questions.

Time to Value

A key benefit we deliver revolves around speed and simplicity: no hardware, storage or deployment overhead. Beyond the fact that we’re SaaS the true value, however, revolves around how quickly we can turn data into actionable information.  


First, our cloud-based service integrates quickly into any environment (on-premises, cloud, hybrid) that generates machine data. Because we’re data source agnostic, our service can quickly correlate logs across various systems, leading to new and relevant analyses.  For example, one of our engineers has written a post on how we use Sumo Logic internally to track what’s happening with Amazon SES messages and how others can very quickly set this up as well.  

Second, value is generated by how quickly you uncover insights. A Vice President of IT at a financial services firm that is now using Sumo Logic shared with us that incidents that used to take him 2 hours to discover and fix now takes him 10 minutes. Why? Because the machine learning that underpins our LogReduce pattern recognition engine surfaces the critical issues that his team can investigate and remediate, without the need to write any rules.

Analytics Unleashed

Sumo Logic was founded on the idea that powerful analytics are critical to making machine data a corporate resource to be valued rather than ignored. Our analytics engine combines the best of machine learning, real-time processing, and pre-built applications to provide rapid value.  

Fuze recently implemented Sumo Logic to help gain visibility of its technical infrastructure. They are now able to address incidents and improvements in its infrastructure much more quickly with specific insights. They report saving 40% in management time savings and a 5x improvement in “signal-to-noise” ratio.  A critical reason why InsideView chose Sumo Logic was the availability of our applications for AWS Elastic Load Balancing and AWS CloudTrail to help monitor their AWS infrastructure and to get immediate value from our service.     

Security In the Cloud 

Customers are understandably curious about our security processes, policies and infrastructure that would help them mitigate concerns about sending their data to a 3rd party vendor.  Given that our founding roots are in security and that our entire operating model is to securely deliver data insights at scale, we have a deep appreciation for the natural concerns prospects might have.

We’ve crafted a detailed White Paper that outlines how we secure our service, but here are a few noteworthy highlights.

  • Data encryption:  we encrypt log data both in motion and at rest and each customer’s unique keys are rotated daily
  • Certifications:  we’ve spent significant resources on our current attestations and certifications (e.g., HIPAA, SOC 2 Type 2 and others) and are actively adding to this list
  • Security processes: included in this bucket are centrally managed FIPS-140 two-factor authentication devices, biometric controls, whitelists for users, ports, and addresses, and more

Our CISO has discussed the broader principles of managing security in the cloud in an on-demand webinar and of course you can always start investigating our service via Sumo Logic Free to understand for yourself how we answer these three questions.

Vera Chen

We are Shellshock Bash Bug Free Here at Sumo Logic, but What about You?

10.01.2014 | Posted by Vera Chen

Be Aware and Be Prepared

I am betting most of you have heard about the recent “Shellshock Bash Bug”.  If not, here is why you should care – this bug has affected users of Bash, which is one of the most popular utilities installed on operating systems today.  Discovered in early September 2014, this extremely severe bug affects bash versions dating back to version 1.13 and has the ability to process shell commands after function definitions in Bash that exposes systems to security threats.  This vulnerability allows remote attackers to execute any shell command and gain access to internal data, publish malicious code, reconfigure environments and exploit systems in infinite ways.

Shellshock Bash Bug Free, Safe and Secure

None of the Sumo Logic service components were impacted due to the innate design of our systems.  However, for those of you out there who might have fallen victim to this bug based on your system architecture, you’ll want to jump in quickly to address potential vulnerabilities. 

What We Can Do for You

If you have been searching around for a tool to expedite the process of identifying potential attacks on your systems, you’re in the right place!  I recommend that you consider Sumo Logic and especially our pattern recognition capability called LogReduce.  Here is how it works – the search feature enables you to search for the well known “() {“ Shellshock indicators while the touch of the LogReduce button effectively returns potential malicious activity for you to consider.  Take for instance a large group of messages that could be a typical series of ping requests, LogReduce separates messages by their distinct signatures making it easier for you to review those that differ from the norm.  You can easily see instances of scans, attempts and real attacks separated into distinct groups.  This feature streamlines your investigation process to uncover abnormalities and potential attacks.  Give it a try and see for yourself how LogReduce can reveal to you a broad range of remote attacker activity from downloads of malicious files to your systems, to internal file dumps for external retrieval, and many others.

Witness it Yourself

Check out this video to see how our service enables you to proactively identify suspicious or malicious activity on your systems: Sumo Logic: Finding Shellshock Vulnerabilities

Give Us a Try

For those of you, who are completely new to our service, you can sign up for a Free 30 day trail here: Sumo Logic Free 30 Day Trial


LogReduce vs Shellshock

09.25.2014 | Posted by Joan Pepin, VP of Security/CISO


Shellshock is the latest catastrophic vulnerability to hit the Internet. Following so closely on the heels of Heartbleed, it serves as a reminder of how dynamic information security can be.

(NOTE: Sumo Logic was not and is not vulnerable to this attack, and all of our BASH binaries were patched on Wednesday night.)


Right now there is a massive land-grab going on, as dozens of criminal hacker groups (and others) are looking to exploit this widespread and serious vulnerability for profit. Patching this vulnerability while simultaneously sifting through massive volumes of data looking for signs of compromise is a daunting task for your security and operations teams. However, Sumo Logic’s patent pending LogReduce technology can make this task much easier, as we demonstrated this morning.

Way Big Data

While working with a customer to develop a query to show possible exploitation of Shellshock, we saw over 10,000 exploitation attempts in a fifteen minute window. It quickly became clear that a majority of the attempts were being made by their internal scanner. By employing LogReduce we were able to very quickly pick out the actual attack attempts from the data-stream, which allowed our customer to focus their resources on the boxes that had been attacked.


Fighting the Hydra

From a technical perspective, the Shellshock attack can be hidden in any HTTP header; we have seen it in the User-Agent, the Referrer, and as part of the GET request itself. Once invoked in this way, it can be used to do anything from sending a ping, to sending an email, to installing a trojan horse or opening a remote shell. All of which we have seen already today. And HTTP isn’t even the only vector, but rather just one of many which may be used, including DHCP.

So- Shellshock presents a highly flexible attack vector and can be employed in a number of ways to do a large variety of malicious things. It is is so flexible, there is no single way to search for it or alert on it that will be completely reliable. So there is no single silver bullet to slay this monster, however, LogReduce can quickly shine light on the situation and wither it down to a much less monstrous scale.

We are currently seeing many different varieties of scanning, both innocent and not-so-innocent; as well as a wide variety of malicious behavior, from directly installing trojan malware to opening remote shells for attackers. This vulnerability is actively being exploited in the wild this very second. The Sumo Logic LogReduce functionality can help you mitigate the threat immediately.

Bruno Kurtic, Founding Vice President of Product and Strategy

The New Era of Security – yeah, it’s that serious!

02.23.2014 | Posted by Bruno Kurtic, Founding Vice President of Product and Strategy

Security is a tricky thing and it means different things to different people.   It is truly in the eye of the beholder.  There is the checkbox kind, there is the “real” kind, there is the checkbox kind that holds up, and there is the “real” kind that is circumvented, and so on.  Don’t kid yourself: the “absolute” kind does not exist. 

I want to talk about security solutions based on log data.  This is the kind of security that kicks in after the perimeter security (firewalls), intrusion detection (IDS/IPS), vulnerability scanners, and dozens of other security technologies have done their thing.  It ties all of these technologies together, correlates their events, reduces false positives and enables forensic investigation.  Sometimes this technology is called Log Management and/or Security Information and Event Management (SIEM).  I used to build these technologies years ago, but it seems like decades ago. 


A typical SIEM product is a hunking appliance, sharp edges, screaming colors – the kind of design that instills confidence and says “Don’t come close, I WILL SHRED YOU! GRRRRRRRRRR”.

Ahhhh, SIEM, makes you feel safe doesn’t it.  It should not.  I proclaim this at the risk at being yet another one of those guys who wants to rag on SIEM, but I built one, and beat many, so I feel I’ve got some ragging rights.  So, what’s wrong with SIEM?  Where does it fall apart?

SIEM does not scale

It is hard enough to capture a terabyte of daily logs (40,000 Events Per Second, 3 Billion Events per Day) and store them.  It is couple of orders of magnitude harder to run correlation in real time and alert when something bad happens.  SIEM tools are extraordinarily difficult to run at scales above 100GB of data per day.  This is because they are designed to scale by adding more CPU, memory, and fast spindles to the same box.  The exponential growth of data over the two decades when those SIEM tools were designed has outpaced the ability to add CPU, memory, and fast spindles into the box.

Result: Data growth outpaces capacity → Data dropped  from collection → Significant data dropped from correlation → Gap in analysis → Serious gap in security

SIEM normalization can’t keep pace

SIEM tools depend on normalization (shoehorning) of all data into one common schema so that you can write queries across all events.  That worked fifteen years ago when sources were few.  These days sources and infrastructure types are expanding like never before.  One enterprise might have multiple vendors and versions of network gear, many versions of operating systems, open source technologies, workloads running in infrastructure as a service (IaaS), and many custom written applications.  Writing normalizers to keep pace with changing log formats is not possible.

Result: Too many data types and versions → Falling behind on adding new sources → Reduced source support → Gaps in analysis → Serious gaps in security

SIEM is rule-only based

This is a tough one.  Rules are useful, even required, but not sufficient.  Rules only catch the thing you express in them, the things you know to look for.   To be secure, you must be ahead of new threats.  A million monkeys writing rules in real-time: not possible.

Result: Your rules are stale → You hire a million monkeys → Monkeys eat all your bananas → You analyze only a subset of relevant events → Serious gap in security

SIEM is too complex


It is way too hard to run these things.  I’ve had too many meetings and discussions with my former customers on how to keep the damned things running and too few meetings on how to get value out of the fancy features we provided.  In reality most customers get to use the 20% of features because the rest of the stuff is not reachable.  It is like putting your best tools on the shelf just out of reach.  You can see them, you could do oh so much with them, but you can’t really use them because they are out of reach.

Result: You spend a lot of money → Your team spends a lot of time running SIEM → They don’t succeed on leveraging the cool capabilities → Value is low → Gaps in analysis → Serious gaps in security   

So, what is an honest, forward-looking security professional who does not want to duct tape a solution to do?  What you need is what we just started: Sumo Logic Enterprise Security Analytics.  No, it is not absolute security, it is not checkbox security, but it is a more real security because it:


Processes terabytes of your data per day in real time. Evaluates rules regardless of data volume and does not restrict what you collect or analyze.  Furthermore, no SIEM style normalization, just add data, a pinch of savvy, a tablespoon of massively parallel compute, and voila.

Result: you add all relevant data → you analyze it all → you get better security 


It is SaaS, there are no appliances, there are no servers, there is no storage, there is just a browser connected to an elastic cloud.

Result: you don’t have to spend time on running it → you spend time on using it → you get more value → better analysis → better security

Machine Learning

SecurityAnomaliesRules, check.  What about that other unknown stuff?  Answer: machine that learns from data.  It detects patterns without human input.  It then figures out baselines and normal behavior across sources.  In real-time it compares new data to the baseline and notifies you when things are sideways.  Even if “things” are things you’ve NEVER even thought about and NOBODY in the universe has EVER written a single rule to detect.  Sumo Logic detects those too. 

Result: Skynet … nah, benevolent overlord, nah, not yet anyway.   New stuff happens → machines go to work → machines notify you → you provide feedback → machines learn and get smarter → bad things are detected → better security

Read more: Sumo Logic Enterprise Security Analytics

Vance Loiselle, CEO

Black Friday, Cyber Monday and Machine Data Intelligence

11.25.2013 | Posted by Vance Loiselle, CEO

The annual craze of getting up at 4am to either stand in line or shop online for the “best” holiday deals is upon us.  I know first-hand, because my daughter and I have participated in this ritual for the last four years (I know – what can I say – I grew up in Maine).  While we are at the stores fighting for product, many Americans will be either watching football, or surfing the web from the comfort of their couch looking for that too-good-to-be-true bargain.  And with data indicating a 50% jump in Black Friday and Cyber Monday deals this year, it’s incumbent on companies to ensure that user experiences are positive.  As a result, the leading companies are realizing the need to obtain visibility end-to-end across their applications and infrastructure, from the origin to the edge.  Insights from machine data (click-stream in the form of log data), generated from these environments, helps retailers of all stripes maximize these two critical days and the longer-term holiday shopping season.  

What are the critical user and application issues that CIOs should be thinking about in the context of these incredibly important shopping days?

  • User Behavior Insights. From an e-commerce perspective, companies can use log data to obtain detailed insights into how their customers are interacting with the application, what pages they visit, how long they stay, and the latency of specific transactions.  This helps companies, for example, correlate user behavior with the effectiveness of specific promotional strategies (coupons, etc) that allow them to rapidly make adjustments before the Holiday season ends.

  • The Elasticity of The Cloud.  If you’re going to have a problem, better it be one of “too much” rather than “too little”.  Too frequently, we hear of retail web sites going down during this critical time.  Why? The inability to handle peak demand – because often they don’t know what that demand will be.   Companies need to understand how to provision for the surge in customer interest on these prime shopping days that in turn deliver an exponential increase in the volume of log data.  The ability to provide the same level of performance at 2, 3 or even 10x usual volumes in a *cost-effective* fashion is a problem few companies have truly solved.  The ability of cloud-based architectures to easily load-balance and provision for customer surges at any time is critical to maintaining that ideal shopping experience while still delivering the operational insights needed to support customer SLAs.

  • Machine Learning for Machine Data. It’s difficult enough for companies to identify the root cause of an issue that they know something about.  Far more challenging for companies is getting insights into application issues that they know nothing about.  However, modern machine learning techniques provide enterprises with a way to proactively uncover the symptoms, all buried within the logs, that lead to these issues.  Moreover, machine learning eliminates the traditional requirement of users writing rules to identify anomalies, which by definition limit the ability to understand *all* the data.  We also believe that the best analytics combine machine learning with human knowledge about the data sets – what we call Machine Data Intelligence – and that helps companies quickly and proactively root out operational issues that limit revenue generation opportunities.

  • Security and Compliance Analytics.  With credit cards streaming across the internet in waves on this day, it’s imperative that you’ve already set up the necessary environment to both secure your site from fraudulent behavior and ensure your brand and reputation remain intact.  As I mentioned in a previous post, the notion of a perimeter has long since vanished which means companies need to understand that user interactions might occur across a variety of devices on a global basis.  The ability to proactively identify what is happening in real-time across your applications and the infrastructure on which they run is critical to your underlying security posture.  All this made possible by your logs and the insights they contain.  

Have a memorable shopping season and join me on twitter – @vanceloiselle – to continue the conversation.

Vance Loiselle, CEO

What CIOs (and I) Can Learn From

11.19.2013 | Posted by Vance Loiselle, CEO

There is little debate that the “Obamacare” rollout has been choppy at best.  Regardless of which side of the political debate you fall, many of us in technology, CIOs and software executives alike, can learn from this highly publicized initiative as we approach the November 30th deadline.  Development of new applications, especially web applications, is no longer just about the myopic focus on Design, Develop, Test, and Rollout.  The successful development and deployment of these applications must have a holistic, information-driven approach, which includes the following four key processes:

  1. Application Quality Analytics – the constant tracking of the errors, exceptions, and problems that are occurring in each new release of the code.
  2. Application Performance Analytics – the real-time measurement of the performance of the application as the users are experiencing it.
  3. Security Analytics – the real-time information required to analyze and conduct forensics on the security of the entire application and the infrastructure on which it runs.
  4. User Analytics – real-time insights on which users are in the application, what pages they are viewing, and the success they’ve had in conducting transactions in the application.

Application Quality Analytics – Is it really necessary that in the year 2013, that development of applications still need to be 4 parts art and 1 part science?  I’m sure that the Secretary of Health and Human Services, Kathleen Sibelius, wished it was more science when she testified in front of Congress about why the site was not ready.  She had no real-time information or metrics at her disposal about the number of defects that were being fixed each day, the number of errors being encountered by users, the severity of the errors, and the pace at which these errors and defects were being resolved. 

These metrics are sitting there in the log files (data exhaust from all applications and software components to track what the software is doing), and are largely untapped by most development organizations.   Moreover, this data could be shared between multiple teams to pinpoint the root cause of problems between the application itself and the network and infrastructure on which it is running.  It was so frustrating to see CGI (the contractor hired to build the application) and Verizon (the contractor hired to host the application in their “cloud”) passing the buck between each other in front of Congress.

Application Performance Management – Much has been made about the performance of  The HHS secretary even had gall to say that the site had not crashed, it was just “performing slowly”, while in the congressional hearing there was a live image on the screen informing users that the site was down.  The site was down AND performing slowly because the site’s developers are stuck in a previous generation of thinking – that you can measure the site performance without taking into account user analytics.  It’s not good enough to measure application performance by sampling the transaction response times periodically.  Testers and managers need access to real-time information about each user, the session they were running, the performance at each step, and the outcomes (e.g. new plan sign-up created or failed, 5 insurance plans compared, pricing returned from 2 out of 3 carriers, etc.) along the way.  Most monitoring tools look at just the application or just the network and infrastructure it runs on, and have little to no visibility about the outcomes the user is experiencing.  Guess what?  Most, if not all, of this information is sitting in the logs waiting to be harnessed.

Security Analytics – I appreciated when Secretary Sibelius was asked about what steps had been taken to ensure the privacy and security of the data that fellow Americans had submitted on  The reality is that most IT organizations have very bifurcated organizations to address security and application development.  The old school view is that you put a web application firewall in place and you close down the ports, and your perimeter is safe.  The reality today is that there is no perimeter.  People have mobile phones and tablets and use third-party services to store their documents. itself is dependent on 3rd parties (insurance carriers) to provide and receive private information. 

The most effective way today to ensure some level of security is to have a real-time security analytics and forensics solution in place.  These solutions can scan every element of user and system activity, from – you guessed it – the log data, and determine everything from invalid logins and potential breaches to unauthorized changes to firewall rules and user permissions.     

User Analytics – Ok, I get it, the Obama administration did not want to share information about how many people had signed up on for weeks.  The argument was made that the accuracy of the data could not be trusted.  Either it was a political maneuver or incompetence, but either reason is sad in the year 2013.  And why do the White House and HHS have the right to keep this information a secret?  The American taxpayers are paying the hefty sum of $200M+ to get this application up and running.  Shouldn’t we know, in real-time, the traction and success of the Affordable Care Act?  It should be posted on the home page of the web site.  I guarantee the information about every enrollee, every signup – successful or failed – every county from which they logged in, every plan that was browsed, every price that was displayed, every carrier that’s providing quotes was AVAILABLE IN THE LOG DATA.

 There has been a lot of coverage about President Kennedy recently and we are reminded that he challenged our government to put people on the moon in the 1960s, and they did – with a very limited set of computer and software tools at their disposal.  I would ask the CIOs and software types out there, let’s learn from the rollout, and embrace a modern, information-driven approach to developing and rolling out applications.  And President Obama, if you need some help with this, give me a shout – I’m ready to serve.

Bruno Kurtic, Founding Vice President of Product and Strategy

Akamai and Sumo Logic integrate for real-time application insights!

10.09.2013 | Posted by Bruno Kurtic, Founding Vice President of Product and Strategy

I’m very pleased to announce our strategic alliance with Akamai. Our integrated solution delivers a unified view of application availability, performance, security, and business analytics based on application log data.  Customers who rely on Akamai’s globally distributed infrastructure now can get the real-time feed of all logs generated by Akamai’s infrastructure into their Sumo Logic account in order to integrate and cross-analyze them with their internally generated application data sets!

What problems does the integrated solution solve?

To date, there have been two machine data sets generated by applications that leverage Akamai:

1. Application logs at the origin data centers, which application owners can usually access.

2. Logs generated by Akamai as an application is distributed globally. Application owners typically have zero or limited access to these logs.

Both of these data sets provide important metrics and insights for delivering highly-available, secure applications that also provide detailed view of business results. Until today there was no way to get these data sets into a single tool for real-time analysis, causing the following issues:

  • No single view of performance. While origin performance could be monitored, but that provides little confidence that the app is performant for end users.
  • Difficult to understand user interaction. Without data on how real users interact with an application, it was difficult to gauge how users interacted with the app, what content was served, and ultimately how the app performed for those users (and if performance had any impact on conversions).
  • Issues impacting customer experience remained hidden. The root cause of end-user issues  caused at the origin remained hidden, impacting customer experience for long periods of time.
  • Web App Firewall (WAF) security information not readily available. Security teams were not able to detect and respond to attacks in real-time and take defensive actions to minimize exposure.

The solution!

Quality of Service

Akamai Cloud Monitor and Sumo Logic provide an integrated approach to solving these problems. Sumo Logic has developed an application specifically crafted for customers to extract insights from their Akamai data, which is sent to Sumo Logic in real time.  The solution has been deployed by joint customers (at terabyte scale) to address the following use cases:

  • Real-time analytics about user behavior.  Combine Akamai real-user monitoring data and internal data sets to gain granular insights into user behavior. For example, learn how users behave across different device types, geographies, or even how Akamai quality of service impacts user behavior and business results.

  • AttacksSecurity information management and forensics. Security incidents and attacks on an application can be investigated by deep-diving into sessions, IP addresses, and individual URLs that attackers are attempting to exploit and breach.

  • Application performance management from edge to origin. Quickly determine if an application’s performance issue is caused by your origin or by Akamai’s infrastructure, and which regions, user agents, or devices are impacted.

  • Application release and quality management. Receive an alert as soon as Akamai detects that one or more origins have an elevated number of 4xx or 5xx errors that may be caused by new code push, configuration change, or another issue within your origin application infrastructure.

  • Impact of quality of service and operational excellence. Correlate how quality of service impacts conversions or other business metrics to optimize performance and drive better results

I could go on, but I’m sure you have plenty of ideas of your own.

Join us for a free trial here – as always, there is nothing to install, nothing to manage, nothing to run – we do it all for you.  You can also read our announcement here or read more about the Sumo Logic application for Akamai here.  Take a look at the Akamai press release here.

Bruno Kurtic, Founding Vice President of Product and Strategy

Sumo Logic Anomaly Detection is now in Beta!

09.10.2013 | Posted by Bruno Kurtic, Founding Vice President of Product and Strategy

What is “anomaly detection”?

Here is how the peeps on the interweb and wikipedia define it: Anomaly detection (also known as outlier detection) is the search for events which do not conform to an expected pattern. The detected patterns are called anomalies and often translate to critical and actionable insights that, depending on the application domain, are referred to as outliers, changes, deviations, surprises, intrusions, etc.

The domain: Machine Data

data growthMachine data (most frequently referred to as log data) is generated by applications, servers , infrastructure, mobile devices, web servers, and more.  It is the data generated by machines in order to communicate to humans or other machines exactly what they are doing (e.g. activity), what the status of that activity is (e.g. errors, security issues, performance), and results of their activity (e.g. business metrics). 

The problem of unknown unknowns

Most problems with analyzing machine data orbit around the fact that existing operational analytics technologies enable users to find only those things they know to look for.  I repeat, only things they KNOW they need to look for.  Nothing in these technologies helps users proactively discover events they don’t anticipate getting, events that have not occurred before, events that may have occurred before but are not understood, or complex events that are not easy or even possible to encode into queries and searches.  

Our infrastructure and applications are desperately, and constantly, trying to tell us what’s going on through the massive real-time stream of data they relentlessly throw our way.  And instead of listening, we ask a limited set of questions from some playbook. This is as effective as a patient seeking advice about massive chest pain from a doctor who, instead of listening, runs through a checklist containing skin rash, fever, and runny nose, and then sends the patient home with a clean bill of health.

This is not a good place to be; these previously unknown events hurt us by repeatedly causing downtime, performance degradations, poor user experience, security breaches, compliance violations, and more.  Existing monitoring tools would be sufficient if we lived in static, three system environments where we can enumerate all possible failure conditions and attack vectors.  But we don’t.

unknown events

We operate in environments where we have thousands of sources across servers, networks, and applications and the amount of data they generate is growing exponentially.  They come from a variety of vendors, run a variety of versions, are geographically distributed, and on top of that, they are constantly updated, upgraded, and replaced.  How can we then rely on hard-coded rules and queries and known condition tools to ensure our applications and infrastructure is healthy and secure?  We can’t – it is a fairy tale.  

We believe that three major things are required in order to solve the problem of unknown unknowns at a multi-terabyte scale:

  1. Cloud: enables an elastic compute at the massive scale needed to analyze this scale of data in real-time across all vectors

  2. Big Data technologies: enable a holistic approach to analyzing all data without being bound to schemas, volumes, or batch analytics

  3. Machine learning engine: advanced algorithms that analyze and learn from data as well as humans in order to get smarter over time

Sumo Logic Real-Time Anomaly Detection

anomalyToday we have announced Beta access to our Anomaly Detection engine, an engine that uses thousands of machines in the cloud and continuously and in real-time analyzes ALL of your data to proactively detect important changes and events in your infrastructure.  It does this without requiring users to configure or tune the engine, to write queries or rules, to set thresholds, or to write and apply data parsers.  As it detects changes and events, it bubbles them up to the users for investigation, to add knowledge, classify events, and to apply relevance and severity.  It is in fact this combination of a powerful machine learning algorithm and human expert knowledge that is the real power of our Anomaly Detection engine.

known eventsSo, in essence, Sumo Logic Anomaly Detection continuously turns unknown events into known events.  And that’s what we want: to make events known, because we know how to handle and what to do with known events.  We can alert on them, we can create playbooks and remediation steps, we can prevent them, we can anticipate their impact, and, at least in some cases, we can make them someone else’s problem.


In conclusion

Sumo Logic Anomaly Detection has been more than three years in the making.  During that time, it has had the energy of the whole company and our backers behind it.  Sumo Logic was founded with the belief that this capability is transformational in the face of exponential data growth and infrastructure sprawl.  We developed architecture and adopted a business model that enable us to implement an analytics engine that can solve the most complex problems of the Big Data decade.

We look forward to learning from the experience of our Beta customers and soon from all of our customers about how to continue to grow this game changing capability.  Read more here and join us.