Blog › Authors › Bruno Kurtic

Pardon me, have you got data about machine data?

01.31.2013 | Posted by Bruno Kurtic, Founding Vice President of Product and Strategy

I’m glad you ask, I just might.  In fact, we started collecting data about machine data some 9 months ago when we participated at the AWS Big Data conference in Boston.  Since then we continued collecting the same data at a variety of industry show and conferences such as VMworld, AWS re: Invent, Velocity, Gluecon, Cloud Slam, Defrag, DataWeek, and others.

The original survey was printed on my home printer, 4 surveys per page, then inexpertly cut with the kitchen scissors the night before the conference – startup style, oh yeah!  The new versions made it onto a shiny new iPad as an IOS App.  The improved method, Apple caché, and a wider reach gave us more than 300 data points and, incidentally, cost us more than 300 Sumo Logic T-Shirts which we were more than happy to give up in exchange for data.  (btw, if you want one come to one of our events, next one coming up will be the Strata Conference).  

As a data junkie, I’ve been slicing and dicing the responses and thought that end of our fiscal year could be the right moment to revisit it and reflect on my first blog post on this data set.

Here is what we asked:

  • Which business problems do you solve by using machine data?
  • Which tools do you use to analyze machine data in order to solve those business problems?
  • What issues do you experience solving those problems with the chosen tools?

The survey was partially designed to help us to better understand the Sumo Logic’s segment of IT Operations Management or IT Management markets as defined by Gartner,  Forrester, and other analysts.  I think that the sample set is relatively representative.  Responders come from shows with varied audiences such as developers at Velocity and GlueCon, data center operators at VMworld, and folks investigating a move to the cloud at AWS re: Invent and Cloud Slam.  Answers were actually pretty consistent across the different “cohorts”.  We have a statistically significant number of responses, and finally, they were not our customers or direct prospects.  So let’s dive in and see what we’ve got and let’s start at the top:

Which business problems do you solve by using logs and other machine data?

  • Applications management, monitoring, and troubleshooting (46%)
  • IT operations management, monitoring, and troubleshooting (33%)
  • Security management, monitoring, and alerting (21%)

Does anything in there surprise?  I guess it depends on what your point of reference is.  Let me compare it to the overall “IT Management” or “IT Operations Management” market.  The consensus(if such a thing exists) is that size by segment is:

  • IT Infrastructure (servers, networks, etc) is up to 50-60% of the total market
  • Application (internal, external, etc.) is just north of 30-40%
  • Security is around 10%

Source: Sumo Logic analysis of aggregated data from various industry analysts who cover IT Management space.

There are a few things that could explain the big difference between how much our subsegment leans more toward Applications vs. IT infrastructure.  

  • (hypothesis #1) analysts measure total product sold to derive the market size which might not be the same as effort people apply to these use cases.  
  • (hypothesis #2) there is more shelfware in IT Infrastructure which overrepresented effort.  
  • (hypothesis #3) there are more home-grown solutions in Application management which underrepresents effort.  
  • (hypothesis #4) our data is an indicator or a result of a shift in the market (e.g., when enterprises shift toward the IaaS, they spend less time managing IT Infrastructure and shift more toward the core competency, their applications).  
  • (obnoxious hypothesis #5) intuitively, it’s the software stupid – nobody buys hardware because they love it, it exists to run software (applications), and we care more about applications, and that’s why it is so.

OK, ok, let’s check the data to see which hypothesis can our narrow response set help test/validate.  I don’t think our data can help us validate hypothesis #1 or hypothesis #2.  I’ll try to come up with additional survey questions that will, in the future, help test these two hypotheses.  

Hypothesis #3 on the other hand might be partially testable.  If we compare responses from users who use commercial vs. who use home-grown, we are left with the following:

Not a significant difference between responders who use commercial vs. responders who use home grown tools.  Hypothesis #3 explains only a couple of percentage points of difference.  

Hypothesis #4 – I think we can use a proxy to test it.  Let’s assume that responders from VMworld are focused on internal data center and the private cloud.  In this case they would not be relying as much on IaaS providers for IT Infrastructure Operations.  On the other hand, let’s also assume that AWS, and other cloud conference attendees are more likely to rely on IaaS for IT Infrastructure Operations.  Data please:

Interesting, seems to explain some shift between security and infrastructure, but not applications.  So, we’re left with:

  • hypothesis #1 – spend vs. reported effort is skewed – perhaps
  • hypothesis #2 – there is more shelfware in IT infrastructure – unlikely
  • obnoxious hypothesis #5 – it’s the software stupid – getting warmer

That should do it for one blog post.  I’ve barely scratched the surface by stopping with the responses to the first question.  I will work to see if I can test the outstanding hypotheses and, if successful, will write about the findings.  I will also follow-up with another post looking at the rest of the data.  I welcome your comments and thoughts.

While you’re at it, try Sumo Logic for free.

Real-time Enterprise Dashboards, Really

11.14.2012 | Posted by Bruno Kurtic, Founding Vice President of Product and Strategy

Today we shipped a highly anticipated new capability with a novel approach, novel not only to Sumo Logic, but also novel within our space: Real-time Enterprise Dashboards.  Dashboard technologies have been around for many years, but not all dashboard technologies are created equal.  Most existing technologies leverage either precomputed summary data sets or recompute the entire data set every time a dashboard is viewed.  As such, they suffer from long load times, stale information, an inability to handle the data volume.

Our customers faced a specific challenge: how to take terabytes of machine data per day, crunch it, transform it into information, and render that information in a way that supports making business and IT decisions in real time.  Now they can.

When machine data is used to troubleshoot and monitor today’s production applications or infrastructure, data volume is the enemy.  Large farms of Apache or IIS servers, SaaS and other applications, or data center infrastructure like VMware farms, Cisco networking gear, or Linux or Microsoft Windows server farms generate volumes of data that obey Moore’s Law: the data volume doubles every two years.  It only makes sense that the volume of machine data would follow Moore’s Law – if machine computing capacity doubles, those machines do twice the work, as a result they generate twice the amount of machine data that describes that work.

This exponential growth has put existing dashboarding technologies under an insurmountable strain. Some of us here at Sumo Logic built previous-generation dashboards in our past lives.  From our experience we realized that an entirely new approach is required to enable real-time monitoring and dashboarding and that realization drove development of a new architecture.

First, we adopted the cloud computing paradigm. That turned a data center into an API with lim(capacity)=∞.  This enabled us to spin up and spin down additional capacity truly on demand with a single API call.  Then we built our Streaming Query Engine that leverages that capacity in an elastic manner.  It continuously takes data off the wire and computes results before the data ever hits its permanent resting place.  This “one-time” computing is more efficient and less costly than traditional recompute methods.   When you view a Sumo Logic Dashboard, you simply attach to the existing state, which is continuously computed by our Stream Query Engine in the background.  What you get is freshest data available instantly enabling real-time visibility into your infrastructure or applications.  And they are beautiful to boot. 

Try it for yourself.

Securing the Enterprise Cloud – SOC 2 Compliance

10.16.2012 | Posted by Bruno Kurtic, Founding Vice President of Product and Strategy

In our earlier post, Cloudy Compliance Part 1, we discuss general standards, regulations and some basic compliance concepts. In Part 2, we further explore the relevance of current standards and regulations, including the brief explanations of the American Institute of Certified Public Accountants (AICPA) and its Service Organization Control (SOC) reports.

Today we officially announced the successful completion of our SOC 2 Type 1 examination. Based on Trust Services Principles and Criteria, SOC 2 relates to enterprise-grade assurance, management and confidentiality capabilities.  It’s a significant validation for Sumo Logic, and further proof of the enterprise readiness of our cloud-based log management and analytics service.

What the announcement means to you
As part of SOC 2 examination, Sumo Logic received evaluations which reviewed control confidentiality and integrity of customer’s log data and other machine data in the following three, key areas:

  • Security – The system is protected against unauthorized access (both physical and logical).
  • Availability – The system is available for operation and use as committed or agreed.
  • Confidentiality – Information designated as confidential is protected as committed or agreed.

… Continue Reading

Sumo Logic at AWS Big Data Boston

05.29.2012 | Posted by Bruno Kurtic, Founding Vice President of Product and Strategy

I recently represented Sumo Logic at the AWS Big Data conference in Boston.  It was a great show, very well-attended.  Sumo Logic was one of the few vendors invited to participate.

During the conference I conducted a survey of the attendees to try to understand how this, emerging early-adopter segment of IT professionals,  manages log data for their infrastructure and applications.  

Common characteristics of attendees surveyed:

  • They run their apps and infrastructure in the cloud
  • They deal with large data sets
  • They came to learn how to better exploit/leverage big data and cloud technologies

What I asked:

  • Do you use logs to help you in your daily work, and if so, how?
  • What types of tools do you use for log analysis and management?
  • What are the specific pain points associated with your log management solutions?

The findings were interesting.  Taking each one in turn:  

No major surprises here.  Enterprises buy IaaS in order to run applications, either for burst capacity or because they believe it’s the wave of the future.  The fact that someone else manages the infrastructure does not change the fact that you have to manage and monitor your applications, operating systems, and virtual machines.


A bit of a surprise here.  In my previous analysis, some 45% of enterprises use homegrown solutions, but in this segment it’s 70%.  Big difference with the big data and cloud crowd.  A possible explanation for this is that existing commercial solutions are not easy to deploy and run in the cloud and don’t scale to handle big data.  So, the solution = build it yourself.  Hmm.

Yes, yes, I know, it adds up to more than 100%.  That’s because the question was stated as “select as many as apply” and many respondents have more than one problem.  So, nothing terribly interesting in there.  But let me dig a bit deeper into issues associated with homegrown vs. commercial.

 

This makes a bit more sense.  For the home grown, it looks like complexity is the biggest pain – which makes sense.  Assembling together huge systems to support big volumes of log data is more difficult than many people anticipate.  Hadoop and other similar solutions are not optimized to simply and easily deliver answers.  This then leads to the next pain point:  if it is not easy to use, then you don’t use it = does not deliver enough value.  

The responses on commercial solutions make sense as well.  Today’s commercial products are expensive and hard to operate.  On top of the sticker price, you have to spend precious employee time to perform frequent software upgrades and implement “duct tape” scaling.  If you don’t have expertise internally you buy it from vendors’ professional services at beaucoup $$$$$.  You have to get your own compute and storage, which grow as your data volume grows.  So, commercial “run yourself” solutions = very high CAPEX (upfront capital expenditures) and OPEX (ongoing operational expenditures).  In the end (as the second pain point highlights), commercial solutions are also complex to operate and hard to use, requiring highly skilled and hard to find personnel.

Pretty bleak – what now?
At Sumo Logic, we think we have a solution.  The pain points associated with home-grown and commercial solutions that were architected in the last decade are exactly what we set out to solve. We started this company after building, selling and supporting the previous generation of log management and analysis solutions.  We’ve incorporated our collective experience and customer feedback into Sumo Logic.

Built for the cloud
The Sumo Solution is fundamentally different from anything else out there.  It is built for big data and is “cloud native”.  All of the complexities associated with deploying, managing, upgrading, and scaling are gone – we do all that for you.  Our customers get a simple-to-use web application, and we do all the rest.

Elastic scalability
Our architecture is true cloud, not a “cloud-washed” adaptation of on-premise single-instance software solutions that are trying to pass themselves off as cloud.  Each of our services are separate and can be scaled independently.  It takes us minutes to triple the capacity of our system.

Insights beyond your wildest dreams
Because of our architecture, we are able to build analytics at scale.  Our LogReduce™ and Push Analytics™ uncover things that you didn’t even know you should be paying attention to.  The whole value proposition is turned on its head – instead of having to do all the work yourself, our algorithms do the work for you while you guide them to get better over time.

Come try it out and see for yourself: https://www.sumologic.com/free-trial/

Sumo Logic at RSA: Showcasing data security in cloud-based log management

03.16.2012 | Posted by Bruno Kurtic, Founding Vice President of Product and Strategy

I’m proud to announce that Sumo Logic was one of the top 10 finalists at the 2012 Innovation Sandbox, at last week’s RSA Conference Event.  While I’ve been to the RSA Conference many times, this was my first time at the Innovation Sandbox. This year’s conference showcased three important themes in log analysis today:  Big Data volumes, data privacy, and the need for better analytics. 

… Continue Reading

Twitter