Blog › Authors › Kumar Saurabh

Kumar Saurabh, Co-Founder & VP of Engineering

Machine Data Intelligence – an update on our journey

10.16.2014 | Posted by Kumar Saurabh, Co-Founder & VP of Engineering

In 1965, Dr. Hubert Dreyfus, a professor of philosophy at MIT, later at Berkeley, was hired by RAND Corporation to explore the issue of artificial intelligence.  He wrote a 90-page paper called “Alchemy and Artificial Intelligence” (later expanded into the book What Computers Can’t Do) questioning the computer’s ability to serve as a model for the human brain.  He also asserted that no computer program could defeat even a 10-year-old child at chess.

Two years later, in 1967, several MIT students and professors challenged Dreyfus to play a game of chess against MacHack (a chess program that ran on a PDP-6 computer with only 16K of memory).  Dreyfus accepted. Dreyfus found a move, which could have captured the enemy queen.  The only way the computer could get out of this was to keep Dreyfus in checks with his own queen until he could fork the queen and king, and then exchange them.  And that’s what the computer did.  The computer checkmated Dreyfus in the middle of the board.

I’ve brought up this “man vs. machine” story because I see another domain where a similar change is underway: the field of Machine Data.

Businesses run on IT and IT infrastructure is getting bigger by the day, yet IT operations still remain very dependent on analytics tools with very basic monitoring logic. As the systems become more complex (and more agile) simple monitoring just doesn’t cut it. We cannot support or sustain the necessary speed and agility unless the tools becomes much more intelligent.

We believed in this when we started Sumo Logic and with the learnings of running a large-scale system ourselves, continue to invest in making operational tooling more intelligent. We knew the market needed a system that complemented the human expertise. Humans don’t scale that well – our memory is imperfect so the ideal tools should pick up on signals that humans cannot, and at a scale that perfectly matches the business needs and today’s scale of IT data exhaust.

Two years ago we launched our service with a pattern recognition technology called LogReduce and about five months ago we launched Structure Based Anomaly Detection. And the last three months of the journey have been a lot like teaching a chess program new tricks – the game remains the same, just that the system keeps getting better at it and more versatile.

We are now extending our Structured Based Anomaly Detection capabilities with Metric Based Anomaly Detection. A metric could be just that – a time series of numerical value. You can take any log, filter, aggregate and pre-process however you want – and if you can turn that into a number with a time stamp – we can baseline it, and automatically alert you when the current value of the metric goes outside an expected range based on the history. We developed this new engine in collaboration with the Microsoft Azure Machine Learning team, and they have some really compelling models to detect anomalies in a time series of metric data – you can read more about that here.

The hard part about Anomaly Detection is not about detecting anomalies – it is about detecting anomalies that are actionable. Making an anomaly actionable begins with making it understandable. Once an analyst or an operator can grok the anomalies – they are much more amenable to alert on it, build a playbook around it, or even hook up automated remediation to the alert – the Holy Grail.

And, not all Anomaly Detection engines are equal. Like chess programs there are ones that can beat a 5 year old and others that can even beat the grandmasters. And we are well on our way to building a comprehensive Anomaly Detection engine that becomes a critical tool in every operations team’s arsenal. The key question to ask is: does the engine tell you something that is insightful, actionable and that you could not have found with standard monitoring tools.

Below is an example of  an actual Sumo production use case where some of our nodes were spending a lot of time in garbage collection impacting refresh rates for our dashboards for some of the customers.

anomaly_detection_metric

If this looks interesting, our Metric Based Anomaly Detection service based on Azure Machine Learning is being offered to select customers in a limited beta release and will be coming soon to machines…err..a browser near you (we are a cloud based service after all).

P.S. If you like stories, here is another one for you. 30 years after MackHack beat Dreyfus, in the year 1997  Kasparov (arguably one of the best human chess players) played the Caro-Kann Defence. He then allowed Deep Blue to commit a knight sacrifice, which wrecked his defenses and forced him to resign in fewer than twenty moves.  Enough said.

References

[1] http://www.chess.com/article/view/machack-attack

 

 

 

Kumar Saurabh, Co-Founder & VP of Engineering

2013: The year of machine data science?

01.10.2013 | Posted by Kumar Saurabh, Co-Founder & VP of Engineering

Since I was a kid, I had a fascination for chess playing programs – until it got to a point that it became impossible for me to beat a good chess program. And years ago, not long after I gave up my personal fight with them – the last man standing lost to the best chess playing program. Clearly for chess, machine intelligence overtook human intelligence that day.

 

Another area where machine intelligence has evolved to a point where it’s better than human intelligence is the maps program. I used to have to carry a road atlas with me or risk spending a lot of time just finding my way back on track. It got better a little when you could take a print out, but if I missed an exit or wanted to go for a scenic detour – I again was on my own. Not any more, now I can simply plug in my phone, speak the next destination, and it guides me patiently to that destination – recalculating the route if i miss an exit, heck even warning me when the route is blocked with traffic. These are just couple of examples of how technology evolves to a point – where it would have seemed a sci-fi fantasy 10-15 years ago. And it fundamentally changes how we all go about our lives.

Machine Data Analytics seems like another area desperately in need of a similar evolution. Machine Data Analytics has to evolve into Machine Data Science – and it has to evolve to a point where we depend on it and use it just as I rely on maps and navigation on my cell phone. And Sumo Logic is at the forefront of making that change happen – and there are some fundamental shifts in computing technology – changes which bring that breakthrough within reach.

Cloud has become as mainstream as video streaming. And just like video streaming completely disrupted brick and mortar DVD rental businesses, Cloud has already and continues to bring along fundamentally disruptive technologies to life. So what does Cloud mean for Machine Data? It will be about generating sophisticated insights from the data generated by IT today. Machine data is already the one of the biggest sources of “Big Data” in enterprises. It will be about delivering smarter analytics at scale with the simplicity of a service. As the new year begins, I feel proud and satisfied with what we have accomplished in the last two and a half years. And super excited about the journey ahead – a future is waiting to be invented. :)

Kumar Saurabh, Co-Founder & VP of Engineering

Do you want to buy a Big Data zoo?

04.12.2012 | Posted by Kumar Saurabh, Co-Founder & VP of Engineering

How exciting can a discussion at 5PM on a Friday be? Very exciting, in fact, if you are talking to industry analyst Vanessa Alvarez (@vanessaalvarez1) from Forrester about Big Data.

Last Friday it turned out that we had tons to talk about together regarding recent developments in the Big Data space. Vanessa has a unique take on Big Data — she thinks “Analytics as a Service” is going to gain a lot of traction soon. And that line of thinking resonates with us a lot.

At Sumo Logic you’ll hear us using terms like Cloud, SaaS, elastic scalability… but the most exciting angle for us has always been the *aaS angle, the fact that our solution is a service. We believe that log analytics should be easy to use, and by lowering the effort it takes to perform log analytics, we can make this kind of technology much more widely accessible. A “Log Analytics as a Service” solution aims to do just that — shorten and democratize the path from data to insights.

So, the real question is not if you are Mac or PC — but rather are you a Mac or Linux guy — when it comes to log management.  The choice is — do you really want to build and tweak and operate and maintain your log management system (the big data zoo in other words), or do you just need a solution that delivers log analytics in the most efficient way possible.

We still find a lot of prospects who think that they need to roll out their own log management system using a lot of new stacks (Hadoop, Cassandra, Solr, Hive…). We use similar technologies under the hood at Sumo, but we handle all the operational overhead that comes with this, and we certainly don’t shy away from fixing and optimizing pieces that don’t work, or don’t deliver the performance we need to deliver.

So, if you do not have extremely specialized requirements, is it worth rolling out your own log management systems? Is it worth all the operational overhead? Or would you rather use a service? Curious to hear your thoughts, please feel free to share your thoughts in comments, or shoot me an email at kumar@sumologic.com

Kumar Saurabh, Co-Founder & VP of Engineering

What the heck is LogReduce

03.23.2012 | Posted by Kumar Saurabh, Co-Founder & VP of Engineering

As anybody who has worked with log data will tell you, one of the major problems is its sheer volume of this data—and the horsepower required to crunch it.  And even if you can process it, you’re faced with a second problem:  how to make sense of it all.  And while there’s been progress on both fronts in the past ten years, the tools and techniques haven’t kept up with the explosion in data volume.   

You can spend hours looking into logs, and still only understand a tiny fraction of it. It’s become such an overwhelming task that IT has generally given up on looking at logs proactively. And on the occasions when they do, it’s because something bad has happened, which means they’re in reactive mode, forced to dive CSI-style into the log forensics in the hope of finding the answer.   

… Continue Reading

Kumar Saurabh, Co-Founder & VP of Engineering

Sumo Logic Launches

01.31.2012 | Posted by Kumar Saurabh, Co-Founder & VP of Engineering

On behalf of my co-founder Christian Beedgen and the entire Sumo Logic team, I’m proud to be able to launch the Sumo Logic service.  It’s been an intense and productive two years, and we’re extremely proud of what we have accomplished.

Christian and I spent nearly a decade together at Arcsight, the security log management company purchased by HP two years ago.  We were amazed, pained and disappointed by how much enterprise software asks of its users. We had seen first hand the song and dance during sales cycles, weeks and months of professional services before you can get value out of your investment, documents and brochures meant more to obfuscate and impress than to help, as well as  the glacial speed of innovation. As we talked together, we realized we both had the same goal:  to develop enterprise software that was easy to deploy, was powerful yet simple, and that delivered on the promise of actionable insights from IT data. To that end, Sumo Logic was founded.

… Continue Reading

Twitter