Sanjay Sarathy, CMO

The Oakland A’s, the Enterprise and the Future of Data Innovation

01.23.2014 | Posted by Sanjay Sarathy, CMO

Remember Moneyball? Moneyball is the story of how the performance of the Oakland A’s skyrocketed when they started to vet players based on sabermetrics principles, a data-driven solution that defied conventional wisdom. The team’s success with a metrics-driven approach only came about because GM Billy Beane and one of his assistants, Paul DePodesta, identified the value in player statistics and trusted these insights over what other baseball teams had accepted was true. Any business can learn a significant lesson from Billy Beane and Paul DePodesta, and it is a lesson that speaks volumes about the future of data in business.

If a business wants their data to drive innovation, they need to manage that data like the Oakland A’s did. Data alone does not reveal actionable business insights; experienced analysts and IT professionals must interpret it. Furthermore it’s up to business leaders to put their faith in their data, even if it goes against conventional wisdom.

Of course, the biggest problem companies confront with their data is the astronomical volume. While the A’s had mere buckets of data to pour through, the modern enterprise has to deal with a spewing fire hose of data. This constant influx of data generated by both humans and machines has paralyzed many companies who often never analyze the data available to them or just analyze the data reactively. Reactive data analysis, while useful to interpret what happened in the past, can’t necessarily provide insights into what might occur in the future.  Remember your mutual fund disclaimer?

Innovation in business will stem from companies creating advantages via proactive use of that data. Case in point: Amazon’s new initiative to anticipate customers’ purchases and prepare shipping and logistics “ahead of time.”

The ability to be proactive with machine data won’t be driven simply by technology. It will instead stem from companies implementing their own strategic combination of machine learning and human knowledge. Achieving this balance to generate proactive data insights has been the goal of Sumo Logic since day one. While we have full confidence in our machine data intelligence technologies to do just that, we also know that is not the only solution that companies require. The future of data in the enterprise depends on how companies manage their data. If Billy Beane and Paul DePodesta effectively managed their data to alter the trajectory of the Oakland A’s, there is no reason that modern businesses cannot do the same.

This blog was published in conjunction with ‘Data Innovation Day’

Jim Wilson

Why I Joined Sumo Logic

01.16.2014 | Posted by Jim Wilson

Today I joined Sumo Logic, a cloud-based company that transforms Machine Data into new sources of operations, security, and compliance insights. I left NICE Systems, a market leader and successful organization that had acquired Merced Systems, where I led the Sales Organization for the past 6 years. I had a good position and enjoyed my role, so why leave? And why go to Sumo Logic versus many other options I considered? Many of my friends and colleagues have asked me this, so I wanted to summarize my thinking here.

First, I believe the market that Sumo Logic is trying to disrupt is massive. Sumo Logic, like many companies in Silicon Valley these days, manages Big Data. As Gartner recently noted, the concept of Big Data has now reached the peak of the Hype Cycle. The difference is that Sumo Logic actually does this by generating valuable insights from machine data (primarily log files). As a board member told me, people don’t create Big Data nearly as much as machines do. The emergence in the last 10+ years of cloud solutions, and the proliferation of the Internet and web based technologies in everything we do, in every aspect of business, has created an industry that did not exist 10 years ago. By now it’s a foregone conclusion that cloud technologies and cloud vendors like Amazon Web Services and Workday will ultimately be the solution of choice for all companies, whether they are small mom-and-pop shops or large Global Enterprises. I wanted to join a company that was solving a problem that every company has, and doing it using the most disruptive platform, Software-As-A- Service.

Equally important is my belief that it’s possible to build a better sales team that can make a difference in the traditional Enterprise Sales Process. Sumo Logic competes in a massive market with only one established player, Splunk.  I believe that our capabilities, specifically Machine Data Analytics, are truly differentiated in the market. However, I am also excited to build a sales team that customers and prospects will actually want to work with. Just like technology has evolved (client server, web, cloud) I believe the sales profession needs to as well. Today’s sales organization needs to add value to the sales process, not just get in the way. This means we need to understand more about the product than we describe on the company’s website, be able to explain how our product is different from other choices, and how our service will uniquely solve the complex problems companies face today. I am excited to build an organization that will have a reputation of being knowledgeable about the industry and its ecosystem, will challenge customer thinking while understanding their requirements, and will also be fun to work with. The team at Sumo Logic understands this, and I look forward to delivering on this promise.

Finally, I think Sumo Logic has a great product. I started my sales career at Parametric Technology Corporation (PTC). Selling Pro/ENGINEER was a blast and set the gold standard for great products – everything from watching reactions during demos to hearing loyal customers rave about the innovative work they were doing with the product. I had a similar experience at Groove Networks watching Ray Ozzie and his team build a great product that was ultimately acquired by Microsoft. Sumo Logic seems to be generating that same product buzz. We have some amazing brand names like Netflix, Orange, McGraw-Hill, and Scripps Networks as our customers. These and the other customers we have are generating significant benefits from using our machine data intelligence service. The best measure of a company is the passion of their customer base. The energy and loyalty that our customer base exhibits for the Sumo Logic service is a critical reason why I’m very bullish about the long-term opportunity.

I am fired up to be a part of this organization. The management team and in particular Vance, Mark, and the existing sales team are already off to a great start and have grown sales significantly. I hope to build on their early success, and I will also follow the advice a good friend recently gave me when he heard the news: “You found something good – don’t screw it up!”

I won’t.

Our SOC 2 Report: A Matter of Trust

01.14.2014 | Posted by Joan Pepin, Director of Security

Today we announced that Sumo Logic has successfully completed the Service Organization Controls (SOC) Type 2 examination of the Trust Service Principles; Security, Availability and Confidentiality. Frankly, this is a pretty big deal and something we have been working towards for a while (we achieved our SOC 2 Type 1 in August of 2012) so I’m here to explain a little bit about what that means for you.

In case you’re not familiar with the SOC 2 Type 2 it may help you to know that that the SOC family of reports was implemented by the American Institute of Certified Public Accountants (the AICPA) as a replacement for the venerable old SAS-70 report back in 2011. (So if you’re still asking your vendors for their SAS-70, you’re behind the times a bit- I get this a lot- it’s usually followed by questions about our backup tapes on security assessment paperwork that hasn’t been updated since it was noisily written in Lotus Notes(™) on this bad-boy…)

 

The main purpose of the SOC 2 Type 2 report is to show our customers that an independent third party has evaluated our controls and our adherence to those controls over a period of time. In the words of the AICPA, a SOC 2 report is ideal for:

“A Software-as-a-Service (SaaS) or Cloud Service Organization that offers virtualized computing environments or services for user entities and wishes to assure its customers that the service organization maintains the confidentiality of its customers’ information in a secure manner and that the information will be available when it is needed. A SOC 2 report addressing security, availability and confidentiality provides user entities with a description of the service organization’s system and the controls that help achieve those objectives. A type 2 report also helps user entities perform their evaluation of the effectiveness of controls that may be required by their governance process.”

The major areas of the SOC report are called “Trust Service Principles” because Trust is what this is all about. Once again in the words of the AICPA:

“Trust Services helps differentiate entities from their competitors by demonstrating to stakeholders that the entities are attuned to the risks posed by their environment and equipped with the controls that address those risks. Therefore, the potential beneficiaries of Trust Services assurance reports are consumers, business partners, creditors, bankers and other creditors, regulators, outsourcers and those using outsourced services, and any other stakeholders who in some way rely on electronic commerce (e-commerce) and IT systems.”

You know how you handle your data, but before you hand it over to someone else, you should know a good deal about how they are going to handle it, and because trust is based on openness your data services vendors should be extremely open about that.

Because trust is an important factor in any business relationship, our report lists 263 controls around Security, Availability and Confidentiality put into effect at Sumo Logic and the tests that our examiners (The wonderful people at Brightline CPAs & Associates) performed. This is an extremely thorough overview of what we do to ensure that we deserve your trust, and if you are considering sending us your data, you should ask us for a copy and look it over. And If you are considering any of our competitors, you should also ask to see their third-party assessment. (Hint: They don’t have one.)

Manish Khettry

Sumo Logic Deployment Infrastructure and Practices

01.08.2014 | Posted by Manish Khettry

Introduction

Here at Sumo Logic, we run a log management service that ingests and indexes many terabytes of data a day; our customers then use our service to query and analyze all of this data. Powering this service are a dozen or more separate programs (which I will call assembly from now on), running in the cloud, communicating with one another. For instance the Receiver assembly is responsible for accepting log lines from collectors running on our customer host machines, while the Index assembly creates text indices for the massive amount of data pumping into our system constantly being fed by the Receivers.

We deploy to our production system multiple times each week, while our engineering teams are constantly building new features, fixing bugs, improving performance, and, last but not least, working on infrastructure improvements to help in the care and wellbeing of this complex big-data system. How do we do it? This blog post tries to explain our (semi)-continuous deployment system.

Running through hoops

In any continuous deployment system, you need multiple hoops that your software must pass through, before you deploy it for your users. At Sumo Logic, we have four well defined tiers with clear deployment criteria for each.  A tier is an instance of the entire Sumo Logic service where all the assemblies are running in concert as well as all the monitoring infrastructure (health checks, internal administrative tools, auto-remediation scripts, etc) watching over it.

Night

This is the first step in the sequence of steps that our software goes through. Originally intended as a nightly deploy, we now automatically deploy the latest clean builds of each assembly on our master branch several times every day. A clean build means that all the unit tests for the assemblies pass. In our complex system, however, it is the interaction between assemblies which can break functionality. To test these, we have a number of integration tests running against Night regularly. Any failures in these integration tests are an early warning that something is broken. We also have a dedicated person troubleshooting problems with Night whose responsibility it is, at the very least, to identify and file bugs for problems.

Stage

We cut a release branch once a week and use Stage to test this branch much as we use Night to keep master healthy. The same set of integration tests that run against Night also run against Stage and the goal is to stabilize the branch in readiness for a deployment to production. Our QA team does ad-hoc testing and runs their manual test suites against Stage.

Long

Right before production is the Long tier. We consider this almost as important as our Production tier. The interaction between Long and Production is well described in this webinar given by our founders. Logs from Long are fed to Production and vice versa, so Long is used to monitor and trouble shoot problems with Production.

Deployments to Long are done manually a few days before a scheduled deployment to Production from a build that has passed all automated unit tests as well as integration tests on Stage. While the deployment is manually triggered, the actual  process of upgrading and restarting the entire system is about as close to a one-button-click as you can get (or one command on the CLI)!

Production

After Long has soaked for a few days, we manually deploy the software running on Long to Production, the last hoop our software has to jump through. We aim for a full deployment every week and often times will do smaller upgrades of our software between full deploys.

Being Production, this deployment is closely watched and there are a fair number of safeguards built into the process. Most notably, we have two dedicated engineers who manage this deployment, with one acting as an observer. We also have a tele-conference with screen sharing that anyone can join and observe the deploy process.

Social Practices

Closely associated with the software infrastructure are the social aspects of keeping this system running. These are:

Ownership

We have well defined ownership of these tiers within engineering and devops which rotate weekly. An engineer is designated Primary and is responsible for Long and Production. Similarly we have a designated Jenkins Cop role, to keep our continuous integration system and Night and Stage healthy.

Group decision making and notifications

We have a short standup everyday before lunch, which everyone in engineering attends. The Primary and Jenkins Cop update the team on any problems or issues with these tiers for the previous day.

In addition to a physical meeting, we use Campfire, to discuss on-going problems and notifying others of changes to any of these tiers. If someone wants to change a configuration property on night to test a new feature, the person would update everyone else on campfire. Everyone (and not just the Primary or Jenkins Cop) is in the loop about these tiers and can jump in to troubleshoot problems.

Automate almost everything. A checklist for the rest.

There are certain things that are done or triggered manually. In cases where humans operate something (a deploy to Long or Production for instance), we have a checklist for engineers to follow. For more on checklists, I refer you to an excellent book, The Checklist Manifesto.

Conclusion

This system has been in place since Sumo Logic went live and has served us well. It bears mentioning that the key to all of this is automation, uniformity, and well-delineated responsibilities. For example, spinning up a complete system takes just a couple of commands in our deployment shell. Also, any deployment (even a personal one for development) comes up with everything pre-installed and running, including health checks, monitoring dashboards or auto-remediation scripts. Identifying and fixing a problem on Production is no different from that on Night. In almost every way (except for waking up the Jenkins Cop in the middle of the night and the sizing), these are identical tiers!

While automation is key, it doesn’t take away the fact that people who run and keep things healthy. A deployment to production can be stressful, more so for the Primary than anyone else and having a well defined checklist can take away some of the stress.

Any system like this needs constant improvements and since we are not sitting idle, there are dozens of features, big and small that need to be worked on. Two big ones are:

  • Red-Green deployments, where new releases are rolled out to a small set of instances and once we are confident they work, are pushed to the rest of the fleet.

  • More frequent deployments of smaller parts of the system. Smaller more frequent deployments are less risky.

In other words, there is a lot of work to do. Come join us at Sumo Logic!

Twitter