Vance Loiselle, CEO

Black Friday, Cyber Monday and Machine Data Intelligence

11.25.2013 | Posted by Vance Loiselle, CEO

The annual craze of getting up at 4am to either stand in line or shop online for the “best” holiday deals is upon us.  I know first-hand, because my daughter and I have participated in this ritual for the last four years (I know – what can I say – I grew up in Maine).  While we are at the stores fighting for product, many Americans will be either watching football, or surfing the web from the comfort of their couch looking for that too-good-to-be-true bargain.  And with data indicating a 50% jump in Black Friday and Cyber Monday deals this year, it’s incumbent on companies to ensure that user experiences are positive.  As a result, the leading companies are realizing the need to obtain visibility end-to-end across their applications and infrastructure, from the origin to the edge.  Insights from machine data (click-stream in the form of log data), generated from these environments, helps retailers of all stripes maximize these two critical days and the longer-term holiday shopping season.  

What are the critical user and application issues that CIOs should be thinking about in the context of these incredibly important shopping days?

  • User Behavior Insights. From an e-commerce perspective, companies can use log data to obtain detailed insights into how their customers are interacting with the application, what pages they visit, how long they stay, and the latency of specific transactions.  This helps companies, for example, correlate user behavior with the effectiveness of specific promotional strategies (coupons, etc) that allow them to rapidly make adjustments before the Holiday season ends.

  • The Elasticity of The Cloud.  If you’re going to have a problem, better it be one of “too much” rather than “too little”.  Too frequently, we hear of retail web sites going down during this critical time.  Why? The inability to handle peak demand – because often they don’t know what that demand will be.   Companies need to understand how to provision for the surge in customer interest on these prime shopping days that in turn deliver an exponential increase in the volume of log data.  The ability to provide the same level of performance at 2, 3 or even 10x usual volumes in a *cost-effective* fashion is a problem few companies have truly solved.  The ability of cloud-based architectures to easily load-balance and provision for customer surges at any time is critical to maintaining that ideal shopping experience while still delivering the operational insights needed to support customer SLAs.

  • Machine Learning for Machine Data. It’s difficult enough for companies to identify the root cause of an issue that they know something about.  Far more challenging for companies is getting insights into application issues that they know nothing about.  However, modern machine learning techniques provide enterprises with a way to proactively uncover the symptoms, all buried within the logs, that lead to these issues.  Moreover, machine learning eliminates the traditional requirement of users writing rules to identify anomalies, which by definition limit the ability to understand *all* the data.  We also believe that the best analytics combine machine learning with human knowledge about the data sets – what we call Machine Data Intelligence – and that helps companies quickly and proactively root out operational issues that limit revenue generation opportunities.

  • Security and Compliance Analytics.  With credit cards streaming across the internet in waves on this day, it’s imperative that you’ve already set up the necessary environment to both secure your site from fraudulent behavior and ensure your brand and reputation remain intact.  As I mentioned in a previous post, the notion of a perimeter has long since vanished which means companies need to understand that user interactions might occur across a variety of devices on a global basis.  The ability to proactively identify what is happening in real-time across your applications and the infrastructure on which they run is critical to your underlying security posture.  All this made possible by your logs and the insights they contain.  

Have a memorable shopping season and join me on twitter – @vanceloiselle – to continue the conversation.

Vance Loiselle, CEO

What CIOs (and I) Can Learn From Healthcare.gov

11.19.2013 | Posted by Vance Loiselle, CEO

There is little debate that the “Obamacare” rollout has been choppy at best.  Regardless of which side of the political debate you fall, many of us in technology, CIOs and software executives alike, can learn from this highly publicized initiative as we approach the November 30th deadline.  Development of new applications, especially web applications, is no longer just about the myopic focus on Design, Develop, Test, and Rollout.  The successful development and deployment of these applications must have a holistic, information-driven approach, which includes the following four key processes:

  1. Application Quality Analytics – the constant tracking of the errors, exceptions, and problems that are occurring in each new release of the code.
  2. Application Performance Analytics – the real-time measurement of the performance of the application as the users are experiencing it.
  3. Security Analytics – the real-time information required to analyze and conduct forensics on the security of the entire application and the infrastructure on which it runs.
  4. User Analytics – real-time insights on which users are in the application, what pages they are viewing, and the success they’ve had in conducting transactions in the application.

Application Quality Analytics – Is it really necessary that in the year 2013, that development of applications still need to be 4 parts art and 1 part science?  I’m sure that the Secretary of Health and Human Services, Kathleen Sibelius, wished it was more science when she testified in front of Congress about why the site was not ready.  She had no real-time information or metrics at her disposal about the number of defects that were being fixed each day, the number of errors being encountered by users, the severity of the errors, and the pace at which these errors and defects were being resolved. 

These metrics are sitting there in the log files (data exhaust from all applications and software components to track what the software is doing), and are largely untapped by most development organizations.   Moreover, this data could be shared between multiple teams to pinpoint the root cause of problems between the application itself and the network and infrastructure on which it is running.  It was so frustrating to see CGI (the contractor hired to build the application) and Verizon (the contractor hired to host the application in their “cloud”) passing the buck between each other in front of Congress.

Application Performance Management – Much has been made about the performance of Healthcare.gov.  The HHS secretary even had gall to say that the site had not crashed, it was just “performing slowly”, while in the congressional hearing there was a live image on the screen informing users that the site was down.  The site was down AND performing slowly because the site’s developers are stuck in a previous generation of thinking – that you can measure the site performance without taking into account user analytics.  It’s not good enough to measure application performance by sampling the transaction response times periodically.  Testers and managers need access to real-time information about each user, the session they were running, the performance at each step, and the outcomes (e.g. new plan sign-up created or failed, 5 insurance plans compared, pricing returned from 2 out of 3 carriers, etc.) along the way.  Most monitoring tools look at just the application or just the network and infrastructure it runs on, and have little to no visibility about the outcomes the user is experiencing.  Guess what?  Most, if not all, of this information is sitting in the logs waiting to be harnessed.

Security Analytics – I appreciated when Secretary Sibelius was asked about what steps had been taken to ensure the privacy and security of the data that fellow Americans had submitted on Healthcare.gov.  The reality is that most IT organizations have very bifurcated organizations to address security and application development.  The old school view is that you put a web application firewall in place and you close down the ports, and your perimeter is safe.  The reality today is that there is no perimeter.  People have mobile phones and tablets and use third-party services to store their documents.  Healthcare.gov itself is dependent on 3rd parties (insurance carriers) to provide and receive private information. 

The most effective way today to ensure some level of security is to have a real-time security analytics and forensics solution in place.  These solutions can scan every element of user and system activity, from – you guessed it – the log data, and determine everything from invalid logins and potential breaches to unauthorized changes to firewall rules and user permissions.     

User Analytics – Ok, I get it, the Obama administration did not want to share information about how many people had signed up on Healthcare.gov for weeks.  The argument was made that the accuracy of the data could not be trusted.  Either it was a political maneuver or incompetence, but either reason is sad in the year 2013.  And why do the White House and HHS have the right to keep this information a secret?  The American taxpayers are paying the hefty sum of $200M+ to get this application up and running.  Shouldn’t we know, in real-time, the traction and success of the Affordable Care Act?  It should be posted on the home page of the web site.  I guarantee the information about every enrollee, every signup – successful or failed – every county from which they logged in, every plan that was browsed, every price that was displayed, every carrier that’s providing quotes was AVAILABLE IN THE LOG DATA.

 There has been a lot of coverage about President Kennedy recently and we are reminded that he challenged our government to put people on the moon in the 1960s, and they did – with a very limited set of computer and software tools at their disposal.  I would ask the CIOs and software types out there, let’s learn from the Healthcare.gov rollout, and embrace a modern, information-driven approach to developing and rolling out applications.  And President Obama, if you need some help with this, give me a shout – I’m ready to serve.

Bruno Kurtic, Founding Vice President of Product and Strategy

Sumo Logic Application for AWS CloudTrail

11.13.2013 | Posted by Bruno Kurtic, Founding Vice President of Product and Strategy

Cloud is opaque

One of the biggest adoption barriers of SaaS, PaaS, and IaaS is the opaqueness and lack of visibility into changes and activities that affect cloud infrastructure.  While running an on-premise infrastructure, you have the ability to audit activity ; for example, you can easily tell who is starting and stopping VMs in virtualization clusters, see who is creating and deleting users, and watch who is making firewall configuration changes. This lack of visibility has been one of the main roadblocks to adoption, even though the benefits have been compelling enough for many enterprises to adopt the Cloud.

This information is critical to securing infrastructure, applications, and data. It’s critical to proving and maintaining compliance, critical to understanding utilization and cost, and finally, it’s critical for maintaining excellence in operations.

Not all Clouds are opaque any longer

Today, the world’s biggest cloud provider, Amazon Web Services (AWS),  announced a new product that, in combination with Sumo Logic, changes the game for cloud infrastructure audit visibility.  AWS CloudTrail is the raw log data feed that will tell you exactly who is doing what, on which sets of infrastructure, at what time, from which IP addresses, and more.  Sumo Logic is integrated with AWS CloudTrail and collects this audit data in real-time and enables SOC and NOC style visibility and analytics.

Here are few examples of what AWS CloudTrail data contains:Network Access

  • Network acl changes.

  • Creation and deletion of network interfaces.

  • Authorized Ingress/Egress across network segments and ports.

  • Changes to privileges, passwords and user profiles.

  • Deletion and creation of security groups.

  • Starting and terminating instances.

  • And much more.

Sumo Logic Application for AWS CloudTrail

Cloud data comes to life with our Sumo Logic Application for AWS CloudTrail, helping our customers across security and compliance, operational visibility, and cost containment. Sumo Logic Application for AWS CloudTrail delivers:

User Activity

  • Seamless integration with AWS CloudTrail data feed.

  • SOC-style, real-time Dashboards in order to monitor access and activity.

  • Forensic analysis to understand the “who, what, when, where, and how” of  events and logs.

  • Alerts when important activities and events occur.

  • Correlation of AWS CloudTrail data with other security data sets, such as intrusion detection system data, operating system events, application data, and more.

This integration delivers improved security posture and better compliance with internal and external regulations that protect your brand.  It also improves operational analytics that can improve SLAs and customer satisfaction.  Finally, it provides deep visibility into the utilization of AWS resources that can help improve efficiency and reduce cost.

The integration is simple: AWS CloudTrail deposits data in near-real time into your S3 account,  and Sumo Logic collects it as soon as it is deposited using an S3 Source.  Sumo Logic also provides a set of pre-built Dashboards and searches to analyze the CloudTrail Data.

To learn more, click here for more details: http://www.sumologic.com/applications/aws-cloudtrail/ and read the documentation: https://support.sumologic.com/entries/30216746-Sumo-Logic-for-Amazon-CloudTrail-App.

Twitter