11.25.2013 | Posted by Vance Loiselle, CEO
The annual craze of getting up at 4am to either stand in line or shop online for the “best” holiday deals is upon us. I know first-hand, because my daughter and I have participated in this ritual for the last four years (I know – what can I say – I grew up in Maine). While we are at the stores fighting for product, many Americans will be either watching football, or surfing the web from the comfort of their couch looking for that too-good-to-be-true bargain. And with data indicating a 50% jump in Black Friday and Cyber Monday deals this year, it’s incumbent on companies to ensure that user experiences are positive. As a result, the leading companies are realizing the need to obtain visibility end-to-end across their applications and infrastructure, from the origin to the edge. Insights from machine data (click-stream in the form of log data), generated from these environments, helps retailers of all stripes maximize these two critical days and the longer-term holiday shopping season.
What are the critical user and application issues that CIOs should be thinking about in the context of these incredibly important shopping days?
User Behavior Insights. From an e-commerce perspective, companies can use log data to obtain detailed insights into how their customers are interacting with the application, what pages they visit, how long they stay, and the latency of specific transactions. This helps companies, for example, correlate user behavior with the effectiveness of specific promotional strategies (coupons, etc) that allow them to rapidly make adjustments before the Holiday season ends.
The Elasticity of The Cloud. If you’re going to have a problem, better it be one of “too much” rather than “too little”. Too frequently, we hear of retail web sites going down during this critical time. Why? The inability to handle peak demand – because often they don’t know what that demand will be. Companies need to understand how to provision for the surge in customer interest on these prime shopping days that in turn deliver an exponential increase in the volume of log data. The ability to provide the same level of performance at 2, 3 or even 10x usual volumes in a *cost-effective* fashion is a problem few companies have truly solved. The ability of cloud-based architectures to easily load-balance and provision for customer surges at any time is critical to maintaining that ideal shopping experience while still delivering the operational insights needed to support customer SLAs.
Machine Learning for Machine Data. It’s difficult enough for companies to identify the root cause of an issue that they know something about. Far more challenging for companies is getting insights into application issues that they know nothing about. However, modern machine learning techniques provide enterprises with a way to proactively uncover the symptoms, all buried within the logs, that lead to these issues. Moreover, machine learning eliminates the traditional requirement of users writing rules to identify anomalies, which by definition limit the ability to understand *all* the data. We also believe that the best analytics combine machine learning with human knowledge about the data sets – what we call Machine Data Intelligence – and that helps companies quickly and proactively root out operational issues that limit revenue generation opportunities.
Security and Compliance Analytics. With credit cards streaming across the internet in waves on this day, it’s imperative that you’ve already set up the necessary environment to both secure your site from fraudulent behavior and ensure your brand and reputation remain intact. As I mentioned in a previous post, the notion of a perimeter has long since vanished which means companies need to understand that user interactions might occur across a variety of devices on a global basis. The ability to proactively identify what is happening in real-time across your applications and the infrastructure on which they run is critical to your underlying security posture. All this made possible by your logs and the insights they contain.
Have a memorable shopping season and join me on twitter – @vanceloiselle – to continue the conversation.
11.19.2013 | Posted by Vance Loiselle, CEO
There is little debate that the “Obamacare” rollout has been choppy at best. Regardless of which side of the political debate you fall, many of us in technology, CIOs and software executives alike, can learn from this highly publicized initiative as we approach the November 30th deadline. Development of new applications, especially web applications, is no longer just about the myopic focus on Design, Develop, Test, and Rollout. The successful development and deployment of these applications must have a holistic, information-driven approach, which includes the following four key processes:
- Application Quality Analytics – the constant tracking of the errors, exceptions, and problems that are occurring in each new release of the code.
- Application Performance Analytics – the real-time measurement of the performance of the application as the users are experiencing it.
- Security Analytics – the real-time information required to analyze and conduct forensics on the security of the entire application and the infrastructure on which it runs.
- User Analytics – real-time insights on which users are in the application, what pages they are viewing, and the success they’ve had in conducting transactions in the application.
Application Quality Analytics – Is it really necessary that in the year 2013, that development of applications still need to be 4 parts art and 1 part science? I’m sure that the Secretary of Health and Human Services, Kathleen Sibelius, wished it was more science when she testified in front of Congress about why the site was not ready. She had no real-time information or metrics at her disposal about the number of defects that were being fixed each day, the number of errors being encountered by users, the severity of the errors, and the pace at which these errors and defects were being resolved.
These metrics are sitting there in the log files (data exhaust from all applications and software components to track what the software is doing), and are largely untapped by most development organizations. Moreover, this data could be shared between multiple teams to pinpoint the root cause of problems between the application itself and the network and infrastructure on which it is running. It was so frustrating to see CGI (the contractor hired to build the application) and Verizon (the contractor hired to host the application in their “cloud”) passing the buck between each other in front of Congress.
Application Performance Management – Much has been made about the performance of Healthcare.gov. The HHS secretary even had gall to say that the site had not crashed, it was just “performing slowly”, while in the congressional hearing there was a live image on the screen informing users that the site was down. The site was down AND performing slowly because the site’s developers are stuck in a previous generation of thinking – that you can measure the site performance without taking into account user analytics. It’s not good enough to measure application performance by sampling the transaction response times periodically. Testers and managers need access to real-time information about each user, the session they were running, the performance at each step, and the outcomes (e.g. new plan sign-up created or failed, 5 insurance plans compared, pricing returned from 2 out of 3 carriers, etc.) along the way. Most monitoring tools look at just the application or just the network and infrastructure it runs on, and have little to no visibility about the outcomes the user is experiencing. Guess what? Most, if not all, of this information is sitting in the logs waiting to be harnessed.
Security Analytics – I appreciated when Secretary Sibelius was asked about what steps had been taken to ensure the privacy and security of the data that fellow Americans had submitted on Healthcare.gov. The reality is that most IT organizations have very bifurcated organizations to address security and application development. The old school view is that you put a web application firewall in place and you close down the ports, and your perimeter is safe. The reality today is that there is no perimeter. People have mobile phones and tablets and use third-party services to store their documents. Healthcare.gov itself is dependent on 3rd parties (insurance carriers) to provide and receive private information.
The most effective way today to ensure some level of security is to have a real-time security analytics and forensics solution in place. These solutions can scan every element of user and system activity, from – you guessed it – the log data, and determine everything from invalid logins and potential breaches to unauthorized changes to firewall rules and user permissions.
User Analytics – Ok, I get it, the Obama administration did not want to share information about how many people had signed up on Healthcare.gov for weeks. The argument was made that the accuracy of the data could not be trusted. Either it was a political maneuver or incompetence, but either reason is sad in the year 2013. And why do the White House and HHS have the right to keep this information a secret? The American taxpayers are paying the hefty sum of $200M+ to get this application up and running. Shouldn’t we know, in real-time, the traction and success of the Affordable Care Act? It should be posted on the home page of the web site. I guarantee the information about every enrollee, every signup – successful or failed – every county from which they logged in, every plan that was browsed, every price that was displayed, every carrier that’s providing quotes was AVAILABLE IN THE LOG DATA.
There has been a lot of coverage about President Kennedy recently and we are reminded that he challenged our government to put people on the moon in the 1960s, and they did – with a very limited set of computer and software tools at their disposal. I would ask the CIOs and software types out there, let’s learn from the Healthcare.gov rollout, and embrace a modern, information-driven approach to developing and rolling out applications. And President Obama, if you need some help with this, give me a shout – I’m ready to serve.
11.13.2013 | Posted by Bruno Kurtic, Founding Vice President of Product and Strategy
Cloud is opaque
One of the biggest adoption barriers of SaaS, PaaS, and IaaS is the opaqueness and lack of visibility into changes and activities that affect cloud infrastructure. While running an on-premise infrastructure, you have the ability to audit activity ; for example, you can easily tell who is starting and stopping VMs in virtualization clusters, see who is creating and deleting users, and watch who is making firewall configuration changes. This lack of visibility has been one of the main roadblocks to adoption, even though the benefits have been compelling enough for many enterprises to adopt the Cloud.
This information is critical to securing infrastructure, applications, and data. It’s critical to proving and maintaining compliance, critical to understanding utilization and cost, and finally, it’s critical for maintaining excellence in operations.
Not all Clouds are opaque any longer
Today, the world’s biggest cloud provider, Amazon Web Services (AWS), announced a new product that, in combination with Sumo Logic, changes the game for cloud infrastructure audit visibility. AWS CloudTrail is the raw log data feed that will tell you exactly who is doing what, on which sets of infrastructure, at what time, from which IP addresses, and more. Sumo Logic is integrated with AWS CloudTrail and collects this audit data in real-time and enables SOC and NOC style visibility and analytics.
Network acl changes.
Creation and deletion of network interfaces.
Authorized Ingress/Egress across network segments and ports.
Changes to privileges, passwords and user profiles.
Deletion and creation of security groups.
Starting and terminating instances.
And much more.
Sumo Logic Application for AWS CloudTrail
Cloud data comes to life with our Sumo Logic Application for AWS CloudTrail, helping our customers across security and compliance, operational visibility, and cost containment. Sumo Logic Application for AWS CloudTrail delivers:
Seamless integration with AWS CloudTrail data feed.
SOC-style, real-time Dashboards in order to monitor access and activity.
Forensic analysis to understand the “who, what, when, where, and how” of events and logs.
Alerts when important activities and events occur.
Correlation of AWS CloudTrail data with other security data sets, such as intrusion detection system data, operating system events, application data, and more.
This integration delivers improved security posture and better compliance with internal and external regulations that protect your brand. It also improves operational analytics that can improve SLAs and customer satisfaction. Finally, it provides deep visibility into the utilization of AWS resources that can help improve efficiency and reduce cost.
The integration is simple: AWS CloudTrail deposits data in near-real time into your S3 account, and Sumo Logic collects it as soon as it is deposited using an S3 Source. Sumo Logic also provides a set of pre-built Dashboards and searches to analyze the CloudTrail Data.
To learn more, click here for more details: http://www.sumologic.com/applications/aws-cloudtrail/ and read the documentation: https://support.
10.29.2013 | Posted by Brandon Mensing
The powerful analytics capabilities of the Sumo Logic platform have always provided the greatest insights into your machine data. Recently we added an operator – bringing the essence of a SQL JOIN to your stream of unstructured data, giving you even more flexibility.
In a standard relational join, the datasets in the tables to be joined are fixed at query time. However, matching up IDs between log messages from different days within your search timeframe likely produces the wrong result because actions performed yesterday should not be associated with a login event that occurred today. For this reason, our Join operator provides for a specified moving timeframe within which to join log messages. In the diagram below, the pink and orange represent two streams of disparate log messages. They both contain a key/value pair that we want to match on and the messages are only joined on that key/value when they both occur within the time window indicated by the black box.
Now let’s put this to use. Suppose an application has both real and machine-controlled users. I’m interested in knowing which users are which so that I can keep an eye out for any machine-controlled users that are impacting performance. I have to find a way to differentiate between the real vs the machine-controlled users. As it turns out, the human users create requests at a reasonably low rate while the machine-controlled users (accessing via an API) are able to generate several requests per second and always immediately after the login event.
In these logs, there are several different messages coming in with varying purposes and values. Using Join, I can query for both the logins and requests and then restrict the time window of the matching logic to combine the two messages streams. The two sub queries in my search will look for request/query events and login events respectively. I’ve restricted the match window to just 15 seconds so that I’m finding the volume of requests that are very close to the login event. Then I’m filtering out users who made less than 10 requests in that 15-second time frame following a login. The result is a clear view of the users that are actively issuing a large volume of requests via the API immediately upon logging in. Here is my example query:
(login or (creating query))
(parse "Creating query: '*'" as query, "auth=User:*:" as user) as query,
(parse "Login success for: '*'" as user) as login
on query.user = login.user
| count by query_user
| where _count > 10
| sort _count
As you can see from the above syntax, the subqueries are written with the same syntax and even support the use of aggregates (count, sum, average, etc) so that you can join complex results together and achieve the insights you need. And of course, we support joining more than just two streams of logs – combining all your favorite data into one query!
10.15.2013 | Posted by Mark Musselman
This is my first blog post for Sumo Logic. It took 18 months but I was always a late bloomer and we have some Hemingway-class bloggers on staff anyway. No doubt I was shy as my music production partner in MOMU, JD Moyer, is now a prolific blogger with an immense following.
Nonetheless, when I was asked to write about the experience at last week’s Akamai Edge – the worldwide customer and partner conclave – due to my unique position of having worked at Akamai from 2002 to 2005, I jumped at the opportunity. The day I started at Akamai the stock was either at 52 cents or 56 cents – I don’t recall exactly. The day I left it was at $56 bucks – I do remember that. In those three years, I was able to bring onboard and expanded Akamai’s presence at companies like eBay, The Gap, RingCentral, Netflix, Walmart.com and E*Trade, all of whom bought into the business value that Akamai delivered. This culminated in being Named the top Major Account Executive for the Americas in 2004 – definitely a personally “pinnacle” achievement….
Akamai is an amazing company for way too many reasons to list, but the people and the culture top the list. In fact, when I think about the best places I have worked, from Ritz-Carlton to BladeLogic, the common thread among these favorite employers of mine was and is the people. Smart, aggressive, coachable, creative, daring, fearless and fun people, with amazing founders.
I want to key in on the similarity that I see between Akamai and Sumo Logic. Akamai is the first Cloud Company. REALLY Cloud. My goal when I arrived at Sumo Logic last year was to help build a culture that weaved in the best of two great worlds – BladeLogic and Akamai, with a maniacal focus on the Customer Experience. At all of these places a common theme was the “DNA” of the staff. There is magnificent art in taking a cutting edge, disruptive product and meshing it with the intense sense of urgency and thoughtful execution. Having the opportunity to help build this from scratch at Sumo Logic was too good to pass up. I fell in love with Christian and Kumar’s vision and the innovation around the technology.
There are many more similarities than just the clarity of vision and the incredible focus on execution. The inimitable George Conrades once told a prospect of ours in a meeting how many lines of code that Akamai had written – in 2003 – and it was a massive number. We are both software companies at the core. We both rely heavily on algorithms to create customer value and massive differentiation. We both go to market with a recurring revenue model. We both allow for instant elasticity and on-demand usage. We both are totally focused on a great product that helps our customers fight the demands of the digital world with the best tools available. Last but not least we are both entranced by The Algorithm…
Back to Akamai Edge. It is incredible to see how much of the online world continues to run and thrive through Akamai. 2.2 billion log lines every 60 seconds. Yes, you read that right. Staggering scale. The session on the Dominant Design principle blew me away. With the new announcement of Akamai opening up its platform to developers and partners, Akamai is even more Open. Sumo Logic is thrilled to become a charter Member of the Open Platform Initiative – we already have many joint customers salivating to send the Akamai logs directly to Sumo Logic, where they can “join” them with the rest of their infrastructure logs – all for real-time insights across their entire infrastructure. The beta customers are all happy that we have come so far so quickly together. This is an alliance with legs AND brains.
George, Paul, Tom, Bob, Brad, Doug, John, Tim, Mark, Gary, Jennie, Rick, Kevin, Kris, Alyson, Brian, Mike, Andy, Dave, Ed (and so many more)….it was great to see you and it is GREAT to be working with you again. The new hires I met seem to have the DNA you need to get to the next Scaling Point.
Akamai and Sumo Logic: Faster Forward Together, Moving at the speed of Cloud.
Now stop reading my rant and go sell something, will ya, and check out our new Sumo Logic Application for Akamai.
10.09.2013 | Posted by Bruno Kurtic, Founding Vice President of Product and Strategy
I’m very pleased to announce our strategic alliance with Akamai. Our integrated solution delivers a unified view of application availability, performance, security, and business analytics based on application log data. Customers who rely on Akamai’s globally distributed infrastructure now can get the real-time feed of all logs generated by Akamai’s infrastructure into their Sumo Logic account in order to integrate and cross-analyze them with their internally generated application data sets!
What problems does the integrated solution solve?
To date, there have been two machine data sets generated by applications that leverage Akamai:
1. Application logs at the origin data centers, which application owners can usually access.
2. Logs generated by Akamai as an application is distributed globally. Application owners typically have zero or limited access to these logs.
Both of these data sets provide important metrics and insights for delivering highly-available, secure applications that also provide detailed view of business results. Until today there was no way to get these data sets into a single tool for real-time analysis, causing the following issues:
- No single view of performance. While origin performance could be monitored, but that provides little confidence that the app is performant for end users.
- Difficult to understand user interaction. Without data on how real users interact with an application, it was difficult to gauge how users interacted with the app, what content was served, and ultimately how the app performed for those users (and if performance had any impact on conversions).
- Issues impacting customer experience remained hidden. The root cause of end-user issues caused at the origin remained hidden, impacting customer experience for long periods of time.
- Web App Firewall (WAF) security information not readily available. Security teams were not able to detect and respond to attacks in real-time and take defensive actions to minimize exposure.
Akamai Cloud Monitor and Sumo Logic provide an integrated approach to solving these problems. Sumo Logic has developed an application specifically crafted for customers to extract insights from their Akamai data, which is sent to Sumo Logic in real time. The solution has been deployed by joint customers (at terabyte scale) to address the following use cases:
Real-time analytics about user behavior. Combine Akamai real-user monitoring data and internal data sets to gain granular insights into user behavior. For example, learn how users behave across different device types, geographies, or even how Akamai quality of service impacts user behavior and business results.
Security information management and forensics. Security incidents and attacks on an application can be investigated by deep-diving into sessions, IP addresses, and individual URLs that attackers are attempting to exploit and breach.
Application performance management from edge to origin. Quickly determine if an application’s performance issue is caused by your origin or by Akamai’s infrastructure, and which regions, user agents, or devices are impacted.
Application release and quality management. Receive an alert as soon as Akamai detects that one or more origins have an elevated number of 4xx or 5xx errors that may be caused by new code push, configuration change, or another issue within your origin application infrastructure.
Impact of quality of service and operational excellence. Correlate how quality of service impacts conversions or other business metrics to optimize performance and drive better results
I could go on, but I’m sure you have plenty of ideas of your own.
Join us for a free trial here – as always, there is nothing to install, nothing to manage, nothing to run – we do it all for you. You can also read our announcement here or read more about the Sumo Logic application for Akamai here. Take a look at the Akamai press release here.
10.02.2013 | Posted by Christian Beedgen, Co-Founder & CTO
Yes, we are cloud and proud. Puppies, ponies, rainbows, unicorns. We got them all. And this, too. But the cloud is not a personal choice for us at Sumo Logic. It is an imperative. An imperative to build a better product, for happier customers.
We strongly believe that if designed correctly, there is no need to fragment your product into many different pieces, each with different functional and performance characteristics that confuse decision-makers. We have built the Sumo Logic platform from the very beginning with a mindset of scalability. Sumo Logic is a service that is designed to appeal and adapt to many use cases. This explains why in just three short years we have been successful in a variety of enterprise accounts across three continents because – first and foremost – our product scales.
On the surface, scale is all about the big numbers. We got Big Data, thank you. So do our customers, and we scale to the level required by enterprise customers. Yet, scaling doesn’t mean scaling up by sizes of data sets. Scaling also means being able to scale back, to get out of the way, and provide value to everyone, including those customers that might not have terabytes of data to deal with. Our Sumo Free offering has proven that our approach to scaling is holistic – one product for everyone. No hard decisions to be made now, and no hard decisions to be made later. Just do it and get value.
Another compelling advantage of our multi-tenant, one service approach is that we can very finely adjust to the amount of data and processing required by every customer, all the time. Elasticity is key, because it enables agility. Agile is the way of business today. Why would anyone want to get themselves tied into a fixed price license, and on top of that provision large amount of compute and storage resources permanently upfront just to buy insurance for those days of the year where business spikes, or, God forbid, a black swan walks into the lobby? Sumo Logic is the cure for anti-agility in the machine data analytics space. As a customer, you get all the power you need, when you need it, without having to pay for it when you don’t.
Finally, Sumo Logic scales insight. With our recently announced anomaly detection capability, you can now rely on the army of squirrels housed in our infrastructure to generate and vet millions of hypotheses about potential problems on your behalf. Only the most highly correlated anomalies survive this rigorous process, meaning you get actionable insight into potential infrastructure issues for free. You will notice repetitive events and be able to annotate them precisely and improve your operational processes. Even better – you will be able to share documented anomalous events with and consume them back from the Sumo Logic community. What scales to six billion humans? Sumo Logic does.
One more thing: as a cloud-native company, we have also scaled the product development process, to release more features, more improvements, and yes, more bug fixes than any incumbent vendor. Sumo Logic runs at the time of now, and new stuff rolls out on a weekly basis. Tired of waiting for a year to get issues addressed? Tired of then having to provision an IT project to just update the monitoring infrastructure? Scared of how that same issue will apply even if the vendor “hosts” the software for you? We can help.
Sumo Logic scales, along all dimensions. You like scale? Come on over.
Oh, and thanks for the date, Praveen. I’ll let you take the check.
09.10.2013 | Posted by Bruno Kurtic, Founding Vice President of Product and Strategy
What is “anomaly detection”?
Here is how the peeps on the interweb and wikipedia define it: Anomaly detection (also known as outlier detection) is the search for events which do not conform to an expected pattern. The detected patterns are called anomalies and often translate to critical and actionable insights that, depending on the application domain, are referred to as outliers, changes, deviations, surprises, intrusions, etc.
The domain: Machine Data
Machine data (most frequently referred to as log data) is generated by applications, servers , infrastructure, mobile devices, web servers, and more. It is the data generated by machines in order to communicate to humans or other machines exactly what they are doing (e.g. activity), what the status of that activity is (e.g. errors, security issues, performance), and results of their activity (e.g. business metrics).
The problem of unknown unknowns
Most problems with analyzing machine data orbit around the fact that existing operational analytics technologies enable users to find only those things they know to look for. I repeat, only things they KNOW they need to look for. Nothing in these technologies helps users proactively discover events they don’t anticipate getting, events that have not occurred before, events that may have occurred before but are not understood, or complex events that are not easy or even possible to encode into queries and searches.
Our infrastructure and applications are desperately, and constantly, trying to tell us what’s going on through the massive real-time stream of data they relentlessly throw our way. And instead of listening, we ask a limited set of questions from some playbook. This is as effective as a patient seeking advice about massive chest pain from a doctor who, instead of listening, runs through a checklist containing skin rash, fever, and runny nose, and then sends the patient home with a clean bill of health.
This is not a good place to be; these previously unknown events hurt us by repeatedly causing downtime, performance degradations, poor user experience, security breaches, compliance violations, and more. Existing monitoring tools would be sufficient if we lived in static, three system environments where we can enumerate all possible failure conditions and attack vectors. But we don’t.
We operate in environments where we have thousands of sources across servers, networks, and applications and the amount of data they generate is growing exponentially. They come from a variety of vendors, run a variety of versions, are geographically distributed, and on top of that, they are constantly updated, upgraded, and replaced. How can we then rely on hard-coded rules and queries and known condition tools to ensure our applications and infrastructure is healthy and secure? We can’t – it is a fairy tale.
We believe that three major things are required in order to solve the problem of unknown unknowns at a multi-terabyte scale:
Cloud: enables an elastic compute at the massive scale needed to analyze this scale of data in real-time across all vectors
Big Data technologies: enable a holistic approach to analyzing all data without being bound to schemas, volumes, or batch analytics
Machine learning engine: advanced algorithms that analyze and learn from data as well as humans in order to get smarter over time
Sumo Logic Real-Time Anomaly Detection
Today we have announced Beta access to our Anomaly Detection engine, an engine that uses thousands of machines in the cloud and continuously and in real-time analyzes ALL of your data to proactively detect important changes and events in your infrastructure. It does this without requiring users to configure or tune the engine, to write queries or rules, to set thresholds, or to write and apply data parsers. As it detects changes and events, it bubbles them up to the users for investigation, to add knowledge, classify events, and to apply relevance and severity. It is in fact this combination of a powerful machine learning algorithm and human expert knowledge that is the real power of our Anomaly Detection engine.
So, in essence, Sumo Logic Anomaly Detection continuously turns unknown events into known events. And that’s what we want: to make events known, because we know how to handle and what to do with known events. We can alert on them, we can create playbooks and remediation steps, we can prevent them, we can anticipate their impact, and, at least in some cases, we can make them someone else’s problem.
Sumo Logic Anomaly Detection has been more than three years in the making. During that time, it has had the energy of the whole company and our backers behind it. Sumo Logic was founded with the belief that this capability is transformational in the face of exponential data growth and infrastructure sprawl. We developed architecture and adopted a business model that enable us to implement an analytics engine that can solve the most complex problems of the Big Data decade.
06.12.2013 | Posted by Jacek Migdal, Software Developer (Sumo Logic Poland)
As human beings, we share quite a few life events that we keep track of, like birthdays, holidays, anniversaries, and so on. These are structured events that occur on exact dates or during specific times of year.
But how do you keep track of the unique, unexpected events that can be life-changing? The first meeting with someone, an inspiring conversation that sparked a realization—events that may seem common to many, but are so special to you.
Computer systems offer the same dilemma. Some events are expected, like adding a new user. Other events look routine, but from time to time they carry crucial, unexpected information. Unfortunately we most often realize how important pivotal events were after we experience a malfunction.
That’s where logs come in.
Virtually every computer program has some append-only structure for logs. Usually, it is as simple as a text file with a new line for each event. Sometimes the messages are saved to a database if the information may be used later. Why does it work that way? Well, it’s very easy to use and implement–usually it’s just one line of code. Don’t let the simplicity fool you. Logs provide a very powerful way of understanding and debugging systems. In many cases, logs are the sole method of figuring out the reason why something has happened.
From time to time, I’ll read about a new log management tool that converts log data into some standardized format. Well, there is limited value in that approach. Extracting data from logs is useful and could answer many business and operational questions. This works well with things that we expect, and things that answer numerical questions, like determining how many users have signed up in a given period of time.
However, during the process of converting logs to a standardized format, valuable data could be lost. For example, it’s interesting that many users couldn’t log in to your service, but the crucial information is why it happened. The unexpected part is usually very important and often even more valuable.
So do logs have a schema? Well, for the expected things, sure. But for analyzing the unexpected events it’s hard to think of a schema at all, beyond perhaps some partial structure.
That’s why at Sumo Logic, we accept any kind of log you throw at us. During log collection we just need to understand the events (e.g. separate lines) and the timestamp format. Everything else can be derived when you run a query.
Our query language lets you to find or extract structure, and data can be visualized and/or exported. Sumo Logic’s key advantage is how we handle the unexpected with machine learning algorithms. Our patent-pending LogReduce groups similar events on the fly to find anomalies, enabling our customers to review large sets of events quickly to identify the root cause of unexpected things.
No one ever intends to create bugs, but with the complexity and fast pace of software development they are inevitable. Well-designed systems should be debuggable. Log management tools, such as Sumo Logic, are here to help you deal with the logs that are a huge part of today’s technology.
“These days are only important, which are still unknown to us
These several moments are important, these for which we still wait”
(lyrics from famous Polish song by Marek Grechuta)
05.29.2013 | Posted by Amanda Saso, Sr. Tech Writer
Have you ever put your cell phone through the wash? Personally, I’ve done it. Twice. What did I learn, finally? To always double-check where I put my iPhone before I turn on the washing machine. It’s a very real and painful threat that I’ve learned to proactively manage by using a process with a low rate of failure. But, from time to time, other foreign objects slip through, like a lipstick, my kids’s crayon, a blob of Silly Putty—things that are cheaper than an iPhone yet create havoc in the dryer. Clothes are stained, the dryer drum is a mess, and my schedule is thrown completely off while I try to remember my grandmother’s instructions for removing red lipstick from a white shirt.
What do low-tech laundry woes have to do with Sumo Logic’s big data solution? Well, I see LogReduce as a tool that helps fortify your organization against known problems (for which you have processes in place) while guarding against unknown threats that may cause huge headaches and massive clean-ups.
When you think about it, a small but messy threat that you don’t know you need to look for is a nightmare. These days we’re dealing with an unbelievable quantity of machine data that may not be human-readable, meaning that a proverbial Chap Stick in the pocket could be lurking right below your nose. LogReduce takes the “noise” out of that data so you can see those hidden threats, problems, or issues that could otherwise take a lot of time to resolve.
Say you’re running a generic search for a broad area of your deployment, say billing errors, user creations, or log ins. Whatever the search may be, it returns thousands and thousands of pages of results. So, you could take your work day slogging through messages, hoping to find the real problem, or you can simply click Log Reduce. Those results are logically sorted into signatures–groups of messages that contain similar or relevant information. Then, you can teach Sumo Logic what messages are more important, and what data you just don’t need to see again. That translates into unknown problems averted.
Of course your team has processes in place to prevent certain events. How do you guard against the unknown? LogReduce can help you catch a blip before it turns into a rogue wave. Oh, and if you ever put Silly Putty through the washer and dryer, a good dose of Goo Gone will do the trick.