03.19.2014 | Posted by Ariel Smoliar, Senior Product Manager
Last July we launched our Applications webpage, and have been constantly adding new applications to this list. This week we are excited to announce a major step in delivering a better application user experience. We have now integrated the Sumo Logic Application Library directly into core service and have made it available for both trial users and paying customers.
The initial Library rollout includes the following applications: Active Directory, Apache, AWS CloudTrail, Cisco, IIS, Linux, Log Analysis QuickStart, Nginx, Palo Alto Networks, VMware and Windows. We updated the Sumo Logic user interface with a new “Apps” tab. You can install applications from the menu for a true self-service experience, without downloading any files. The dashboards for the applications you choose will be visible after following a few simple steps.
Over the coming weeks, we will add the remainder of the Sumo Logic Applications to the Library, including ones for Akamai Cloud Monitor, AWS Elastic Load Balancing, Snort, OSSEC, and more. Till that time, we will manually load these applications for our customers.
What’s Next and Feedback
This is just the first phase in the rollout of our Application Library. We will continue to deliver more applications that provide critical insights into your operational and security use cases. In addition, we will continue to enhance the Library itself as a system to share relevant insights across your organization.
We are eager to hear your feedback on this initial phase. Please fill out this form if you would like to meet with us and share your experience using the Sumo Application Library.
03.06.2014 | Posted by Ariel Smoliar, Senior Product Manager
After the successful launch of the Sumo Logic Application for AWS CloudTrail last November and with numerous customers now using this application, we were really excited to work again on a new logging service from AWS, this time providing analytics around log files generated by the AWS Load Balancers.
Our integration with AWS CloudTrail targets use cases relevant to security, usage and operations. With our new application for AWS Elastic Load Balancing, we provide our customers with dashboards that provide real-time insights into operational data. You will also be able to add additional use cases based on your requirements by parsing the log entries and visualizing the data using our visualization tools.
Insights from ELB Log Data
Sumo Logic runs natively on the AWS infrastructure and uses AWS load balancers, so we had plenty of raw data to work with during the development of the content. You will find 12 fields in the ELB logs covering the entire request/response lifecycle. By adding the request, backend and response processing time, we can highlight the total time (latency) from when the load balancer started reading the request headers to when the load balancer started sending the response headers to the client. The Latency Analysis dashboard presents a granular analysis per domain, client IP and backend instance (EC2).
The Application also provides analysis of the status codes based on the ELB and backend instances status codes. Please note that the total count for the status codes will be similar for both the ELB and the instances most of the time, unless there are issues, such as no backend response or client rejected request. Additionally, for ELBs that have been configured with a TCP listener (layer 4) rather than HTTP, the TCP requests will be logged. In this case, you will see that the URL has three dashes and there are no values for the HTTP status codes.
Often during my discussions with Sumo Logic users, the topic of scheduled searches and alerting comes up. Based on our work with ELB logs, there is no specific threshold that we recommend that covers every single use case scenario. The threshold should be based on the application – e.g., tiny beacon requests versus downloading huge files cause different latencies. Sumo Logic provides you with the flexibility to set threshold in the scheduled search or just to change the color in the graph for monitoring purpose, based on the value range.
I want to talk a little bit about machine data visualization. While skiing last week in Steamboat Colorado, I kept thinking about the relevance of the beautiful Rocky Mountain landscape with the somewhat more mundane world of load balancer data visualization. So here is what we did to present the load balancers data in a more compelling way:
You can slice and dice the data using our Transpose operator as we did in the Latency by Load Balancer monitor, but I would like to focus on a different feature that was built by our UI team and share how we used it in this application. This feature combines data about the number of requests, the size of the total requests, the client IP address and integrates these data elements into the Total Requests and Data Volume monitor.
We first used this visualization approach in our Nginx app (Traffic Volume and Bytes Served monitor). We received very positive feedback and decided it made sense to incorporate this approach into this application as well.
Combining three fields in a single view enables you to get faster overview of your environment and also provides you with the ability to drill-down and investigate any activity.
It reminds one of the landscape above, right?
To get this same visualization, click on the gear icon in the Search screen and choose the Change Series option.
For each data series, you can choose how you would like to represent the data. We used Column Chart for the total requests and Line Chart for the received and sent data.
I find it beautiful and useful. I hope you plan to use this visualization approach in your dashboards, and please let us know if any help is required.
One more thing…
Please stay tuned and check our posts next week… we can’t wait to share with you where we’re going next in the world of Sumo Logic Applications.
03.05.2014 | Posted by David Andrzejewski, Data Sciences Engineer
A few weeks ago I had the pleasure of hosting the machine data track of talks at Strata Santa Clara. Like “big data”, the phrase “machine data” is associated with multiple (sometimes conflicting) definitions, two prominent ones come from Curt Monash and Daniel Abadi. The focus of the machine data track is on data which is generated and/or collected automatically by machines. This includes software logs and sensor measurements from systems as varied as mobile phones, airplane engines, and data centers. The concept is closely related to the “internet of things”, which refers to the trend of increasing connectivity and instrumentation in existing devices, like home thermostats.
More data, more problems
This data can be useful for the early detection of operational problems or the discovery of opportunities for improved efficiency. However, the decoupling of data generation and collection from human action means that the volume of machine data can grow at machine scales (i.e., Moore’s Law), an issue raised by both Monash and Abadi. This explosive growth rate amplifies existing challenges associated with “big data.” In particular, two common motifs among the talks at Strata were the difficulties around:
- mechanics: the technical details of data collection, storage, and analysis
- semantics: extracting understandable and actionable information from the data deluge
The talks covered applications involving machine data from both physical systems (e.g., cars) and computer systems, and highlighted the growing fuzziness of the distinction between the two categories.
Steven Gustafson and Parag Goradia of GE discussed the “industrial internet” of sensors monitoring heavy equipment such as airplane engines or manufacturing machinery. One anecdotal data point was that a single gas turbine sensor can generate 500 GB of data per day. Because of the physical scale of these applications, using data to drive even small efficiency improvements can have enormous impacts (e.g., in amounts of jet fuel saved).
Moving from energy generation to distribution, Brett Sargent of LumaSense Technologies presented a startling perspective on the state of the power grid in the United States, stating that the average age of an electrical distribution substation in the United States is over 50 years, while its intended lifetime was only 40 years. His talk discussed remote sensing and data analysis for monitoring and troubleshooting this critical infrastructure.
Ian Huston, Alexander Kagoshima, and Noelle Sio from Pivotal presented analyses of traffic data. The talk revealed both common-sense (traffic moves more slowly during rush hour) and ￼￼￼￼￼￼￼￼￼counterintuitive (disruptions in London tended to resolve more quickly when it was raining) findings.
My presentation showed how we apply machine learning at Sumo Logic to help users navigate machine log data (e.g., software logs). The talk emphasized the effectiveness of combining human guidance with machine learning algorithms.
Krishna Raj Raja and Balaji Parimi of Cloudphysics discussed how machine data can be applied to problems in data center management. One very interesting idea was to use data and modeling to predict how different configuration changes would affect data center performance.
The amount of data available for analysis is exploding, and we are still in the very early days of discovering how to best make use of it. It was great to hear about different application domains and novel techniques, and to discuss strategies and design patterns for getting the most out of data.