Blog › Authors › Rishi Divate
05.31.2012 | Posted by Rishi Divate, Technical Account Manager
In a previous post in this series, I wrote about how you can upload your log data to the Sumo Logic Service. In this second and final post in the series, I am going to discuss how you can extract important pieces of information and gain insight from your log messages. For additional technical details on any of the sections below, please see the online help.
Step 2 – Extract
After some initial exploration of your logs, you can now extract fields that represent important information for further analysis.
For simplicity, let’s assume your web application writes log messages in a format that allows for readability and parsability much like the Apache error log:
[<Timestamp>] [level: <log level>] [client: <IP address>] <reason>
When defining the Source, you can configure the timestamp in log messages to be parsed out automatically or you can ignore all timestamps in a message, in which case the time of receipt by the Sumo Logic Service will be used. In either case, you do not need to explicitly extract timestamps yourself.
For the other pieces of information in your log message such as log level, client IP address and reason, you can write specific search terms to extract their values. At this point, you should refer to the Getting Started with Search section in the online help to understand the Sumo Logic search syntax.
To parse out the log level using the parse operator, you can write the following search term in the Sumo Logic web application:
_sourceCategory=Application/Petstore | parse “[level: *]” as log_level
When you run this search, in addition to seeing the actual log message, you will also see a new field called “log_level” that will display the log level for every message as shown below.
To parse all three fields you could use the following search term:
_sourceCategory=Application/Petstore | parse “[level: *] [client: *] *” as log_level, client, reason
With this search term, you will get three new fields as shown below:
At this point, you may want to select only those messages which are critical or that identify errors; in which case you would write a search term using the where operator like this:
_sourceCategory=Application/Petstore | parse “[level: *]” as log_level
| where log_level in (“crit”, “error”)
This search will return only those messages with the “crit” and “error” log levels as shown below:
The search syntax does allow for more complex parse expressions such as regular expressions. See the Searching Your Data section in the online help for further details. Sumo Logic has also parser libraries for common log types such as the Apache web server. See the Automatically Extracting Fields with Parsers section in the online help on how to use them.
At some point, you may want to save the search you just wrote, since you will most likely be using it repeatedly to find certain kinds of messages. You may also want to schedule it to run periodically and email you the results only if the number of messages are more than a certain amount across a given time interval. For example, you may want to schedule a search to return only if there are more than 20 critical or error messages in an hour. See the Saving a Search Query and the Scheduling a Search sections in the online help for instructions on how to do this.
Step 3 – Analyze
So now that you know how to extract information from log messages and get only those messages that you want; the next step is to get more insight into what your logs are telling you.
_sourceCategory=Application/Petstore | parse “[level: *] [client: *] *” as log_level, client, reason | where log_level=“error” | count by client | top 10 client by _count
Given these results, you may want to investigate why some of these IP addresses are causing so many errors and what can be done to prevent them from happening again.
Now, you may want to use the summarize operator to get an overview of all your log messages. To do this you could use the following search term:
_sourceCategory=Application/Petstore | summarize
As you can see, messages with similar keywords and structures are grouped into clusters and in the first cluster, the URLs in 230 log messages were automatically reduced to $URL so that you can focus more on what the actual message is.
You can drill-down to the individual log messages simply by selecting any of the clusters and clicking the “View Details” button to see the actual messages. For instance, in this case you may feel that the number of “Access to $URL failed” type messages are unusually high and may warrant further investigation.
For more details on the various capabilities of our search syntax, see the Search Syntax Overview sections in the online help.
I hope this series has given you a good idea of how to start collecting and gaining insights from your log data using Sumo Logic. If you have any further questions feel free to contact us.
05.21.2012 | Posted by Rishi Divate, Technical Account Manager
Let’s say you have a web application running in a production environment and like most applications it is logging its operational data in a log file. In this two-part series, I am going to give you an overview of how you can store and analyze this log data using the Sumo Logic Service with our three-step “Collect, Extract and Analyze” approach. For additional technical details for any of the sections below, please see the online help.
Step 1 – Collect
Once you have an account with Sumo Logic, the first step is installing a Collector on a machine, which can access the application log file. A Collector is a lightweight application that can securely and robustly feed your log data to the Sumo Logic service.
A single Collector has the ability to send log data from various log sources to Sumo Logic as shown in the figure below:
For additional details on Collector configurations see the “Deciding Where to Install the Collectors” section of the online help.
You can then download, install and activate the Collector for your operating system by following the instructions in the Downloading and Installing a Collector section of the online help.
Once the Collector has been activated, the next step is to add a Source for your application log file. A Source identifies which log file is being collected, how it can be accessed and adds metadata tags to it, which you can use later while analyzing the data. See the Adding a New Source section in the online help on how to do this.
When adding your Source, we recommend setting the following metadata tags to help you identify your log data more accurately later when analyzing it.
- Source Category as “Application/Petstore”. This will tag the Source as an application and to be more specific, the Petstore application.
- Source Host as “US/West/Petstore/LinuxMc14”. This will tag the country, region, application name and machine name from which the log file is being collected.
Adding these tags will enable you to easily write searches at various levels.
- The search term “_sourceCategory=Application*” will identify log messages from all applications.
- The search term “_sourceHost=US/West/Petstore*” will identify all log messages from machines running the Petstore application in the western region of the United States.
For more information on suggested Source naming conventions, please look at the Establish Metadata Conventions section in the online help.
Once you have configured your Source, you can do a quick test to search for log messages specific to your application.To do this, click on the Search tab in the Sumo Logic web application and enter in the following search term:
Then select a time range for these messages on the right hand side of the screen and click the Start button to see how log messages are displayed in Sumo Logic.
As we have seen in this post, Collectors and their Sources provide a flexible framework by which you can upload and tag your log data. In a subsequent post, I will elaborate how to extract important information from your logs and gain further insight.