Sign up for a live Kubernetes or DevSecOps demo

Click here

DevOps Glossary

Structured Logging

What is Structured Logging?

Event log files are the computer-generated documentation of all events that happen within an application. Logs are among the most valuable tools for application developers in the debugging process and for IT operations analysts monitoring applications in the production environment. They include all of the most important information about events, including:

  • A description of the event
  • The date and time of the event
  • The characteristics of the device where the event originated (IP address, MAC address, etc.)
  • The identity of the user that was accessing the system when the event occurred
  • The application or resource that generated the log
  • The severity level or type associated with the event

Typical log entries are presented in a textual format that is easily readable for a human analyst but difficult for machines to process. Sometimes, we may want to use automated processing to investigate log files, or we may want to use algorithms to categorize, index and search through log files based on specific parameters (by date, user, number, etc.). To support this capability, IT organizations must implement structured logging.

Structured logging is the practice of implementing a consistent, predetermined message format for application logs that allows them to be treated as data sets rather than text. The idea of structured logging is to take an application log that is delivered as a string of text and convert it into a simple relational data set that can be more easily searched and analyzed.

Why Use Structured Logging?

Structured logging uses a defined format to add important details to logs and make it easier to interact with them in different ways. The default layout for many types of application logs is a plain text layout, which is easily readable for humans but difficult to interact with for machines. Structured logging takes plain text application logs and converts them into a set of data points that can be more easily analyzed by a machine.

Structured logging addresses three key issues that arise when dealing with log files in the standard "plain text" format:

  1. Log files presented in plain text are formatted arbitrarily. The user must implement a customized parsing algorithm to extract attribute data from the string of information presented.
  2. Unstructured log files are not always human-friendly. They can be hard to read and individual values may be difficult to interpret unless the reader knows how the logs are being formatted.
  3. If the format changes somehow, downstream applications that depend on that specific formatting may have their function impacted.

In practice, most developers now implement structured logging to help application users interact with their log files through automated processes. The use of basic or unstructured logs is becoming less widespread as more IT organizations adopt log management tools and processes for security and operational diagnostic purposes.

Structured Logging vs Basic Logging Explained

Structured logging takes the contents of a log and puts them into a structured format. A structured log has a clearly identified event number for reference, attributes, and values that comprise records and additional contextual data. The contents of a log entry are sometimes referred to as a payload, with the distinction drawn between structured and unstructured payloads. An unstructured payload might appear in a textual format while a structured payload appears in a structured format according to a predetermined standard or custom configuration that identifies attributes and values.

To understand the benefits of structured logging, we might consider how unstructured log messages typically appear for users. Take this example of an unstructured log record generated through the Google Cloud Platform:

<4>Nov 21 2:53:17 192.168.0.1 fluentd[11111]: [error] Syslog test

This log clearly contains a wealth of information, but it may not be immediately obvious what each part of the message is referring to and some details that are important might have been left out. We can introduce structured logging to help clarify the meaning of this log message and make it more readable for machines. One of the ways this can be done is using the Javascript Object Notation (JSON) format to change the structure of the payload:

jsonPayload: {
 "pri": "6",
 "host": "192.168.0.1",
 "ident": "fluentd",
 "pid": "11111",
 "message": "[error] Syslog test"
 }

As you can see, the modified payload contains essentially the same information as the initial message. The key difference is that attributes have been identified, named, and presented as a set of ordered pairs along with the corresponding values. Now, a data analysis program can use these attributes to filter search results or to detect patterns in the data.

Implementing Structured Logging with Data Parsing Tools

IT organizations implement structured logging using specialized software tools that parse data from various sources and convert them into a common format. The basic process can be outlined as follows:

  1. The application generates a log entry in response to an application event
  2. The log entry is captured and collected by a log aggregation software tool
  3. The source of the log entry is determined and the correct parsing algorithm is applied to convert the log payload into a structured payload
  4. The log entry can now be combined with other data to support search and analysis functions

Each application generates logs according to the specifications created by the developer. Some applications are following Syslog standards by default, while others may present log entries in an unstructured plain text format. Many applications output structured logs by default, but application users may still want to convert data from all application logs they collect into a standard format to better facilitate event log search and data analysis.

Sumo Logic Supports Structured Logging Functionality

Effective event logs should stand on their own, providing information in a format that is easily readable for humans and machines without depending on file names, storage locations or automatic metadata tagging from a software tool to provide additional context.

Sumo Logic provides excellent support for structured logging functionality, helping IT organizations make the most of their event logs. With Sumo Logic, you can take a log entry that looks like this:

2017-04-10 09:50:32 -0700 dan12345 10.0.24.123 GET /checkout/flights/ credit.payments.io Success 2 241.9

...and convert it into a structured format like JSON so it looks like this:

{ timestamp: 2017-04-10 09:50:32 -0700,
username: dan12345,
source_ip: 10.0.24.123,
method: GET,
resource: /checkout/flights/,
gateway: credit.payments.io,
audit: Success,
flights_purchased: 2,
value: 241.98,
}

You can then parse through the data using Sumo Logic's parse operators to convert the log entry into your preferred structure and format for data analysis.