DevOps Glossary

Log Analysis

What is Log Analysis?

Log analysis is the process of reviewing, interpreting and understand computer-generated records called logs. Logs are generated by a range of programmable technologies, including networking devices, operating systems, applications, and more. A log consists of a series of messages in time-sequence that describe activities going on within a system. Log files may be streamed to a log collector through an active network, or they may be stored in files for later review. Either way, log analysis is the delicate art of reviewing and interpreting these messages to gain insight into the inner workings of the system.

How to Perform Log Analysis

Logs provide visibility into the health and performance of an application and infrastructure stack, enabling developer teams and system administrators to more easily diagnose and rectify issues. Here's our basic five-step process for managing logs with log analysis software:

  1. Instrument and collect - install a collector to collect data from any part of your stack
  2. Centralize and index - integrate data from all log sources into a centralized platform to streamline the search and analysis process. Indexing makes logs searchable, so security and IT personnel can quickly find the information they need
  3. Search and analyze - Analysis techniques such as pattern recognition, normalization, tagging and correlation analysis can be implemented either manually or using native machine learning.
  4. Monitor and alert - With machine learning and analytics, IT organizations can implement real-time, automated log monitoring that generates alerts when certain conditions are met. Automation can enable the continuous monitoring of large volumes of logs that cover a variety of systems and applications.
  5. Report and dashboard - Streamlined reports and dashboarding are key features of log analysis software. Customized reusable dashboards can also be used to ensure that access to confidential security logs and metrics is provided to employees on a need-to-know basis.

Log Analysis Functions and Methods

Log analysis functions manipulate data to help users organize and extract information from the logs. Here are just a few of the most common methodologies for log analysis.

Normalization - normalization is a data management technique wherein parts of a message are converted to the same format. The process of centralizing and indexing log data should include a normalization step where attributes from log entries across applications are standardized and expressed in the same format.

Pattern Recognition - machine learning applications can now be implemented with log analysis software to compare incoming messages with a pattern book and distinguish between "interesting" and "uninteresting" log messages. Such a system might discard routine log entries, but send an alert when an abnormal entry is detected.

Classification and Tagging - as part of our log analysis, we may want to group together log entries that are the same type. We may want to track all of the errors of a certain type across applications, or we may want to filter the data in different ways.

Correlation Analysis - when an event happens, it is likely to be reflected in logs from several different sources. Correlation analysis is the analytical process of gathering log information from a variety of systems and discovering the log entries from each individual system that connect to the known event.

Log Analysis in Cyber Security

Organizations who wish to enhance their capabilities in a cyber security must develop capabilities in log analysis that can help them actively identify and respond to cyber threats. Organizations that effectively monitor their cyber security with log analysis can make their network assets more difficult to attack. Cyber security monitoring can also reduce the frequency and severity of cyber attacks, promote earlier response to threats and help organizations meet compliance requirements for cyber security, including:

  • ISO/IEC 27002:2013 Information technology -- Security techniques -- Code of practice for information security controls
  • PCI DSS V3.1 (Parts 10 and 11)
  • NIST 800-137 Information Security Continuous Monitoring (ISCM) for Federal Information Systems and Organizations

The first step to an effective cyber security monitoring program is to identify business applications and technical infrastructure where event logging should be enabled. Use this list as a starting point for determining what types of logs your organization should be monitoring:

  • System logs
    • System activity logs
    • Endpoint logs
    • Application logs
    • Authentication logs
    • Physical security logs
  • Networking logs
    • Email logs
    • Firewall logs
    • VPN logs
    • Netflow logs
  • Technical logs
    • HTTP proxy logs
    • DNS, DHCP and FTP logs
    • Appflow logs
    • Web and SQL server logs
  • Cyber security monitoring logs
    • Malware protection software logs
    • Network intrusion detection system (NIDS) logs
    • Network intrusion prevention system (NIPS) logs
    • Data loss protection (DLP) logs

Event logging for all of these systems and applications can generate a high volume of data, with significant expense and resources required to handle logs effectively. Cyber security experts should determine the most important logs for consistent monitoring and leverage automate or software-based log analysis methods to save time and resources.

Log Analysis in Linux

The Linux operating system offers several unique features that make it popular among its dedicated user base. In addition to being free to use, thanks to an open source development model with a large and supportive community, Linux automatically generates and saves log files that make it easy for server administrators to monitor important events that take place on the server, in the kernel, or in any of the active services or applications.

Log analysis is a crucial activity for server administrators who value a proactive approach to IT. By tracking and monitoring Linux log files, administrators can keep tabs on server performance, discover errors, detect potential threats to security and privacy issues and even anticipate future problems before they ever occur. Linux keeps four types of logs that system administrators can review and analyze:

  • Application Logs - Linux creates log files that track the behavior of a number of applications. Application logs contain records of events, errors, warnings, and other messages that come from applications.
  • Event Logs - the purpose of an event log is to record events that take place during the execution of a system. Event logs provide an audit trail, enabling system administrators to understand how the system is behaving and diagnose potential problems.
  • Service Logs - The Linux OS creates a log file called /var/log/daemon.log which tracks important background services that have no graphical output. Logging is especially useful for services that lack a user interface, as there are few other methods for users to check the activities and performance of the service.
  • System Logs - System log files contain events that are logged by the operating system components. This includes things like device changes, events, updates to device drivers and other operations. In Linux, the file /var/log/syslog contains most of the typical system activity logs. Users can analyze these logs to discover things like non-kernel boot errors, system start-up messages, and application errors.