Sign up for a live Kubernetes or DevSecOps demo

Click here

DevOps Glossary

Application Performance Monitoring (APM)

What is Application Performance Monitoring?

Modern application management has become increasingly complicated for software developers, due in part to new software architectures like Iaas, Paas and CaaS, the widespread use of microservices and functions and new software development practices like Agile and DevOps. Many IT organizations are also increasing the number of applications that they deploy in hybrid cloud environments, which increases administrative overhead and makes it more difficult to effectively manage applications.

With Application Performance Monitoring (APM), software development teams can track a variety of metrics for applications that are deployed in the cloud. APM is a capability that enables development teams to monitor and manage the performance of their software in the live environment. APM software tools capture application performance data, user behavior, security data and other information that help developers optimize the user experience and verify the ongoing functionality of the application.

Software analyst firm Gartner describes three main functional dimensions of APM:

  • Application discovery, tracing, and diagnostics (ADTD)
  • Digital experience monitoring (DEM)
  • Artificial intelligence for IT operations (AIOps)

APM has been described as the translation of IT metrics into business meaning. APM tools drive value by capturing data from IT infrastructure, aggregating it into a single database, analyzing the data to detect patterns and trends and presenting actionable insights in a human-readable format.

Application Performance Monitoring: Three Categories of Data

Application performance monitoring is a versatile capability that can be used to measure many different types of data. There are three categories of data that your IT organization should differentiate between when configuring your APM capability:

Metrics offer a wealth of information about application performance. A metric is a quantified measure that conveys the status of a specific process. Metrics are frequently generated by a variety of applications and operating systems and can easily be correlated across different elements of the IT infrastructure. Metrics can be compared to a known baseline to yield information about the status of a system or a process. Changes in metrics can often be viewed as symptoms of an underlying problem.

A trace is the complete processing of a request. The trace itself illustrates the entire journey of a request as it moves through all of the services and components of the network. A trace is made of segments, operations that take place within an individual service or network component. A trace contains hundreds of data points that can be used to diagnose errors, identify and isolate network issues and detect security threats. Traces help security analysts or artificial intelligence applications track inter-dependencies between network objects and see how things are connected within the IT infrastructure.

The final category of data is log files. A log file is automatically generated by an application or operating system. Each application's log file contains information about events and user behavior that took place on the application. An application may create several log files for recording different types of events - one for application logs, one for security logs, one for system logs, one for directory service logs, etc. Logs are useful for conducting root cause analysis and determining why a metric changed or where an event originated.

Application Performance Monitoring: What Should You Measure?

When it comes to application performance monitoring, developing the capabilities is as important as understanding what specific things you should be tracking and how those measurements help you diagnose and resolve user issues. There are four general categories of metrics that you should be tracking with your APM tool:

System Performance

An application is only as good as its underlying infrastructure. Monitor system performance for metrics including:

  • Load - understand how many users are accessing the servers, how many requests your servers are dealing with and whether you are overloaded
  • Resource Usage - track usage of IT infrastructure across time and determine when it's time to increase your capacity
  • Input/Output - gain visibility into the movement of data throughout your IT infrastructure and identify bottlenecks that could negatively impact system performance

Application Performance

Application metrics can reveal critical data that reflects how users experience the application and whether they return. Monitor application performance to evaluate:

  • Latency - strong correlated with user satisfaction and positive user experience, latency measure the time that it takes for a user to complete a transaction on the application
  • Service Uptime - application downtime translates directly into lost revenue. APM solutions can provide real-time insight into application availability, enabling a rapid response to unplanned service interruptions
  • Throughput - throughput measures the rate of data transfer into and out of your application. This can be correlated with user activity or measured against a baseline to verify that the application is functioning correctly.

System and Security Events

Monitoring events means capturing log files from the IT infrastructure and analyzing them to diagnose events on the system. An effective APM tool should capture events such as:

System Errors/Failures - System errors or failures can be caused by any number of conditions. Monitoring system events can help to initiate a rapid response that discovers and corrects underlying issues before customers are negatively impacted.

System Changes - Changes in the IT infrastructure that supports your application can affect data transfer rates and latency, leading to user dissatisfaction and other issues. System changes should be monitored and evaluated to quantify their impact on the user experience.

Code Deploys - If a code deploy contains an unknown issue, it may immediately begin to trigger errors in the application. The ability to track new commits and correlate them to application errors and events can streamline the process of restoring the application after a faulty code deployment.

Application Events

Application event monitoring is achieved by capturing log data from the application itself. These logs contain a range of useful data that can be used to assess and improve application performance:

User Actions - Your application performance monitoring capability should capture application event logs that reflect user actions. Track the behavior of users in the application that can help you identify opportunities to improve user experience and funnel users toward preferred or target activities.

User Transactions - Trace the pathways that users take when navigating your application to identify and remove bottlenecks or failure points

Success/Failure - Track conversion successes and failures for users to determine when a serious issue could be affecting your bottom line.

Application Performance Monitoring with Sumo Logic

Sumo Logic provides a cloud-scale platform for monitoring and managing application performance across your entire hybrid cloud environment. Sumo Logic makes it easy to capture and aggregate event logs and other data from your applications and IT infrastructure and turn it into actionable insights with the help of artificial intelligence and pattern recognition algorithms. Sumo Logic is effective on its own or as a complement to your current Application Performance Monitoring tool suite.