Analyzing Apache Response Time | Apache access log - Sumo Logic

Done

# How to Analyze Apache Response Time

In this guide you’ll learn how to collect, monitor, measure, and analyze logs using a powerful log analysis tool like the Sumo Logic Apache app. With this data at your fingertips, you’ll be able to detect bottlenecks, diagnose user problems, and gain deep insights into your overall Apache environment.

Discover how the Sumo Logic app for Apache Server allows you to monitor and analyze all of your Apache server logs with one tool and download a free trial to get started.

## Initializing Apache Access Logs to Collect Response Time

Apache response time can be analyzed by adding a %D directive to a custom log format. This records a new piece of data in your access logs: the number of microseconds between when the HTTP request was received and when the response was sent back to the client.

By comparing this information to the other fields in an Apache access log, we can uncover performance bottlenecks in a web application. The process is similar to Apache Traffic Analysis, but now we can look at speed in addition to hits and volume.

This article assumes you’ve defined the following LogFormat in your httpd.conf file and told Apache to use it for its access logs with the CustomLog directive:

LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\" %D" combined_with_response_time CustomLog /path/to/apache/logs/access_log combined_with_response_time

A dedicated log analyzer like Sumo Logic makes it much easier to identify performance problems by aggregating and visualizing log data. To follow along with the examples in this article, you can sign up for a .

## Monitoring Response Time

Let’s start simple by extracting the response time from each access log, averaging it, and graphing the results over time. Try running the following query in Sumo Logic:

_sourceCategory=Apache/Access | parse regex "(?<microseconds>\d+)$" | timeslice 1m | microseconds/toLong(1000000) as seconds | avg(seconds) as response_time by _timeslice Clicking the Line Chart button in the Aggregates tab makes it easy to see when your servers are having performance problems: However, this only gives us a high-level view. We can identify if and when slow-downs occurred, but we need to run more diagnostic queries to get to the root cause of the problem. ## Measuring Apache Response Time If you want to measure how well your application is running or how long it takes to load a page, all you have to do is look at the Apache Access log files. But before you can analyze, you have to make sure that you are capturing the Apache Access log response times. Make sure that the LogFormat directive is properly configured. %T will give you a way to log response times in seconds; however if you want more granular response times use %D which logs the time taken in microseconds. Once you have this logging turned on,sign up for a free Sumo account, download the collectors and configure them to pick up the log file. As soon as you have logs coming into the system, it only takes a few operators to parse the response times out of the log line and start using it. For example, if your log format is similar to: "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" --%T/%D--" combined You will start seeing log lines like: 31/Jan/2008:14:19:07 +0000] "GET / HTTP/1.1" 200 7918 "" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.11) Gecko/20061201 Firefox/2.0.0.11 (Ubuntu-feisty)" --0/95491-- You can parse the times taken by running a parse command like: *| parse "--*/*--" as seconds, micro_seconds Typically – you want to see the 95 percentile response times, in which case Sumo will offer a percentile operator which can be used like: | pct(micro_seconds) as responce_time_95 You can chart 95 percentile response times, or you can set up alerts on it, so that you get paged whenever response time is much more than what you expect. ## Analyzing Response Time and Traffic Volume Performance problems are often caused by a web application’s inability to scale. To determine if this is the case for your Apache servers, we need to compare response time to traffic volume. The following query includes the total bytes served every minute: _sourceCategory=Apache/Access | parse regex "HTTP/1.1\"\s+\d+\s+(?<size>\d+)" | parse regex "(?<microseconds>\d+)$" | timeslice 1m | microseconds/toLong(1000000) as seconds | (size/1024) as kbytes | avg(seconds) as response_time, sum(kbytes) as kbytes by _timeslice

By visualizing the traffic volume as columns and response time as a line graph, we can quickly determine if there’s a correlation between the two.

You typically want response time to stay relatively constant regardless of how much traffic you’re serving. The above chart shows a large spike in response time when traffic increases, which indicates some kind of scaling problem.

So, we’ve determined what kind of problem we’re having, but this still isn’t enough information to start resolving the problem. We also need to know where the problem is occurring.

## Slowest URLs by Average Time

This question can be answered by analyzing request URLs. The next query returns the slowest URLs in your web application:

_sourceCategory=Apache/Access | parse regex "\"[A-Z]+ (?<url>.+) HTTP" | parse regex "(?<microseconds>\d+)$" | microseconds/toLong(1000000) as seconds | avg(seconds) as response_time by url | sort response_time Constraining the query’s time frame to the response time spike from the previous section and visualizing the results as a bar chart tells you precisely which URLs need your attention. Your development or IT team can now figure out what the problem is, fix it, re-deploy, and then verify that they’ve implemented the correct solution back in Sumo Logic. It’s important to understand that a log analyzer’s role is primarily monitoring and root cause analysis—not remediation. Sumo Logic won’t fix Apache configuration problems for you, but it will reduce your mean time to resolution by telling you exactly where you should be spending your debugging efforts. ## Average Request Time by Server If you don’t find any particular URL is causing performance issues, you can continue your analysis by examining response time by server: _sourceCategory=Apache/Access | parse regex "\"[A-Z]+ (?<url>.+) HTTP" | parse regex "(?<microseconds>\d+)$" | timeslice 1m | microseconds/toLong(1000000) as seconds | avg(seconds) as response_time by _timeslice, _sourceHost | transpose row _timeslice column _sourceHost

An area chart is a great way to compare server performance. Any response time spikes will be immediately apparent:

Under-performing servers could indicate load-balancing issues, poor server configuration, or even problems in the hardware layer (e.g., an overheating CPU). You can continue digger deeper with more diagnostic queries, but again, remember that an Apache log analyzer is only designed to tell you where to look—it’s up to you to implement a solution.

## Summary

This article walked through a common scenario for many sysadmins: a customer complains that your company’s website is slow, so you need to dig into your Apache logs to figure out why. We started by determining when the slow-downs occurred, then we checked for scaling problems, slow scripts/resources, and under-performing servers.

All we were really doing in this article was pivoting access log values on the %D format string. Every time you add a new piece of data to your Apache log format, it gives you a host of new actionable insights about your server performance.

### Request A Free Sumo Logic Demo

Fill out the form below and a Sumo Logic representative will contact you to schedule your free demo.
“Sumo Logic brings everything together into one interface where we can quickly scan across 1,000 servers and gigabytes of logs and quickly identify problems. It’s awesome software and awesome support.”

Jon Dokuli,
VP of Engineering

### Thank you for signing up for Sumo Logic.

We are creating your account now.