The rise of DevOps engineers has changed how IT teams monitor the health of their systems and networks. Rather than having a siloed organization with specialized staff managing specific pieces of equipment, a DevOps team comprises tech generalists who take a more holistic view of the system and prioritize application performance. They need a different set of metrics, along with notifications and alerts, to analyze how business-critical applications are performing. This blog highlights five key metrics for optimizing application performance.
Let’s say you’ve deployed, or are thinking about deploying, an Application Delivery Controller (ADC) solution for load balancing, web acceleration and application firewalling. How do you know if your business applications are running at the best performance possible? The metrics discussed here are crucial for understanding app performance and how it can be improved.
But the metrics alone are not enough. They must be easy to see, understand, and act upon. The data needs to be presented in an easy-to-use format on an intuitive dashboard that is customizable. Your administrators should be able to set performance thresholds, configure alerts and easily take action to avert service deterioration or optimize performance.
The most important metric that is directly linked to application performance and user experience is latency (a.k.a., server response time). This is a measure of the time it takes for your web server to receive a request and generate a response. Another way to think about latency is the time it takes for your web server to generate a page. If latency times are too high, then web applications and your website will run slowly for users.
Slow-running web apps hinder employee productivity and cause poor customer experience. Studies have shown that more than half of users will abandon a website if it takes more than 4 seconds to load.
There will always be some latency in any web application, so the goal is to minimize it. If latency is higher than your acceptable threshold, then it’s time to look for underlying causes. Perhaps the distances between your data centers and servers are too great. For example, the baseline delay for a European user to communicate with a U.S. West Coast website is 150 ms. And that’s before any website objects are retrieved.
Other contributors to latency include having too many network elements in the path, insufficient network capacity and websites that are simply too large. There are many tools available to help you analyze your website, and the results will usually lead you to one of the most important metrics for application performance – object count.
2. Object Count
3. Failed Connections
The number of failed or rejected connections is another important metric. If the failed connection count goes up, it could be an indicator of a serious problem. For one, your system might have reached its capacity limit after being overwhelmed by the number of requests coming into it. This could happen if there is an unusual surge in traffic to your site or because your website isn’t scaled properly to accommodate normal traffic patterns.
If you have too many failed connections, then you should consider adding additional server capacity or implementing better load balancing across your system. Ideally, you should test your site by adding virtual users to see how the system responds under high traffic loads, rather than watching the impact of a traffic surge in real-time when live users could be affected.
4. HTTP Error Rates
Monitoring HTTP status codes and tracking HTTP error rates will show how well your services are running. By watching the status codes on a per-server basis, you can easily see if a specific server is generating errors. The status codes also identify the problem, such as configuration issues, Layer 1-4 timeouts or connection problems, Layer 7 response errors and protocol errors.
Regular HTTP health checks are essential for ensuring application performance because they provide a complete and accurate view of system health. This feature should be included in the analytics tools of load balancers. You should also be able to customize the health checks for your operations by setting parameters for frequency and the rise and fall counts (i.e., how many good or bad responses to health checks are required before determining a server is up or down.)
5. Request Count and Active Connections
The request count measures how many requests are coming into your system, while the number of active connections (a.k.a., flow count) monitors connections between clients and target servers. Both are good metrics for determining how well traffic load is being balanced across your system.
Request counts can be tracked on a per-minute basis or as a total sum across all your servers. The total amount of requests will show the normal range for the number of users your system supports; if the total falls outside this range then it could indicate problems with network routing or connections. Per-minute request counts provide a good view of overall load balancing.
Active connections can be displayed as an average or maximum and signal whether your system is scaled to the right level and how the load is distributed across your system.
Monitoring these key metrics and responding quickly to potential problems will ensure high performance of your business-critical applications. Snapt’s software ADCs are packed with powerful analytics tools that track these metrics and much more. Built for DevOps teams, the performance metrics are presented in easy-to-understand, customizable dashboards. You can configure the dashboards to show only the data you want to see and set thresholds for alarms and notifications so that you can prevent performance degradations from affecting users and customers.
To see Snapt’s ADC and analytics tools in action, start your free trial today!