What is Resource Utilization?
In addition to throughput and response times, another key performance indicator of an application’s performance is often referred to as utilization.
Resource utilization is a way to track how busy various resources of a computer system are when running a performance test.
What are some common utilization performance metrics to monitor?
There are tons of metric counters to choose from to help monitor utilization. When I’m running a performance test, however, these are the four key areas that I begin with:
CPU Utilization (Is there are doctor in the house?)
CPU utilization measurements can help determine how effective your test is. It can also be used as a gauge of how any tuning change you’ve made has affected the overall performance of the system. I like to think of CPU as the pulse rate in the overall health of a system.
When the CPU hits 100% it can no longer process more work and your throughput flattens. Usually, a best practice would be to avoid 80% CPU utilization for each processor for long periods of time.
You’ll want to try and grab CPU utilization for all servers used in your application under test (AUT) infrastructure. CPU is a quick and easy key metric because it can rapidly help you identity which servers may be causing issues or creating potential bottlenecks. To take a look at CPU utilization:
- On a windows machine go to Start>Run and type perfmon:
- On a UNIX machine you could use vmstat 5 or on a mac from the terminal type vm_stat 5:
This counter can help find potential memory leaks caused by your application. Based on Microsoft’s recommendations in Performance Testing Microsoft .NET Web Applications, memory leaks can be found by monitoring:
- Memory\Available bytes
- Process\Private Bytes
- Process\Working Sets
Find a Memory Leak formula:
A Memory leak will usually show Process\Private Bytes and Process\Working Sets increasing, and Memory\Available bytes decreasing.
Although when you hear the work “Disk” the first thing that most likely comes to mind is disk space, disk bottlenecks are usually related to time. Some counters to help troubleshoot disk issues are:
- Average Disk Queue Length
- Average Disk Read Queue Length
- Average Disk Write Queue Length
- Average Disk sec/Read
- Average Disk sec/Transfer
- Disk Reads/sec
- Disk Writes/sec
Find a Disk bottleneck formula:
I/Os per Disk = [Read + (4xWrites)] / Number of Disk
(To see a step-by-step example of how to use this formula, take a look at pg. 84 & 85 of the Microsoft book I mentioned earlier)
The last resource utilization metric to keep an eye on is the network. This may be even more important nowadays, with the growth of distributed app running in the cloud. Things to keep an eye on are:
- Network latency – The time it takes to send a data packet across a network connection
- Network Round Trip – a client-server request and response generated by your application
- Data transfer – The amount of info moved from a browser to a web server.
Lots of data combined with restricted bandwidth and network latency usually equals stinky performance. It’s kind of like expecting Superman to save the world in a straight jacket and a pocket full of kryptonite. (Well, maybe not quite that bad.)
How to collect all this stuff?
Instead of writing performance data to a bunch of text files, most performance tools have the ability to gather most of these metrics in a single location.
Performance monitoring is a big topic. I believe in starting small and then slowly add more complexity. There are many other performance counters to choose from. Hopefully these four areas — CPU, MEMORY, DISK and NETWORK will help get you started.