Finding busy disks with iostat


The iostat(1M) utility provides several I/O statistics, which can be useful for analyzing I/O workloads and troubleshooting performance problems. When reviewing I/O problems, I usually start by reviewing the number of reads and writes to a device, which are available in iostat’s “r/s” and “w/s” columns:

$ iostat -zxnM 5

extended device statistics
r/s w/s Mr/s Mw/s wait actv wsvc_t asvc_t %w %b device
85.2 22.3 10.6 2.6 7.2 1.4 67.0 13.5 18 89 c0t0d0

Once I know how many reads and writes are being issued, I like to find the number of Megabytes read and written to each device. This information is available in iostat’s “Mr/s” and “Mw/s” columns:

$ iostat -zxnM 5

extended device statistics
r/s w/s Mr/s Mw/s wait actv wsvc_t asvc_t %w %b device
85.2 22.3 10.6 2.6 7.2 1.4 67.0 13.5 18 89 c0t0d0

After reviewing these items, I like to check iostat’s “wait” value to see the I/O queue depth for each device:

$ iostat -zxnM 5

extended device statistics
r/s w/s Mr/s Mw/s wait actv wsvc_t asvc_t %w %b device
85.2 22.3 10.6 2.6 7.2 1.4 67.0 13.5 18 89 c0t0d0

To see how these can be applied to a real problem, I captured the following data from device c0t0d0 a week or two back:

$ iostat -zxnM 5

extended device statistics
r/s w/s Mr/s Mw/s wait actv wsvc_t asvc_t %w %b device
0.2 71.2 0.0 7.7 787.3 2.0 11026.8 28.0 100 100 c0t0d0

Device c0t0d0 was overloaded, had 787 I/O operations waiting to be serviced, and was causing application latecy (since the application in question performed lots of reads/writes, and the files were open O_SYNC). Once iostat returned the above statistics, I used ps(1) to find the processes that were causing the excessive disk activity, and used kill(1) to terminate them!

The Solaris iostat utility was used to produce this output. The first iostat line contains averages since the system was booted, and should be ignored.

This article was posted by Matty on 2005-06-26 21:53:00 -0400 -0400