Finding busy disks with iostat

The iostat(1M) utility provides several I/O statistics, which can be useful for analyzing I/O workloads and troubleshooting performance problems. When reviewing I/O problems, I usually start by reviewing the number of reads and writes to a device, which are available in iostat’s “r/s” and “w/s” columns:

$ iostat -zxnM 5

                    extended device statistics              
    r/s    w/s   Mr/s   Mw/s wait actv wsvc_t asvc_t  %w  %b device
   85.2   22.3   10.6    2.6  7.2  1.4   67.0   13.5  18  89 c0t0d0

Once I know how many reads and writes are being issued, I like to find the number of Megabytes read and written to each device. This information is available in iostat’s “Mr/s” and “Mw/s” columns:

$ iostat -zxnM 5

                    extended device statistics              
   r/s    w/s   Mr/s   Mw/s wait actv wsvc_t asvc_t  %w  %b device
   85.2   22.3   10.6    2.6  7.2  1.4   67.0   13.5  18  89 c0t0d0

After reviewing these items, I like to check iostat’s “wait” value to see the I/O queue depth for each device:

$ iostat -zxnM 5

                    extended device statistics              
    r/s    w/s   Mr/s   Mw/s wait actv wsvc_t asvc_t  %w  %b device
   85.2   22.3   10.6    2.6  7.2  1.4   67.0   13.5  18  89 c0t0d0

To see how these can be applied to a real problem, I captured the following data from device c0t0d0 a week or two back:

$ iostat -zxnM 5

               extended device statistics              
    r/s    w/s   Mr/s   Mw/s wait actv wsvc_t asvc_t  %w  %b device
    0.2   71.2    0.0    7.7 787.3  2.0 11026.8   28.0 100 100 c0t0d0

Device c0t0d0 was overloaded, had 787 I/O operations waiting to be serviced, and was causing application latecy (since the application in question performed lots of reads/writes, and the files were open O_SYNC). Once iostat returned the above statistics, I used ps(1) to find the processes that were causing the excessive disk activity, and used kill(1) to terminate them!

* The Solaris iostat utility was used to produce this output.
** The first iostat line contains averages since the system was booted, and should be ignored.

6 Comments

Sheldon  on April 14th, 2006

Thanks. Your iostat writeup is simple & useful.

hafiz  on October 26th, 2007

Simple n easy for beginner like me to understand n i just wondering how do you sorting out either ascen. or desce. the biggest or little process by using the ps command?

Quote: “I captured the following data from device c0t0d0 a week or two back” > does this perform by script or manually?

hafiz  on October 26th, 2007

Thanks for sharing your thought.

chow  on November 3rd, 2011

Please tell me what is the threshold value for % disk busy solaris T5220 server and how is it calculated.

Kaluwa  on January 23rd, 2012

Thanks..! vry useful

Zahid Haseeb  on March 16th, 2012

@ chow

(r/s + w/s) * service time/1000 * 100 = %b(in solaris) OR %util (in linux)

Leave a Comment