Blog O' Matty


Monitoring interrupts with Solaris

This article was posted by Matty on 2005-07-22 21:34:00 -0400 -0400

The intrstat(1m) utility was introduced in Solaris 10, and allows interrupt activity to be monitored on a system:

$ intrstat 5

device | cpu0 %tim
-------------+---------------
glm#0 | 953 2.6
qfe#0 | 202 1.5
uata#0 | 91 0.2

device | cpu0 %tim
-------------+---------------
glm#0 | 879 2.6
qfe#0 | 198 1.5
uata#0 | 89 0.2

This provides a snapshot of the number of interrupts generated over interval seconds. To get cumulative interrupt activity for a specific period of time, Brendan Gregg’s intrtime D (DTRACE) script can be used:

$ intrtime 60

Interrupt Time(ns) %Time
uata 2869846 0.00
qfe 46331270 0.08
glm 1913715146 3.19
TOTAL(int) 1962916262 3.27
TOTAL(dur) 60008698021 100.00

With these two utilities, you can easily see which devices are busy generating interrupts. This information can also be used to ask questions like “which process is causing the activity in the qfe driver,” or “what SCSI devices are busy in the system,” or “HEY! The SCSI disk drives off the Ultra Wide SCSI controller shouldn’t be in use! Who is accessing them?!?”

DTRACE is da bomb yizo!

Veritas disk group configuration records

This article was posted by Matty on 2005-07-16 21:42:00 -0400 -0400

Veritas uses disk group configuration records to store subdisk, plex, volume, and device configuration data. The configuration records get written to the private region of specific devices in each disk group, and are described in the vxinfo(1m) manual page:

A disk group configuration is a small database that contains all volume, plex, subdisk, and disk media records. These configurations are repli- cated onto some or all disks in the disk group, usually with one copy on each disk. Because these databases are stored within disk groups, record associations cannot span disk groups. Thus, a subdisk defined on a disk in one disk group cannot be associated with a volume in another disk group.*

If multiple devices are present in a disk group, Veritas will replicate the configuration records to multiple devices for redundancy. You can see which devices contain disk configuration records by invoking vxdg(1m) with the “list” option:

$ vxdg list oof

Group: oof
dgid: 1120604922.22.tigger
import-id: 1024.10
flags: cds
version: 120
alignment: 8192 (bytes)
ssb: on
detach-policy: global
dg-fail-policy: dgdisable
copies: nconfig=2 nlog=default
config: seqno=0.17098 permlen=1280 free=1223 templen=27 loglen=192
config disk c1t1d0s2 copy 1 len=1280 state=clean online
config disk c1t2d0s2 copy 1 len=1280 state=clean online
config disk c1t3d0s2 copy 1 len=1280 disabled
config disk c1t4d0s2 copy 1 len=1280 disabled
config disk c1t5d0s2 copy 1 len=1280 disabled
config disk c1t6d0s2 copy 1 len=1280 disabled
log disk c1t1d0s2 copy 1 len=192
log disk c1t2d0s2 copy 1 len=192
log disk c1t3d0s2 copy 1 len=192
log disk c1t4d0s2 copy 1 len=192
log disk c1t5d0s2 copy 1 len=192

This example shows that targets 1 and 2 contain configuration records. If you are a paranoid person, you will probably want to replicate the configuration records to several devices in the disk group. This can be accomplished with the vxedit(1m) utility:

$ vxedit -g oof set nconfig=6 oof

$ vxdg list oof | grep ^config

config: seqno=0.17137 permlen=1280 free=1265 templen=27 loglen=192
config disk c1t1d0s2 copy 1 len=1280 state=clean online
config disk c1t2d0s2 copy 1 len=1280 state=clean online
config disk c1t3d0s2 copy 1 len=1280 state=clean online
config disk c1t4d0s2 copy 1 len=1280 state=clean online
config disk c1t5d0s2 copy 1 len=1280 state=clean online
config disk c1t6d0s2 copy 1 len=1280 state=clean online

Since the configuration records need to be updated periodically, it is poor practice to replicate configuration records to all devices in large disk groups. Veritas will use the correct number of configuration records by default, so creating additional configuration records is seldom required. For further details on configuration records, take a look at the Veritas Volume Manager administrators guide and vxintro(1m) manual page.

Finding out how a file system was created

This article was posted by Matty on 2005-07-12 21:45:00 -0400 -0400

Part of being an SA involves creating new file systems as databases and applications expand. The creation process usually requires a bit of detective work, since block sizes, inodes, journal size, and a variety of other file system attributes can boost or hamper performance. Whenever I take over control of a new server, I like to run mkfs with the “-m”( show how a file system was created) option against all of the existing file systems:

$ /usr/sbin/mkfs -F vxfs -m /dev/vx/rdsk/oradg/oravol01
mkfs -F vxfs -o bsize=8192,version=6,inosize=256,logsize=2048,largefiles /dev/vx/rdsk/oradg/oravol01 20971584

$ /usr/sbin/mkfs -F ufs -m /dev/dsk/c0t0d0s0
mkfs -F ufs -o nsect=255,ntrack=16,bsize=8192,fragsize=1024,cgsize=26,free=1, rps=90,nbpi=8154,opt=t,apc=0,gap=0,nrpos=8,maxcontig=16, mtb=n /dev/dsk/c0t0d0s0 193245120

The “-m” option will print the options passed to mkfs at file system creation time. This information can be invaluable for reverse engineering why something was created (or changed) with a specific option. I am not sure if this option is available on other Operating Systems, but Solaris definitely supports it.

Manually synchronizing Solaris meta devices

This article was posted by Matty on 2005-07-03 21:48:00 -0400 -0400

While performing routine maintenance today, I discovered that one of my hot spare drives kicked in to replace a faulted disk drive. Since the synchronization process had recently started, I decided to shutdown the box to replace the faulted drive. Once I booted the box back up, I noticed that the synchronization process didn’t start automatically:

$ metastat d5

d5: RAID
State: Resyncing
Hot spare pool: hsp001
Interlace: 128 blocks
Size: 106085968 blocks (50 GB)
Original device:
Size: 106086528 blocks (50 GB)
Device Start Block Dbase State Reloc Hot Spare
c1t1d0s0 6002 No Okay Yes
c1t2d0s0 4926 No Resyncing Yes c1t6d0s0
c1t3d0s0 4926 No Okay Yes
c1t4d0s0 4926 No Okay Yes

Under normal operation, a “Resync in progress” line would be listed. To manually start the syncrhonization process, I ran the metasync(1m) command by hand:

$ metasync -r 2048

Once the command was executed, the synchronization process started:

$ metastat d5

d5: RAID
State: Resyncing
Resync in progress: 0.5% done
Hot spare pool: hsp001
Interlace: 128 blocks
Size: 106085968 blocks (50 GB)
Original device:
Size: 106086528 blocks (50 GB)
Device Start Block Dbase State Reloc Hot Spare
c1t1d0s0 6002 No Okay Yes
c1t2d0s0 4926 No Resyncing Yes c1t6d0s0
c1t3d0s0 4926 No Okay Yes
c1t4d0s0 4926 No Okay Yes

Since this is a software RAID5 meta device, the synchronization process ( read data and parity, calculate parity, write data, write parity) will take a looooooong time to complete.

Finding busy disks with iostat

This article was posted by Matty on 2005-06-26 21:53:00 -0400 -0400

The iostat(1M) utility provides several I/O statistics, which can be useful for analyzing I/O workloads and troubleshooting performance problems. When reviewing I/O problems, I usually start by reviewing the number of reads and writes to a device, which are available in iostat’s “r/s” and “w/s” columns:

$ iostat -zxnM 5

extended device statistics
r/s w/s Mr/s Mw/s wait actv wsvc_t asvc_t %w %b device
85.2 22.3 10.6 2.6 7.2 1.4 67.0 13.5 18 89 c0t0d0

Once I know how many reads and writes are being issued, I like to find the number of Megabytes read and written to each device. This information is available in iostat’s “Mr/s” and “Mw/s” columns:

$ iostat -zxnM 5

extended device statistics
r/s w/s Mr/s Mw/s wait actv wsvc_t asvc_t %w %b device
85.2 22.3 10.6 2.6 7.2 1.4 67.0 13.5 18 89 c0t0d0

After reviewing these items, I like to check iostat’s “wait” value to see the I/O queue depth for each device:

$ iostat -zxnM 5

extended device statistics
r/s w/s Mr/s Mw/s wait actv wsvc_t asvc_t %w %b device
85.2 22.3 10.6 2.6 7.2 1.4 67.0 13.5 18 89 c0t0d0

To see how these can be applied to a real problem, I captured the following data from device c0t0d0 a week or two back:

$ iostat -zxnM 5

extended device statistics
r/s w/s Mr/s Mw/s wait actv wsvc_t asvc_t %w %b device
0.2 71.2 0.0 7.7 787.3 2.0 11026.8 28.0 100 100 c0t0d0

Device c0t0d0 was overloaded, had 787 I/O operations waiting to be serviced, and was causing application latecy (since the application in question performed lots of reads/writes, and the files were open O_SYNC). Once iostat returned the above statistics, I used ps(1) to find the processes that were causing the excessive disk activity, and used kill(1) to terminate them!

The Solaris iostat utility was used to produce this output. The first iostat line contains averages since the system was booted, and should be ignored.