After reading through the following UFS tuning information:
http://www.solarisinternals.com/si/reading/fs2/fs2.html
I started playing with the UFS “maxcontig” tunable. This value controls the number of file system blocks that will be read or written in a single operation. Each UFS file system contains a maxcontig value, which can be printed with the Solaris “fstyp” command:
$ fstyp -v /dev/md/dsk/d0 |more
ufs magic 11954 format dynamic time Fri Jan 14 09:47:19 2005 sblkno 16 cblkno 24 iblkno 32 dblkno 832 sbsize 2048 cgsize 8192 cgoffset 128 cgmask 0xfffffff0 ncg 2191 size 116165760 blocks 114377853 bsize 8192 shift 13 mask 0xffffe000 fsize 1024 shift 10 mask 0xfffffc00 frag 8 shift 3 fsbtodb 1 minfree 1% maxbpg 2048 optim time maxcontig 16 rotdelay 0ms rps 90 csaddr 832 cssize 35840 shift 9 mask 0xfffffe00 ntrak 16 nsect 255 spc 4080 ncyl 56944 cpg 26 bpg 6630 fpg 53040 ipg 6400 nindir 2048 inopb 64 nspf 2 nbfree 7624085 ndir 12806 nifree 13904878 nffree 90216 cgrotor 1454 fmod 0 ronly 0 logbno 1824 version 0
To see if maxcontig needs to be increased, you can run “iostat,” and watch the transfer sizes:
$ iostat -zxn 5
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
0.0 63.2 0.0 7782.9 52.3 2.0 827.9 31.4 97 99 c0t0d0
0.2 63.2 1.6 7782.9 52.9 2.0 834.1 31.4 98 100 c0t2d0
0.2 62.4 1.6 7782.5 0.2 54.7 3.6 873.1 23 100 d0
0.0 62.4 0.0 7782.5 0.0 54.1 0.0 866.7 0 99 d1
0.2 62.4 1.6 7782.5 0.0 54.7 0.0 873.0 0 100 d2
If we divide writes per second (w/s) by the total bytes written (kr/s), we can
derive the average size of each physical write:
$ bc
scale=2
7782/63
123.52
Give or take a few bytes, we are pushing maxcontig bytes during each write operation. If you have sequential workloads, increasing the value of maxcontig may allow your Solaris box to read or write more data at once (reducing the total number of I/O operations). You can adjust the size of maxcontig with the “tunefs” utility:
$ tunefs -a 128 /dev/md/dsk/d0
maximum contiguous block count changes from 16 to 128
This will cause 128 file system blocks (1 MB) to be read and written with each I/O operation. In order for this value to be effective, you need to increase the maximum size of a SCSI/SVM I/O operation. This is done by adding the following tunables to /etc/system:
set maxphys=1048576
set md:md_maxphys=1048576
ALL tunables should be tested on a development/QE box before implementing on important systems. I tried bumping maxcontig to 128 on my Ultra5, and immediately saw corruption on several meta devices. Digging through sunsolve.sun.com, I learned that maxcontig can only be set to “16″ on IDE devices, and “128″ for SCSI devices:
http://sunsolve.sun.com/search/document.do?assetkey=1-26-23429-1&searchclause=maxcontig
Luckily the Ultra5 was a test system, so recovering was relatively straight forward. Test all tunables before you deploy them :)