Blog O' Matty


Selectively tracing Solaris system calls

This article was posted by Matty on 2004-12-01 00:26:00 -0400 -0400

I was debugging an I/O problem last week, and needed to see how an application was using a UFS file system on a Solaris 9 server. Solaris comes with the “truss” utility, which can be used to trace library and system calls. When truss is invoked with the “-t” option, only the system calls passed to “-t” will be traced:

$ truss -leaf -t open,close,write,lseek -p 7044

7044/1: psargs: /usr/local/bin/blah -ubpid 17595 -web -wtbhostname w4
7044/1: open("log/httpd_log", O_WRONLY) = 25
7044/1: lseek(25, 0, SEEK_END) = 0x04B33A05
7044/1: write(25, " SOME_ TEXT ".., 133) = 133
7044/1: close(25) = 0

This example is taken from an application I manage ( I didn’t write it ). The application opens a log file, seeks to the end of the file, writes 133-bytes of data, and then closes the log file. The application does this 100s of times per minute, which is extremely inefficient. To fix this problem, the applications needs to be modified to use a single open() and close(). This will also allow us to cluster writes (if the appropriate tunables and open() options are are set correctly) to the log file.

Checking for OpenLDAP unindexed searches

This article was posted by Matty on 2004-11-18 00:29:00 -0400 -0400

I was checking my openldap logfiles today, and noticed that the “cn” attribute wasn’t indexed. I found this by checking for the “index_param” string in my OpenLDAP logfiles:

$ grep "index_param failed" /var/log/openldap

Dec 25 13:37:19 winnie slapd[730]: [ID 635189 local4.debug] < = bdb_substring_candidates: (cn) index_param failed (18)

To fix this problem, I added an “index” statement to my slapd.conf:

index cn,mail,sn eq,pres,sub

Once the index was added, I rebuilt the indexes with the “slapdindex” utility:

$ slapindex -f /usr/local/openldap-common/etc/slapd.conf -b "dc=synackfin,dc=com"

The OpenLDAP documentation has more info in case your interested in learning more:

http://www.openldap.org/doc/admin22/

Bash arrays

This article was posted by Matty on 2004-11-14 00:30:00 -0400 -0400

I have been trying to get a better grasp of some advanced bash concepts, and have been reading through the following reference manual. I am pretty familiar with C and perl arrays, but have never had a need to use arrays in a bash script. The syntax for a bash array is almost identical to Perl:

array[1]=12
echo ${array[1]}

This assigns the value 12 to the first slot in the array. Since bash variables are untyped, we can assign a string to the same array:

array[2]="my string"
echo ${array[2]}

This assigns the string “my string” to slot two in the array. Useful stuff!

Solaris Entropy statistics

This article was posted by Matty on 2004-11-03 00:38:00 -0400 -0400

I exchanged an email or two with Andy Tucker regarding Solaris 9 entropy pools, and found out that entropy statistics are available through mdb’s (modular debugger) “rnd_stats” dcmd:

$ uname -a
SunOS winnie 5.9 Generic_117171-14 sun4u sparc SUNW,Ultra-5_10

$ mdb -k

Loading modules: [ unix krtld genunix ip lofs nfs random ptm ]

> ::rnd_stats
Random number generator statistics:
8192 bits of entropy estimate
0 bytes generated for /dev/random
5998456 bytes generated for /dev/urandom
2277764 bits of entropy added to the pool
94006 bits of entropy extracted from the pool
4849216 bytes added to the random pool
240 bytes extracted from the random pool

With Solaris 10, you can use the “swrand_stats” and “rnd_stats” dcmds to get entropy statistics:

$ uname -a
SunOS sparky 5.10 s10_69 i86pc i386 i86pc

$ mdb -k

Loading modules: [ unix krtld genunix specfs dtrace ufs ip sctp uhci usba nca random lofs sppp nfs crypto ptm ]

> ::swrand_stats
Software-based Random number generator statistics:
8192 bits of entropy estimate
861095 bits of entropy added to the pool
8480 bits of entropy extracted from the pool
2318888 bytes added to the random pool
1060 bytes extracted from the random pool

> ::rnd_stats
Random number device statistics:
0 bytes generated for /dev/random
0 bytes read from /dev/random cache
36 bytes generated for /dev/urandom

I wish there was a way to tell if an application blocked because of a depleted pool in Solaris 9 ( dtrace may solve this problem in Solaris 10).

Solaris's maxcontig setting

This article was posted by Matty on 2004-10-29 00:32:00 -0400 -0400

After reading through the following UFS tuning information:

http://www.solarisinternals.com/si/reading/fs2/fs2.html

I started playing with the UFS “maxcontig” tunable. This value controls the number of file system blocks that will be read or written in a single operation. Each UFS file system contains a maxcontig value, which can be printed with the Solaris “fstyp” command:

$ fstyp -v /dev/md/dsk/d0 |more

ufs
magic 11954 format dynamic time Fri Jan 14 09:47:19 2005
sblkno 16 cblkno 24 iblkno 32 dblkno 832
sbsize 2048 cgsize 8192 cgoffset 128 cgmask 0xfffffff0
ncg 2191 size 116165760 blocks 114377853
bsize 8192 shift 13 mask 0xffffe000
fsize 1024 shift 10 mask 0xfffffc00
frag 8 shift 3 fsbtodb 1
minfree 1% maxbpg 2048 optim time
maxcontig 16 rotdelay 0ms rps 90
csaddr 832 cssize 35840 shift 9 mask 0xfffffe00
ntrak 16 nsect 255 spc 4080 ncyl 56944
cpg 26 bpg 6630 fpg 53040 ipg 6400
nindir 2048 inopb 64 nspf 2
nbfree 7624085 ndir 12806 nifree 13904878 nffree 90216
cgrotor 1454 fmod 0 ronly 0 logbno 1824
version 0

To see if maxcontig needs to be increased, you can run “iostat,” and watch the transfer sizes:

$ iostat -zxn 5

extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
0.0 63.2 0.0 7782.9 52.3 2.0 827.9 31.4 97 99 c0t0d0
0.2 63.2 1.6 7782.9 52.9 2.0 834.1 31.4 98 100 c0t2d0
0.2 62.4 1.6 7782.5 0.2 54.7 3.6 873.1 23 100 d0
0.0 62.4 0.0 7782.5 0.0 54.1 0.0 866.7 0 99 d1
0.2 62.4 1.6 7782.5 0.0 54.7 0.0 873.0 0 100 d2

If we divide writes per second (w/s) by the total bytes written (kr/s), we can derive the average size of each physical write:

$ bc

scale=2
7782/63
123.52

Give or take a few bytes, we are pushing maxcontig bytes during each write operation. If you have sequential workloads, increasing the value of maxcontig may allow your Solaris box to read or write more data at once (reducing the total number of I/O operations). You can adjust the size of maxcontig with the “tunefs” utility:

$ tunefs -a 128 /dev/md/dsk/d0
maximum contiguous block count changes from 16 to 128

This will cause 128 file system blocks (1 MB) to be read and written with each I/O operation. In order for this value to be effective, you need to increase the maximum size of a SCSI/SVM I/O operation. This is done by adding the following tunables to /etc/system:

set maxphys=1048576 set md:md_maxphys=1048576

ALL tunables should be tested on a development/QE box before implementing on important systems. I tried bumping maxcontig to 128 on my Ultra5, and immediately saw corruption on several meta devices. Digging through sunsolve.sun.com, I learned that maxcontig can only be set to “16” on IDE devices, and “128” for SCSI devices:

http://sunsolve.sun.com/search/document.do?assetkey=1-26-23429-1&searchclause=maxcontig

Luckily the Ultra5 was a test system, so recovering was relatively straight forward. Test all tunables before you deploy them :)