Monitoring hardware RAID controllers with Solaris

This article was posted by Matty on 2007-04-01 17:17:00 -0400 -0400

I manage several V40Zs running Solaris 10, and these servers utilize the built-in hardware RAID controller. Siince the physical spindles are masked off from the operating system, using a tool like smartmontools to check disk health is not an option. Luckily Solaris shops with the raidctl utility, which provides insight into the status of both the controller and the disks that sit behind that controller:

$ raidctl

RAID Volume RAID RAID Disk
Volume Type Status Disk Status
------------------------------------------------------
c1t0d0 IM OK c1t0d0 OK
c1t1d0 OK

Since raidctl will display a disk fault when a drive fails, I run a shell wrapper from cron every fifteen minutes to check the RAID controller status. If the script detects a problem, it will send an email and generate a syslog entry to let folks know a problem exists. Viva la hardware RAID!

Limiting the size of Solaris tmpfs file systems

This article was posted by Matty on 2007-04-01 16:57:00 -0400 -0400

I had an application go nuts a week or two ago, and it filled up /tmp on one of my Solaris 10 hosts. Since /tmp is an in memory file system, you can only imagine the chaos this caused. :( To ensure that this never happens again, I modified the tmpfs entry in /etc/vfstab to limit tmpfs to 1GB in size:

$ grep ^swap /etc/vfstab

swap - /tmp tmpfs - yes size=1024m

That will teach that pesky application. :)

VxFS clear blocks mount option

This article was posted by Matty on 2007-03-25 10:31:00 -0400 -0400

While reading through the VxFS administrators guide last week, I came across a cool mount option that can be used to zero out file system blocks prior to use:

“In environments where performance is more important than absolute data integrity, the preceding situation is not of great concern. However, for environments where data integrity is critical, the VxFS file system provides a mount -o blkclear option that guarantees that uninitialized data does not appear in a file.”

This is pretty cool, and a useful feature for environments that are super concerned about data integrity

Preallocating files sequentially on VxFS file systems

This article was posted by Matty on 2007-03-25 10:21:00 -0400 -0400

One cool feature that is built into VxFS is the ability to preallocate files sequentially on disk. This capability can benefit sequential workloads, and will typically result in higher throughput since disk seek times are minimized (LBA addressing, disk drive defect management and storage array abstractions can sometimes obscure this, so this may not always be 100% accurate).

To use the VxFS preallocation features, a file first needs to be created:

$ dd if=/dev/zero of=oradata01.dbf count=2097152
2097152+0 records in 2097152+0 records out

In this example, I created a 1GB file (2097152 blocks 512-bytes per block gives us 1GB) named oradata01.dbf, and double checked that it was 1GB by running ls with the “-h” option:

$ ls -lh

total 3.1G -rw-r–r– 1 root root 1.0G Aug 25 09:06 oradata01.dbf

After a file of the correct size has been allocated, the setext utility can be used to reserve blocks for that file, and to create an extent that matches the number of blocks allocated to the file:

$ setext -r 2097152 -e 2097152 oradata01.dbf

To verify the settings that were assigned to the file, the getext utility can be used:

$ getext oradata01.dbf

oradata01.dbf: Bsize 1024 Reserve 2097152 Extent Size 2097152

This is an awesome feature, and yet another reason why VxFS is one of the best file systems available today!

Backing up the Veritas Cluster Server configuration

This article was posted by Matty on 2007-03-25 10:07:00 -0400 -0400

Veritas cluster server stores custom agents and it’s configuration data as a series of files in /etc, /etc/VRTSvcs/conf/config and /opt/VRTSvcs/bin/ directories. Since these files are the life blood of the cluster engine, it is important to backup these files to ensure cluster recovery should disaster hit. VCS comes with the hasnap utility to simplify cluster configuration backups, and when run with the “-backup,” “-n,” “-f ,” and “-m ” options, a point in time snapshot of the cluster configuration will be written to the file passed to the “-f” option:

$ hasnap -backup -f clusterbackup.zip -n -m "Backup from March 25th

2007"**

Starting Configuration Backup for Cluster foo

Dumping the configuration...

Registering snapshot "foo-2006.08.25-1156511358610"

Contacting host lnode1...

Error connecting to the remote host "lnode1"

Starting backup of files on host lnode2
"/etc/VRTSvcs/conf/config/types.cf" ----> 1.0
"/etc/VRTSvcs/conf/config/main.cf" ----> 1.0
"/etc/VRTSvcs/conf/config/vcsApacheTypes.cf" ----> 1.0
"/etc/llthosts" ----> 1.0
"/etc/gabtab" ----> 1.0
"/etc/llttab" ----> 1.0
"/opt/VRTSvcs/bin/vcsenv" ----> 1.0
"/opt/VRTSvcs/bin/LVMVolumeGroup/monitor" ----> 1.0
"/opt/VRTSvcs/bin/LVMVolumeGroup/offline" ----> 1.0
"/opt/VRTSvcs/bin/LVMVolumeGroup/online" ----> 1.0
"/opt/VRTSvcs/bin/LVMVolumeGroup/clean" ----> 1.0
"/opt/VRTSvcs/bin/ScriptAgent" ----> 1.0
"/opt/VRTSvcs/bin/LVMVolumeGroup/LVMVolumeGroup.xml" ----> 1.0
"/opt/VRTSvcs/bin/RVGSnapshot/fdsched" ----> 1.0
"/opt/VRTSvcs/bin/RVGSnapshot/monitor" ----> 1.0
"/opt/VRTSvcs/bin/RVGSnapshot/fdsetup.vxg" ----> 1.0
"/opt/VRTSvcs/bin/RVGSnapshot/open" ----> 1.0
"/opt/VRTSvcs/bin/ScriptAgent" ----> 1.0
"/opt/VRTSvcs/bin/RVGSnapshot/RVGSnapshotAgent.pm" ----> 1.0
"/opt/VRTSvcs/bin/RVGSnapshot/RVGSnapshot.xml" ----> 1.0
"/opt/VRTSvcs/bin/RVGSnapshot/offline" ----> 1.0
"/opt/VRTSvcs/bin/RVGSnapshot/online" ----> 1.0
"/opt/VRTSvcs/bin/RVGSnapshot/attr_changed" ----> 1.0
"/opt/VRTSvcs/bin/RVGSnapshot/clean" ----> 1.0
"/opt/VRTSvcs/bin/RVGPrimary/monitor" ----> 1.0
"/opt/VRTSvcs/bin/RVGPrimary/open" ----> 1.0
"/opt/VRTSvcs/bin/RVGPrimary/RVGPrimary.xml" ----> 1.0
"/opt/VRTSvcs/bin/RVGPrimary/offline" ----> 1.0
"/opt/VRTSvcs/bin/RVGPrimary/online" ----> 1.0
"/opt/VRTSvcs/bin/RVGPrimary/clean" ----> 1.0
"/opt/VRTSvcs/bin/ScriptAgent" ----> 1.0
"/opt/VRTSvcs/bin/RVGPrimary/actions/fbsync" ----> 1.0
"/opt/VRTSvcs/bin/triggers/violation" ----> 1.0
"/opt/VRTSvcs/bin/CampusCluster/monitor" ----> 1.0
"/opt/VRTSvcs/bin/CampusCluster/close" ----> 1.0
"/opt/VRTSvcs/bin/ScriptAgent" ----> 1.0
"/opt/VRTSvcs/bin/CampusCluster/open" ----> 1.0
"/opt/VRTSvcs/bin/CampusCluster/CampusCluster.xml" ----> 1.0
"/opt/VRTSvcs/bin/RVG/monitor" ----> 1.0
"/opt/VRTSvcs/bin/RVG/info" ----> 1.0
"/opt/VRTSvcs/bin/ScriptAgent" ----> 1.0
"/opt/VRTSvcs/bin/RVG/RVG.xml" ----> 1.0
"/opt/VRTSvcs/bin/RVG/offline" ----> 1.0
"/opt/VRTSvcs/bin/RVG/online" ----> 1.0
"/opt/VRTSvcs/bin/RVG/clean" ----> 1.0
"/opt/VRTSvcs/bin/internal_triggers/cpuusage" ----> 1.0
Backup of files on host lnode2 complete

Backup succeeded partially

To check the contents of the snapshot, the unzip utility can be run with the “-t” option:

$ unzip -t clusterbackup.zip |more

Archive: clusterbackup.zip
testing: /cat_vcs.zip OK
testing: /categorylist.xml.zip OK
testing: _repository__data/vcs/foo/lnode2/etc/VRTSvcs/conf/config/types.cf.zip OK
testing: _repository__data/vcs/foo/lnode2/etc/VRTSvcs/conf/config/main.cf.zip OK
testing: _repository__data/vcs/foo/lnode2/etc/VRTSvcs/conf/config/vcsApacheTypes.cf.z
ip OK
testing: _repository__data/vcs/foo/lnode2/etc/llthosts.zip OK
testing: _repository__data/vcs/foo/lnode2/etc/gabtab.zip OK
testing: _repository__data/vcs/foo/lnode2/etc/llttab.zip OK
testing: _repository__data/vcs/foo/lnode2/opt/VRTSvcs/bin/vcsenv.zip OK
testing: _repository__data/vcs/foo/lnode2/opt/VRTSvcs/bin/LVMVolumeGroup/monitor.zip
OK
testing: _repository__data/vcs/foo/lnode2/opt/VRTSvcs/bin/LVMVolumeGroup/offline.zip
OK
testing: _repository__data/vcs/foo/lnode2/opt/VRTSvcs/bin/LVMVolumeGroup/online.zip
OK
testing: _repository__data/vcs/foo/lnode2/opt/VRTSvcs/bin/LVMVolumeGroup/clean.zip
OK
testing: _repository__data/vcs/foo/lnode2/opt/VRTSvcs/bin/LVMVolumeGroup/LVMVolumeGro
upAgent.zip OK
testing: _repository__data/vcs/foo/lnode2/opt/VRTSvcs/bin/LVMVolumeGroup/LVMVolumeGro
up.xml.zip OK
testing: _repository__data/vcs/foo/lnode2/opt/VRTSvcs/bin/RVGSnapshot/fdsched.zip O
K
testing: _repository__data/vcs/foo/lnode2/opt/VRTSvcs/bin/RVGSnapshot/monitor.zip O
K
......

Since parts of the cluster configuration ran reside in memory and not on disk, it is a good idea to run “haconf -dump -makero” prior to running hasnap. This will ensure that the current configuration is being backed up, and will allow hasnap “-restore” to restore the correct configuration if disaster hits.