Using Veritas Volume Manager with RHAS 4.0

I have used Veritas Volume Manager (VxVM) and Veritas File system (VxFS) for as long as I can remember. All of my VxVM and VxFS experience has been on Solaris, so I thought I would install both products on a Linux host to see if anything was different. The Linux installation turned out to be nearly identical to the Solaris installation (the Linux installer install RPMs versus SVR4 packages), and the vx* commands are located in the same place in both operating systems. Since I wanted to get my hands dirty and play with VxVM and VxFS on a Linux host, I first ran the vxdisksetup utility to initialize a few devices:

$ /usr/lib/vxvm/bin/vxdisksetup -i hdc
VxVM vxdisksetup ERROR V-5-2-1814 hdc: Invalid disk device for ‘cdsdisk’ format

Gak! After pondering the error for a few minutes, it dawned on me that as of VxVM 4.0 the “cdsdisk” disk format is used by default. The new format allows devices to be transported between different operating systems and hardware architecures (e.g., you can deport a Solaris disk group on a SPARC host (big endian) and import and access it* on an x86 (little endian) Linux host). After sifting through the Veritas support site to see if cdsdisk had any limitations, I came across infodoc 278178. Infodoc 278178 states that the cdsdisk format can only be used on SCSI disks, and showed how to use the vxscsi command to see if the cdsdisk format could be used with a device:

$ /usr/lib/vxvm/diag.d/vxscsi -g sdd
Cannot get disk geometry on /dev/vx/rdmp/hdd !

The IDE disk drives I was using with the Linux host don’t fit into the cdsdisk supportability matrix, so I decided to the use the “sliced” format since the devices were used purely for testing:

$ /usr/lib/vxvm/bin/vxdisksetup -fi hdc format=sliced

$ /usr/lib/vxvm/bin/vxdisksetup -fi hdd format=sliced

Once the disks were initialized, I added them to a disk group, carved up a new volume, and created a VxFS file system on the volume:

$ vxdg init datadg hdc hdd cds=off

$ vxassist -g datadg make vol01 512m layout=concat

$ mkfs -t vxfs -o bsize=8192 /dev/vx/dsk/datadg/vol01

    version 6 layout
    8380416 sectors, 523776 blocks of size 8192, log size 2048 blocks
    largefiles supported

I plan to fire up my Sun multipack this weekend to see if the Solaris to Linux migration works as well as the folks at Veritas say (based on past experiences with the Veritas product suite, I am relatively certain it will work well).

* To deal with endianness issues, you need to use the fscdsconv utility.

Linux udev humor

The udev device management framework is one of the new features that was added to the Linux 2.6 kernel, and allows the /dev namespace to be populated based on hotplug events sent from the kernel to the userspace udevd daemon. While reading through the udev FAQ, I found a good explanation of udev:

Q: How is udev related to devfs?
A: udev works entirely in userspace, using hotplug events the kernel sends
   whenever a device is added or removed from the kernel. Details about
   the devices are exported by the kernel to the sysfs filesystem at /sys
   All device naming policy permission control and event handling is done in
   userspace. devfs is operated from within the kernel.

As well as some comedic writing:

Q: But udev will not automatically load a driver if a /dev node is opened
   when it is not present like devfs will do.
A: Right, but Linux is supposed to load a module when a device is discovered
   not to load a module when it's accessed.

Q: Oh come on, pretty please.  It can't be that hard to do.
A: Such a functionality isn't needed on a properly configured system. All
   devices present on the system should generate hotplug events, loading
   the appropriate driver, and udev will notice and create the
   appropriate device node.  If you don't want to keep all drivers for your
   hardware in memory, then use something else to manage your modules
   (scripts, modules.conf, etc.)  This is not a task for udev.

Q: But I love that feature of devfs, please?
A: The devfs approach caused a lot of spurious modprobe attempts as
   programs probed to see if devices were present or not.  Every probe
   attempt created a process to run modprobe, almost all of which were
   spurious.

This made me laugh silly, and I wish more FAQ maintainers used this style of writing. Whoever wrote this, I commend you!

Concert review: Cinderella & Poison

I grew up listening to hard rock, and used to bang my head to the likes of Motley Crue, Cinderella, Guns N’ Roses, Ratt, Warrant, Skid Row and the rest of the bands who made the 1980s special. Most of these bands are still touring in one capacity or another, and I recently had the opportunity to see two of these bands, Cinderella and Poison. Both bands were a huge success in the 1980s, and I was curious to see if they could still jam.

Cinderella was the first band to take the stage, and they began their setlist with what sounded like “Fallin’ Apart at the Seams.” The lead singer (Tom Keifer) was having problems hitting the high notes Cinderella is known for, but the show was amazing none the less (Tom Keifer told the crowd that his voice was strained, but he chose to go on tour to make all of us Cinderella fans happy! Awesome!). In addition to playing “Fallin’ Apart at the Seams,” the band also played classic hits such as “Don’t Know What You Got,” “Gypsey Road,” “Shake Me,” “Push Push,” “Shelter Me,” “Heartbreak Station,” and “Coming Home.” Even with Tom’s strained voice, the band sounded awesome, and the crowd loved every minute of their performance.

After Cinderella completed their set, the roadies cleared the stage in preparation for Poison. Eventually the lights dimmed, and Bret Michaels, CC Deville, Rikki Rockett and Bobby Dall popped up out of nowhere to perform “Look What the Cat Dragged In.” The guys sounded incredible, and I had a blast watching CC Deville jam on the guitar. The band played most of their hits, including “Unskinny Bob,” “Every Rose Has It’s Thorn,” “Your Momma Don’t Dance,” “Nothin’ But a Good Time,” “Talk Dirty to Me,” “Fallen Angel,” and “Something To Believe In.” Each hit sounded just like it did back in the 1980s, and I have to say I enjoyed listening to Poison (I was never a huge fan of their music).

Once the show was over and we found our car (we thought we lost it), I met some cool folks and reminisced about the hair band music that made the 1980s the century of rock. The show was a blast, and I would have to say I loved every minute of Cinerella’s performance. They have tons of hits, and are well worth seeing live.

Checking devices for bad sectors

I recently had a friend contact me because he was getting an error similar to the following in his Redhat Linux system log (I didn’t save the error while debugging the problem, so I grabbed this one from the web):

kernel: disk I/O error: dev 08:01, sector 25590410
kernel: SCSI disk error : host 0 channel 0 id 0 lun 0 return code = 28000002

At first glance, I thought the disk drive had failed, and told him to back up all of his data to safe media. Once the data was backed up, I decided to run a full SMART self test on the disk drive to check the drives health:

$ smartctl -t long /dev/hda

smartctl version 5.36 [sparc-sun-solaris2.10] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Extended self-test routine immediately in off-line mode".
Drive command "Execute SMART Extended self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 84 minutes for test to complete.
Test will complete after Sun Aug 27 19:41:01 2006

Use smartctl -X to abort test.

The SMART long test completed successfully, but dd was failing when attempting to read sector 25590410 (we weren’t using the continue on error option). Since all modern disk drive controllers contain logic to remap faulty sectors when they are detected, and the number of reallocated sectors as reported by smartctl was well below the manufacturers failure threshold, I wondered if the sector was “stuck.” To test my theory, I booted from a Linux CD, and ran the Linux badblocks utility on the disk partition (I didn’t save the badblocks output from his drive, so the following is a sample from another machine):

$ badblocks -sv /dev/hda

Checking blocks 0 to 8192016
Checking for bad blocks (read-only test):    222400/  8192016

Badblocks completed successfully, and an fsck of the file system reported that the file system was clean (We also used the ext3 file system debugger to see if a file was using the block. It wasn’t, so my theory is that the errors occurred when a new file was being created). Next we rebooted the system, and the number of reallocated sectors reported by smartmontools had increased by one. This completely surprised me, and I am still confused why the disk controller didn’t remap the sector when we were booted from the disk drive. I had fun debugging this problem, and learning about how IDE disk drives work.

Concert review: Candlebox

I grew up loving rock and roll, and spent a fair amount of time in college listening to music (especially when I was writing code for class). One band that got a fair amount of play time on my home stero was Candlebox. They had a cool relaxing sound, and could jam with the best of them. So when Ticketmaster informed me that they would be in town playing a show at a local venue, I decided to get tickets. Once the opening acts left the stage, Candlebox played an audio clip, and then blasted into a version of “Arrow.” The band looked like they were having fun on stage, and proceeded to play all of their hits, including “Far Behind,” “You,” “Cover Me,” “Rain,” “Change,” and “Simple lessons.” The show was incredible, and being four rows back gave me a chance to watch the band jam close up. If you get a chance to see them. check ’em out!

Getting E-mail notifications when MD devices fail

I use the MD (multiple device) logical volume manager to mirror the boot devices on the Linux servers I support. When I first started using MD, the mdadm utility was not available to manage and monitor MD devices. Since disk failures are relatively common in large shops, I used the shell script from my SysAdmin article Monitoring and Managing Linux Software RAID to send E-mail when a device entered the failed state. While reading through the mdadm(8) manual page, I came across the “–monitor” and “–mail” options. These options can be used to monitor the operational state of the MD devices in a server, and generate E-mail notifications if a problem is detected. E-mail notification support can be enabled by running mdadm with the “–monitor” option to monitor devices, the “–daemonise” option to create a daemon process, and the “–mail” option to generate E-mail:

$ /sbin/mdadm –monitor –scan –daemonise –mail=root@localhost

Once mdadm is daemonized, an E-mail similar to the following will be sent each time a failure is detected:

From: mdadm monitoring 
To: root@localhost.localdomain
Subject: Fail event on /dev/md1:biscuit

This is an automatically generated mail message from mdadm
running on biscuit

A Fail event had been detected on md device /dev/md1.

Faithfully yours, etc.

I digs me some mdadm!