I have used Veritas Volume Manager (VxVM) and Veritas File system (VxFS) for as long as I can remember. All of my VxVM and VxFS experience has been on Solaris, so I thought I would install both products on a Linux host to see if anything was different. The Linux installation turned out to be nearly identical to the Solaris installation (the Linux installer install RPMs versus SVR4 packages), and the vx* commands are located in the same place in both operating systems. Since I wanted to get my hands dirty and play with VxVM and VxFS on a Linux host, I first ran the vxdisksetup utility to initialize a few devices:

$ /usr/lib/vxvm/bin/vxdisksetup -i hdc
VxVM vxdisksetup ERROR V-5-2-1814 hdc: Invalid disk device for ‘cdsdisk’ format

Gak! After pondering the error for a few minutes, it dawned on me that as of VxVM 4.0 the “cdsdisk” disk format is used by default. The new format allows devices to be transported between different operating systems and hardware architecures (e.g., you can deport a Solaris disk group on a SPARC host (big endian) and import and access it* on an x86 (little endian) Linux host). After sifting through the Veritas support site to see if cdsdisk had any limitations, I came across infodoc 278178. Infodoc 278178 states that the cdsdisk format can only be used on SCSI disks, and showed how to use the vxscsi command to see if the cdsdisk format could be used with a device:

$ /usr/lib/vxvm/diag.d/vxscsi -g sdd
Cannot get disk geometry on /dev/vx/rdmp/hdd !

The IDE disk drives I was using with the Linux host don’t fit into the cdsdisk supportability matrix, so I decided to the use the “sliced” format since the devices were used purely for testing:

$ /usr/lib/vxvm/bin/vxdisksetup -fi hdc format=sliced

$ /usr/lib/vxvm/bin/vxdisksetup -fi hdd format=sliced

Once the disks were initialized, I added them to a disk group, carved up a new volume, and created a VxFS file system on the volume:

$ vxdg init datadg hdc hdd cds=off

$ vxassist -g datadg make vol01 512m layout=concat

$ mkfs -t vxfs -o bsize=8192 /dev/vx/dsk/datadg/vol01

    version 6 layout
    8380416 sectors, 523776 blocks of size 8192, log size 2048 blocks
    largefiles supported

I plan to fire up my Sun multipack this weekend to see if the Solaris to Linux migration works as well as the folks at Veritas say (based on past experiences with the Veritas product suite, I am relatively certain it will work well).

* To deal with endianness issues, you need to use the fscdsconv utility.

Posted by matty, filed under Veritas Volume Manager. Date: August 30, 2006, 12:00 am | 4 Comments »

29  Aug
Linux udev humor

The udev device management framework is one of the new features that was added to the Linux 2.6 kernel, and allows the /dev namespace to be populated based on hotplug events sent from the kernel to the userspace udevd daemon. While reading through the udev FAQ, I found a good explanation of udev:

Q: How is udev related to devfs?
A: udev works entirely in userspace, using hotplug events the kernel sends
   whenever a device is added or removed from the kernel. Details about
   the devices are exported by the kernel to the sysfs filesystem at /sys
   All device naming policy permission control and event handling is done in
   userspace. devfs is operated from within the kernel.

As well as some comedic writing:

Q: But udev will not automatically load a driver if a /dev node is opened
   when it is not present like devfs will do.
A: Right, but Linux is supposed to load a module when a device is discovered
   not to load a module when it's accessed.

Q: Oh come on, pretty please.  It can't be that hard to do.
A: Such a functionality isn't needed on a properly configured system. All
   devices present on the system should generate hotplug events, loading
   the appropriate driver, and udev will notice and create the
   appropriate device node.  If you don't want to keep all drivers for your
   hardware in memory, then use something else to manage your modules
   (scripts, modules.conf, etc.)  This is not a task for udev.

Q: But I love that feature of devfs, please?
A: The devfs approach caused a lot of spurious modprobe attempts as
   programs probed to see if devices were present or not.  Every probe
   attempt created a process to run modprobe, almost all of which were
   spurious.

This made me laugh silly, and I wish more FAQ maintainers used this style of writing. Whoever wrote this, I commend you!

Posted by matty, filed under Linux Kernel. Date: August 29, 2006, 11:27 pm | 1 Comment »

I grew up listening to hard rock, and used to bang my head to the likes of Motley Crue, Cinderella, Guns N’ Roses, Ratt, Warrant, Skid Row and the rest of the bands who made the 1980s special. Most of these bands are still touring in one capacity or another, and I recently had the opportunity to see two of these bands, Cinderella and Poison. Both bands were a huge success in the 1980s, and I was curious to see if they could still jam.

Cinderella was the first band to take the stage, and they began their setlist with what sounded like “Fallin’ Apart at the Seams.” The lead singer (Tom Keifer) was having problems hitting the high notes Cinderella is known for, but the show was amazing none the less (Tom Keifer told the crowd that his voice was strained, but he chose to go on tour to make all of us Cinderella fans happy! Awesome!). In addition to playing “Fallin’ Apart at the Seams,” the band also played classic hits such as “Don’t Know What You Got,” “Gypsey Road,” “Shake Me,” “Push Push,” “Shelter Me,” “Heartbreak Station,” and “Coming Home.” Even with Tom’s strained voice, the band sounded awesome, and the crowd loved every minute of their performance.

After Cinderella completed their set, the roadies cleared the stage in preparation for Poison. Eventually the lights dimmed, and Bret Michaels, CC Deville, Rikki Rockett and Bobby Dall popped up out of nowhere to perform “Look What the Cat Dragged In.” The guys sounded incredible, and I had a blast watching CC Deville jam on the guitar. The band played most of their hits, including “Unskinny Bob,” “Every Rose Has It’s Thorn,” “Your Momma Don’t Dance,” “Nothin’ But a Good Time,” “Talk Dirty to Me,” “Fallen Angel,” and “Something To Believe In.” Each hit sounded just like it did back in the 1980s, and I have to say I enjoyed listening to Poison (I was never a huge fan of their music).

Once the show was over and we found our car (we thought we lost it), I met some cool folks and reminisced about the hair band music that made the 1980s the century of rock. The show was a blast, and I would have to say I loved every minute of Cinerella’s performance. They have tons of hits, and are well worth seeing live.

Posted by matty, filed under Music. Date: August 27, 2006, 11:41 pm | 3 Comments »

I recently had a friend contact me because he was getting an error similar to the following in his Redhat Linux system log (I didn’t save the error while debugging the problem, so I grabbed this one from the web):

kernel: disk I/O error: dev 08:01, sector 25590410
kernel: SCSI disk error : host 0 channel 0 id 0 lun 0 return code = 28000002

At first glance, I thought the disk drive had failed, and told him to back up all of his data to safe media. Once the data was backed up, I decided to run a full SMART self test on the disk drive to check the drives health:

$ smartctl -t long /dev/hda

smartctl version 5.36 [sparc-sun-solaris2.10] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Extended self-test routine immediately in off-line mode".
Drive command "Execute SMART Extended self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 84 minutes for test to complete.
Test will complete after Sun Aug 27 19:41:01 2006

Use smartctl -X to abort test.

The SMART long test completed successfully, but dd was failing when attempting to read sector 25590410 (we weren’t using the continue on error option). Since all modern disk drive controllers contain logic to remap faulty sectors when they are detected, and the number of reallocated sectors as reported by smartctl was well below the manufacturers failure threshold, I wondered if the sector was “stuck.” To test my theory, I booted from a Linux CD, and ran the Linux badblocks utility on the disk partition (I didn’t save the badblocks output from his drive, so the following is a sample from another machine):

$ badblocks -sv /dev/hda

Checking blocks 0 to 8192016
Checking for bad blocks (read-only test):    222400/  8192016

Badblocks completed successfully, and an fsck of the file system reported that the file system was clean (We also used the ext3 file system debugger to see if a file was using the block. It wasn’t, so my theory is that the errors occurred when a new file was being created). Next we rebooted the system, and the number of reallocated sectors reported by smartmontools had increased by one. This completely surprised me, and I am still confused why the disk controller didn’t remap the sector when we were booted from the disk drive. I had fun debugging this problem, and learning about how IDE disk drives work.

Posted by matty, filed under Linux Storage. Date: August 27, 2006, 6:32 pm | 1 Comment »

I grew up loving rock and roll, and spent a fair amount of time in college listening to music (especially when I was writing code for class). One band that got a fair amount of play time on my home stero was Candlebox. They had a cool relaxing sound, and could jam with the best of them. So when Ticketmaster informed me that they would be in town playing a show at a local venue, I decided to get tickets. Once the opening acts left the stage, Candlebox played an audio clip, and then blasted into a version of “Arrow.” The band looked like they were having fun on stage, and proceeded to play all of their hits, including “Far Behind,” “You,” “Cover Me,” “Rain,” “Change,” and “Simple lessons.” The show was incredible, and being four rows back gave me a chance to watch the band jam close up. If you get a chance to see them. check ‘em out!

Posted by matty, filed under Music. Date: August 23, 2006, 10:44 pm | No Comments »

I use the MD (multiple device) logical volume manager to mirror the boot devices on the Linux servers I support. When I first started using MD, the mdadm utility was not available to manage and monitor MD devices. Since disk failures are relatively common in large shops, I used the shell script from my SysAdmin article Monitoring and Managing Linux Software RAID to send E-mail when a device entered the failed state. While reading through the mdadm(8) manual page, I came across the “–monitor” and “–mail” options. These options can be used to monitor the operational state of the MD devices in a server, and generate E-mail notifications if a problem is detected. E-mail notification support can be enabled by running mdadm with the “–monitor” option to monitor devices, the “–daemonise” option to create a daemon process, and the “–mail” option to generate E-mail:

$ /sbin/mdadm –monitor –scan –daemonise –mail=root@localhost

Once mdadm is daemonized, an E-mail similar to the following will be sent each time a failure is detected:

From: mdadm monitoring 
To: root@localhost.localdomain
Subject: Fail event on /dev/md1:biscuit

This is an automatically generated mail message from mdadm
running on biscuit

A Fail event had been detected on md device /dev/md1.

Faithfully yours, etc.

I digs me some mdadm!

Posted by matty, filed under Linux LVM. Date: August 20, 2006, 6:30 pm | 1 Comment »

I recently started supporting several DNS servers running BIND 9. To ensure that these server are up and operational at all times, I wrote a small shell script named dns-check to test the operational state of each server. The script takes a file as an argument, and each line in the file contains the IP address of a DNS server (names will also work), a name to resolve, and the record type that should be requested. If the script is unable to resolve the name for one reason or another (any return code > 0 is a failure), the script will log a message to syslog, and send E-mail to the address listed in the $ADMIN variable, or an address passed to the “-e” option. Here is sample run:

$ cat dns-check-sites
ns1.fooby.net mail.fooby.net A
ns2.fooby.net mail.fooby.net A

$ dns-check -e dns-admin@prefetch.net -f dns-check-sites

The script is nothing special, but might be useful to folks running DNS servers.

Posted by matty, filed under DNS & BIND. Date: August 20, 2006, 2:43 pm | No Comments »

I recently had a file system on a Solaris Volume Manager (SVM) metadevice fill up, and I needed to expand it to make room for some additional data. Since the expansion could potentially cause problems, I backed up the file system, and saved a copy of the metastat and df output to my local workstation. Having several backups always gives me a warm fuzzy, since I know I have a way to revert back to the old configuration if something goes awry. Once the configuration was in a safe place and the data backed up, I used the umount command to unmount the /data file system, which lives on metadevice d100:

$ df -h

Filesystem             size   used  avail capacity  Mounted on
/dev/dsk/c1t0d0s0      7.9G   2.1G   5.7G    27%    /
/devices                 0K     0K     0K     0%    /devices
ctfs                     0K     0K     0K     0%    /system/contract
proc                     0K     0K     0K     0%    /proc
mnttab                   0K     0K     0K     0%    /etc/mnttab
swap                   2.3G   600K   2.3G     1%    /etc/svc/volatile
objfs                    0K     0K     0K     0%    /system/object
/usr/lib/libc/libc_hwcap1.so.1
                       7.9G   2.1G   5.7G    27%    /lib/libc.so.1
fd                       0K     0K     0K     0%    /dev/fd
/dev/dsk/c1t0d0s4      4.0G   154M   3.8G     4%    /var
swap                   2.3G    32K   2.3G     1%    /tmp
swap                   2.3G    24K   2.3G     1%    /var/run
/dev/dsk/c1t0d0s3       19G   2.8G    17G    15%    /opt
/dev/md/dsk/d100        35G    35G   120M    99%    /data

$ umount /data

After the file system was unmounted, I had to run the metaclear utility to remove the metadevice from the meta state database:

$ metaclear D100
d100: Concat/Stripe is cleared

Now that the metadevice was removed, I needed to add it back with the desired layout. It is EXTREMELY important to place the device(s) back in the right order, and to ensure that the new layout doesn’t corrupt the data that exists on the device(s) that contain the file system (i.e., don’t create a RAID5 metadevice with the existing devices, since that will wipe your data when the RAID5 metadevice is initialized). In my case, I wanted to concatenate another hardware RAID protected LUN to the meta device d100. This was accomplished by running metainit with the “numstripes” equal to 2 to indicate a 2 stripe concatenation, and “width” equal to 1 to indicate that each stripe should have one member:

$ metainit d100 2 1 c1t1d0s0 1 c1t2d0s0
d100: Concat/Stripe is setup

Once the new metadevice was created, I ran the mount utility to remount the /data file system, and then executed growfs to expand the file system:

$ mount /dev/md/dsk/d100 /data

$ growfs -M /data /dev/md/rdsk/d100

Warning: 2778 sector(s) in last cylinder unallocated
/dev/md/rdsk/d100:      150721830 sectors in 24532 cylinders of 48 tracks, 128 sectors
        73594.6MB in 1534 cyl groups (16 c/g, 48.00MB/g, 5824 i/g)
super-block backups (for fsck -F ufs -o b=#) at:
 32, 98464, 196896, 295328, 393760, 492192, 590624, 689056, 787488, 885920,
Initializing cylinder groups:
..............................
super-block backups for last 10 cylinder groups at:
 149821984, 149920416, 150018848, 150117280, 150215712, 150314144, 150412576,
 150511008, 150609440, 150707872

After the growfs operation completed, I had some breathing room on the /data file system:

$ df -h

Filesystem             size   used  avail capacity  Mounted on
/dev/dsk/c1t0d0s0      7.9G   2.1G   5.7G    27%    /
/devices                 0K     0K     0K     0%    /devices
ctfs                     0K     0K     0K     0%    /system/contract
proc                     0K     0K     0K     0%    /proc
mnttab                   0K     0K     0K     0%    /etc/mnttab
swap                   2.3G   600K   2.3G     1%    /etc/svc/volatile
objfs                    0K     0K     0K     0%    /system/object
/usr/lib/libc/libc_hwcap1.so.1
                       7.9G   2.1G   5.7G    27%    /lib/libc.so.1
fd                       0K     0K     0K     0%    /dev/fd
/dev/dsk/c1t0d0s4      4.0G   154M   3.8G     4%    /var
swap                   2.3G    32K   2.3G     1%    /tmp
swap                   2.3G    24K   2.3G     1%    /var/run
/dev/dsk/c1t0d0s3       19G   2.8G    17G    15%    /opt
/dev/md/dsk/d100        71G    36G    35G    49%    /data

The fact that you have to unmount the file system to grow a metadevice is somewhat frustrating, since every other LVM package I have used allows volumes and file system to be expanded on the fly (it’s a good thing ZFS is shipping with Solaris). As with all data migrations, you should test storage expansion operations prior to performing them on production systems.

Posted by matty, filed under Solaris Volume Manager. Date: August 18, 2006, 5:44 pm | 2 Comments »

« Previous Entries