Repairing the Solaris /dev and /devices directories

The /devices and /dev directories on one of my Solaris 9 hosts got majorly borked a few weeks back, and the trusy old `devfsadm -Cv’ command wasn’t able to fix our problem. To clean up the device tree, I booted from CDROM into single user mode and manually cleaned up the device hierarchy. Here is what I did to fix my problems (WARNING: This fixed my problem, but there is no guarantee that this will work for you. Please test changes similar to this on non-production systems prior to adjusting production systems.):

Step 1: Boot from CDROM into single user mode

Step 2: Mount the “/” partition to your favorite place (if your boot devices are mirrored, you will need to perform the following operations on each half of the mirror):

$ mount /dev/dsk/c0t0d0s0 /a

Step 3: Move the existing path_to_inst aside:

$ mv /a/etc/path_to_inst /a/etc/08012007.path_to_inst.orig

Step 4: Clean out the /devices and /dev directories:

$ rm -rf /a/devices/*

$ rm -rf /a/dev/*

Step 5: Replicate the /devices and /dev directories that were created during boot:

$ cd /devices; find . | cpio -pmd /a/devices

$ cd /dev; find . | cpio -pmd /a/dev

Step 6: Adjust the vfstab to reflect any device changes

Step 7: Boot with the “-a”, “-s” and “-r” options to create a new path_to_inst (you can optionally use `devfsadm -C -r /a -p /a/etc/path_to_inst -v’ to create the path_to_inst from single user mode), and to add device entries that weren’t found while booted from single user mode

Step 8: Grab a soda and enjoy the fruits of your labor! :)

Expanding storage the ZFS way

I had a mirrored ZFS pool fill up on me this week, which required me to add additional storage to ensure that my application kept functioning correctly. Since expanding storage is a trivial process with ZFS, I decided to increase the available pool storage by replacing the 36GB disks in the pool with 72GB disks. Here is the original configuration:

$ df -h netbackup

Filesystem             size   used  avail capacity  Mounted on
netbackup               33G    32G    1G     96%    /opt/openv

$ zpool status -v netbackup

  pool: netbackup
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        netbackup   ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            c1t2d0  ONLINE       0     0     0
            c1t3d0  ONLINE       0     0     0

errors: No known data errors

To expand the available storage, I replaced the disk c1t2d0 with a 72GB disk, and then used the zpool “replace” option to replace the old disk with the new one:

$ zpool replace netbackup c1t2d0

Once the pool finished resilvering (you can run `zpool status -v’ to monitor the progress), I replaced the disk c1t3d0 with a 72GB disk, and used the zpool “replace” option to replace the old disk with the new one:

$ zpool replace netbackup c1t3d0

Once the pool finished resilvering, I had an extra 36GB of disk space available:

$ df -h netbackup

Filesystem             size   used  avail capacity  Mounted on
netbackup               67G   32G    35G    47%    /opt/openv

This is pretty powerful, and it’s nice not to have to run another utility to extend volumes and file systems once new storage is available. There is also the added benefit that ZFS resilvers at the object level, and not at the block level. Giddie up!

Solaris to support VRRP

While reading up on the Sitara project on opensolaris.org, I noticed that the project team is planning to add VRRP support to Solaris. They are also working on speeding up small packet forwarding performance, which will be great for sites that run busy DNS servers and voice solutions. Now if we can just get them to port CARP to opensolaris. ;) Niiiiiiiiiiiiice!

Getting orca working on Solaris hosts

I installed the SE Toolkit on several Solaris 10 hosts this week, and noticed that the se process was SEGFAULT’ing during startup:

$ /etc/rc3.d/S99orcallator start
Writing data into /opt/orca/nbm01/
Starting logging
Sending output to nohup.out

$ tail -1 /var/adm/messages

Aug 10 23:09:27 nbm01 genunix: [ID 603404 kern.notice] NOTICE: core_log: se.sparcv9[17571] core dumped: /var/core/core.se.sparcv9.17571

After fileing a bug report on the SE Toolkit website, it dawned on me that the issue wasn’t with the se program, but with the orcallator.se script that accompanied the SE Toolkit. Based on a hunch, I installed the latest version of orcallator.se from the Orca website, and that fixed my issue (and provided a number of additional useful performance graphs!). Hopefully this will help others who bump into this issue.