Adding a disk to a ZFS pool

I needed to expand a ZFS pool from a single disk to a pair of disks today. To expand my pool named “striped,” I ran zpool with the “add” option, the pool name to add the disk to, and the device to add to the pool:

$ zpool add striped c1d1

Once the disk was added to the pool, it was immediately available for use:

$ zpool status -v

  pool: striped
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        striped    ONLINE       0     0     0
          c1d0      ONLINE       0     0     0
          c1d1      ONLINE       0     0     0

errors: No known data errors

I used to think Veritas had the easiest method to expand file systems, but I don’t think that is the case anymore. Now if we can just get Sun to allow us remove devices from a pool, and expand the number of columns in a RAIDZ or RAIDZ2 vdev!

Displaying netstat statistics at various intervals

I periodically need to review netstat data to debug network problems, and prefer to view the deltas between two adjacent runs. The Solaris netstat utility can be passed a time interval, which will cause it to display the difference between two runs:

$ netstat -sP tcp 10

TCP     tcpRtoAlgorithm     =     4     tcpRtoMin           =   400
        tcpRtoMax           = 60000     tcpMaxConn          =    -1
        tcpActiveOpens      =    20     tcpPassiveOpens     =    50
        tcpAttemptFails     =    20     tcpEstabResets      =     0
        tcpCurrEstab        =     4     tcpOutSegs          =859323
        tcpOutDataSegs      =847791     tcpOutDataBytes     =1159286456
        tcpRetransSegs      =    19     tcpRetransBytes     = 13388
        tcpOutAck           = 11532     tcpOutAckDelayed    =   620
        tcpOutUrg           =     0     tcpOutWinUpdate     =    96
        tcpOutWinProbe      =     0     tcpOutControl       =   136
        tcpOutRsts          =    20     tcpOutFastRetrans   =     0
        tcpInSegs           =186702
        tcpInAckSegs        =103393     tcpInAckBytes       =1159286502
        tcpInDupAck         =   229     tcpInAckUnsent      =     0
        tcpInInorderSegs    = 83753     tcpInInorderBytes   =79048288
        tcpInUnorderSegs    =     9     tcpInUnorderBytes   =   432
        tcpInDupSegs        =     8     tcpInDupBytes       =  4664
        tcpInPartDupSegs    =     0     tcpInPartDupBytes   =     0
        tcpInPastWinSegs    =     0     tcpInPastWinBytes   =     0
        tcpInWinProbe       =     0     tcpInWinUpdate      =     0
        tcpInClosed         =     0     tcpRttNoUpdate      =     0
        tcpRttUpdate        =103347     tcpTimRetrans       =   109
        tcpTimRetransDrop   =     0     tcpTimKeepalive     =    61
        tcpTimKeepaliveProbe=    15     tcpTimKeepaliveDrop =     0
        tcpListenDrop       =     0     tcpListenDropQ0     =     0
        tcpHalfOpenDrop     =     0     tcpOutSackRetrans   =     7


TCP     tcpRtoAlgorithm     =     0     tcpRtoMin           =   400
        tcpRtoMax           = 60000     tcpMaxConn          =    -1
        tcpActiveOpens      =     0     tcpPassiveOpens     =     0
        tcpAttemptFails     =     0     tcpEstabResets      =     0
        tcpCurrEstab        =     4     tcpOutSegs          =    83
        tcpOutDataSegs      =    83     tcpOutDataBytes     =  6544
        tcpRetransSegs      =     0     tcpRetransBytes     =     0
        tcpOutAck           =     0     tcpOutAckDelayed    =     0
        tcpOutUrg           =     0     tcpOutWinUpdate     =     0
        tcpOutWinProbe      =     0     tcpOutControl       =     0
        tcpOutRsts          =     0     tcpOutFastRetrans   =     0
        tcpInSegs           =    78
        tcpInAckSegs        =    76     tcpInAckBytes       =  6544
        tcpInDupAck         =     0     tcpInAckUnsent      =     0
        tcpInInorderSegs    =     2     tcpInInorderBytes   =    96
        tcpInUnorderSegs    =     0     tcpInUnorderBytes   =     0
        tcpInDupSegs        =     0     tcpInDupBytes       =     0
        tcpInPartDupSegs    =     0     tcpInPartDupBytes   =     0
        tcpInPastWinSegs    =     0     tcpInPastWinBytes   =     0
        tcpInWinProbe       =     0     tcpInWinUpdate      =     0
        tcpInClosed         =     0     tcpRttNoUpdate      =     0
        tcpRttUpdate        =    76     tcpTimRetrans       =     0
        tcpTimRetransDrop   =     0     tcpTimKeepalive     =     0
        tcpTimKeepaliveProbe=     0     tcpTimKeepaliveDrop =     0
        tcpListenDrop       =     0     tcpListenDropQ0     =     0
        tcpHalfOpenDrop     =     0     tcpOutSackRetrans   =     0

Netstat is some good stuff!

Viewing the last time a Centos Linux user changed their password

I often forget about the Centos Linux chage utility, and it’s ability to manage the expiration data in /etc/shadow. In addition to being able to manage password policies, chage can be be run with the “-l” option to view the policy set for a user, and the date when a users password was last changed:

$ chage -l matty

Minimum:        0
Maximum:        99999
Warning:        7
Inactive:       -1
Last Change:            Dec 25, 2006
Password Expires:       Never
Password Inactive:      Never
Account Expires:        Never

If you have a security organization, ‘chage -l’ is a great command to allow them to run through sudo.

Setting up password policies on Centos Linux hosts

I needed to setup password policies on a few CentOS 4.4 machines last week. The password policy needed to define the minimum length of a password, the number of days a password is valid, the strength of a password, and a warning period to alert individuals that their password is about to expire. Expiration data for each user is stored in their entry in /etc/shadow, and is initially populated based on the password policies in /etc/logins.defs. Here is a list of password policies that I typically set in /etc/logins.defs:

# Password aging controls:
#
#       PASS_MAX_DAYS   Maximum number of days a password may be used.
#       PASS_MIN_DAYS   Minimum number of days allowed between password changes.
#       PASS_MIN_LEN    Minimum acceptable password length.
#       PASS_WARN_AGE   Number of days warning given before a password expires.
#
PASS_MAX_DAYS   60
PASS_MIN_DAYS   0
PASS_MIN_LEN    8
PASS_WARN_AGE   10

For accounts that were created without a password policy, the chage command can be used to create one. To enforce strong passwords, you need to add the pam module pam_cracklib.so to the password management group in /etc/pam.conf (or the applicable service definition in /etc/pam.d). Managing passwords is a pain, but it is one of the most important tasks in securing any server platform.

Getting ESX server to recognize Clariion devices

While setting up two new ESX 3.0 server nodes, I ran into a bizarre problem where the VI client refused to initialize several Clariion CX700 devices. Since the VI client isn’t the best environment to debug problems, I ssh’ed into the service console, and began my research by running esxcfg-vmhbadevs to list the devices on the system:

$ esxcfg-vmhbadevs -q
vmhba0:0:0 /dev/sda
vmhba1:0:0 /dev/sdc
vmhba1:0:1 /dev/sdd
vmhba1:0:2 /dev/sde
vmhba1:0:3 /dev/sdf
vmhba1:0:4 /dev/sdg
vmhba1:0:5 /dev/sdh
vmhba1:0:6 /dev/sdi
vmhba1:0:7 /dev/sdj
vmhba1:0:8 /dev/sdk
vmhba1:0:9 /dev/sdl

This listed the correct number of devices, so I decided to use dd to read data from one of the devices that was causing problems:

$ dd if=/dev/sdl of=/dev/null
195728+0 records in
195728+0 records out

Dd worked fine, but I was still unable to create a file system with the VI client or vmkfstools:

$ vmkfstools -C vmfs3 /vmfs/devices/disks/vmhba1:0:0:9
Error: Bad file descriptor

To get a better idea of the disk device layout. I decided to run fdisk to list the label type and layout of each device:

$ fdisk -l

< ..... >

Disk /dev/sdl (Sun disk label): 256 heads, 10 sectors, 40958 cylinders
Units = cylinders of 2560 * 512 bytes

   Device Flag    Start       End    Blocks   Id  System
/dev/sdl3  u          0     40958  52426240    5  Whole disk
/dev/sdl4  u          0         1      1280    f  Unknown
/dev/sdl5  u          2     40958  52423680    e  Unknown

< ..... >

Hmmm — each Clariion device contained an SMI label, and I started to wonder if this was causing the problem. To test my hypothesis, I used the fdisk utility to write a DOS label to one of the problematic devices:

$ fdisk /dev/sdl

Command (m for help): o
Building a new DOS disklabel. Changes will remain in memory only,
until you decide to write them. After that, of course, the previous
content won't be recoverable.

The number of cylinders for this disk is set to 40958.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
   (e.g., DOS FDISK, OS/2 FDISK)
Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.

Once I labeled the device with a DOS label, vmkfstools and the VI client allowed me to initialize and use the device. I am not certain why ESX server was having an issue with the SMI label, but writing a DOS label looks to have fixed my problem.

Locating the services that will start when Windows boots

My Dell C400 runs Windows XP, and it was taking 5 – 10 minutes to boot. This wasn’t always the case, so I started poking around to see which services where being enabled during startup. Windows maintains two registry keys for startup items, as well as the old trusty “Startup” folder in the start menu. Here are the keys you can check to see which services will start when Windows XP initializes:

Registry key #1
HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Run

Registry key #2
HKCU\SOFTWARE\Microsoft\Windows\CurrentVersion\Run

My problem turned out to be with VMWare server, and removing it made my machine boot quickly again. Nice!