Solaris fault manager overview

One of the coolest features in Solaris 10 in the fault management service. Fault management allows system software to send telemetry data to the fmd(1m) daemon, which then diagnoses the problem, and takes action (e.g., offlining a faulty components and logging an error with FMRI/UUID information to syslog) based on the type of event received. The diagnosis phase is controlled by a set of diagnosis engines, which can be viewed with the fmadm(1m) utilities “config” option:

$ fmadm config

MODULE                   VERSION STATUS  DESCRIPTION
USII-io-diagnosis        1.0     active  UltraSPARC-II I/O Diagnosis
cpumem-retire            1.0     active  CPU/Memory Retire Agent
eft                      1.12    active  eft diagnosis engine
fmd-self-diagnosis       1.0     active  Fault Manager Self-Diagnosis
io-retire                1.0     active  I/O Retire Agent
syslog-msgs              1.0     active  Syslog Messaging Agent

If the fault manager daemon (fmd) detects a fault, it will log a detailed message to syslog, and update the fault manager error and fault logs. The contents of these logfiles can be viewed with the fmdump(1m) utility:

$ fmdump -v

TIME UUID SUNW-MSG-ID
fmdump: /var/fm/fmd/fltlog is empty

$ fmdump -e -v

TIME                 CLASS                                 ENA
fmdump: /var/fm/fmd/errlog is empty

If a device is diagnosed as faulty, this will be indicated in the fmadm(1m) “faulty” output:

$ fmadm faulty

   STATE RESOURCE / UUID
-------- ----------------------------------------------------------------------

The fault management daemon (fmd) keeps track of service events and numerous pieces of key statistical data. This information can be accessed and printed with the fmstat(1m) utility:

$ fmstat

module             ev_recv ev_acpt wait  svc_t  %w  %b  open solve  memsz  bufsz
USII-io-diagnosis        0       0  0.0    0.0   0   0     0     0      0      0
cpumem-retire            0       0  0.0    0.0   0   0     0     0      0      0
eft                      0       0  0.0    0.0   0   0     0     0   552K      0
fmd-self-diagnosis       0       0  0.0    0.0   0   0     0     0      0      0
io-retire                0       0  0.0    0.0   0   0     0     0      0      0
syslog-msgs              0       0  0.0    0.0   0   0     0     0    32b      0

If you are interested in learning more about this amazingly cool technology, you can check out the following resources:

Mike Shapiro’s ACM Fault Management Article

Mike Shapiro’s Fault Management Presentation

Debugging lease problems with Sun DHCP Servers

While debugging some PXE boot problems a few weeks back, I needed to see who DHCP leases were being issued to. Since the box was running Solaris 9, I was able to take advantage of the Sun DHCP server’s “-d” (debug) and “-v” (verbose) options:

$ /usr/lib/inet/in.dhcpd -d -v -i ge0 -b manual

42a07602:  Daemon Version: 3.5
42a07602:  Maximum relay hops: 4
42a07602:  Run mode is: DHCP Server Mode.
42a07602:  Datastore resource: SUNWfiles
42a07602:  Location: /var/dhcp
42a07602:  DHCP offer TTL: 10
42a07602:  BOOTP compatibility enabled.
42a07602:  ICMP validation timeout: 1000 milliseconds, Attempts: 1.
42a07602:  Maximum concurrent clients: 1024
42a07602:  Maximum threads: 256
42a07602:  Read 4 entries from DHCP macro database on Fri Jun  3 11:23:46 2005
42a07602:  Monitor (0003/ge0) started...
42a07602:  Thread Id: 0003 - Monitoring Interface: ge0 *****
42a07602:  MTU: 1500    Type: SOCKET
42a07602:  Broadcast: 10.10.10.255
42a07602:  Netmask: 255.255.255.0
42a07602:  Address: 10.10.10.122
42a07632:  Datagram received on network device: ge0(limited broadcast)
42a07632: (Error 0) No more IP addresses on 10.10.10.0 network
42a07637:  Datagram received on network device: ge0(limited broadcast)
42a07637: (Error 0) No more IP addresses on 10.10.10.0 network
42a0763f:  Datagram received on network device: ge0(limited broadcast)
42a0763f: (Error 0) No more IP addresses on 10.10.10.0 network
42a0764f:  Datagram received on network device: ge0(limited broadcast)
42a0764f: (Error 0) No more IP addresses on 10.10.10.0 network
42a07663:  Datagram received on network device: ge0(limited broadcast)
42a07664:  Sending datagram to broadcast address.
42a07664:  (Added offer: 10.10.10.112
42a0766b:  Datagram received on network device: ge0(limited broadcast)
42a0766b:  Client: 01000D609C8889 maps to IP: 10.10.10.112
42a0766b:  Sending datagram to broadcast address.
42a0766e:  Datagram received on network device: ge0(limited broadcast)
42a0766e:  Unicasting datagram to 10.10.10.112 address.
42a0766e:  Adding ARP entry: 10.10.10.112 == 000D609C8889
42a0766e:  Updated offer: 10.10.10.112
42a0766e:  Datagram received on network device: ge0(limited broadcast)
42a0766e:  Reserved offer: 10.10.10.112
42a0766e:  Database write unnecessary for DHCP client: 01000D609C8889, 10.10.10.112
42a0766e:  Client: 01000D609C8889 maps to IP: 10.10.10.112
42a0766e:  Unicasting datagram to 10.10.10.112 address.
42a0766e:  Adding ARP entry: 10.10.10.112 == 000D609C8889

The debug option will place the DHCP server into the foreground, causing the server to print a slew of information to the terminal screen. The debug options helped me solve my problem, and assisted with identifying several devices that shouldn’t be using DHCP.

Veritas Volume Manager (VxVM) Hot Spares

When a disk fails that is part of a redundant volume (e.g., RAID 1, RAID 5), the volume is able to continue handling I/O requests, but becomes susceptible to data loss if additional devices fail ( and in the case of RAID5 volumes, the volume will operate in a degraded state, since parity calculations are required to recreate data).

To remedy the potential impacts associated with device failures, Veritas Volume Manager (VxVM) starts the vxrelocd(1m) failure event detection and subdisk relocation daemon at system boot time. This daemon periodically scans the vxnotify(1m) output, and upon detecting a failure, attempts to relocate data to a working device.

When relocating data, Veritas will first attempt to use a device marked as a spare. If Veritas is unable to find a device marked as a spare, Veritas will attempt to relocate data to a device that contains adequate space and doesn’t have the “nohotuse” flag set. To see if a device contains the nohotuse or spare flag, the vxdisk(1m) utility can be invoked with the list option, and the device to list:

$ vxdisk list c1t6d0

Device:    c1t1d0s2
devicetag: c1t1d0
type:      auto
hostid:    pooh
disk:      name=c1t1d0 id=1123602295.10.pooh
group:     name=oradg id=1123603158.13.pooh
info:      format=cdsdisk,privoffset=256,pubslice=2,privslice=2
flags:     online ready private autoconfig spare autoimport imported
pubpaths:  block=/dev/vx/dmp/c1t1d0s2 char=/dev/vx/rdmp/c1t1d0s2
version:   3.1
iosize:    min=512 (bytes) max=2048 (blocks)
public:    slice=2 offset=2304 len=35365968 disk_offset=0
private:   slice=2 offset=256 len=2048 disk_offset=0
update:    time=1123603160 seqno=0.6
ssb:       actual_seqno=0.0
headers:   0 240
configs:   count=1 len=1280
logs:      count=1 len=192
Defined regions:
 config   priv 000048-000239[000192]: copy=01 offset=000000 enabled
 config   priv 000256-001343[001088]: copy=01 offset=000192 enabled
 log      priv 001344-001535[000192]: copy=01 offset=000000 enabled
 lockrgn  priv 001536-001679[000144]: part=00 offset=000000
Multipathing information:
numpaths:   1
c1t1d0s2        state=enabled

To mark a device as a hot spare, the vxedit(1m) utility can be used:

$ vxedit set spare=on c1t6d0

$ vxdisk list c1t6d0

Device:    c1t6d0s2
devicetag: c1t6d0
type:      auto
hostid:    winnie
disk:      name=c1t6d0 id=1127240120.14.winnie
group:     name=oradg id=1127240283.19.winnie
info:      format=cdsdisk,privoffset=256,pubslice=2,privslice=2
flags:     online ready private autoconfig spare autoimport imported
pubpaths:  block=/dev/vx/dmp/c1t6d0s2 char=/dev/vx/rdmp/c1t6d0s2
version:   3.1
iosize:    min=512 (bytes) max=2048 (blocks)
public:    slice=2 offset=2304 len=35521408 disk_offset=0
private:   slice=2 offset=256 len=2048 disk_offset=0
update:    time=1127961735 seqno=0.28
ssb:       actual_seqno=0.0
headers:   0 240
configs:   count=1 len=1280
logs:      count=1 len=192
Defined regions:
 config   priv 000048-000239[000192]: copy=01 offset=000000 disabled
 config   priv 000256-001343[001088]: copy=01 offset=000192 disabled
 log      priv 001344-001535[000192]: copy=01 offset=000000 disabled
 lockrgn  priv 001536-001679[000144]: part=00 offset=000000
Multipathing information:
numpaths:   1
c1t6d0s2        state=enabled

To request that a device not be used for relocation, the “nohotuse” flag can be set. This will cause vxrelocd(1m) to skip the device when making relocation decisions, which ensures that data doesn’t get relocated to free space on busy disks. To set the “nohotuse”flag, the vxedit(1m) utility can be used:

$ vxedit set nohotuse=on c1t6d0

$ vxdisk list c1t6d0

Device:    c1t6d0s2
devicetag: c1t6d0
type:      auto
hostid:    winnie
disk:      name=c1t6d0 id=1127240120.14.winnie
group:     name=oradg id=1127240283.19.winnie
info:      format=cdsdisk,privoffset=256,pubslice=2,privslice=2
flags:     online ready private autoconfig nohotuse autoimport imported
pubpaths:  block=/dev/vx/dmp/c1t6d0s2 char=/dev/vx/rdmp/c1t6d0s2
version:   3.1
iosize:    min=512 (bytes) max=2048 (blocks)
public:    slice=2 offset=2304 len=35521408 disk_offset=0
private:   slice=2 offset=256 len=2048 disk_offset=0
update:    time=1127961735 seqno=0.28
ssb:       actual_seqno=0.0
headers:   0 240
configs:   count=1 len=1280
logs:      count=1 len=192
Defined regions:
 config   priv 000048-000239[000192]: copy=01 offset=000000 disabled
 config   priv 000256-001343[001088]: copy=01 offset=000192 disabled
 log      priv 001344-001535[000192]: copy=01 offset=000000 disabled
 lockrgn  priv 001536-001679[000144]: part=00 offset=000000
Multipathing information:
numpaths:   1
c1t6d0s2        state=enabled

The relocation process can consumes considerable amounts of I/O and CPU resources, so it’s often beneficial to explicitly pick the hot spares by hand. This will ensure that when failures occur, data is not relocated to a chunk of free space that resides on the same spindles as your production data.

Awesome band: Stereolith

While listening to my favorite online radio station today, I heard a song pop on that caught my attention. I immediately wandered over to the wazee website, and saw that the artist’s name was Stereolith, and the tune that was playing was titled Save Me. These guys are fricking awesome, and have a super cool industrial rock sound. They have four tunes on the wazee website, so check them out and let me know what you think!