Today is a huge day for the opensolaris community! Xen is now officially part of Nevada, and should be available in SXCE build #75 (prior to this putback, Xen was available as a separate set of BFU archives)! The putback occurred this afternoon, and I can’t wait to start playing with this!!! A huge congratulations is in order for the Xen team!
I was pleasantly surprised to find out this week that the brandz framework is being extended to support Linux 2.6 kernels, as well as binaries that were built to run on Solaris 8 hosts! This has lots and lots of potential, and would be a blessing for one of my previous employers (they have a lot of Solaris 8 hosts). I would like to send dibs out to the folks who are making this happen. ;) Niiiiiiiice!
On more than one occassion now, I have run into problems where the Solaris boot archive wasn’t in a consistent format at boot time. This stops the boot process, and the console recommends booting into FailSafe mode to fix it. If you want to do this manually, you can run the bootadm utility with the update_archive command, and the location where the root file system is mounted:
$ bootadm update_archive -v -R /a
I am hopeful that the opensolaris community will enhance the archive support to make it more fault tolerant. The current code seems somewhat brittle.
I just came across Rick Moen’s Preventing Domain Expiration article. Rick did a great job with the article, and it’s cool to see that they took my domain-check shell script and implemented it in Perl. The Perl version supports for TLDS, and contains a bit more functionality than the bash implementation. If I get some time in the next few months, I will have to update the domain-check bash script to support the same TLDs as the Perl implementation. Great job Rick and Ben!!
With the introduction of Solaris 10, the Solaris kernel was modified and userland tools were added to detect and report on hardware faults. The fault analysis is handled by the Solaris fault manager, which currently detects and responds (the kernel can retire memory pages, CPUs, etc. when it detects faulty hardware) to failures in AMD and SPARC CPU modules, PCI and PCIe buses, memory modules, disk drives and eventually Intel Xeon processors and system sensors (e.g., fan speed, thermal sensors, etc.).
To see if the fault manager has diagnosed a faulty component on a Solaris 10 or Nevada host, the fmadm utility can be run with the “faulty” option:
$ fmadm faulty
STATE RESOURCE / UUID
-------- ----------------------------------------------------------------------
degraded dev:////pci@8,700000/lpfc@4
a0461e5e-4356-ca7b-ee83-c66816b9caba
-------- ----------------------------------------------------------------------
degraded dev:////pci@8,700000/lpfc@5
a0461e5e-4356-ca7b-ee83-c66816b9caba
-------- ----------------------------------------------------------------------
degraded dev:////pci@8,700000/pci@2
a0461e5e-4356-ca7b-ee83-c66816b9caba
-------- ----------------------------------------------------------------------
degraded dev:////pci@8,700000/pci@3
a0461e5e-4356-ca7b-ee83-c66816b9caba
-------- ----------------------------------------------------------------------
degraded dev:////pci@8,700000/scsi@1
a0461e5e-4356-ca7b-ee83-c66816b9caba
-------- ----------------------------------------------------------------------
degraded mod:///mod-name=emlxs/mod-id=101
a0461e5e-4356-ca7b-ee83-c66816b9caba
-------- ----------------------------------------------------------------------
degraded mod:///mod-name=glm/mod-id=146
a0461e5e-4356-ca7b-ee83-c66816b9caba
-------- ----------------------------------------------------------------------
degraded mod:///mod-name=pci_pci/mod-id=132
a0461e5e-4356-ca7b-ee83-c66816b9caba
-------- ----------------------------------------------------------------------
The fmadm output includes the suspect component, the state of the component and a unique identifer to identify the fault. Since hardware faults can lead to system and application outages, I like to configure the FMA SNMP agent to send an SNMP trap with the hardware fault details to my NMS station, and I also like to configure my FMA notifier script to send the hardware fault details to my blackberry. The emails that are generated by the fmanotifier script look similar to the following:
From matty@lucky Sat Aug 18 14:58:29 2007
Date: Sat, 18 Aug 2007 14:58:29 -0400 (EDT)
From: matty@lucky
To: root@lucky
Subject: Hardware fault on lucky
The fault manager detected a problem with the system hardware.
The fmadm and fmdump utilities can be run to retrieve additional
details on the faults and recommended next course of action.
Fmadm faulty output:
STATE RESOURCE / UUID
-------- ----------------------------------------------------------------------
degraded dev:////pci@8,700000/lpfc@4
a0461e5e-4356-ca7b-ee83-c66816b9caba
-------- ----------------------------------------------------------------------
degraded dev:////pci@8,700000/lpfc@5
a0461e5e-4356-ca7b-ee83-c66816b9caba
-------- ----------------------------------------------------------------------
degraded dev:////pci@8,700000/pci@2
a0461e5e-4356-ca7b-ee83-c66816b9caba
-------- ----------------------------------------------------------------------
degraded dev:////pci@8,700000/pci@3
a0461e5e-4356-ca7b-ee83-c66816b9caba
-------- ----------------------------------------------------------------------
degraded dev:////pci@8,700000/scsi@1
a0461e5e-4356-ca7b-ee83-c66816b9caba
-------- ----------------------------------------------------------------------
degraded mod:///mod-name=emlxs/mod-id=101
a0461e5e-4356-ca7b-ee83-c66816b9caba
-------- ----------------------------------------------------------------------
degraded mod:///mod-name=glm/mod-id=146
a0461e5e-4356-ca7b-ee83-c66816b9caba
-------- ----------------------------------------------------------------------
degraded mod:///mod-name=pci_pci/mod-id=132
a0461e5e-4356-ca7b-ee83-c66816b9caba
-------- ----------------------------------------------------------------------
I absolutely adore FMA, and wish there was something similar for Linux. Hopefully the Linux kernel engineers will provide similar functionality in the future, since this greatly simplifies the process of identifying broken hardware.
While reviewing the system logs on one of my SAN attached servers last week, I noticed hundreds of entries similar to the following:
Aug 28 13:10:14 foo scsi: [ID 243001 kern.warning] WARNING: /scsi_vhci (scsi_vhci0): Aug 28 13:10:14 foo /scsi_vhci/ssd@g600a0b80001fcb370000010646d3d207 (ssd21): Command Timeout on path /pci@9,600000/lpfc@1/fp@0,0 (fp3) Aug 28 13:10:14 foo scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/ssd@g600a0b80001fcb370000010646d3d207 (ssd21): Aug 28 13:10:14 foo SCSI transport failed: reason 'timeout': retrying command
Since the errors were retryable, it looked like MPxIO was doing it’s job and retrying requests on one of the other paths. To see if all of the paths were up and operational (the host has four paths to disk), I ran the mpathadm utility with the “list” command and the “LU” option:
$ mpathadm list LU
/dev/rdsk/c2t600A0B80001FCB370000010446D3D0C1d0s2
Total Path Count: 4
Operational Path Count: 4
/dev/rdsk/c2t600A0B8000216462000000C746D3DA48d0s2
Total Path Count: 4
Operational Path Count: 4
/dev/rdsk/c2t600A0B8000216462000000CD46D3DC1Cd0s2
Total Path Count: 4
Operational Path Count: 4
/dev/rdsk/c2t600A0B80001FCB370000010646D3D207d0s2
Total Path Count: 4
Operational Path Count: 4
/dev/rdsk/c2t600A0B8000216462000000CA46D3DB88d0s2
Total Path Count: 4
Operational Path Count: 4
Since all of the paths were available, I started to wonder if a cable was faulty. After running fcinfo on each of the four HBA ports, I came across the following:
$ fcinfo hba-port -l 10000000c94708f2
HBA Port WWN: 10000000c94708f2
OS Device Name: /dev/cfg/c6
Manufacturer: Emulex
Model: LP9002L
Firmware Version: 3.93a0
FCode/BIOS Version: 1.41a4
Type: N-port
State: online
Supported Speeds: 1Gb 2Gb
Current Speed: 2Gb
Node WWN: 20000000c94708f2
Link Error Statistics:
Link Failure Count: 0
Loss of Sync Count: 14
Loss of Signal Count: 0
Primitive Seq Protocol Error Count: 0
Invalid Tx Word Count: 198724
Invalid CRC Count: 63412
Bingo! The CRC errors were continuosly increasing, so I knew that either the HBA or fibre channel cable were faulty (as a side note, I can’t wait for the FMA project to harden the emlxs and qlc drivers!). During one of my storage training courses, the instructor mentioned that CRC errors are typically associated with bad cables. Once I swapped out the cable that was connected to the port with the errors, the CRC error counts no longer increased, and the scsi_vhci errors stopped! Niiiiiiiiiiiiiiiiiiice!