Finding/setting nvalias (nvram) OBP settings from a running Solaris O/S

Using the command eeprom (1m) while in the Solaris O/S on SPARC platforms has been a useful way to view and set OBP parameters without bringing the entire machine offline and down to the ok prompt.

Unfortunately, eeprom does not show nvalias definitions. These are most often used to specify root and mirror O/S boot devices. For clarity, these are then plugged into the boot-device and diag-device OBP variables. (diag-device is the OBP variable used to boot the machine when the physical or virtual keyswitch is set to “diag mode.”)

Luckly, prtconf -vp will give you this information once you do a little bit of digging…

$ prtconf -vp

….

<snip>

….

Node 0xf022d030
ttya-rts-dtr-off: ‘false’
ttya-ignore-cd: ‘true’
local-mac-address?: ‘true’
fcode-debug?: ‘false’
scsi-initiator-id: ‘7’
oem-logo:
oem-logo?: ‘false’
oem-banner:
oem-banner?: ‘false’
ansi-terminal?: ‘true’
screen-#columns: ’80’
screen-#rows: ’34’
ttya-mode: ‘9600,8,n,1,-‘
output-device: ‘virtual-console’
input-device: ‘virtual-console’
auto-boot-on-error?: ‘false’
load-base: ‘16384’
auto-boot?: ‘true’
network-boot-arguments:
boot-command: ‘boot’
boot-file:
boot-device: ‘disk net’
use-nvramrc?: ‘false’
nvramrc:
security-mode: ‘none’
security-password:
security-#badlogins: ‘0’
verbosity: ‘min’
diag-switch?: ‘true’
error-reset-recovery: ‘boot’
name: ‘options’

Node 0xf022d0a8
ttya: ‘/pci@7c0/pci@0/pci@1/pci@0/isa@2/serial@0,3f8’
nvram: ‘/virtual-devices/nvram@3’
net3: ‘/pci@7c0/pci@0/pci@2/network@0,1’
net2: ‘/pci@7c0/pci@0/pci@2/network@0’
net1: ‘/pci@780/pci@0/pci@1/network@0,1’
net0: ‘/pci@780/pci@0/pci@1/network@0’
net: ‘/pci@780/pci@0/pci@1/network@0’
ide: ‘/pci@7c0/pci@0/pci@1/pci@0/ide@8’
cdrom: ‘/pci@7c0/pci@0/pci@1/pci@0/ide@8/cdrom@0,0:f’
disk3: ‘/pci@780/pci@0/pci@9/scsi@0/disk@3’
disk2: ‘/pci@780/pci@0/pci@9/scsi@0/disk@2’
disk1: ‘/pci@780/pci@0/pci@9/scsi@0/disk@1’
disk0: ‘/pci@780/pci@0/pci@9/scsi@0/disk@0’
disk: ‘/pci@780/pci@0/pci@9/scsi@0/disk@0’
scsi: ‘/pci@780/pci@0/pci@9/scsi@0’
virtual-console: ‘/virtual-devices/console@1’
name: ‘aliases’

….

<snip>

….

The first thing we need to do is to set the “use-nvramrc = true” OBP variable so our modifications will be used. Viewing parameters with eeprom (1m) is accessible to regular users. Modifying these values requires root privileges.

$ eeprom use-nvramrc?
use-nvramrc?=false

Since its false..

# eeprom use-nvramrc?=true

To verify..

$ eeprom use-nvramrc?

use-nvramrc?=true

Awesome. Step 1 down.

Next, we define the device we want to use. First, figure out what your root and mirror devices are.

$ df -h /
Filesystem size used avail capacity Mounted on
/dev/md/dsk/d10 32G 22G 9.4G 70% /

$ metastat -p d10
d10 -m d11 d12 1
d11 1 1 c0t0d0s0
d12 1 1 c0t1d0s0

So we’ve got an encapsulated SVM root file system. Lets find the device paths for c0t0d0 and c0t1d0 under the /devices name space.

$ ls -l /dev/dsk/c0t0d0s0
lrwxrwxrwx 1 root root 49 Feb 6 11:06 /dev/dsk/c0t0d0s0 -> ../../devices/pci@780/pci@0/pci@9/scsi@0/sd@0,0:a

$ ls -l /dev/dsk/c0t1d0s0
lrwxrwxrwx 1 root root 49 Jan 11 2007 /dev/dsk/c0t1d0s0 -> ../../devices/pci@780/pci@0/pci@9/scsi@0/sd@1,0:a

Ok.. so removing the “/devices” from the path (since that’s just a Solaris name space) and the trailing “a” gives us the following….

/pci@780/pci@0/pci@9/scsi@0/sd@0,0

/pci@780/pci@0/pci@9/scsi@0/sd@1,0

(Side note… the “a” stands for slice 0. Slice 1 would have a “b”, slice 2 would have a “c”, etc. You can see an example of this by issuing a $ ls -l /dev/dsk/c0t0d0s1. That’s why when you boot off of cdrom, you’ll see a trailing “f”. By default, its looking for boot strap information on slice 6!)

We can confirm this by grepping for the two above paths into /etc/path_to_inst
$ grep ‘/pci@780/pci@0/pci@9/scsi@0/sd@[0-1],0’ /etc/path_to_inst
“/pci@780/pci@0/pci@9/scsi@0/sd@0,0” 1 “sd”
“/pci@780/pci@0/pci@9/scsi@0/sd@1,0” 3 “sd”

Sweet. Step 2 down.

Next, lets assign an alias of “rootdisk” to the first device and “rootmirror” to the second. Replace the two characters “sd” with the word “disk”.

# eeprom nvramrc=”devalias rootdisk /pci@780/pci@0/pci@9/scsi@0/disk@0,0 devalias rootmirror /pci@780/pci@0/pci@9/scsi@0/disk@1,0″

While we’re at it, lets change our boot device to point to rootdisk and rootmirror.

# eeprom boot-device=”rootdisk rootmirror”
The OBP variable nvramrc a placeholder for values that haven’t been committed into the NVRAM. Upon the next reboot, the nvaliasrc variable will be commited into NVRAM. Lets see this in action. The whole point of this article is so we didn’t have to bounce the machine, but just to prove this is how it works lets see it anyways.

# halt
syncing file systems… done
Program terminated
{0} ok

{0} ok printenv nvramrc
nvramrc = devalias rootdisk /pci@780/pci@0/pci@9/scsi@0/disk@0,0 devalias rootmirror /pci@780/pci@0/pci@9/scsi@0/disk@1,0

So there’s our value that we stuck into nvramrc.

Our modifications haven’t appeared under devalias yet because a complete boot cycle is needed before the contents of nvramrc gets committed into NVRAM…
{0} ok devalias
ttya /pci@7c0/pci@0/pci@1/pci@0/isa@2/serial@0,3f8
nvram /virtual-devices/nvram@3
net3 /pci@7c0/pci@0/pci@2/network@0,1
net2 /pci@7c0/pci@0/pci@2/network@0
net1 /pci@780/pci@0/pci@1/network@0,1
net0 /pci@780/pci@0/pci@1/network@0
net /pci@780/pci@0/pci@1/network@0
ide /pci@7c0/pci@0/pci@1/pci@0/ide@8
cdrom /pci@7c0/pci@0/pci@1/pci@0/ide@8/cdrom@0,0:f
disk3 /pci@780/pci@0/pci@9/scsi@0/disk@3
disk2 /pci@780/pci@0/pci@9/scsi@0/disk@2
disk1 /pci@780/pci@0/pci@9/scsi@0/disk@1
disk0 /pci@780/pci@0/pci@9/scsi@0/disk@0
disk /pci@780/pci@0/pci@9/scsi@0/disk@0
scsi /pci@780/pci@0/pci@9/scsi@0
virtual-console /virtual-devices/console@1
name aliases

Lets bounce the box so the contents of nvramrc are committed into NVRAM. (

{0} ok boot rootdisk

SC Alert: Host System has Reset

SC Alert: Host system has shut down.
..

Sun Fire T200, No Keyboard
Copyright 2006 Sun Microsystems, Inc. All rights reserved.
OpenBoot 4.25.0, 32760 MB memory available, Serial #XXXXXXX
Ethernet address 0:14:4f:xx:xx:xx, Host ID: xxxxxxxx

Rebooting with command: boot rootdisk
Boot device: /pci@780/pci@0/pci@9/scsi@0/disk@0,0 File and args:
Loading ufs-file-system package 1.4 04 Aug 1995 13:02:54.
FCode UFS Reader 1.12 00/07/17 15:48:16.
Loading: /platform/SUNW,Sun-Fire-T200/ufsboot
Loading: /platform/sun4v/ufsboot
SunOS Release 5.10 Version Generic_127111-05 64-bit
Copyright 1983-2007 Sun Microsystems, Inc. All rights reserved.

Sure enough, once the box is back online, prtconf -vp found our modifications committed into NVRAM.

$ prtconf -vp

<snip>

Node 0xf022d0a8
rootmirror: ‘/pci@780/pci@0/pci@9/scsi@0/disk@1’
rootdisk: ‘/pci@780/pci@0/pci@9/scsi@0/disk@0’

ttya: ‘/pci@7c0/pci@0/pci@1/pci@0/isa@2/serial@0,3f8’
nvram: ‘/virtual-devices/nvram@3’
net3: ‘/pci@7c0/pci@0/pci@2/network@0,1’
net2: ‘/pci@7c0/pci@0/pci@2/network@0’
net1: ‘/pci@780/pci@0/pci@1/network@0,1’
net0: ‘/pci@780/pci@0/pci@1/network@0’
net: ‘/pci@780/pci@0/pci@1/network@0’
ide: ‘/pci@7c0/pci@0/pci@1/pci@0/ide@8’
cdrom: ‘/pci@7c0/pci@0/pci@1/pci@0/ide@8/cdrom@0,0:f’
disk3: ‘/pci@780/pci@0/pci@9/scsi@0/disk@3’
disk2: ‘/pci@780/pci@0/pci@9/scsi@0/disk@2’
disk1: ‘/pci@780/pci@0/pci@9/scsi@0/disk@1’
disk0: ‘/pci@780/pci@0/pci@9/scsi@0/disk@0’
disk: ‘/pci@780/pci@0/pci@9/scsi@0/disk@0’
scsi: ‘/pci@780/pci@0/pci@9/scsi@0’
virtual-console: ‘/virtual-devices/console@1’
name: ‘aliases’

If you follow this procedure closely, you don’t have to bounce the box to make this modification — but keep in mind that messing around with these OBP variables especially boot-device without testing can leave your machine in a state for some other poor administrator to figure out which disk is your real boot-device. I take no responsibility for what you do with your OBP modifications. =)

Configuring ZFS to gracefully deal with pool failures

If you are running ZFS in production, you may have experienced a situation where your server paniced and reboot when a ZFS file system was corrupted. With George Wilson’s recent putback of CR #6322646, this is no longer the case. George’s putback allows the file system administrator to set the “failmode” property to control that happens when a pool incurs a fault. Here is a description of the new property from the zpool(1m) manual page:

failmode=wait | continue | panic

    Controls the system behavior  in  the  event  of  catas-
    trophic  pool  failure.  This  condition  is typically a
    result of a  loss  of  connectivity  to  the  underlying
    storage device(s) or a failure of all devices within the
    pool. The behavior of such an  event  is  determined  as
    follows:

    wait        Blocks all I/O access until the device  con-
                nectivity  is  recovered  and the errors are
                cleared. This is the default behavior.

    continue    Returns EIO to any new  write  I/O  requests
                but  allows  reads  to  any of the remaining
                healthy devices.  Any  write  requests  that
                have  yet  to  be committed to disk would be
                blocked.

    panic       Prints out a message to the console and gen-
                erates a system crash dump.



To see just how well this feature worked, I decided to test out the new failmode property. To begin my tests, I created a new ZFS pool from two files:

$ cd / && mkfile 1g file1 file2

$ zpool create p1 /file1 /file2

$ zpool status

  pool: p1
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        p1          ONLINE       0     0     0
          /file1    ONLINE       0     0     0
          /file2    ONLINE       0     0     0



After the pool was created, I checked the failmode property:

$ zpool get failmode p1

NAME  PROPERTY  VALUE     SOURCE
p1    failmode  wait      default



And then then began writing garbage to one of the files to see what would happen:

$ dd if=/dev/zero of=/file1 bs=512 count=1024

$ zpool scrub p1

I was overjoyed to find that the box was still running, even though the pool showed up as faulted:

$ zpool status

  pool: p1
 state: FAULTED
status: One or more devices could not be used because the label is missing or
        invalid.  Sufficient replicas exist for the pool to continue
        functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-4J
 scrub: scrub completed after 0h0m with 0 errors on Tue Feb 19 13:57:41 2008
config:

        NAME        STATE     READ WRITE CKSUM
        p1          FAULTED      0     0     0  insufficient replicas
          /file1    UNAVAIL      0     0     0  corrupted data
          /file2    ONLINE       0     0     0

errors: No known data errors



But my joy didn’t last long, since the box became unresponsive after a few minutes, and paniced with the following string:

Feb 19 13:57:47 nevadadev genunix: [ID 603766 kern.notice] assertion failed: vdev_config_sync(rvd->vdev_child, rvd->vdev_children, txg) == 0 (0x5 == 0x0), file: ../../common/fs/zfs/spa.c, line: 4130
Feb 19 13:57:47 nevadadev unix: [ID 100000 kern.notice] 
Feb 19 13:57:47 nevadadev genunix: [ID 655072 kern.notice] ffffff0001feab30 genunix:assfail3+b9 ()
Feb 19 13:57:47 nevadadev genunix: [ID 655072 kern.notice] ffffff0001feabd0 zfs:spa_sync+5d2 ()
Feb 19 13:57:47 nevadadev genunix: [ID 655072 kern.notice] ffffff0001feac60 zfs:txg_sync_thread+19a ()
Feb 19 13:57:47 nevadadev genunix: [ID 655072 kern.notice] ffffff0001feac70 unix:thread_start+8 ()



Since the manual page states that the failmode property “controls the system behavior in the event of catas-trophic pool failure,” it appears the box should have stayed up and operational when the pool became unusable. I filed a bug on the opensolaris website, so hopefully the ZFS team will get this issue addressed in the future.