If you are running ZFS in production, you may have experienced a situation where your server paniced and reboot when a ZFS file system was corrupted. With George Wilson’s recent putback of CR #6322646, this is no longer the case. George’s putback allows the file system administrator to set the “failmode” property to control that happens when a pool incurs a fault. Here is a description of the new property from the zpool(1m) manual page:
failmode=wait | continue | panic
Controls the system behavior in the event of catas- trophic pool failure. This condition is typically a result of a loss of connectivity to the underlying storage device(s) or a failure of all devices within the pool. The behavior of such an event is determined as follows:
wait Blocks all I/O access until the device con- nectivity is recovered and the errors are cleared. This is the default behavior.
continue Returns EIO to any new write I/O requests but allows reads to any of the remaining healthy devices. Any write requests that have yet to be committed to disk would be blocked.
panic Prints out a message to the console and gen- erates a system crash dump.
To see just how well this feature worked, I decided to test out the new failmode property. To begin my tests, I created a new ZFS pool from two files:
cd / && mkfile 1g file1 file2
zpool create p1 /file1 /file2
pool: p1 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM p1 ONLINE 0 0 0 /file1 ONLINE 0 0 0 /file2 ONLINE 0 0 0
After the pool was created, I checked the failmode property:
zpool get failmode p1
NAME PROPERTY VALUE SOURCE p1 failmode wait default
And then then began writing garbage to one of the files to see what would happen:
dd if=/dev/zero of=/file1 bs=512 count=1024
zpool scrub p1
I was overjoyed to find that the box was still running, even though the pool showed up as faulted:
pool: p1 state: FAULTED status: One or more devices could not be used because the label is missing or invalid. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Replace the device using 'zpool replace'. see: http://www.sun.com/msg/ZFS-8000-4J scrub: scrub completed after 0h0m with 0 errors on Tue Feb 19 13:57:41 2008 config: NAME STATE READ WRITE CKSUM p1 FAULTED 0 0 0 insufficient replicas /file1 UNAVAIL 0 0 0 corrupted data /file2 ONLINE 0 0 0 errors: No known data errors
But my joy didn’t last long, since the box became unresponsive after a few minutes, and paniced with the following string:
Feb 19 13:57:47 nevadadev genunix: [ID 603766 kern.notice] assertion failed: vdev_config_sync(rvd->vdev_child, rvd->vdev_children, txg) == 0 (0x5 == 0x0), file: ../../common/fs/zfs/spa.c, line: 4130 Feb 19 13:57:47 nevadadev unix: [ID 100000 kern.notice] Feb 19 13:57:47 nevadadev genunix: [ID 655072 kern.notice] ffffff0001feab30 genunix:assfail3+b9 () Feb 19 13:57:47 nevadadev genunix: [ID 655072 kern.notice] ffffff0001feabd0 zfs:spa_sync+5d2 () Feb 19 13:57:47 nevadadev genunix: [ID 655072 kern.notice] ffffff0001feac60 zfs:txg_sync_thread+19a () Feb 19 13:57:47 nevadadev genunix: [ID 655072 kern.notice] ffffff0001feac70 unix:thread_start+8 ()
Since the manual page states that the failmode property “controls the system behavior in the event of catas-trophic pool failure,” it appears the box should have stayed up and operational when the pool became unusable. I filed a bug on the opensolaris website, so hopefully the ZFS team will get this issue addressed in the future.