Replacing failed disk drives in a ZFS pool


I had a disk drive fail in one of my ZFS pools over the weekend, and needed to swap it out to restore the pool to an optimal state. To begin the swap out, I used the zpool utility to see which disk drive was faulty:

$ zpool status -v

pool: rz2pool
state: DEGRADED
status: One or more devices could not be opened. Sufficient replicas exist for
the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
see: http://www.sun.com/msg/ZFS-8000-D3
scrub: resilver completed with 0 errors on Tue Feb 13 14:12:37 2007
config:

NAME STATE READ WRITE CKSUM
rz2pool DEGRADED 0 0 0
raidz2 DEGRADED 0 0 0
c1t9d0 ONLINE 0 0 0
c1t10d0 ONLINE 0 0 0
c1t12d0 ONLINE 0 0 0
c2t1d0 ONLINE 0 0 0
spare DEGRADED 0 0 0
c2t2d0 UNAVAIL 0 0 0 cannot open
c2t3d0 ONLINE 0 0 0
spares
c2t3d0 INUSE currently in use

Once I located the faulty device, I used cfgadm to add and remove the old and new disk drives from the system, and then ran zpool with the “replace” option to replace the failed drive in my pool:

$ zpool replace rz2pool c2t2d0 c2t2d0

After the replacement operation completed, I used zpool to monitor the resilvering of the replacement drive:

$ zpool status -v

pool: rz2pool
state: DEGRADED
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scrub: resilver in progress, 0.10% done, 0h31m to go

config:

NAME STATE READ WRITE CKSUM
rz2pool DEGRADED 0 0 0
raidz2 DEGRADED 0 0 0
c1t9d0 ONLINE 0 0 0
c1t10d0 ONLINE 0 0 0
c1t12d0 ONLINE 0 0 0
c2t1d0 ONLINE 0 0 0
spare DEGRADED 0 0 0
replacing DEGRADED 0 0 0
c2t2d0s0/o UNAVAIL 0 0 0 cannot open
c2t2d0 ONLINE 0 0 0
c2t3d0 ONLINE 0 0 0
spares
c2t3d0 INUSE currently in use

errors: No known data errors

All of this was done online, and with minimal interruption to the applications running on the host.

This article was posted by Matty on 2007-02-14 00:30:00 -0400 -0400