Better ZFS pool fault handling coming to an opensolaris release near you!


I just saw the following ARC case fly by, and this will be a welcome addition to the ZFS file system!:

OVERVIEW:

Uncooperative or deceptive hardware, combined with power failures or sudden lack of access to devices, can result in zpools without redundancy being non-importable. ZFS’ copy-on-write and Merkle tree properties will sometimes allow us to recover from these problems. Only ad-hoc means currently exist to take advantage of this recoverability. This proposal aims to rectify that short-coming.

PROPOSED SOLUTION:

This fast-track proposes two new command line flags each for the ‘zpool clear’ and ‘zpool import’ sub-commands.

Both sub-commands will now accept a ‘-F’ recovery mode flag. When specified, a determination is made if discarding the last few transactions performed in an unopenable or non-importable pool will return the pool to an usable state. If so, the transactions are irreversibly discarded, and the pool imported. If the pool is usable or already imported and this flag is specified, the flag is ignored and no transactions are discarded.

Both sub-commands will now also accept a ‘-n’ flag. This flag is only meaningful in conjunction with the ‘-F’ flag. When specified, an attempt is made to see if discarding transactions will return the pool to a usable state, but no transactions are actually discarded.

I have encountered errors where this feature would have been handy, and will be stoked when this feature is available in Solaris 10 / Solaris next.

This article was posted by Matty on 2009-09-11 18:46:00 -0400 EDT