An MD device that went too far


Recently I was approached to help debug a problem with a Linux MD device that wouldn’t start. When I ran raidstart to start the device, it spit out a number of errors on the console, and messages similar to the following were written to the system log:

ide: failed opcode was: unknown
hda: read_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
hda: read_intr: error=0x10 { SectorIdNotFound }, LBAsect=120101895, sector=120101895

After pondering the error for a bit, it dawned on me that the partition table might be fubar. My theory proved correct, since partition six (the one associated with the md device that wouldn’t start) had an ending cylinder count (7476) that was greater than the number of physical cylinders (7294) on the drive:

$ fdisk -l /dev/hda

Disk /dev/hda: 60.0 GB, 60000000000 bytes
255 heads, 63 sectors/track, 7294 cylinders
Units = cylinders of 16065 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/hda1 1 32 257008+ fd Linux raid autodetect
/dev/hda2 33 65 265072+ fd Linux raid autodetect
/dev/hda3 66 327 2104515 fd Linux raid autodetect
/dev/hda4 328 7476 57424342+ 5 Extended
/dev/hda5 328 458 1052226 fd Linux raid autodetect
/dev/hda6 459 720 2104483+ fd Linux raid autodetect
/dev/hda7 721 7476 54267538+ fd Linux raid autodetect

Once I corrected the end cyclinder and recreated the file system, everything worked as expected. Now I have no idea how the system got into this state (I didn’t build the system), since the installers I tested display errors when you specify an ending cylinder count that is larger than the maximum number of cylinders available.

This article was posted by Matty on 2007-05-16 19:46:00 -0400 -0400