Recently I was approached to help debug a problem with a Linux MD device that wouldn’t start. When I ran raidstart to start the device, it spit out a number of errors on the console, and messages similar to the following were written to the system log:

ide: failed opcode was: unknown
hda: read_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
hda: read_intr: error=0x10 { SectorIdNotFound }, LBAsect=120101895, sector=120101895

After pondering the error for a bit, it dawned on me that the partition table might be fubar. My theory proved correct, since partition six (the one associated with the md device that wouldn’t start) had an ending cylinder count (7476) that was greater than the number of physical cylinders (7294) on the drive:

$ fdisk -l /dev/hda

Disk /dev/hda: 60.0 GB, 60000000000 bytes
255 heads, 63 sectors/track, 7294 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/hda1   *           1          32      257008+  fd  Linux raid autodetect
/dev/hda2              33          65      265072+  fd  Linux raid autodetect
/dev/hda3              66         327     2104515   fd  Linux raid autodetect
/dev/hda4             328        7476    57424342+   5  Extended
/dev/hda5             328         458     1052226   fd  Linux raid autodetect
/dev/hda6             459         720     2104483+  fd  Linux raid autodetect
/dev/hda7             721        7476    54267538+  fd  Linux raid autodetect

Once I corrected the end cyclinder and recreated the file system, everything worked as expected. Now I have no idea how the system got into this state (I didn’t build the system), since the installers I tested display errors when you specify an ending cylinder count that is larger than the maximum number of cylinders available.

Posted by matty, filed under Linux Storage. Date: May 16, 2007, 7:46 pm | No Comments »

I recently ran out of swap space on one of my production application servers, and needed to add some additional swap on the fly. Since I didn’t have a spare slice free on the server, I created a 1GB file on my / file system with dd, and then used the mkswap and swapon utilities to create a swap device out of that file:

$ dd if=/dev/zero of=/swap1.swp bs=1024 count=512K

$ mkswap /swap1.swp

$ swapon /swap1.swp

To verify the new swap device was available, I dumped /proc/swaps:

$ cat /proc/swaps

Filename                                Type            Size    Used    Priority
/dev/hda2                               partition       522104  160     -1
/swap1.swp                              file            1048568 0       -2

Sizing swap is easy to do, but when a server changes roles, previous swap estimates no longer come into play. I am planning to kickstart the server with a different disk layout, which will allow me to allocate a block device of the right size to swap. For the interim, this met our needs.

Posted by matty, filed under Linux Storage, Linux Utilities. Date: April 12, 2007, 3:14 am | 1 Comment »

One super useful utility that ships with CentOS 4.4 is the watch utility. Watch allows you to monitor the output from a command at a specific interval, which is especially useful for monitoring array rebuilds. To use watch, you need to run it with a command to watch, and an optional interval to control how often the output from that command is displayed:

$ watch –interval=10 cat mdstat

Every 2.0s: cat mdstat                                                                   Mon Mar  5 22:30:58 2007

Personalities : [raid1] [raid6] [raid5] [raid4]
md1 : active raid1 sdb2[1] sda2[0]
      8385856 blocks [2/2] [UU]

md2 : active raid5 sdg1[5] sdf1[3] sde1[2] sdd1[1] sdc1[0]
      976751616 blocks level 5, 64k chunk, algorithm 2 [5/4] [UUUU_]
      [=>...................]  recovery =  9.8% (24068292/244187904) finish=161.1min speed=22764K/sec

md0 : active raid1 sdb1[1] sda1[0]
      235793920 blocks [2/2] [UU]

unused devices: 

Posted by matty, filed under Linux Storage, Linux Utilities. Date: March 11, 2007, 12:26 pm | No Comments »

I am running CentOS 4.4 on some old servers, and each of these servers has multiple internal disk drives. Since system availability concerns me more than the amount of storage that is available, I decided to add a hot spare to the md device that stores my data (md2). To add the hot spare, I ran the mdadm utility with the “–add” option, the md device to add the spare to, and the spare device to use:

$ /sbin/mdadm –add /dev/md2 /dev/sdh1
mdadm: added /dev/sdh1

After the spare was added, the device showed up in the /proc/mdstat output with the “(S)” string to indicate that it’s a hot spare:

$ cat /proc/mdstat

Personalities : [raid1] [raid6] [raid5] [raid4]
md1 : active raid1 sdb2[1] sda2[0]
      8385856 blocks [2/2] [UU]
      bitmap: 0/128 pages [0KB], 32KB chunk

md2 : active raid5 sdh1[5](S) sdg1[4] sdf1[3] sde1[2] sdd1[1] sdc1[0]
      976751616 blocks level 5, 64k chunk, algorithm 2 [5/5] [UUUUU]
      bitmap: 3/233 pages [12KB], 512KB chunk

md0 : active raid1 sdb1[1] sda1[0]
      235793920 blocks [2/2] [UU]
      bitmap: 7/225 pages [28KB], 512KB chunk

unused devices: 

Posted by matty, filed under Linux Storage. Date: March 11, 2007, 12:14 pm | No Comments »

While upgrading my desktop this weekend to Fedora Core 6, I received the following error while attempting to start one of my md arrays:

$ /sbin/mdadm -A /dev/md3 /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh
mdadm: error opening /dev/md3: No such file or directory

To fix the issue, I had to cd into /dev and add some additional md entries with the MAKEDEV executable:

$ cd /dev && ./MAKEDEV md

Once I ran MAKDEV, mdadm was able to start up the array:

$ /sbin/mdadm -A /dev/md3 /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh
mdadm: /dev/md3 has been started with 6 drives.

*** UPDATE ***

Instead of going through the hassle of running MAKDEV, it looks like you can also use the mdadm “-a” option:

-a, –auto{=no,yes,md,mdp,part,p}{NN}
Instruct mdadm to create the device file if needed, possibly allocating an unused minor
number. “md” causes a non-partitionable array to be used. “mdp”, “part” or “p” causes
a partitionable array (2.6 and later) to be used. “yes” requires the named md device
to have a ’standard’ format, and the type and minor number will be determined from
this. See DEVICE NAMES below.

Posted by matty, filed under Linux Storage. Date: February 11, 2007, 3:13 pm | No Comments »

I managed a fair number of Dell and Sun servers that use LSI Logic RAID controllers. To ensure that a disk failure in one of our servers is quickly located and fixed, I started poking around the web this week to locate a tool that was capable of monitoring our RAID controllers and disk drives. My searches led me to the mpt-status utility, which is an opensource tool for monitoring LSI Logic RAID controllers.

Mpt-status is a relatively simple utility, and can be run without any options to report the status of all LSI Logic RAID controllers and the disk drives that live behind those controllers:

$ mpt-status

ioc0 vol_id 0 type IM, 2 phy, 136 GB, state OPTIMAL, flags ENABLED
ioc0 phy 0 scsi_id 0 SEAGATE  ST3146707LC      D704, 136 GB, state ONLINE, flags NONE
ioc0 phy 1 scsi_id 1 SEAGATE  ST3146707LC      D704, 136 GB, state ONLINE, flags NONE

This will print the status of the controller and each disk drive, along with the drive manufacturer, the size of each disk drive, and the SCSI target number. To get similar information in a parseable format, the mpt-status “-s” option can be used:

$ mpt-status -s

log_id 0 OPTIMAL
phys_id 0 ONLINE
phys_id 1 ONLINE

The servers I plan to use mpt-status on run Redhat Linux, so I created an RPM specification file to assist with building and deploying the package. I also incorporated a RAID controller monitoring script into the RPM, which will install itself into /etc/cron.daily/checklsi.sh, and run daily to check the status of the controllers and disk drives. Viva la monitoring!

Posted by matty, filed under Linux Storage, Linux Utilities. Date: February 3, 2007, 9:43 am | 5 Comments »

EXT3, along with most other file systems, can incur file level fragmentation over time. To see how fragmented a file on an EXT3 file system is, the filefrag utility can be run with the “-v” (verbose) option and the name of a file to check for fragmentation:

$ filefrag -v ick

Checking ick
Filesystem type is: ef53
Filesystem cylinder groups is approximately 3832
Blocksize of file ick is 4096
File size of ick is 115910586 (28299 blocks)
Discontinuity: Block 858 is at 788667 (was 787807)
Discontinuity: Block 2716 is at 790536 (was 790527)
Discontinuity: Block 4754 is at 792592 (was 792575)
Discontinuity: Block 6784 is at 794632 (was 794623)
Discontinuity: Block 23144 is at 820195 (was 811007)
Discontinuity: Block 23149 is at 821452 (was 820199)
ick: 7 extents found, perfection would be 1 extent

The easiest way I have found to reduce file fragmentation is to copy a fragmented file to a new location in the file system hierarchy (i.e., a new directory), and use that file instead of the original. I really wish there was a tool similar to VxFS’s fsadm utility to defragment files w/o having to copy them (this is impractical for file systems that store lots of data).

Posted by matty, filed under Linux Storage. Date: January 21, 2007, 10:53 am | No Comments »

Most modern drives support DMA, and the Linux IDE driver will use DMA if a device supports it. To check if a device is using DMA on a Linux host, you can cat /proc/ide/piix:

$ cat /proc/ide/piix

Controller: 0

                                Intel PIIX4 Ultra 100 Chipset.
--------------- Primary Channel ---------------- Secondary Channel -------------
                 enabled                          enabled
--------------- drive0 --------- drive1 -------- drive0 ---------- drive1 ------
DMA enabled:    yes              yes             yes               yes
UDMA enabled:   yes              yes             yes               yes
UDMA enabled:   5                5               5                 5
UDMA
DMA
PIO

The output contains a “yes” or “no” to indicate if DMA is enabled, and a line to indicate which DMA mode is in use. If for some reason DMA isn’t being used with a device, you can use hdparm to enable it.

Posted by matty, filed under Linux Storage. Date: January 14, 2007, 9:01 pm | 1 Comment »

« Previous Entries