Using smartd on Solaris systems to proactively find disk drive problems


In my article Out S.M.A.R.T your hard drive, I talked about how smartctl can be used to monitor the SMART attributes in modern disk drives. One item I didn’t cover was smartd. This nifty little daemon can be used to automatically run drive self tests, check for SMART attributes that have changed, and to report errors logged to the self-test and error logs. If the daemon finds a problem it will log a message to syslog and send an e-mail if configured to do so.

To use smartd to monitor SCSI and ATA disks on a Solaris system, you will need to create a configuration file similar to the following (if a configuration file is not present, smartd will attempt to find all local disk and tape devices by scanning /dev/rdsk and /dev/rmt):

$ cat smartd.conf

/dev/rdsk/c0t0d0s0 -o on -d ata -S on -a -m admin@prefetch.net
/dev/rdsk/c0t2d0s0 -o on -d ata -S on -a -m admin@prefetch.net

The configuration file contains one line per device, and the options control the monitoring behavior applied to each device. In this example “-o on” requests the drive to perform periodic offline self-tests, “-d” indicates the device type, “-S on” enables attribute autosave, “-a” causes smartd to monitor the devices SMART health status, attribute changes, and messages logged to the self-test and error logs, and “-m” contains the e-mail address to send alerts when an error is detected.

To verify that the configuration file is formatted correctly and error free, smartd can be run with the “-d” option:

$ smartd -d

smartd version 5.33 [sparc-sun-solaris2.10] Copyright (C) 2002-4 Bruce
Allen
Home page is http://smartmontools.sourceforge.net/

Opened configuration file /usr/local/smartmontools-5.33/etc/smartd.conf
Configuration file /usr/local/smartmontools-5.33/etc/smartd.conf
parsed.
Device: /dev/rdsk/c0t0d0s0, opened
Device: /dev/rdsk/c0t0d0s0, found in smartd database.
Device: /dev/rdsk/c0t0d0s0, enabled SMART Attribute Autosave.
Device: /dev/rdsk/c0t0d0s0, is SMART capable. Adding to "monitor"
list.
Device: /dev/rdsk/c0t2d0s0, opened
Device: /dev/rdsk/c0t2d0s0, found in smartd database.
Device: /dev/rdsk/c0t2d0s0, enabled SMART Attribute Autosave.
Device: /dev/rdsk/c0t2d0s0, is SMART capable. Adding to "monitor"
list.
Monitoring 2 ATA and 0 SCSI devices

If everything checks out you can run smartd from a terminal window (this should ideally go into an rc script or an SMF manifest):

$ smartd -i 86400 &

This will cause smartd to become a daemon, and the “-i 86400” will cause smartd to check the devices SMART health every 24-hours. If smartd detects an errror, it will e-mail the address passed to the “-m” option, and log a message to the syslog daemon facility. For additional smartd options, you can wander over to the smartmontools website. smartd is the cat’s meow!

This article was posted by Matty on 2006-01-05 14:04:00 -0400 -0400