Smartmontools saves the day!

While booting up my x86 laptop this week, I noticed the following errors on the console:

Feb 26 18:16:54 zebox smartd[492]: Device: /dev/ad0, 1 Currently unreadable (pending) sectors
Feb 26 18:16:54 zebox smartd[492]: Device: /dev/ad0, 1 Offline uncorrectable sectors
Feb 26 18:46:55 zebox smartd[492]: Device: /dev/ad0, 1 Currently unreadable (pending) sectors
Feb 26 18:46:55 zebox smartd[492]: Device: /dev/ad0, 1 Offline uncorrectable sectors

Eeeeep — it looks like the disk drive is going bad. To verify this, I decided to run smartctl against the device:

$ smartctl -a /dev/ad0

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   100   253   006    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0003   095   095   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       92
  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   068   062   030    Pre-fail  Always       -       6508058
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       329
 10 Spin_Retry_Count        0x0013   100   100   034    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       96
187 Unknown_Attribute       0x0032   094   094   000    Old_age   Always       -       6
189 Unknown_Attribute       0x003a   100   100   000    Old_age   Always       -       0
190 Unknown_Attribute       0x0022   065   055   045    Old_age   Always       -       622067747
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       56
193 Load_Cycle_Count        0x0032   090   090   000    Old_age   Always       -       21143
194 Temperature_Celsius     0x0022   035   045   000    Old_age   Always       -       35 (Lifetime Min/Max 0/15)
195 Hardware_ECC_Recovered  0x001a   078   054   000    Old_age   Always       -       177096143
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       1
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       1
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0000   100   253   000    Old_age   Offline      -       0
202 TA_Increase_Count       0x0032   100   253   000    Old_age   Always       -       0

Hmmm — the value of Seek_Error_Rate looks extremely high, so I decided to run smartctl a second time to see if the value of Seek_Error_Rate was climbing:

$ smartctl -a /dev/ad0

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   100   253   006    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0003   095   095   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       92
  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   068   062   030    Pre-fail  Always       -       6508123
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       329
 10 Spin_Retry_Count        0x0013   100   100   034    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       96
187 Unknown_Attribute       0x0032   094   094   000    Old_age   Always       -       6
189 Unknown_Attribute       0x003a   100   100   000    Old_age   Always       -       0
190 Unknown_Attribute       0x0022   065   055   045    Old_age   Always       -       622067747
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       56
193 Load_Cycle_Count        0x0032   090   090   000    Old_age   Always       -       21146
194 Temperature_Celsius     0x0022   035   045   000    Old_age   Always       -       35 (Lifetime Min/Max 0/15)
195 Hardware_ECC_Recovered  0x001a   078   054   000    Old_age   Always       -       177096157
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       1
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       1
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0000   100   253   000    Old_age   Offline      -       0
202 TA_Increase_Count       0x0032   100   253   000    Old_age   Always       -       0

Sure enough, the value was increasing at a staggering rate! Since I had just purchased the drive from NewEgg, I gave them a call, and they are going to send me a replacement. Viva la smartmontools!

4 thoughts on “Smartmontools saves the day!”

  1. You’re not doing too bad. I just got this. Didn’t even know it could hold numbers that big

    7 Seek_Error_Rate 0x000f 046 045 030 Pre-fail Always – 10222122622046

  2. Did you notice that your normalised value (68) isn’t even close to the threshold (30)? I think your disk was fine. The man page says if normalised drops below threshold you’re in trouble.

  3. It’s better than a drive in my drive array (which has been running for two months). The 3ware controller lists it as having failed it’s SMART TEST (which is true, it needs to be replaced and I need to write a script):

    7 Seek_Error_Rate 0x000f 025 025 030 Pre-fail Always FAILING_NOW 211960947363771

  4. Doesn’t look too bad to me, those numbers are always climbing.
    my 4 disk array has
    7 Seek_Error_Rate 0x000f 075 060 030 Pre-fail Always – 4333399490
    7 Seek_Error_Rate 0x000f 075 060 030 Pre-fail Always – 41515339
    7 Seek_Error_Rate 0x000f 075 060 030 Pre-fail Always – 4335866028
    7 Seek_Error_Rate 0x000f 076 060 030 Pre-fail Always – 44854505

    Those disks are all fine, just purchased a 5th today, this one is not so great

    7 Seek_Error_Rate 0x000f 022 022 030 Pre-fail Always FAILING_NOW 55954836087656
    will be going back to the shop

Leave a Reply

Your email address will not be published. Required fields are marked *