[Tech] SMART disk failure

Marco A. Calamari marcoc1@dada.it
Sab 23 Ago 2008 10:37:04 CEST


Il disco da 2.5 pollici (laptop) di un server che e' su da 3 anni
 non passa piu' il long SMART test dando i messaggi allegati.

E' un errore recuperabile, magari con una scrittura forzata sul
 settore, od e' senz'altro meglio sostituirlo ?

Eventuali risposte sono piuttosto urgenti.
Nel frattempo ho salvato un dump unix della partizione
 che mi interessa. Per la cronaca un dd della stessa
 partizione fallisce.

Mille grazie.   Marco

==============================================

xxx:/mnt/hdb5# smartctl -a /dev/hdb
smartctl version 5.32 Copyright (C) 2002-4 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Device Model:     ST94019A
Serial Number:    3KW5P55X
Firmware Version: 3.05
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   6
ATA Standard is:  ATA/ATAPI-6 T13 1410D revision 2
Local Time is:    Sat Aug 23 10:29:11 2008 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                                        was completed without error.
                                        Auto Offline Data Collection:
Enabled.
Self-test execution status:      ( 120) The previous self-test completed
having
                                        the read element of the test
failed.
Total time to complete Offline
data collection:                 ( 426) seconds.
Offline data collection
capabilities:                    (0x5b) SMART execute Offline immediate.
                                        Auto Offline data collection
on/off support.
                                        Suspend Offline collection upon
new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test
supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        No General Purpose Logging
support.
Short self-test routine
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        (  31) minutes.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   052   047   034    Pre-fail  Always
-       57957539
  3 Spin_Up_Time            0x0003   100   099   000    Pre-fail  Always
-       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always
-       11
  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always
-       0
  7 Seek_Error_Rate         0x000f   085   060   030    Pre-fail  Always
-       352844695
  9 Power_On_Hours          0x0032   070   070   000    Old_age   Always
-       26783
 10 Spin_Retry_Count        0x0013   100   100   034    Pre-fail  Always
-       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always
-       307
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always
-       1328
193 Load_Cycle_Count        0x0032   001   001   000    Old_age   Always
-       1480555
194 Temperature_Celsius     0x0022   047   059   000    Old_age   Always
-       47
195 Hardware_ECC_Recovered  0x001a   052   047   000    Old_age   Always
-       57957539
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always
-       1
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age
Offline      -       1
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always
-       0
200 Multi_Zone_Error_Rate   0x0000   100   253   000    Old_age
Offline      -       0
202 TA_Increase_Count       0x0032   100   253   000    Old_age   Always
-       0

SMART Error Log Version: 1
ATA Error Count: 52 (device log contains only the most recent five
errors)
        CR = Command Register [HEX]
        FR = Features Register [HEX]
        SC = Sector Count Register [HEX]
        SN = Sector Number Register [HEX]
        CL = Cylinder Low Register [HEX]
        CH = Cylinder High Register [HEX]
        DH = Device/Head Register [HEX]
        DC = Device Command Register [HEX]
        ER = Error register [HEX]
        ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 52 occurred at disk power-on lifetime: 26777 hours (1115 days + 17
hours)
  When the command that caused the error occurred, the device was active
or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 d5 5b 77 df f0  Error: UNC 213 sectors at LBA = 0x00df775b =
14645083

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 d8 58 77 df f0 00      02:54:33.682  READ DMA EXT
  25 00 e0 50 77 df f0 00      02:54:33.626  READ DMA EXT
  25 00 e8 48 77 df f0 00      02:54:33.567  READ DMA EXT
  25 00 f0 40 77 df f0 00      02:54:33.511  READ DMA EXT
  25 00 f8 38 77 df f0 00      02:54:33.457  READ DMA EXT

Error 51 occurred at disk power-on lifetime: 26777 hours (1115 days + 17
hours)
  When the command that caused the error occurred, the device was active
or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 d5 5b 77 df f0  Error: UNC 213 sectors at LBA = 0x00df775b =
14645083

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 e0 50 77 df f0 00      02:54:33.682  READ DMA EXT
  25 00 e8 48 77 df f0 00      02:54:33.626  READ DMA EXT
  25 00 f0 40 77 df f0 00      02:54:33.567  READ DMA EXT
  25 00 f8 38 77 df f0 00      02:54:33.511  READ DMA EXT
  25 00 00 30 77 df f0 00      02:54:33.457  READ DMA EXT

Error 50 occurred at disk power-on lifetime: 26777 hours (1115 days + 17
hours)
  When the command that caused the error occurred, the device was active
or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 d5 5b 77 df f0  Error: UNC 213 sectors at LBA = 0x00df775b =
14645083

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 e8 48 77 df f0 00      02:54:33.682  READ DMA EXT
  25 00 f0 40 77 df f0 00      02:54:33.626  READ DMA EXT
  25 00 f8 38 77 df f0 00      02:54:33.567  READ DMA EXT
  25 00 00 30 77 df f0 00      02:54:33.511  READ DMA EXT
  25 00 00 30 76 df f0 00      02:54:33.457  READ DMA EXT

Error 49 occurred at disk power-on lifetime: 26777 hours (1115 days + 17
hours)
  When the command that caused the error occurred, the device was active
or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 d5 5b 77 df f0  Error: UNC 213 sectors at LBA = 0x00df775b =
14645083

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 f0 40 77 df f0 00      02:54:33.682  READ DMA EXT
  25 00 f8 38 77 df f0 00      02:54:33.626  READ DMA EXT
  25 00 00 30 77 df f0 00      02:54:33.567  READ DMA EXT
  25 00 00 30 76 df f0 00      02:54:33.511  READ DMA EXT
  25 00 00 30 75 df f0 00      02:54:33.457  READ DMA EXT

Error 48 occurred at disk power-on lifetime: 26777 hours (1115 days + 17
hours)
  When the command that caused the error occurred, the device was active
or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 d5 5b 77 df f0  Error: UNC 213 sectors at LBA = 0x00df775b =
14645083

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 f8 38 77 df f0 00      02:54:33.682  READ DMA EXT
  25 00 00 30 77 df f0 00      02:54:33.626  READ DMA EXT
  25 00 00 30 76 df f0 00      02:54:33.567  READ DMA EXT
  25 00 00 30 75 df f0 00      02:54:33.511  READ DMA EXT
  25 00 00 30 74 df f0 00      02:54:33.457  READ DMA EXT

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining
LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed: read failure       80%     26774
14645082
# 2  Extended offline    Completed: read failure       80%     26774
14645076
# 3  Short offline       Completed without error       00%     26774
-
# 4  Extended offline    Completed without error       00%     22661
-

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute
delay.



-- 

+--------------- http://www.winstonsmith.info ---------------+
| il Progetto Winston Smith: scolleghiamo il Grande Fratello |
| the Winston Smith Project: unplug the Big Brother          |
| Marco A. Calamari marcoc@marcoc.it  http://www.marcoc.it   |
| DSS/DH:  8F3E 5BAE 906F B416 9242 1C10 8661 24A9 BFCE 822B |
+ PGP RSA: ED84 3839 6C4D 3FFE 389F 209E 3128 5698 ----------+

-------------- parte successiva --------------
Un allegato non testuale è stato rimosso....
Nome:        non disponibile
Tipo:        application/pgp-signature
Dimensione:  307 bytes
Descrizione: This is a digitally signed message part
URL:         <http://lists.linux.it/pipermail/flug-tech/attachments/20080823/0a5f8d29/attachment.pgp>


Maggiori informazioni sulla lista flug-tech