[Tech] SMART disk failure
Marco A. Calamari
marcoc1@dada.it
Sab 23 Ago 2008 10:37:04 CEST
Il disco da 2.5 pollici (laptop) di un server che e' su da 3 anni
non passa piu' il long SMART test dando i messaggi allegati.
E' un errore recuperabile, magari con una scrittura forzata sul
settore, od e' senz'altro meglio sostituirlo ?
Eventuali risposte sono piuttosto urgenti.
Nel frattempo ho salvato un dump unix della partizione
che mi interessa. Per la cronaca un dd della stessa
partizione fallisce.
Mille grazie. Marco
==============================================
xxx:/mnt/hdb5# smartctl -a /dev/hdb
smartctl version 5.32 Copyright (C) 2002-4 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF INFORMATION SECTION ===
Device Model: ST94019A
Serial Number: 3KW5P55X
Firmware Version: 3.05
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: 6
ATA Standard is: ATA/ATAPI-6 T13 1410D revision 2
Local Time is: Sat Aug 23 10:29:11 2008 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection:
Enabled.
Self-test execution status: ( 120) The previous self-test completed
having
the read element of the test
failed.
Total time to complete Offline
data collection: ( 426) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection
on/off support.
Suspend Offline collection upon
new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test
supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
No General Purpose Logging
support.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 31) minutes.
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 052 047 034 Pre-fail Always
- 57957539
3 Spin_Up_Time 0x0003 100 099 000 Pre-fail Always
- 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age Always
- 11
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always
- 0
7 Seek_Error_Rate 0x000f 085 060 030 Pre-fail Always
- 352844695
9 Power_On_Hours 0x0032 070 070 000 Old_age Always
- 26783
10 Spin_Retry_Count 0x0013 100 100 034 Pre-fail Always
- 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always
- 307
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always
- 1328
193 Load_Cycle_Count 0x0032 001 001 000 Old_age Always
- 1480555
194 Temperature_Celsius 0x0022 047 059 000 Old_age Always
- 47
195 Hardware_ECC_Recovered 0x001a 052 047 000 Old_age Always
- 57957539
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always
- 1
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age
Offline - 1
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always
- 0
200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age
Offline - 0
202 TA_Increase_Count 0x0032 100 253 000 Old_age Always
- 0
SMART Error Log Version: 1
ATA Error Count: 52 (device log contains only the most recent five
errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 52 occurred at disk power-on lifetime: 26777 hours (1115 days + 17
hours)
When the command that caused the error occurred, the device was active
or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 d5 5b 77 df f0 Error: UNC 213 sectors at LBA = 0x00df775b =
14645083
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
25 00 d8 58 77 df f0 00 02:54:33.682 READ DMA EXT
25 00 e0 50 77 df f0 00 02:54:33.626 READ DMA EXT
25 00 e8 48 77 df f0 00 02:54:33.567 READ DMA EXT
25 00 f0 40 77 df f0 00 02:54:33.511 READ DMA EXT
25 00 f8 38 77 df f0 00 02:54:33.457 READ DMA EXT
Error 51 occurred at disk power-on lifetime: 26777 hours (1115 days + 17
hours)
When the command that caused the error occurred, the device was active
or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 d5 5b 77 df f0 Error: UNC 213 sectors at LBA = 0x00df775b =
14645083
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
25 00 e0 50 77 df f0 00 02:54:33.682 READ DMA EXT
25 00 e8 48 77 df f0 00 02:54:33.626 READ DMA EXT
25 00 f0 40 77 df f0 00 02:54:33.567 READ DMA EXT
25 00 f8 38 77 df f0 00 02:54:33.511 READ DMA EXT
25 00 00 30 77 df f0 00 02:54:33.457 READ DMA EXT
Error 50 occurred at disk power-on lifetime: 26777 hours (1115 days + 17
hours)
When the command that caused the error occurred, the device was active
or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 d5 5b 77 df f0 Error: UNC 213 sectors at LBA = 0x00df775b =
14645083
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
25 00 e8 48 77 df f0 00 02:54:33.682 READ DMA EXT
25 00 f0 40 77 df f0 00 02:54:33.626 READ DMA EXT
25 00 f8 38 77 df f0 00 02:54:33.567 READ DMA EXT
25 00 00 30 77 df f0 00 02:54:33.511 READ DMA EXT
25 00 00 30 76 df f0 00 02:54:33.457 READ DMA EXT
Error 49 occurred at disk power-on lifetime: 26777 hours (1115 days + 17
hours)
When the command that caused the error occurred, the device was active
or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 d5 5b 77 df f0 Error: UNC 213 sectors at LBA = 0x00df775b =
14645083
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
25 00 f0 40 77 df f0 00 02:54:33.682 READ DMA EXT
25 00 f8 38 77 df f0 00 02:54:33.626 READ DMA EXT
25 00 00 30 77 df f0 00 02:54:33.567 READ DMA EXT
25 00 00 30 76 df f0 00 02:54:33.511 READ DMA EXT
25 00 00 30 75 df f0 00 02:54:33.457 READ DMA EXT
Error 48 occurred at disk power-on lifetime: 26777 hours (1115 days + 17
hours)
When the command that caused the error occurred, the device was active
or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 d5 5b 77 df f0 Error: UNC 213 sectors at LBA = 0x00df775b =
14645083
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
25 00 f8 38 77 df f0 00 02:54:33.682 READ DMA EXT
25 00 00 30 77 df f0 00 02:54:33.626 READ DMA EXT
25 00 00 30 76 df f0 00 02:54:33.567 READ DMA EXT
25 00 00 30 75 df f0 00 02:54:33.511 READ DMA EXT
25 00 00 30 74 df f0 00 02:54:33.457 READ DMA EXT
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining
LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed: read failure 80% 26774
14645082
# 2 Extended offline Completed: read failure 80% 26774
14645076
# 3 Short offline Completed without error 00% 26774
-
# 4 Extended offline Completed without error 00% 22661
-
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute
delay.
--
+--------------- http://www.winstonsmith.info ---------------+
| il Progetto Winston Smith: scolleghiamo il Grande Fratello |
| the Winston Smith Project: unplug the Big Brother |
| Marco A. Calamari marcoc@marcoc.it http://www.marcoc.it |
| DSS/DH: 8F3E 5BAE 906F B416 9242 1C10 8661 24A9 BFCE 822B |
+ PGP RSA: ED84 3839 6C4D 3FFE 389F 209E 3128 5698 ----------+
-------------- parte successiva --------------
Un allegato non testuale è stato rimosso....
Nome: non disponibile
Tipo: application/pgp-signature
Dimensione: 307 bytes
Descrizione: This is a digitally signed message part
URL: <http://lists.linux.it/pipermail/flug-tech/attachments/20080823/0a5f8d29/attachment.pgp>
Maggiori informazioni sulla lista
flug-tech