How To Protect Yourself From Harddisk Crash & Failures

How To Protect Yourself From Harddisk Crash & Failures

Tech Note News

support News

RELATED NEWS

How To Track Profile Views On Facebook?

Ahmadinejad to auction his car for charity

EU-U.S. Summit Joint Statement

All-party meet fails to break logjam, opposition firm on JPC (Second Lead)

Statements by European Foreign Ministers in Support of the New START Treaty

Fact Sheet on the United States' Relationship with the European Union: An Enduring Partnership

2G spectrum probe: CBI tells apex court charge-sheet in March

Integrity rule also for judicial posts, government tells court (Second Lead)

Afghan handicrafts, dry fruits lure retailers at Delhi IITF

Organic farming meet to be held in Gujarat

By Angsuman Chakraborty, Gaea News Network
Monday, January 15, 2007

Most modern hard disks have S.M.A.R.T. (Self-Monitoring, Analysis and Reporting Technology) technology built in which, if enabled, allows you to query the hard drive about it’s health and performance. Let’s look at some of the critical attributes and how you can determine the health of your hard disk.

Mechanical failures, which are usually predictable failures, account for 60 percent of drive failure. The purpose of S.M.A.R.T. is to warn a user or system administrator of impending drive failure by mechanical means, while time remains to take preventive action such as copying the data to a replacement device, taking regular backups etc. Approximately 30% of failures can be predicted by S.M.A.R.T.

Note: Most modern drives support S.M.A.R.T. However drives connected via SCSI or hardware RAID will not work. Drives connected via SATA (serial ATA) are supported as are drives configured as software RAID (dynamic disks) via Windows Disk Management will also work.

Each drive manufacturer defines a set of attributes and selects threshold values which attributes should not go below under normal operation. Attribute values can range from 1 to 253 (1 representing the worst case and 253 representing the best). Depending on the manufacturer, a value of 100 or 200 will often be chosen as the “normal” value.

S.M.A.R.T. is supported by majority of hard disk manufacturers including but not limited to Samsung, Seagate, IBM (Hitachi), Fujitsu, Maxtor and Western Digital.

They do not necessarily agree on precise attribute definitions and measurement units; therefore the following list of critical attributes should be regarded as a general reference only.

Overview of critical S.M.A.R.T. attributes and their description

ID	Hex	Attribute name	Description
01	01	Read Error Rate	Indicates the rate of hardware read errors that occurred when reading data from a disk surface. Lower values indicate a problem with either disk surface or read/write heads.
05	05	Reallocated Sectors Count	Count of reallocated sectors. When the hard drive finds a read/write/verification error, it marks this sector as “reallocated” and transfers data to a special reserved area (spare area). This process is also known as remapping and “reallocated” sectors are called remaps. This is why, on modern hard disks, you can not see “bad blocks” while testing the surface — all bad blocks are hidden in reallocated sectors. However, the more sectors that are reallocated, the more read/write speed will decrease.
06	06	Read Channel Margin	Margin of a channel while reading data. The function of this attribute is not specified.
196	C4	Reallocation Event Count	Count of remap operations. The raw value of this attribute shows the total number of attempts to transfer data from reallocated sectors to a spare area. Both successful & unsuccessful attempts are counted.
197	C5	Current Pending Sector Count	Number of “unstable” sectors (waiting to be remapped). If the unstable sector is subsequently written or read successfully, this value is decreased and the sector is not remapped. Read errors on the sector will not remap the sector, it will only be remapped on a failed write attempt. This can be problematic to test because cached writes will not remap the sector, only direct I/O writes to the disk.
198	C6	Uncorrectable Sector Count	The total number of uncorrectable errors when reading/writing a sector. A rise in the value of this attribute indicates defects of the disk surface and/or problems in the mechanical subsystem.
220	DC	Disk Shift	Distance the disk has shifted relative to the spindle (usually due to shock). Unit of measure is unknown.

There are several free and commercial tools available to determine the health of your hard disk.

I prefer HDD Health from Panterasoft. It provides detailed listing of detected S.M.A.R.T. attributes, including ones it couldn’t decipher. You can use the table above to get an understanding of the impact of the parameters for your hard disks. It can also send you notifications by emails, network messages, popups and sound before impending hard disk failures.

A safe corporate strategy is to use S.M.A.R.T. to manage your hard disks across all machines by using a S.M.A.R.T. aware tool to get centrally notified of impending failures and prepare for contingencies. Monitoring the server machines is of critical importance. For Linux servers I would recommend smartmontools and utilities based on it such as for web view.

NAME :	(REQUIRED)
MAIL :	(REQUIRED) will not be displayed
WEBSITE :	(OPTIONAL)

YOUR COMMENT :
	Submit Notify me of followup comments via e-mail

Older News