Typical hard drives of the mid-1990s.
A hard disk (or "hard disc" or "hard drive" or "hard disk drive") is a computer storage device.
A hard disk uses rigid rotating platters. It stores and retrieves digital data from a planar magnetic surface. Information is written to the disk by transmitting an electromagnetic flux through an antenna or write head that is very close to a magnetic material, which in turn changes its polarization due to the flux. The information can be read back in a reverse manner, as the magnetic fields cause electrical change in the coil or read head that passes over it.
A typical hard disk drive design consists of a central axis or spindle upon which the platters spin at a constant speed. Moving along and between the platters on a common armature are the read-write heads, with one head for each platter face. The armature moves the heads radially across the platters as they spin, allowing each head access to the entirety of the platter.
The associated electronics control the movement of the read-write armature and the rotation of the disk, and perform reads and writes on demand from the disk controller. Modern drive electronics are capable of scheduling reads and writes efficiently across the disk, and of remapping sectors of the disk which have failed.
Also, most major hard drive and motherboard vendors now support S.M.A.R.T. technology, by which impending failures can often be predicted, allowing the user to be alerted in time to prevent data loss.
The (mostly) sealed enclosure protects the drive internals from dust, condensation, and other sources of contamination. The hard disk's read-write heads fly on an air bearing (a cushion of air) only nanometers above the disk surface. The disk surface and the drive's internal environment must therefore be kept immaculately clean, as fingerprints, hair, dust, and even smoke particles have mountain-sized dimensions when compared to the submicroscopic gap that the heads maintain.
Some people believe a disk drive contains a vacuum — this is incorrect, as the system relies on air pressure inside the drive to support the heads at their proper flying height while the disk is in motion. Another common misconception is that a hard drive is totally sealed. A hard disk drive requires a certain range of air pressures in order to operate properly. If the air pressure is too low, the air will not exert enough force on the flying head, the head will not be at the proper height, and there is a risk of head crashes and data loss. (Specially manufactured sealed and pressurized drives are needed for reliable high-altitude operation, above about 10,000 feet. Please note this does not apply to pressurized enclosures, like an airplane cabin.) Some modern drives include flying height sensors to detect if the pressure is too low, and temperature sensors to alert the system to overheating problems.
The inside of a hard disk with the platter removed. To the left is the read-write arm. In the middle the electromagnets of the platter's motor can be seen.
Hard disk drives are not airtight. They have a permeable filter (a breather filter) between the top cover and inside of the drive, to allow the pressure inside and outside the drive to equalize while keeping out dust and dirt. The filter also allows moisture in the air to enter the drive. Very high humidity year-round will cause accelerated wear of the drive's heads (by increasing stiction, or the tendency for the heads to stick to the disk surface, which causes physical damage to the disk and spindle motor). You can see these breather holes on all drives -- they usually have a warning sticker next to them, informing the user not to cover the holes. The air inside the operating drive is constantly moving too, being swept in motion by friction with the spinning disk platters. This air passes through an internal filter to remove any leftover contaminants from manufacture, any particles that may have somehow entered the drive, and any particles generated by head crash.
Due to the extremely close spacing of the heads and disk surface, any contamination of the read-write heads or disk platters can lead to a head crash — a failure of the disk in which the head scrapes across the platter surface, often grinding away the thin magnetic film. For GMR heads in particular, a minor head crash from contamination (that does not remove the magnetic surface of the disk) will still result in the head temporarily overheating, due to friction with the disk surface, and renders the disk unreadable until the head temperature stabilizes. Head crashes can be caused by electronic failure, a sudden power failure, physical shock, wear and tear, or poorly manufactured disks. Normally, when powering down, a hard disk moves its heads to a safe area of the disk, where no data is ever kept (the landing zone). However, especially in old models, sudden power interruptions or a power supply failure can result in the drive shutting down with the heads in the data zone, which increases the risk of data loss. Newer drives are designed such that the rotational inertia in the platters is used to safely park the heads in the case of unexpected power loss. In recent years, IBM pioneered drives with "head unloading" technology, where the heads are lifted off the platters onto "ramps" instead of having them rest on the platters. Other manufacturers have begun using this technology as well.
Spring tension from the head mounting constantly pushes the heads towards the disk. While the disk is spinning, the heads are supported by an air bearing, and experience no physical contact wear. The sliders (the part of the head that is closest to the disk and contains the pickup coil itself) are designed to reliably survive a number of landings and takeoffs from the disk surface, though wear and tear on these microscopic components eventually takes its toll. Most manufacturers design the sliders to survive 50,000 contact cycles before the chance of damage on startup rises above 50%. However, the decay rate is not linear — when a drive is younger and has fewer start/stop cycles, it has a better chance of surviving the next startup than an older, higher-mileage drive (literally, as the head drags along the drive surface until the air bearing is established). For the Maxtor DiamondMax series of drives, for instance, the drive typically has a 0.02% chance of failing after 4,500 cycles, a 0.05% chance after 7,500 cycles, with the chance of failure rising geometrically to 50% after 50,000 cycles, and increasing ever after.
Using rigid platters and sealing the unit allows much tighter tolerances than in a floppy disk. Consequently, hard disks can store much more data than floppy disk, and access and transmit it faster. In 2004, a typical workstation hard disk might store between 80 GB and 400 GB of data, rotate at 5,400 to 10,000 rpm, and have an average transfer rate of over 30 MB/s. The fastest workstation hard drives spin at 15,000 rpm. Notebook hard drives, which are physically smaller than their desktop counterparts, tend to be slower and have less capacity. Most spin at only 4,200 rpm or 5,400 rpm, though the newest top models spin at 7,200 rpm.
There are three primary factors that determine hard drive performance: seek time, latency and internal data transfer rate:
- Seek time is a measure of the speed with which the drive can position its read/write heads over any particular data track. Because neither the starting position of the head nor the distance from there to the desired track is fixed, seek time varies greatly, and it is almost always measured as an average seek time, though full-track (the longest possible) and track-to-track (the shortest possible) seeks are also quoted sometimes. The standard way to measure seek time is to time a large number of disk accesses to random locations, subtract the latency (see below) and take the mean. Note, however, that two different drives with identical average seek times can display quite different performance characteristics. Seek time is always measured in milliseconds (ms), and often regarded as the single most important determinant of drive performance, though this claim is debated. (More on seek time.)
- All drives have rotational latency: the time that elapses between the moment when the read/write head settles over the desired data track and the moment when the first byte of the required data appears under the head. For any individual read or write operation, latency is random between zero (if the first data sector happens to be directly under the head at the exact moment that the head is ready to begin reading or writing) and the full rotational period of the drive (for a typical 7200 rpm drive, just under 8.4 ms). However, on average, latency is always equal to one half of the rotational period. Thus, all 5400 rpm drives of any make or model have 5.56 ms latency; all 7200 rpm drives, 4.17 ms; all 10,000 rpm drives, 3.0 ms; and all 15,000 rpm drives have 2.0 ms latency. Like seek time, latency is a critical performance factor and is always measured in milliseconds. (More on latency.)
- The internal data rate is the speed with which the drive's internal read channel can transfer data from the magnetic media. (Or, less commonly, in the reverse direction.) Previously a very important factor in drive performance, it remains significant but less so than in prior years, as all modern drives have very high internal data rates. Internal data rates are normally measured in Megabits per second (Mbit/s).
Subsidiary performance factors include:
- Access time is simply the sum of the seek time and the latency. It is important not to mistake seek time figures for access time figures! The access time is by far the most important performance benchmark of a modern HDD. It almost alone defines how fast the disk performs in a typical system. However, people tend to pay much more attention to the data rates, which rarely make any significant difference in typical systems. Of course, in some usage scenarios it may be vise-versa, so you need to know your system before buying a HDD.
- The external data rate is the speed with which the drive can transfer data from its buffer to the host computer system. Although in theory this is vital, in practice it is usually a non-issue. It is a relatively trivial matter to design an electronic interface capable of outpacing any possible mechanical read/write mechanism, and it is routine for computer makers to include a hard drive controller interface that is significantly faster than the drive it will be attached to. As a general rule, modern ATA and SCSI interfaces are capable of dealing with at least twice as much data as any single drive can deliver; they are, after all, designed to handle two or more drives per bus even though a desktop computer usually mounts only one. For a single-drive computer, the difference between ATA-100 and ATA-133, for example, is largely one of marketing rather than performance. No drive yet manufactured can utilise the full bandwidth of an ATA-100 interface, and few are able to send more data than an ATA-66 interface can accept. The external data rate is usually measured in Megabytes per second. (MB/s — note the upper-case "B".)
- Command overhead is the time it takes the drive electronics to interpret instructions from the host computer and issue commands to the read/write mechanism. In modern drives it is negligible.
Access and interfaces
A hard disk is generally accessed over one of a number of bus types, including ATA (IDE, EIDE), SCSI, FireWire/IEEE 1394, USB, and Fibre Channel. In late 2002 Serial ATA was introduced.
Back in the days of the ST-506 interface, the data encoding scheme was also important. The first ST-506 disks used Modified Frequency Modulation (MFM) encoding (which is still used on the common "1.44 MB" (1.4 MiB) 3.5-inch floppy), and ran at a data rate of 5 megabits per second. Later on, controllers using 2,7 RLL (or just "RLL") encoding increased this by half, to 7.5 megabits per second; it also increased drive capacity by half.
Many ST-506 interface drives were only certified by the manufacturer to run at the lower MFM data rate, while other models (usually more expensive versions of the same basic drive) were certified to run at the higher RLL data rate. In some cases, the drive was overengineered just enough to allow the MFM-certified model to run at the faster data rate; however, this was often unreliable and was not recommended. (An RLL-certified drive could run on a MFM controller, but with 1/3 less data capacity and speed.)
ESDI also supported multiple data rates (ESDI drives always used 2,7 RLL, but at 10, 15 or 20 megabits per second), but this was usually negotiated automatically by the drive and controller; most of the time, however, 15 or 20 megabit ESDI drives weren't downward compatible (i.e. a 15 or 20 megabit drive wouldn't run on a 10 megabit controller). ESDI drives typically also had jumpers to set the number of sectors per track and (in some cases) sector size.
SCSI originally had just one speed, 5 MHz (for a maximum data rate of 5 megabytes per second), but this was increased dramatically later. The SCSI bus speed had no bearing on the drive's internal speed because of buffering between the SCSI bus and the drive's internal data bus; however, many early drives had very small buffers, and thus had to be reformatted to a different interleave (just like ST-506 drives) when used on slow computers, such as early IBM PC compatibles and Apple Macintoshes.
ATA drives have typically had no problems with interleave or data rate, due to their controller design, but many early models were incompatible with each other and couldn't run in a master/slave setup (two drives on the same cable). This was mostly remedied by the mid-1990s, when ATA's specfication was standardised and the details begun to be cleaned up, but still causes problems occasionally (especially with CD-ROM and DVD-ROM drives, and when mixing Ultra DMA and non-UDMA devices). Serial ATA does away with master/slave setups entirely, placing each drive on its own channel (with its own set of I/O ports) instead.
- capacity (measured in Gigabytes)
- MTBF (mean time between failures)
- power used (especially important in battery-powered laptops)
- audible noise (in dBA)
- G-shock rating (surprisingly high in modern drives)
There are two modes of addressing the data blocks on more recent hard disks. The older one is the CHS addressing (Cylinder-Head-Sector), used on old ST-506 and ATA drives and internally by the PC BIOS, and the more recent one the LBA (Logical Block Addressing), used by SCSI drives and newer ATA drives (ATA drives power up in CHS mode for historical reasons).
CHS describes the disk space in terms of its physical dimensions, data-wise; this is the traditional way of accessing a disk on IBM PC compatible hardware, and while it works well for floppies (for which it was originally designed) and small hard disks, it caused problems when disks started to exceed the design limits of the PC's CHS implementation. The traditional CHS limit was 1024 cylinders, 16 heads and 63 sectors; on a drive with 512-byte sectors, this comes to 504 MiB (528 megabytes). The origin of the CHS limit lies in a combination of the limitations of IBM's BIOS interface (which allowed 1024 cylinders, 256 heads and 64 sectors; sectors were counted from 1, reducing that number to 63, giving an addressing limit of 8064 MiB or just under 8 GiB), and a hardware limitation of the AT's hard disk controller (which allowed up to 65536 cylinders and 256 sectors, but only 16 heads, putting its addressing limit at 2^28 bits or 128 GiB).
When drives larger than 504 MiB began to appear in the mid-1990s, many system BIOSes had problems communicating with them, requiring LBA BIOS upgrades or special driver software to work correctly. Even after the introduction of LBA, similar limitations reappeared several times over the following years: at 2.1, 4.2, 8.4, 32, and 128 GiB. The 2.1, 4.2 and 32 GiB limits are hard limits: fitting a drive larger than the limit results in a PC that refuses to boot, unless the drive includes special jumpers to make it appear as a smaller capacity. The 8.4 and 128 GiB limits are soft limits: the PC simply ignores the extra capacity and reports a drive of the maximum size it is able to communicate with.
SCSI drives, however, have always used LBA addressing, which describes the disk as a linear, sequentially-numbered set of blocks. SCSI mode page commands can be used to get the physical specifications of the disk, but this is not used to read or write data; this is an artifact of the early days of SCSI, circa 1986, when a disk attached to a SCSI bus could just as well be an ST-506 or ESDI drive attached through a bridge (and therefore having a CHS configuration that was subject to change) as it could a native SCSI device. Because PCs use CHS addressing internally, the BIOS code on PC SCSI host adapters does CHS-to-LBA translation, and provides a set of CHS drive parameters that tries to match the total number of LBA blocks as closely as possible.
ATA drives can either use their native CHS parameters (only on very early drives; hard drives made since the early 1990s use multiple-zone recording, and thus don't have a set number of sectors per track), use a "translated" CHS profile (similar to what SCSI host adapters provide), or run in ATA LBA mode, as specified by ATA-2. To maintain some degree of compatibility with older computers, LBA mode generally has to be requested explicitly by the host computer. ATA drives larger than 8 GiB are always accessed by LBA, due to the 8 GiB limit described above.
See also: hard disk drive partitioning, master boot record, file system, drive letter assignment, boot sector.
Most of the world's hard disks are now manufactured by just a handful of large firms: Seagate, Maxtor, Western Digital, Samsung, and the former drive manufacturing division of IBM, now sold to Hitachi. Fujitsu continues to make specialist notebook and SCSI drives but exited the mass market in 2001. Toshiba is a major manufacturer of 2.5-inch notebook drives.
Firms that have come and gone
Dozens of former hard drive manufacturers have gone out of business, merged, or closed their hard drive divisions; as capacities and demand for products increased, profits became hard to find, and there were shakeouts in the late 1980s and late 1990s. The first notable casualty of the business in the PC era was Computer Memories International or CMI; after the 1985 incident with the faulty 20MB AT drives, CMI's reputation never recovered, and they exited the hard drive business in 1987. Another notable failure was MiniScribe, who went bankrupt in 1990 after it was found that they had "cooked the books" and inflated sales numbers for several years. Many other smaller companies (like Kalok, Microscience, LaPine, Areal, Priam and PrairieTek) also did not survive the shakeout, and had disappeared by 1993; Micropolis was able to hold on until 1997, and JTS, a relative latecomer to the scene, lasted only a few years and was gone by 1999. Rodime was also an important manufacturer during the 1980s, but stopped making drives in the early 1990s amid the shakeout and now concentrates on technology licensing; they hold a number of patents related to 3.5-inch form factor hard drives.
There have also been a number of notable mergers in the hard disk industry:
- Tandon sold its disk manufacturing division to Western Digital (which was then a controller maker and ASIC house) in 1988; by the early 1990s Western Digital disks were among the top sellers.
- Quantum bought DEC's storage division in 1994, and later (2000) sold the hard disk division to Maxtor to concentrate on tape drives.
- In 1995, Conner Peripherals announced a merger with Seagate (who had earlier bought Imprimis from CDC), which completed in early 1996.
- JTS infamously merged with Atari in 1996, giving it the capital it needed to bring its drive range into production.
- In 2003, following the controversy over the mass failures of the Deskstar 75GXP range (which resulted in lost sales of its follow-ons), hard disk pioneer IBM sold the majority of its disk division to Hitachi, who renamed it Hitachi Global Storage Technologies.
"Marketing" capacity versus true capacity
It is important to note that hard drive manufacturers often use the decimal definition of a gigabyte or megabyte. As a result, after the drive is installed it appears that a few gigabytes or megabytes have disappeared. In reality computers operate based upon the binary numeral system. In the decimal number system a gigabyte is 7.5% smaller than in the binary number system. The term "1.44 MB" often used to describe 1440 KB floppies (actually 1.47 MB or 1.4 MiB) introduced an anomalous definition of "megabyte" as 1 x 10^3 x 2^10 bytes (1 KKiB).
Hard disk usage
From the original use of a hard drive in a single computer, techniques for guarding against hard disk failure were developed such as the redundant array of independent disks (RAID). Hard disks are also found in network attached storage devices, but for large volumes of data are most efficiently used in a storage area network.
The first computer with a hard disk drive as standard was the IBM 350 Disk File, introduced in 1955 with the IBM 305 computer. This drive had fifty 24 inch platters, with a total capacity of five million characters. In 1952, an IBM engineer named Reynold Johnson developed a massive hard disk consisting of fifty platters, each two feet wide, that rotated on a spindle at 1200 rpm with read/write heads for the first database running RCAs Bismark computer.
In 1973, IBM introduced the 3340 "Winchester" disk system (the 30Mb + 30 millisecond access time led the project to be named after the Winchester 30-30 rifle), the first to use a sealed head/disk assembly (HDA). Almost all modern disk drives now use this technology, and the term "Winchester" became a common description for all hard disks.
For many years, hard disks were large, cumbersome devices, more suited to use in the protected environment of a data center or large office than in a harsh industrial environment (due to their delicacy), or small office or home (due to their size and power consumption). Before the early 1980s, most hard disks had 8-inch or 14-inch platters, required an equipment rack or a large amount of floor space (especially the large removable-media drives, which were often referred to as "washing machines"), and in many cases needed special power hookups for the large motors they used. Because of this, hard disks were not commonly used with microcomputers until after 1980, when Seagate Technology introduced the ST-506, the first 5.25-inch hard drive, with a capacity of 5 megabytes. In fact, in its factory configuration the original IBM PC (IBM 5150) was not equipped with a hard drive.
Most microcomputer hard disk drives in the early 1980s were not sold under their manufacturer's names, but by OEMs as part of larger peripherals (such as the Corvus Disk System and the Apple ProFile). The IBM PC/XT had an internal hard disk, however, and this started a trend toward buying "bare" drives (often by mail order) and installing them directly into a system. Hard disk makers started marketing to end users as well as OEMs, and by the mid-1990s, hard disks had become available on retail store shelves.
While internal drives became the system of choice on PCs, external hard drives remained popular for much longer on the Apple Macintosh and other platforms. Every Mac made between 1986 and 1998 has a SCSI port on the back, making external expansion easy; also, "toaster" Macs did not have easily accessible hard drive bays (or, in the case of the Mac Plus, any hard drive bay at all), so on those models, external SCSI disks were the only reasonable option. External SCSI drives were also popular with older microcomputers such as the Apple II series and the Commodore 64, and were also used extensively in servers, a usage which is still popular today. The appearance in the late 1990s of high-speed external interfaces such as USB and IEEE 1394 (FireWire) has made external disk systems popular among regular users once again, especially for users that move large amounts of data between two or more locations, and most hard disk makers now make their disks available in external cases.
The capacity of hard drives has grown exponentially over time. With early personal computers, a drive with a 20 megabyte capacity was considered large. In the latter half of the 1990's, hard drives with capacities of 1 gigabyte and greater became available. As of early 2005, the "smallest" desktop hard disk in production has a capacity of 40 gigabytes, while the largest-capacity drives approach one half terabyte (500 gigabytes), and are expected to exceed that mark by year's end.
- The PC Guide: A Brief History of the Hard Disk Drive (http://www.pcguide.com/ref/hdd/hist-c.html)
- Binary versus Decimal (http://www.pcguide.com/intro/fun/bindec.htm)
- Multi Disk System Tuning HOWTO (http://www.nyx.net/~sgjoen/disk.html)
- Windows NT Server Resource Kit: Disk Management Basics (http://www.microsoft.com/technet/archive/winntas/support/diskover.mspx) (See section "About Disks and Disk Organization")
- Behold the God Box (http://www.legadoassociates.com/behold.htm) - Less's Law and future implications of massive cheap hard disk storage