FACTOID # 4: Just 1% of the houses in Nevada were built before 1939.
 
 Home   Encyclopedia   Statistics   States A-Z   Flags   Maps   FAQ   About 
   
 
WHAT'S NEW
RELATED ARTICLES
People who viewed "Backup" also viewed:
 

SEARCH ALL

FACTS & STATISTICS    Advanced view

Search encyclopedia, statistics and forums:

 

 

(* = Graphable)

 

 


Encyclopedia > Backup

In information technology, backup refers to making copies of data so that these additional copies may be used to restore the original after a data loss event. These additional copies are typically called "backups." Backups are useful primarily for two purposes. The first is to restore a computer to an operational state following a disaster (called disaster recovery). The second is to restore small numbers of files after they have been accidentally deleted or corrupted.[1] Backups are typically that last line of defense against data loss, and consequently the least granular and the least convenient to use. [2] Look up backup in Wiktionary, the free dictionary. ... Information and communication technology spending in 2005 Information technology (IT), as defined by the Information Technology Association of America (ITAA), is the study, design, development, implementation, support or management of computer-based information systems, particularly software applications and computer hardware. ... For other uses, see Data (disambiguation). ... In the field of information technology, data loss refers to the unforseen loss of data or information. ... This article is about business continuity planning. ...


Since a backup system contains at least one copy of all data worth saving, the data storage requirements are considerable. Organizing this storage space and managing the backup process is a complicated undertaking. A data repository model can be used to provide structure to the storage. In the modern era of computing there are many different types of data storage devices that are useful for making backups. There are also many different ways in which these devices can be arranged to provide geographic redundancy, data security, and portability.


Before data is ever sent to its storage location, it is selected, extracted, and manipulated. Many different techniques have been developed to optimize the backup procedure. These include optimizations for dealing with open files and live data sources as well as compression, encryption, and de-duplication, among others. Many organizations and individuals would also like to have some confidence that the process is working as expected and work to define measurements and validation techniques. It is also important to recognize the limitations and human factors involved in any backup scheme.


Due to a considerable overlap in technology, backups and backup systems are frequently confused with archives and fault-tolerant systems. Backups differ from archives in the sense that archives are the primary copy of data and backups are a secondary copy of data. Backup systems differ from fault-tolerant systems in the sense that backup systems assume that a fault will cause a data loss event and fault-tolerant systems assume a fault will not. Archive of the AMVC hahahahaAn archive refers to a collection of records, and also refers to the location in which these records are kept. ... Fault-tolerance or graceful degradation is the property that enables a system to continue operating properly in the event of the failure of some of its components. ...

Contents

Storage, the base of a backup system

Data repository models

Any backup strategy starts with a concept of a data repository. The backup data needs to be stored somehow and probably should be organized to a degree. It can be as simple as a sheet of paper with a list of all backup tapes and the dates they were written or a more sophisticated setup with a computerized index, catalog, or relational database. Different repository models have different advantages. This is closely related to choosing a backup rotation scheme. A Backup rotation scheme is a method for effectively backing up data where multiple media (such as tapes) are used in the backup process. ...

Unstructured 
An unstructured repository may simply be a stack of floppy disks or CD-R media with minimal information about what was backed up and when. This is the easiest to implement, but probably the least likely to achieve a high level of recoverability.
Full + Incrementals 
A Full + Incremental repository aims to make storing several copies of the source data more feasible. At first, a full backup (of all files) is taken. After that an incremental backup (of only the files that have changed since the previous full or incremental backup) can be taken. Restoring whole systems to a certain point in time would require locating the full backup taken previous to that time and all the incremental backups taken between that full backup and the particular point in time to which the system is supposed to be restored. This model offers a high level of security that something can be restored and can be used with removable media such as tapes and optical disks. The downside is dealing with a long series of incrementals and the high storage requirements.[3]
Full + Differential 
A full + differential backup differs from a full + incremental in that after the full backup is taken, each partial backup captures all files created or changed since the full backup, even though some may have been included in a previous partial backup. Its advantage is that a restore involves recovering only the last full backup and then overlaying it with the last differential backup.[4]
Mirror + Reverse Incrementals
A Mirror + Reverse Incrementals repository is similar to a Full + Incrementals repository. The difference is instead of an aging full backup followed by a series of incrementals, this model offers a mirror that reflects the system state as of the last backup and a history of reverse incrementals. One benefit of this is it only requires an initial full backup. Each incremental backup is immediately applied to the mirror and the files they replace are moved to a reverse incremental. This model is not suited to use removable media since every backup must be done in comparison to the mirror.
Continuous data protection 
This model takes it a step further and instead of scheduling periodic backups, the system immediately logs every change on the host system. This is generally done by saving byte or block-level differences rather than file-level differences.[5] It differs from simple disk mirroring in that it enables a roll-back of the log and thus restore of old image of data.

An incremental backup is a backup method where multiple backups are kept (not just the last one). ... This article or section does not adequately cite its references or sources. ... In computing, rsync is a computer program for Unix systems which synchronizes files and directories from one location to another while minimizing data transfer using delta encoding when appropriate. ... Continuous data protection (CDP), also called continuous backup, refers to backup of computer data by automatically saving a copy of every change made to that data, essentially capturing every version of the data that the user saves. ... In data storage, disk mirroring (which is different from file shadowing) is the replication of logical disk volumes onto separate logical disk volumes in real time to ensure continuous availability, currency and accuracy. ...

Storage media

Regardless of the repository model that is used, the data has to be stored on some data storage medium somewhere.

Magnetic tape 
Magnetic tape has long been the most commonly used medium for bulk data storage, backup, archiving, and interchange. Tape has typically had an order of magnitude better capacity/price ratio when compared to hard disk, but recently the ratios for tape and hard disk have become a lot closer.[6] There are myriad formats, many of which are proprietary or specific to certain markets like mainframes or a particular brand of personal computer. Tape is a sequential access medium, so even though access times may be poor, the rate of continuously writing or reading data can actually be very fast. Some new tape drives are even faster than modern hard disks.
Hard disk 
The capacity/price ratio of hard disk has been rapidly improving for many years. This is making it more competitive with magnetic tape as a bulk storage medium. The main advantages of hard disk storage are low access times, availability, capacity and ease of use.[7] External disks can be connected via local interfaces like SCSI, USB or FireWire, or via longer distance technologies like Ethernet, iSCSI, or Fibre Channel. Some disk-based backup systems, such as Virtual Tape Libraries, support data de-duplication which can dramatically reduce the amount of disk storage capacity consumed by daily and weekly backup data.
Optical disc 
A recordable CD can be used as a backup device. One advantage of CDs is that they can be restored on any machine with a CD-ROM drive. As well, recordable CD's are relatively cheap. Another common format is recordable DVD. Many optical disk formats are WORM type, which makes them useful for archival purposes since the data can't be changed. Other rewritable formats can also be utilized such as CDRW or DVD-RAM. The newer HD-DVD's and BluRay Disks dramatically increase the amount of data possible on a single optical storage disk, though, as yet, the hardware may be cost prohibitive for many people.
Floppy disk 
During the 1980s and early 1990s, many personal/home computer users associated backup mostly with copying floppy disks. The low data capacity of a floppy disk makes it an unpopular and obsolete choice today.[8]
Solid state storage 
Also known as flash memory, thumb drives, USB flash drives, CompactFlash, SmartMedia, Memory Stick, Secure Digital cards, etc., these devices are relatively costly for their low capacity, but offer excellent portability and ease-of-use.
Remote backup service 
As broadband internet access becomes more widespread, remote backup services are gaining in popularity. Backing up via the internet to a remote location can protect against some worse case scenarios, such as fire, flood or earthquake, destroying any backups along with everything else. A drawback to a remote backup service is that an internet connection is usually substantially slower than the speed of local data storage devices, so this can be a problem for people with large amounts of data. It also has the risk associated with putting control of personal or sensitive data in the hands of a third party.

Magnetic tape has been used for data storage for over 50 years. ... Typical hard drives of the mid-1990s. ... This article or section does not cite any references or sources. ... Note: USB may also mean upper sideband in radio. ... The 6-pin and 4-pin FireWire Connectors The alternative ethernet-style cabling used by 1394c FireWire is Apple Inc. ... Ethernet is a large, diverse family of frame-based computer networking technologies that operate at many speeds for local area networks (LANs). ... Internet SCSI (iSCSI) is a network protocol standard, officially ratified on 2003-02-11 by the Internet Engineering Task Force, that allows the use of the SCSI protocol over TCP/IP networks. ... This article does not cite any references or sources. ... “Optical media” redirects here. ... A CD-R (Compact Disc-Recordable) is a thin (1. ... A CD-R (Compact Disc-Recordable) is a thin (1. ... DVD recordable and DVD rewritable refer to DVD optical disc formats that can be recorded (written, burned), either write once or rewritable (write multiple times) format written by laser, as compared to DVD-ROM, which is mass-produced by pressing. ... For other uses, see Worm (disambiguation). ... Compact Disc ReWritable (CD-RW) is a rewritable optical disc format. ... You can recognize a DVD-RAM immediately because visually there are lots of little rectangles distributed on the surface of the data carrier. ... A floppy disk is a data storage device that is composed of a disk of thin, flexible (floppy) magnetic storage medium encased in a square or rectangular plastic shell. ... This article or section does not adequately cite its references or sources. ... A USB flash drive. ... A USB keydrive, shown with a US quarter coin for scale. ... “JumpDrive” redirects here. ... A 32 MB High Speed CompactFlash Type I card CompactFlash (CF) was originally developed as a type of data storage device used in portable electronic devices. ... A 128MB SmartMedia flash memory card. ... A 2GB Sony High Speed Memory Stick PRO Duo with MagicGate support. ... A SanDisk Multi Card Reader, with a 2 GB SD Card inserted. ... A remote backup service or online backup service is a service that provides users with an online system for backing up and storing computer files. ... A WildBlue Satellite Internet dish. ...

Managing the data repository

Regardless of the data repository model or data storage media used for backups, a balance needs to be struck between accessibility, security and cost.

On-line 
On-line backup storage is typically the most accessible type of data storage, which can begin restore in milliseconds time. A good example would be an internal hard disk or a disk array (maybe connected to SAN). This type of storage is very convenient and speedy, but is relatively expensive. On-line storage is vulnerable to being deleted or overwritten, either by accident, or in the wake of a data-deleting virus payload.
Near-line 
Near-line storage is typically less accessible and less expensive than on-line storage, but still useful for backup data storage. A good example would be a tape library with restore times ranging from seconds to a few minutes. A mechanical device is usually involved in moving media units from storage into a drive where the data can be read or written.
Off-line 
Off-line storage is similar to near-line, except it requires human interaction to make storage media available. This can be as simple as storing backup tapes in a file cabinet. Media access time is more than an hour.
Off-site vault 
To protect against a disaster or other site-specific problem, many people choose to send backup media to an off-site vault. The vault can be as simple as the System Administrator’s home office or as sophisticated as a disaster hardened, temperature controlled, high security bunker that has facilities for backup media storage.
Backup site, Disaster Recovery Center or DR Center
In the event of a disaster, the data on backup media will not be sufficient to recover. Computer systems onto which the data can be restored and properly configured networks are necessary too. Some organizations have their own data recovery centers that are equipped for this scenario. Other organizations contract this out to a third-party recovery center. Note that because DR site is itself a huge investment, backup is very rarely considered preferred method of moving data to DR site. More typical way would be remote disk mirroring, which keeps the DR data as up-to-date as possible.

Online means being connected to the Internet or another similar electronic network, like a bulletin board system. ... Hewlett-Packard Disk-Arrays: HASS (top) and NIKE (OEMd Data General SCSI Clariion) EMC CLARiiON CX500 (Cover removed on one Shelf) EMC Symmetrix DMX1000 A disk array is an enterprise storage system which contains multiple disk drives. ... In computing, a storage area network (SAN) is an architecture to attach remote computer storage devices such as disk array controllers, tape libraries and CD arrays to servers in such a way that to the operating system the devices appear as locally attached devices. ... A computer virus is a computer program that can copy itself and infect a computer without permission or knowledge of the user. ... Nearline storage (where Nearline is a contraption of Nearonline)is a term used in computer science to describe an intermediate type of data storage. ... Off-line storage is a computer storage medium which must be inserted into a computer drive by a human operator before a computer can access the information stored on the medium. ... It has been suggested that this article or section be merged with remote backup service. ... A backup site is a location where a business can easily relocate following a disaster, such as fire, flood, or terrorist threat. ... A backup site is a location where a business can easily relocate following a disaster, such as fire, flood, or terrorist threat. ... In data storage, disk mirroring (which is different from file shadowing) is the replication of logical disk volumes onto separate logical disk volumes in real time to ensure continuous availability, currency and accuracy. ...

Selection, access, and manipulation of data

Approaches to backing up files

Deciding what to back up at any given time is a harder process than it seems. By backing up too much redundant data, the data repository will fill up too quickly. If we don't back up enough data, critical information can get lost. The key concept is to only back up files that have changed.

Copying files 
Copy the files to be backed up to another location using the OS specific copy utility.
Filesystem dump 
Copy the filesystem that holds the files in question to another location. This usually involves unmounting the filesystem and running a program like dump. This is also known as a raw partition backup. This type of backup has the possibility of running faster than a backup that simply copies files. A feature of some dump software is the ability to restore specific files from the dump image.
Identification of changes 
Some filesystems have an archive bit for each file that says it was recently changed. Some backup software looks at the date of the file and compares it with the last backup, to determine whether the file was changed.
Block Level Incremental 
A more sophisticated method of backing up changes to files is to only back up the blocks within the file that changed. This requires a higher level of integration between the filesystem and the backup software.
Versioning file system 
A versioning filesystem keeps track of all changes to a file and makes those changes accessible to the user. Generally this gives access to any previous version, all the way back to the file's creation time. An example of this is Wayback for the Linux OS [9]

File copying is creation of a new file which has the same content as an existing file. ... A versioning file system is a file system which provides for the concurrent existence of several versions of a file. ...

Approaches to backing up live data

If a computer system is in use while it is being backed up, the possibility of files being open for reading or writing is real. If a file is open, the contents on disk may not correctly represent what the owner of the file intends. This is especially true for database files of all kinds.


When attempting to understand the logistics of backing up open files, one must consider that the backup process could take several minutes to back up a large file such as a database. In order to back up a file that is in use, it is vital that the entire backup represent a single-moment snapshot of the file, rather than a simple copy of a read-through. This represents a challenge when backing up a file that is constantly changing. Either the database file must be locked to prevent changes, or a method must be implemented to ensure that the original snapshot is preserved long enough to be copied, all while changes are being preserved. Backing up a file while it is being changed, in a manner that causes the first part of the backup to represent data before changes occur to be combined with later parts of the backup after the change results in a corrupted file that is unusable, as most large files contain internal references between their various parts that must remain consistent throughout the file.

Snapshot backup 
A snapshot is an instantaneous function of some storage systems that presents a copy of the filesystem as if it was frozen in a specific point in time, often by a copy-on-write mechanism. An effective way to backup live data is to temporarily quiesce it (e.g. close all files), take a snapshot, and then resume live operations. At this point the snapshot can be backed up through normal methods. [10] While a snapshot is very handy for viewing a filesystem at a different point in time, it is hardly an effective backup mechanism by itself.
Open file backup - file locking 
Many backup software packages feature the ability to backup open files. Some simply check for openness and try again later.
Cold database backup 
During a cold backup the database is closed or locked and not available to users. All files of the database are copied (image copy). The datafiles do not change during the copy so the database is in sync upon restore. [11]
Hot database backup 
Some database management%
In computer file systems, a snapshot is a copy of a set of files and directories as they were at a particular point in the past. ... Copy-on-write (sometimes referred to as COW) is an optimization strategy used in computer programming. ... File locking is a mechanism that enforces access to a computer file by only one user or process at any specific time. ...

  Results from FactBites:
 
Backup - Wikipedia, the free encyclopedia (1784 words)
Backups differ from an archive in which the data is necessarily duplicated, instead of simply moved.
Backup depends both on software and hardware and so are exposed to expiration due to time issues.
Most traditional backup systems require a data set to be frozen for hours while the entire content of a filesystem is copied to magnetic tape.
  More results at FactBites »

 
 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your comments

Want to know more?
Search encyclopedia, statistics and forums:

 


Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms, 1022, m