I. Executive Summary
There is no question that we live in the age of information. The focus of the world economy has shifted from physical production to the importance of data: statistics, facts, figures, numbers and records are highly valued in the business world. As this shift continues, the importance of information in a business escalates. Nearly all of a professional organization's data exists in an electronic format, and as the value and volume of data increases, so does the demand for adequate storage space to house it.
The solution? Data archiving.
Companies have large amounts of data, and a large percent of it needs to be retained. Only 25% of the data within an organization is freshly created; the rest is redundant data, or data that was created in the past and must be preserved for future reuse.
This situation has created a high demand for information storage, a demand that carries both monetary and logistic concerns. Data archiving allows organizations to efficiently retain this mass of redundant data, often for very long periods of time, so that it can be accessed when necessary.
II. Why Archive?
To put it simply, an organization's electronic data is a valuable asset and needs to remain available over time. In today's fast-paced world, data archives are no longer a luxury. Effective, reliable and affordable archiving technology is quickly becoming a necessity for many content-intensive applications.
Data archives help professional organizations achieve three main goals:
• Protect and retain data for the future
• Meet the world's increasing data storage needs
• Live up to the world's compliance and legislation requirements
Protect and retain data for the future
From contracts to medical records, employee documents to client communications, organizations expend a great deal of time and energy on creating, managing and maintaining electronic documentation.
As a result of the influx of electronic data, and the importance it carries, a significant amount of a company's data holds significant value.
Once this electronic data and documentation had been created, it must be managed and protected. In order to safeguard important data, technologies and practices must be in place that enable an organization to a) defend against data loss, destruction and failures, and b) proceed logically and cost effectively in the event that information is exposed to detrimental effects.
Many organizations already have backup systems in place that allow them to save and protect data. So what is the difference between a data archive and a data backup?
A data archive is a storage device that is used to house data for long-term preservation. Data archives store and protect historic data that is not needed on a day-to-day basis, but is important and necessary for future reference, such as information that must be retained for regulatory compliance. Data archives are indexed for search optimization-files can be located and retrieved quickly and easily. The technology commonly used for data archives are tape drives, hard disk drives and optical Blu-ray media.
A data backup is a large repository that is used to store multiple copies of data. The backup is used to restore data in case it is corrupted or destroyed. Typically, data backups are used more for storage and less for retrieval. In contrast to data archives, backups are not indexed or designed for swift data location and recovery, and the data has a shorter retention period. The technology commonly used for data backups are tape drives and hard disk drives.
The two key differences between a data backup and a data archive are the length of data retention and the ability to retrieve the data. Data archives not only offload content from primary, short-term storage to safe, long-term storage, but they retain information for future use and for frequent recall-all at a low cost. This ensures that important data remains safe, retrievable and easily accessible for decades. Furthermore, as archives streamline data storage and recovery processes, productivity increases.
Meet the world's increasing data storage needs
In the last five years, the amount of electronic data that needs to be retained has outgrown online storage capacities. Currently, there is a gap in data storage capabilities. Data that is stored on local or personal hard drives or online tapes remains static and unprotected.
One significant reason for the lack of adequate storage space is that IT departments have not been able to keep up with the sustained rapid growth of data. Consequently, they have had limited resources to work with in order to efficiently and cost effectively manage current storage needs, as well as create new capacities. As a result, many organizations began meeting their data storage requirements by purchasing extra hard disk space and adding servers. Although effective in the short term, these methods tend to clog up networks and waste valuable online space. In addition, duplicate data is often stored and backed up many times over, eating up storage room that could be better and more efficiently utilized.
The growth of electronic data is certainly not dawdling, and space continues to run short. Hence, the need for additional offline storage becomes crucial for both cost and efficiency reasons. Data archives can fill the gap that this data race has created, allowing companies to increase efficiency, manage old and current data more effectively and reduce storage costs.
The data-storage gap results in:
• Inactive and active data being managed and maintained in one ineffective storage bundle
• Stored data being unable to be recalled or recovered quickly and easily
Data archives can meet storage needs by:
• Moving data from online storage space to offline storage capacities, thereby liberating space for new, more current data
• Reducing storage management costs by implementing affordable, logistical and automated data management systems
Fulfill the world's compliance and legislation requirements
Thousand of regulations exist today that require record retention, and many mandatory compliance standards are in place to ensure that original records can be reproduced unaltered. From FDA regulations to Sarbanes-Oxley standards, industries world-wide find themselves under a mountain of highly specific, strict and standardized compliance and legislation requirements. In addition to compliance standards, companies that face tough litigation issues must often produce documents and other information to be used as evidence in courts of law.
Clearly, electronic documents and records play an increasingly important role in these areas-contracts, medical records, e-mails, financial records and images are only a few examples of information that organizations may need to recall under compliance and legislation requirements. As a result, innumerable world-wide professional organizations must have good archiving architecture in place to store and manage their electronic data as well as have the capabilities to produce original, authentic information at random and unpredictable times. By nature of their technology, data archives can fulfill this need.
Compliance and legislation require companies to:
• Retain vital information for extended periods of time
• Store & manage original records they can be recalled in their initial and unaltered form, and persons with unauthorized permissions are denied access to such records
• Maintain data so that it can be easily navigated on retrieved
Effective data archives ensure that data is:
• Available for recall for years or decades
• Secure from unauthorized access and modification
• Easily recallable with random access
III. The Technologies of Archival Storage
The three technologies that are currently available for data archiving are:
• Hard disk drives (HDDs)
• Tape drives
• Optical media
Hard disk drives - successes and failures
Hard disk drives are routinely used in online or near-line storage archives, which have a role in nearly every aspect of the digital world as we know it today. People and organizations keep significant amounts of data on various versions of HDDs. MP3 players, cell phones, personal business computers, web applications and corporate storage systems are some examples of disk drives used around the world. HDDs are widely used because they are manufactured with affordable magnetic drives that have large individual capacities. These capacities are used to store significant amounts of space-hogging data, such as music, records and video. Although hard disk drives can be effective means of data storage, they are not the perfect archive. When it comes to HDD archiving, there is good news and bad news.
First, the good news:
• Accessibility. Data on HDD archives can be rapidly accessed. Whether a hard disk drive is being used as primary or secondary storage, data can be retrieved in a matter of seconds.
• Affordability. Over the years, the cost of near-line HDDs has been driven down, making them more affordable in terms of acquisition than higher performance HDDs used in primary storage.
• Compliance. Hard disk drives have the basic capabilities needed to meet compliance and regulatory requirements, ensuring that data can be recalled in its original, intended format.
Now for the bad news:
• Long-term retention? Not likely. It's true that HDDs have the basic archiving capabilities to meet storage needs. However, HDDs are normally only designed for a 3-5 year life span. For organizations that need to reproduce data ten or fifteen years down the road due to compliance and legal regulations, this does not bode well.
• Out with the old. Hard disk drives are not designed for unpowered shelf storage. HDDs are designed to heat up only when powered on, and they tend to fail more rapidly when they are sitting unpowered on a shelf. Therefore, offline management of old information is simply not possible with this form of archive technology.
• What goes up stays up. Power and air conditioning consumption are significantly high contributors to HDD operating costs. Furthermore, as a result of their short lifespan, migration to new disk drives is necessary every 3-5 years. Throw in the time and money spent on reliability issues and what happens? The organizational and environmental costs go up- just when they need to go down.
• To err is the nature of HDD. Mistakes happen, which is why humans often rely on technology to perform certain automated tasks with both added speed and reliability. So what happens when technology fails? When it comes to HDD archives, this is an important question. While hard disk archives can be made readily accessible, they are also subject to two types of readability errors: operational and latent failures.
Hard disk drive readability failures
HDDs have a tendency to fail as a result of readability errors, which are either operational or latent in nature. Both types hinder the ability of HDDs to reliably archive data, yet the two failures behave differently.
Operational failures often occur when data cannot be written to the disk drive because the HDD itself has stopped working. Latent failures occur when the disk drive works-data can be written to the HDD, but electronic or mechanical errors prevent the content from being retrieved. Latent errors are the dominant source of errors in HDD archiving.
Latent failures are a little more complex than operational failures, simply because there are several factors that can cause them to occur, and these factors are often lurking, unseen and undetected, until it's too late.
The causes of HDD latent failures
Latent failures cause hard disk drives to be unreadable and unstable.
Causes of latent failures include:
• Thermal instability and self-demagnetization: When an HDD's thermal energy, or the internal energy created by the HDD system and components, is at room temperature, the bits stored on the drive are slowly disordered. As a result, thermal decay occurs. Unpowered hard disk drives are more susceptible to areas of data loss due to thermal decay.
• Corrosion: The internal components of the disk drives are subject to corrosion, including the media, motor parts and connectors. The most severe type of corrosion occurs on the media itself. If corrosive sites develop on the disk platter, data loss could result.
• Particulate infiltration or contamination: It is simpler than it sounds-airborne contamination settles on the disk surface, often rendering it unreadable. This phenomenon can either create a site for possible corrosion or data loss.
• Out-gassing: Out-gassing usually refers to release of detrimental vapor from the HDD cartridge's internal parts or hard case over time. Out-gassing can deposit detrimental films upon the disk platter, which leads to a loss of space or a severe chemical reaction. This process ultimately results in data loss.
• Adhesive breakdown: Some HDD components, such as the filter and desiccant inside each disk, are mounted with adhesives. This adhesive might break down due to time, temperature or humidity, causing the filter to loosen. In turn, this can cause the internal components to rub or make contact with the disk, resulting in areas of non-recoverable data.
To add insult to injury, there is significant research that shows HDD latent errors increase over time. Near-line HDDs (the most common form of archive HDDs) are more likely to develop latent errors than Enterprise HDDs. In one study, 3.45% of 1.53 million disks developed latent errors, and this percentage increased super-linearly for near-line disk drives. Furthermore, drives that have experienced errors are more likely to develop additional errors in the future.
So the question remains, what good is your data if it can't be read? The negative consequences to an organization's time, data and finances as a result of HDD readability issues are plentiful.
When it comes to archiving and hard disk drives, the good does not always outweigh the bad. Due to the increased volumes of electronic data, the ever-growing demands of data archives are beginning to outpace the industry's ability to create adequate HDD storage capacities. The progressive amount of digital data stored on the internet, as well as the growing amounts of information being stored on personal computers and servers, has created a demand for additional offline storage capabilities that prohibit HDDs from becoming an effective archiving tool. In addition, readability failures, as well as high operational costs and the inability of HDDs to retain data for the long-term, significantly compound the issue. The long-term storage deficiencies of hard disk drives make this technology a less than optimal archiving option.
Tape drives - good for backup, bad for archive
A tape drive is a data storage device that uses magnetic tape to read and write data. The tape itself is primarily packaged in a cassette or cartridge, which is then loaded into the drive. Individual drives can be connected to computers via cable connections, such as SATA, USB or FireWire, while multiple tape drives are often housed in autoloaders or large tape libraries. These devices often include built-in barcode readers that identify the tapes and an automated system that loads the tapes into drives-no human intervention is necessary.
The greatest benefit of tape drives is that they are able to store tremendous amounts of data. Tape drive capacities can range from a few MB to over 100 GB, well exceeding the storage capabilities of hard drives and network storage. However, this benefit comes with one large drawback: their accessibility rates are significantly slow. Tape typically offers sequential data storage (versus the random storage capabilities of disk drives), and access to data on tape can take anywhere from a few seconds to two or three minutes. Despite their bulk and lazy retrieval times, tape drives are capable of transferring linear streams of large amounts of data at once. It is for this reason that tape drives are most commonly used for data back-up.
Although the data storage capacity is there, tape drives are not a reliable choice for data archiving needs, as they are not designed to read or write individual files. In addition, tape is fundamentally rewritable-a huge drawback when considering any regulatory and compliance requirements. Compatibility is also not a tape drive's strong suit. Tape standards tend to change every decade, some times more frequently than that. This means that almost every tape format is proprietary and not backwards compatible. When efficiency is paramount, updating drives to current standards or adjusting technologies according to different manufacturers and suppliers is not an effective means to an archiving solution.
Here is the good and bad news about tape archives.
The good news:
• Low incremental cost. Most organizations already have some form of a tape drive in place for data back-up purposes, which means little additional costs if the same drive were to be used in place of primary storage. In addition, the power consumption of tape is low, resulting in low operating costs.
• Long-term retention. Old information can be removed and stored off-line, and the average life span of tape is 7-10 years.
The bad news:
• Incompatibility. Because tape technology changes every decade or sooner, tapes that store data now may not be accessible or readable when the next technology turn comes around. The result is more out-of-pocket costs to keep up with tape's demanding equipment and software upgrades.
• Rewritable technology. By design, tape is extremely rewritable. Additional technology is required to force tape drives to be alteration-resistant or WORM (Write Once Read Many). Tape's decade-long lifespan ensures that data is stored for a significant period of time. However, a long storage life does not guarantee that the original data can be recalled unaltered in the future. When it comes to compliance and regulatory issues, forget about it. Tape simply would not stand a chance in a court of law.
• Slow access times. For all intents and purposes, data stored on tape drives is accessible. However, tape is best suited to sequential access. Due to tape's need to spool, data retrieval times lag. Ultimately, slow retrieval time results in an unmanageable and unpredictable archive technology.
• Vulnerability. Although old data can be removed from the drive and stored off-line, tape is vulnerable to electro-magnetic radiation, and it requires regular maintenance to prevent tapes from adhering together. As a result, support staff needs to be on-hand in order to condition the tapes on a consistent basis.
As an archiving technology, the discussion of tape drives is nearly a moot point. Over the years, tape has been a reliable data backup source, one with little user intervention. In addition, tape system speeds have advanced and capacities have grown. However, in the face of recent Internet developments and the current global economy, tape backup and restoration times remain too slow, inhibiting this technology from keeping pace with the growing demand for effective and efficient archives. Furthermore, this technology has been outmatched by the lowered cost and increased availability of hard drive storage. Long story short, the shoe does not fit.
Optical discs - a reliable data archive choice
Most people are familiar with Blu-ray discs' high-quality ability to store video, games and other interactive content. However, this form of optical media is also a superior choice when it comes to data archiving. Optical archives use Blu-ray disc technology to record a wide array of data and store it in near-line or offline capacities. While HDD storage is a good option for short-term information storage, and tape drives are an efficient means of backing up data, these technologies do not consistently meet the most critical needs and requirements of data archives: safe, long-term data retention and easy accessibility. Not only is optical media designed for the demands of today's video industries, it also serves as a durable, reliable and sustainable data archiving solution now and into the future.
The primary benefits of optical media archiving include:
• Reduced risk of data loss
• Reduced storage costs
• Long-term data retention, durability and compatibility
• Low cost, low power and minimal carbon footprint over time
Reduced risk of data loss
Blu-ray media is 100% WORM, meaning that while the original data stored on the disc cannot be altered, it can always be accessed. This is great news when it comes to compliance and legislation requirements. Optical technology's standard, built-in features ensure that information cannot be erased, altered or accessed without proper authorization-the original file will be there when you need it, with its integrity intact, guaranteed.
Reduced storage costs
A Blu-ray disc is the same physical size as a standard CD or DVD. The differentiating factor between the three types of optical media is that Blu-ray disc technology uses a blue-violet laser, which is a shorter wavelength than the standard red laser used with CDs and DVDs. This enables a standard Blu-ray disc to contain up to 50 GB of data, and disc manufacturers have announced plans to create 100 GB discs in the near future.
In addition, Blu-ray technology has built-in support for multi-layer/extended format features, meaning that as storage capacity requirements grow, so can a single piece of Blu-ray media. In the not-too-distant future, 4-layer Blu-ray storage capacities are expected to accommodate as much as 200 GB of data.
Long-term data retention, durability and compatibility
The average lifespan of a Blu-ray dual-layer 50 GB disc is 20 to 100 years. Two physical factors contribute to the superiority and dependability of optical media.
First, Blu-ray discs contain a protective hard-coating on the outside surface, making scratches and fingerprints a non issue. In addition to the built-in WORM support, which ensures that the data stored on the disc will not be harmed by internal or external forces, Blu-ray's advanced coating technology helps to protect the disc itself from physical damage. These features create a sort of double-whammy protection effect, shielding data from most dangers.
Second, optical media is created in industry standard formats: ISO 9660 and UDF, both of which are backwards compatible. These formats are supported in all major operating systems: Windows, Linux, UNIX and MAC OS. In other words, optical media is highly compatible, assuring that both now and years ahead, Blu-ray discs can be utilized and read by a standard PC. (Further proof: CDs from 25 years ago are still readable today by a standard PC.) There is no need to worry about the proprietary issues that are associated with tape drives-optical media will not out-grow the technology and standards surrounding it, and vice versa. The long lifespan and highly standardized compatibility of Blu-ray media guarantees that original data can be recalled from storage, and files on the disc can be randomly accessed at any time, without having to spend resources on software and technology upgrades.
Low cost, low power and minimal carbon footprint over time
Over time, Blu-ray discs are more economical than hard disk drives. Blu-ray ownership costs are low because:
• Blu-ray discs can hold a significant amount of data, and have inherent technology that allows additional layers and content to be added at a low cost.
• Blu-ray media has the longest shelf life of all data archive solutions.
• Blu-ray's wide capabilities and platform support drastically reduce the need to migrate to new technology.
The cost benefits of optical media ultimately translate into environmental benefits. Optical media is the data archiving technology with the lowest energy consumption, fulfilling another major requirement of today's data storage needs: environmental sustainability. Here's why:
• Optical media is a passive storage device, requiring no energy over decades of storage-it does not consume any power when it is not being utilized to access data and information.
• The energy consumption required to operate and run Blu-ray media is extremely low due to the use of shared resources.
• Blu-ray media generates a significantly low amount of heat, which in turn means that little to no energy is spent on cooling capabilities.
When it comes to data archiving, optical media storage is a robust and reliable option. Blu-ray discs are specially formatted for the wide demands of the ever-growing video industry-a clear advantage over disk drives and tape drives. Second, optical archiving technology is dependable, durable and will be available for the long haul. Furthermore, the low operating costs of optical media ensure that an organization's data archive leaves a small carbon footprint and more money in the bank.
As a result of its robust nature, optical media fully supports the increasing demands of compliance and legislation requirements. These requirements include:
• Long-term record retention
• Reproduction of original, unaltered record
• Quality archiving architecture in place for compliance and legislation
IV. Conclusion
In our digital world, electronic data permeates and dominates business industries across the globe. As our world economy continues to be more information-based, mass amounts of electronic data continue to accrue, and this growth is by no means slowing down. As a result, the world's need for data archiving is ever-growing, placing more and more demand on professional organizations to ensure reliable archiving technology is in place to save and protect their electronic data. Data archives are not a luxury; rather they are a necessity in today's fast-paced society.
The technologies that are currently available to perform archiving tasks are hard disk drives, tape drives and optical Blu-ray media. The question is: Which archiving technology provides the financial and organizational benefits in one trustworthy package? The answer is optical media.
HDDs are effective in the short term, but are prone to operational and latent failures that prove this technology to be an unreliable and unsafe data archiving tool. Tape drives have been exceedingly effective in the data back-up industry, but their high-maintenance and technology turnover do not put them in a noteworthy position for an effective data archiving option. Hard disk drives and tape drive technologies do not meet the most critical need: long-term retention. Blu-ray discs are durable, specially formatted for the popular media industry and have a 20 to 100-year life span. Optical media meets all data archival needs while also fulfilling cost and efficiency demands, surpassing the capacities of both tape disk drives.
2010 Rimage Corporation. All rights reserved.
This document is provided for information purposes only and the contents hereof are subject to change without notice. This document is not warranted to be error-free, nor subject to any other warranties or conditions, whether expressed orally or implied in law, including implied warranties and conditions of merchantability or fitness for a particular purpose. We specifically disclaim any liability with respect to this document and no contractual obligations are formed either directly or indirectly by this document. This document may not be reproduced or transmitted in any form or by any means, electronic or mechanical, for any purpose, without our prior written permission. Rimage is a registered trademark of the Rimage Corporation. All other brand or product names are trademarks of their respective owners and are used without intention of infringement.