FIELD GUIDE / DIGITAL ARCHIVING FOR INDEPENDENT FILMMAKERS
last update: July 8 2016
2 HARD DISK DRIVE
3 DISK ON A SHELF
4 GUIDE TO ARCHIVING WITH DISKS
5 SPINNING DISK
7 LTO-TAPE FORMAT
8 PRIZE/CAPACITY OVERVIEW
9 GUIDE FOR ARCHIVING WITH LTO-TAPE
10 LITERATURE/FURTHER READING
The file-born independent film might be in danger to be lost after a few years. Assets are generally stored only on external hard disk drives. The details of their limited reliability are not sufficiently communicated by the manufactures.
Compared to the size of their businesses, filmmakers produce probably more data than any other field. How to maintain data was widely unknown before film moved to pure file-based workflows. The Academy commented on the issue of digital film preservation 2007 with the unfortunate headline "digital dilemma" and the claim “there is no digital archival master format or process with longevity characteristics equivalent to that of film." Yet it is obvious that a radical different technology has different properties and needs different techniques of preservation. The IT-sector knows how to preserve data for more than 60 years. IT-professionals use two technologies to store large data sets, often deployed in combination: spinning disk systems for instant access and tape for cold data, infrequent used data. As argued later in this article, LTO-tape offers the best choice for individual filmmakers and orders of magnitude higher data protection than the external disk on the shelf.
Chapter DATA MANAGEMENT describes the practical details.
2 HARD DISK DRIVE
The life span of an individual hard disk drive varies greatly depending on manufacturer and model, but also on the usage patterns. Although, what seems apparent, HDDs are not designed to sit on a shelf. Their complex electrical and mechanical nature presents various points of failure. Commercial data centers usually do not publish statistics on disk failure rates. Public are only the dated Google and CMU studies from 2007 and the regularly updated Backblaze statistics. Backup provider Backblaze found that, averaged over all drive models they use, a drive's failure rate seems to follow more or less the in engineering known bathtub curve: elevated infant mortality, relative constant phase of random failures and a significant increased failure rate towards the end of a drives life due to wear and tear.
Backblaze statistics show: For the first 1.5 years, drives fail at 5.1% per year. For the next 1.5 years, drives fail LESS, at about 1.4% per year. After 3 years though, failures rates skyrocket to 11.8% per year. Backblaze also states that HGST drives last longer and rather not follow the bathtub curve, but fail at a low constant rate of 1-2% per year, which confirms the anecdotal knowledge about HGST's (formerly Hitatchi, formerly IBM) reliability.
The Google study provides similar failure rates: 1.7% for the first year, to over 8.6% for three-year-old drives
The also dated CMU study reports annual failure rates from 0.5% to as high as 13.5%, and an over 5 years averaged 3%. Overall the study found rather a constant wear-out for the drive populations they observed.
The MTBF (mean time between failure) values manufactures claim for a model suppose to give a rough estimate of the average life span of a drive. In reality drives last less than half as long. Google and the CMU study calculated 50-70% lower MTBF values than claimed by the vendor.
After collecting sufficient data, Backblaze calculated in early 2016 a median life of 6 years for their drive population. That means that after 6 years 50% of the drives had died, which roughly coincides with the common knowledge of an average HDD life of 5 years and the maximal offered warranties found in the market of also 5 years. Which also means that drives (in use) need to be replaced much earlier. Often suggested is the 3 year mark.
Note: Like most independent filmmakers, Backblaze and Google actually use primarily consumer drives. All three studies found that enterprise models offer only sightly higher reliability, for their specific applications though. Their proprietary software can deal with consumer drives, a hardware RAID for an editing station, on the other hand, depends on at least entry level SATA enterprise drives to work smoothly.
3 DISK ON A SHELF
How do the results of the studies translate from a data center to the drive on a shelf?
Unfortunately no reliability statistic about drives with very little on-time exists, since this usage is not practiced in enterprise environments. This alone would disqualify this archiving strategy, but for data sets of only several TB it offers the cheapest and sometimes only technical or financially feasible solution. However some facts give an indication about the reliability for data on off-line drives and provide hints of how to optimize it:
The best argument for archiving on individual hard drives is the fact that an off-line drive will experience no wear and tear. Drive failure occurs more likely in the mechanical parts, less in the electronics of a drive, so less on-time might mean longer life. Even though Google found little correlation between workload and failure rate, likely this will not include off-line drives with no workload at all.
4 GUIDE TO ARCHIVING WITH DISKS
1 Drives should have passed their infant mortality phase. The drive(s) used for editing
might be ideal. Large organizations do a burn-in to filter out drives with significant
manufacturing defects. For new drives, as a minimum, do a full format that scans all
sectors (Windows: deselect quick format; OSX: disk utility, erase tab).
2 Create at least 3 copies on different drive models or one model from different shops, to
avoid running serial numbers and a bad batch.
3 Create one checksum file for the entire disk/project directory.
4 Verify all data transfers with checksums. Without checksum verification, transfer errors
and data decay remains undetected.
5 Separate copies geographically.
6 Data on magnetic media, like the HDD, degrades over time mostly due to magnetic thermal
relaxation. The magnetized zones that hold the individual bits loose signal strength over
time. Modern HDDs might start to loose information after 1 year. A verification interval
of less than a year with the checksums alerts the drive's firmware of sectors with a low
signal and refreshes them. If a bit is already corrupted, HDDs offer a limited internal
error correction. Additional data protection can provide advanced file systems like btrfs
7 Mechanical parts of a drive might fail, if not switched on at all. The regular
verification will keep the bearings moveable, till all bearing oil evaporated.
8 Migrate data at least every 5 years or before warranty runs out. Migrate file formats if
9 External HDDs add additional interface electronics and a primitive power supply to the
points of failure. Better use bare drives and an internal drive bay or, if not possible,
a docking station. Bare drives also save storage space.
10 What drive is built into an external case is not always clear. Lacie is owned by Seagate,
therefore likely uses Seagate HDDs which are less reliable than HGST in the Backblaze
statistics. G-tech, a brand of HGST (which is owned by WD), declared that they, at least
in part, use HGST drives. HGST offers drives in external cases as well.
11 Store master or at least a DCP at a public film archive. For German filmmakers:
Bundesarchiv-Filmarchiv (publicly funded productions are bound to do so)
Consider, that all value of a production goes into data, stored on pieces of 100 Euro consumer electronics, which are likely designed to be just reliable enough to produce optimal revenue for the manufacturer.
5 SPINNING DISK
Reliable storage with hard drive technology exists only in arrays of individual disks with redundancy spinning 24/7, where several disks can fail without loss of data. A spinning disk system incorporates the constant presents of IT-professionals who monitor and feed replacement parts on failures. It is an on-line storage system, that provides instant access.
A production might use a typical 8-drive SAS RAID for an editing station. They offer higher productivity through improved availability and performance over individual drives, but do not replace additional backup copies, let alone function as an archive. The CMU-study confirmed that once one drive in a RAID fails the probability of another drive failing explodes. An unattended simple RAID-5 system will likely experience complete data loss. For about 10 years RAID-5 is no longer a recommended configuration.
Magnetic tape was the first mass storage technology and is since the 1950s the standard technology to store data that is archived or not frequently accessed. Tape is an off-line storage, that is disconnected from electrical, software or human errors and offers the highest data protection for large data sets.
Today tape comes in form of hard-plastic cartridges. A simple mechanical construction, that consists of a spool with magnetic tape, which makes it much less prone to failure than a HDD. It is designed to stand in a shelf. Compared to a disk drive, a tape offers orders of magnitude higher reliability, needs no electricity and less space. In regard to costs/capacity, tape is also at almost any scale more economical than spinning disk storage.
At CERN (Geneva) only a few hundred megabytes of its 100-petabyte tape on average, get lost every year. The 50 petabytes of data held on hard disk, however, it loses a few hundred terabytes in the same period. (Alberto Pace, head of data and storage at CERN, Economist)
Tape is a linear storage medium and not designed for random access, but it offers a higher sequential read/write speed than an individual disk drive.
Tape technology and it's reliability is well understood. Other technologies play no role in the market. IT professionals must think conservative and would not deploy a technology that hasn't had a proven record of reliability. Tape can fail as well, but the highest risk to a cartridge is by far human error: dropping a cartridge. If a tape fails, recovery is usually simpler. A physically damaged part can be cut out and taped back together.
7 LTO-TAPE FORMAT
The Linear Tape Open format dominates the tape-market with over 85%, from small business to enterprise applications, including filmmakers, studios and post-production companies. It's success can be attributed to the facts that it is an open standard, offers a compromise between reliability and costs and the commitment of the major players IBM, Hewlett-Packard and Quantum to the format. The CERN in Geneva, for example, uses more reliable enterprise-tape formats to store their experiment's data, but the costs of these systems are to high for filmmakers and most other applications.
The LTO consortium pledges that a drive is always two generations backwards-compatible and they will continue to produce old drives to allow access to previous tape-generations. About every 3 years a new generation of LTO will offer about twice the capacity of the previous one.
The in 2010 introduced LTFS file system for the LTO-format, which allows access to the tape like any other drive without any 3rd party software.
LTO-Drives come as internal or external versions, both with SAS interface, compatible with Linux, Windows and OSX. The internal has the size of a 5¼-inch half height drive, like a Blu-ray drive, which fits in any workstation case, except the latest Macpro. Also external drives with the less reliable consumer interfaces USB3 and thunderbolt are available.
The SAS interface is found in workstations only, but simple SAS-controllers can be installed in most computers, except the new Macpro. Note, that not all SAS-controllers support tape drives.
One characteristic of the professional SAS-interface is, that it's connectors have a latch or screw, that prevents connection errors and accidental disconnects, in contrast to consumer interfaces like USB, Thunderbolt, Firewire or eSATA. For mobile use of the external SAS drive, adapters to thunderbolt or the Expresscard-slot are available.
cartridge netto price
drive netto price [€]
(internal HH SAS)
8 PRIZE/CAPACITY OVERVIEW
HGST HDD external
Creating a tape incl. checksum verification costs time. Post-houses might charge tens of cents/GB to produce one tape. Owning a drive writes itself off quickly. LTO-6 presents currently (5/2016) the recommended option. Since LTO-6, cartridges use BaFe-based magnetic material, which saves the tape from oxidation, unlike previous generations. Data stability is thereby increased. LTO-7 becomes interesting end of 2016, when possible issues should be resolved and media price might have dropped by then.
Apple users: The modern Macpro can obviously not be upgraded with any internal drive or SAS controller. External LTO-7 thunderbolt or USB3 solutions can cost over 5000 Dollars. Also additional software is needed, since the Finder does not work well with tape drives.
9 GUIDE FOR ARCHIVING WITH LTO-TAPE
1 Use LTFS (uncompressed).
2 Create 2-4 copies with a running serial number for a catalog.
3 Archive at least the master (10bit dpx, 16bit tiff sequences, DCDM).
4 Archive additional material: stills, posters, scripts, ...
5 Verify all data transfers with checksums (one checksum file per tape).
6 Verify data integrity regularly.
7 Separate copies geographically.
8 Store master or at least a DCP at a public film archive. For German filmmakers:
Bundesarchiv-Filmarchiv (publicly funded productions are bound to do so).
9 Migrate data at least every 10 years or every 3rd (LTO-)generation. Migrate file formats
10 Store cartridges vertical, in stable climate, ideally at 16-25 ° C and 20-50%
humidity (LTO-5); monitor temperature and humidity with dataloggers if
magnetic fields, shock, vibrations, direct sunlight.
@1 Media is often already compressed. In case, recovery of compressed data is much harder.
Activate only for uncompressed files (DPX, TIFF) or compress manually with the RAR-
archiver and add a "recovery record" to compensate for possible data loss increased by
@5 When using LTFS, the drive's built-in write verification is not enabled. Verification
has to be done manually with checksums, like any other file transfer.
3rd party software is needed for OSX, but can also simplify the workflow on windows.
and access by any post-production facility.
@6 Data degradation and the determination of the optimal verification interval is the
biggest problem of digital preservation and has significant financial implications.
Enterprises seem to verify tape anywhere between every 1 to 12 month. There is no
published data backing these intervals. Also, physical attributes of every new tape
generation changes. LTO manufactures even claim a shelf life of 30 years for their media
without specifically addressing the issue all magnetic media suffers from: thermal
relaxation. It is the technology-inherent physical cause for the decay of magnetically
stored information. The archival temperature determines largely the magnetic stability
of the data.
More frequent spot checks for new archived tapes and lengthening the verification
interval over time, seems a plausible approach to the problem.
Also, more copies decreases the risk of a data loss at a later verification date.
@10 Less than ideal storage conditions can also be compensated through more copies.
As mentioned, LTO-6's and later generations offer more resilience in not ideal climates.
@9 Migration after some years is a necessity, to prevent obsolescence of media and file
formats. Upgrade every 3rd LTO-generation allows still access to all tapes if the old
drive failed. The needed amount of tapes will be reduced by a factor of 8 after 3
generations (about 9 years), as tape capacity improves. Therefore the costs of archiving
old material will decrease over time.
Studie zum Stand und Aufgaben der Filmarchivierung des Kinematheksverbundes
Academy of Motion Picture Arts & Sciences
Verband technischer Betriebe f. Film u. Fernsehen
Bundesarchiv Filmarchiv: "Abgabe von Filmen"
Google Inc., Eduardo Pinheiro, Wolf-Dietrich Weber and Luiz Andre Barroso, Failure Trends in a Large Disk Drive Population
Carnegie Mellon University, Computer Science Department, Bianca Schroeder Garth A. Gibson, Disk failures in the real world: What does an MTTF of 1,000,000 hours mean to you?
Alberto Pace, head of data and storage at CERN, Economist Nov 30th 2013
Flash Reliability in Production: The Expected and the Unexpected, Bianca Schroeder, University of Toronto; Raghav Lagisetty and Arif Merchant, Google, Inc.
Association des Cinémathèques Européennes, „Challenges of the Digital Era For Film Heritage Institutions" and „A Digital Agenda For Film Archives“
From Grain to Pixel: The Archival Life of Film in Transition of Giovanna Fossati
EYE Film Institute Netherlands
International Federation of Film Archives
Martin Scorsese's Film Foundation
+49 (0)30 9836 1160