next up previous gif
Next: The Table of Up: The Archivist-Translating Greek Previous: UFD-User File Directory

Output to DAT - Rooms of Tapes to Shoe-Boxes of Tapes

Uncompressed DDS-format 4mm DAT tape was chosen for archival output. There is no particular justification for why we chose this medium over any other (although we also considered using a CD-ROM writer) and we may change if we find reason to do so. We were unable to find any studies suggesting that a particular medium was significantly more reliable or suitable to this purpose than any of the others. We chose an uncompressed format because we have observed interchange problems between drive brands when using compressed formats, while the uncompressed formats have interchanged correctly.

The tapes are being written out using the TCFS standard, but a number of parameters remain unspecified in the standard. For instance, TCFS doesn't specify a blocking factor to use for the output. We happened to arbitrarily choose 10-kilobyte blocks, since the performance concerns over blocking factor are largely irrelevant in the DDS format. We also chose to put an end-of-file (EOF) mark at the end of each old tape's worth of data. This, again, was a totally arbitrary decision; our hope is that the EOF mark will make searching the tape easier, but it is unclear that the mark will provide any performance benefits. These issues remain unspecified in the format because TCFS is completely self-delimiting and therefore insensitive to record length and EOF marks.

To reduce further data loss in the future, we are simultaneously making two copies of the tape on output. Neither of these two tapes will be publicly available, and any publicly available tape would be a third copy (once the privacy issues are settled). One set of these tapes will be stored at the lab, for any of the purposes mentioned in the introduction. Another set of these tapes will be placed in off-site environmentally controlled storage for posterity.

The original media will not be destroyed outright, but we will be less stringent in the care and handling of these ``dead'' tapes. They will probably be sent to off-site storage, but their eventual fate is unknown and will depend on financial constraints. There is no good reason for us to dispose of the old tapes, and no expedient way for us to recycle them. It may also be the case that any particular file that we were unable to reconstruct from these tapes during this project will be of such extraordinary interest to future generations to justify the heroic efforts required to rescue the data.

Some have suggested that a continuous path for file migration from disk to robotically-accessed media (media jukeboxes) to off-line media is the ideal. The TCFS format does not preclude such a structure; it just specifies a more durable format for files to withstand the ages. The tapes that we are migrating currently have been unavailable for access for years, so we are unable to estimate future patterns of access. Some files may be popular enough to require putting them on traditional hard disks or robotically-accessed media. In that event, we can think of these other storage media as a cache for the primary storage of files on off-line media.



next up previous gif
Next: The Table of Up: The Archivist-Translating Greek Previous: UFD-User File Directory



boogles@martigny.ai.mit.edu