The table of contents, designed by Dr. Bawden, not only contains information about the files on the destination tape, it also contains some information regarding the source tapes and the destination tape itself. It was designed to account for the fact that many old tapes are often concatenated together across multiple new media of larger capacity.
The information stored for each file includes the file name, the creation date, and the approximate file size. It doesn't record the exact file size, but instead uses a logarithmic technique to conserve space in the table of contents. The file size is included mainly to provide a hint to a human user of this information; the exact file size can be determined by retrieving the actual file.
This format uses a delta encoding technique to reduce the amount of space used by each entry. An example of this technique is storing all the words in the dictionary in alphabetical order. If one has a sorted list of words, the difference between a given word and the next word is generally very small. Delta encoding records this difference compactly in a few bytes, which we then store in the table of contents.
For an example of how the table of contents is used, suppose a Lab member were looking for all files with the characters ``EMACS'' in their titles. He would first find where we stored the table of contents, which should be compact enough that it can be stored on-line in one place and won't need to be put on a tape itself. The user can then use one of our tools for decoding the table of contents format to peruse the listings. His search would be most effective using a combination of our tools and system tools for searching for the character string ``EMACS''. The output of this search would be a list of files, their locations, the approximate file size, and creation dates. Unfortunately, the search would be hindered by the fact that the user must decode all the table of contents files to be sure he performed his search across all our TCFS files. Due to the delta encoding scheme, it is impossible to avoid this interpretation overhead for any tapes that one searches.