acm-header
Sign In

Communications of the ACM

Communications of the ACM

Trust in the Preservation of Digital Information


We live in a digital world. With an increasing amount of information being created, stored, and distributed in digital formats, preservation of digital information is a central concern. A recent wave of literature addressing the subject has focused on the fragility of digital media, technological obsolescence, and standards, but little attention has been given to the most critical barrier in the preservation of digital information: the potential conflicts between the new reality of digital information and the expectations of people. We address this issue here, describing the results of our survey of individuals with intensive experience in handling digital information and applying a concept derived from our analysis of monetary currency, called "institutional guarantee," to the development of trusted systems for the preservation of digital information.

A major finding of our survey (see sidebar) of 110 individuals, including office workers, students, teachers, scientists, and administrators, was that while individuals recognize the new opportunities offered by digital media, they do not yet embrace them wholeheartedly. For example, when we asked participants if they would discard important paper documents if they also had an electronic copy of the same documents, 86% of the respondents said they would not discard the paper documents, citing the following five factors for their lack of trust in digital preservation:

Inaccessibility. Lack of accessibility is the most commonly cited reason for the lack of trust in preserving digital documents. Paper has traditionally been the dominant medium for document recording, storage, distribution, and utilization. There has been no compelling need to distinguish between the format of a document and the medium in which it is embodied, since there is only one dominant choice of medium [6]. The dependence on equipment and software for preserving digital information potentially leads to inconvenience in accessing information. One respondent reported: "I preserved electronic copies of financial documents and then had a hard disk crash. My fault for not having a backup." Another noted: "Paper documents are more reliable. I can get to them in a power outage."

Still other respondents expressed distaste for having a barrier between them and their data in the form of reading devices. This reflects the common tendency to cherish convenience and ignore any technology with characteristics that limit convenience [1]. The respondents also expressed concern about the fact that many reading devices are rapidly becoming obsolescent. "The electronic copies might disappear or fail to migrate to future technology," noted one.


The creation of an institutional guarantee for trusted digital preservation is instrumental in increasing people's confidence and trust in digital media, since there are no precursors for preserving documents of this nature.


Lack of tangibility. "For the first time in 3,500 years of archival activity we produce records that do not exist to the human eye—unlike Babylonian clay tablets, Egyptian papyrus, Roman and medieval parchment, modern paper, even microfilm" [2]. However, deeply embedded in the context of "seeing is believing" of the printed culture for so many generations, people's confidence and trust in digital information is negatively associated with intangibility. Why should I trust something that is "invisible electromagnetic bumps on the plastic disk" clearly exemplifies this issue. One respondent noted: "I usually feel more comfortable having a paper copy of documents. I feel more real."

Fluidity. Fixity is an inherent feature of paper documents, while fluidity is associated with digital documents. One respondent mentioned an important reason why he did not want to discard paper copies of important documents: "A paper copy is always the exact version you sent somewhere. An electronic copy can be altered." Preserving something that can be altered easily clearly conflicts with our traditional perception of preservation. Conway noted that the term "archival" traditionally refers to "permanent" and preservation focused on infinity [1].

Short preservation period. In the evolution of document media, the physical life of the media, with a handful of exceptions, such as microfilm and optical media, tends to be shorter. Clay tablets and stone are not so easily destroyed; however, destroying digital information can be done with a few keystrokes. The loss of information on highly acidic paper is gradual and partial; in contrast, the loss of information on digital media is always sudden and total. Traditionally, people tend to think of preservation periods in terms of centuries instead of a few years. This shift has a far-reaching effect on preservation decisions. One respondent pointed out: "Why should I preserve something that cannot be used within a decade. It is a waste of effort."

Privacy and security. There is concern about privacy and security in preserving digital information. The growing worry about privacy and security is also confirmed by a recent report stressing "remote storage is an advantage, security worries are a drawback" [7].

Many people have low confidence and trust in digital preservation. For example, few individuals today would discard paper documents even if they have digital versions of the same information, and fewer still would scan paper documents into a digital storage system, then discard the paper. Our survey results, however, suggest many consider electronic preservation important in concert with other storage methods. Among respondents, 68% said they wished to have an electronic copy of important documents for reasons including ease of distribution, ease of reuse and reorganization, ease of locating files, ease of remote access, and backup purposes. Only 23% felt an electronic version of a paper document was unnecessary, and 9% gave no definite answer. One respondent's willingness to make an electronic copy "depends on the energy and effort involved in preserving the digital copy."

Back to Top

Paper, Currency, and Their Implications

The creation of an institutional guarantee for trusted digital preservation is instrumental in increasing people's confidence and trust in digital media, since there are no precursors for preserving documents of this nature. Unless people have sufficient confidence in digital documents, as they have in paper and monetary currency, they will not trust electronic document management appliances. The relationship between confidence and an institutional guarantee can be derived from the following analysis.

The conventional view of the role of paper. Discussions of the merits of information on paper conventionally focus on the physical attributes of paper that give it its creditability; its light weight and lack of battery-dependence make it easily portable; and its high resolution and contrast make it easier to read than most screens. Suppose an ideal display medium could be developed that shares or approaches paper's physical advantages. Would that lead to the replacement of paper by digital devices? For many purposes and in many contexts, no doubt it would. If the physical limitations of today's display devices could be overcome, then the many advantages of digital systems, including the ability to search, edit, annotate, and transmit information, would be widely enjoyed. But what about the storage of information? Both paper and digital devices can store information as well as display it. If we were inclined to replace paper with digital devices because of a future ideal digital display medium, then would we also be replacing the storage function of paper with a digital storage function? Based on our survey results, the answer is no. In other words, an ideal digital display medium is not by itself enough to result in "the end of paper."

How is a paper document like currency? An economist's classical definition of currency is twofold: currency is both a store of value and a medium of exchange. A store of value means currency has intrinsic value that is preserved over extended time periods with no effort on the part of its owner. A medium of exchange means that the exchange of currency is the way economic transactions are performed.

Like currency, paper also acts as a store of information, with information preserved over an extended period, and a medium of direct exchange of information with a person. A "store of information" means the information is preserved over an extended period with no effort on the part of its owner. A "medium of direct exchange" means that a producer of paper-based information can provide that information directly to a consumer of information without using any type of device or intermediary. "Consume" information certainly means reading it, but we can also include operations like annotation if we agree that pencils are trivial "devices."

How is digital information not like paper (or currency)? Information in digital systems—even in digital systems with an ideal display medium—differs markedly from paper-based information. Digital information cannot be consumed without the use of a non-trivial device and is not preserved over extended periods without effort on the part of its owner.

A key point is the meaning of "extended period of time." What does "extended" mean? It is possible to read Babylonian clay tablets and Egyptian papyrus created millennia ago, if you understand the language or symbols, or read English documents printed several centuries ago. But who can read electronic documents stored on digital media only a few decades ago? Do you still have your eight-inch floppy drive? Will the mainframe tapes melt or ignite when spun on drives 10 times faster than they were only 20 years ago? And assuming the physical media can be read, is your software compatible with the old file formats?

People do not trust digital information, but why do they trust electronic representations of currency? Initially, currency had intrinsic value because of the material from which it was composed. Whether the seashells of coastal people passed inland from hand to hand or the gold pieces of mercantilist Europeans, currency per se had perceived value. Later, paper currency gained acceptance as a surrogate for metallic currency (some may remember the silver certificate dollar bills once issued by the U.S. Treasury), though there have been moments in history when paper currency came to have little or no value. And finally, electronic representations of currency gained wide acceptance; we accept the bank balance printed out for us by the automatic teller machine without needing to see real currency.

What lies behind this willingness to accept surrogates in place of the real thing? Clearly, institutional guarantees are central to the process. Whether it was paper currency issued by a private bank in the era of Andrew Jackson or the Federal Reserve today, most of us have to a greater or lesser extent been willing to accept that a trusted institution stood behind the value of the surrogate (though some untrusting souls still keep metallic coins under their mattresses). So, is it possible to have institutional guarantees for the preservation of digital information? The answer could and should be yes.

Back to Top

Digital Media and Trust

Trust is a fundamental concept pervading every aspect of our daily lives. It is also fragile. As trust declines, people are increasingly unwilling to take risks. They demand greater protections against possible failures and may be slower to adopt new technologies. The lack of confidence and the need for trust may form a vicious circle [3, 4]. Weakened trust may result in failure to gain the critical mass needed to sustain a new business. The more pervasive the digital information, the more critical trust becomes. Trust is instrumental for the preservation of digital media. General characteristics of trust include the following:

  • Increases with familiarity. One of the major reasons people are not confident in digital preservation is that it is a new practice and they are not familiar with it. Confidence and trust may increase through use [4]. With the continued penetration of digital media and development of related technologies, people will be more confident in digital preservation.
  • Is linked to a given condition. Trusting behavior changes as the document medium changes. The transition from the preservation of paper media to digital media may have the same consequences.
  • Requires accountability and tangibility. People may not trust things that are intangible and unaccountable.
  • Is often associated with scale. People may not trust a digital preservation system operated by small institutions that could go bankrupt and disappear. There are also questions regarding trusting the intermediary—both in the long-term sense of whether the intermediary itself will be around in the future but also whether one would trust the intermediary with confidential, proprietary, or personal information. If there is to be such an intermediary, it seems that a governmental entity might be a more "trusted" institution or at least some kind of non-profit organizations created for this purpose, possibly a consortium of many industries and organizations.

Trust is a fundamental concept pervading every aspect of our daily lives. It is also fragile. As trust declines, people are increasingly unwilling to take risks. They demand greater protections against possible failures and may be slower to adopt new technologies.


Back to Top

Generic Trust Requirements for Digital Preservation

Trust became a hot topic in computer systems in the 1970s, when software and security issues came to the forefront. Trust needs to be embraced in every component and process in the preservation of digital information—in people (skills and behaviors), process and procedure, media, hardware and software, networks, management (policy and regulations), and insurance mechanisms.

Preservation of digital information is an extremely complex issue. No single organization can deal effectively with it. Instrumental in building trust in digital preservation is an institutional guarantee mechanism through the participation of multiple players, including: content producers; the hardware, software and telecommunications industries; document companies; the insurance industry; public certified agencies; and professional organizations. For example, digital documents, such as banking and insurance records, hold crucial financial and legal information. Unlike paper-based documents, information in digital documents can be easily altered. In order to cope with the fluidity of digital information, a "witness," notary public, or an independent certified agency is critical to an institutional guarantee mechanism of the integrity of digital documents [5, 8]. Since preservation of digital information is still maturing, the participation of professional organizations would help provide guidelines regarding digital preservation.

Kodak used to offer a combined camera and development service with the slogan: "You press the button, we do the rest." The camera was not user-serviceable; it came sealed with the film already installed, and after exposing the roll the entire unit was shipped to Kodak for developing and printing. We can approach digital preservation in the same way, perhaps with the slogan, "You send us the sealed CD/RW drive, we do the rest" (only if you have no Internet connection, of course). But if we took this idea seriously, we might conclude no single vendor has enough credibility. Instead, it might ultimately be necessary to form a kind of industry consortium backing the document-integrity guarantee. Would it be in the commercial interests of today's leading companies to enter such an alliance, or would it be too severe a threat to existing paper-based businesses? In the latter case, is it possible that a new industry might some day arise that provides such a warranty? Are there parallels to these ideas of jointly guaranteeing some results in, for example, the insurance industry?

Back to Top

Conclusion

The cost of electronically storing documents is dipping below the cost of storing paper documents containing the same information. As of this writing, the marginal cost of disk space for storing a document page image is approaching 150 times less than the cost of the paper the document is printed on, a huge gap that continues to widen. The cost and convenience advantages of digital archiving are compelling, but realizing the full benefits requires solutions to an issue as old as humanity and as new as the "next next thing": trust.

Back to Top

References

1. Conway, P. Preservation in the Digital World. Commission on Preservation and Access, Washington, DC, 1996.

2. Cook, T. It's 10 O'clock: Do you know where your data are? (1995); see web.mit.edu/erm/tcook.tr1995.html.

3. Fukuyama, F. Trust: The Social Virtues and the Creation of Property. Free Press, NY, 1995.

4. Gambetta, D. Trust: Making and Breaking Cooperative Relations. Basil Blackwell, NY, 1998.

5. Lynch, C.A. The integrity of digital information: Mechanics and definitional issues. Journal of the American Society for Information Science 45 (1994), 737–744.

6. Lynn, M.S. Preservation and Access Technology. Commission on Preservation and Access, Washington DC, 1990.

7. Robinson, P. Online data backup has its pros, cons. San Jose Mercury News (Aug. 31, 1997).

8. Stornetta, W.S. Preserving the integrity of business records in the age of electronic commerce. Computer Technology Review 10 (1995), special supplement.

Back to Top

Authors

Peter E. Hart ([email protected]) is Chairman and President of Ricoh Innovations, Inc., Menlo Park, CA, and Senior Vice President of Ricoh Company, Ltd., Tokyo, Japan.

Ziming Liu ([email protected]) is an assistant professor at the School of Library and Information Science, San Jose State University, San Jose, CA.

Back to Top


©2003 ACM  0002-0782/03/0600  $5.00

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.

The Digital Library is published by the Association for Computing Machinery. Copyright © 2003 ACM, Inc.