acm-header
Sign In

Communications of the ACM

ACM TechNews

Digital Archives Meant to Be Permanent Seem to Be Lost on the Web


View as: Print Mobile App Share:
Checking connections in a datacenter.

The Internet can be an ephemeral place, which is problematic for digital Web archives designed to permanently preserve the content of Web pages.

Credit: Bill Hinton/Getty Images

Old Dominion University (ODU)'s Michael Nelson and colleagues found supposedly permanent digital Web archives could be lost.

The team ran a Web crawler between November 2017 and January 2019 to access 16,627 pages preserved by 17 services in the U.S., Europe, and some serving the whole Internet.

Four of the archives' uniform resource identifiers changed during that period, impacting the crawler's ability to find the archived pages.

The four archives stored 1,981 Web pages, of which 537 were affected, including 20 that could not be retrieved at all.

ODU's Michael Nelson said, "Being able to provide access to archives and demonstrate the integrity and authenticity of those archives are indeed issues that are very important to us and our members, and Web archives are no exception."

From New Scientist
View Full Article

 

Abstracts Copyright © 2021 SmithBucklin, Washington, DC, USA


 

No entries found

Sign In for Full Access
» Forgot Password? » Create an ACM Web Account