acm-header
Sign In

Communications of the ACM

ACM TechNews

Old Dominion U. Professor Is Trying to Save Internet History


View as: Print Mobile App Share:
Michael Nelson

Old Dominion University professor Michael Nelson who found that about 19 percent of Web pages have been archived.

Photo courtesy of Old Dominion University

Researchers at Old Dominion University and Los Alamos National Laboratory (LANL) have developed Memento, browser-based software that can find a Web site as it appeared on a specific date in the past.

The program is part of an effort to study how much of the Internet is being saved. The average life span of an Internet page is about 100 days, meaning that much of what has been published in the approximately 20-year history of the Internet has been disposed of.

The Internet "was conceived without the notion of time and without the notion of archiving at its core," says LANL computer scientist Herbert Van de Sompel.

In 1996, Brewster Kahle launched the Internet Archive, a nonprofit digital library that conducts bi-monthly crawls through the Web, storing every site it finds. Today, the archive contains three petabytes of information, and is one in a network of archives around the world.

The Old Dominion team, led by professor Michael Nelson, found that about 19 percent of Web pages have been archived, and most of those pages taken from the Open Directory Project, a public index of Web sites, had been saved for posterity. "We're sort of stuck in this perpetual now," Nelson says. "Figuring out what was on the Web an hour ago, a day ago, a week ago, we're really bad at that."

From Washington Post
View Full Article

Abstracts Copyright © 2011 Information Inc. External Link, Bethesda, Maryland, USA 


 

No entries found

Sign In for Full Access
» Forgot Password? » Create an ACM Web Account