When Web Pages Die (Yes. They Die.) and How To Save Them


120px-404_SymbolI came across a great article today in the online journal First Monday. This journal is always the source of a good, albeit scholarly, read.  I’ve been reading it for many years.

One of the articles in the current edition, Learning from failure: The case of the disappearing Web site, by Francine Barone, David Zeitlyn, and Viktor Mayer-Schönberger, caught my web-nerd eye.  It also sparked some memories of an Internet research project I managed way back in 1996-97; but more about that later.

Although it seems to be a universal truth that what you put on the web stays on the web, that’s probably a valid conclusion only for social media (Facebook, Twitter, and their ilk).  Studies have shown that links do die; and that there are many more than previously thought.  Sometimes even the data owners don’t know that their links are broken.

The First Monday article discusses the “Gone Dark Project” at Oxford University which addresses dead URL’s (Uniform Resource Locators) and the resulting “link rot.” The case studies discussed in the article can also “inform practical recommendations that might be considered in order to improve the preservation of online content.”

“We wanted to examine what has happened to Web sites, valuable archives and online resources that have disappeared, been shut down, or otherwise no longer exist publicly on the Internet.” (From the Introduction)

This article seems like the the polar opposite of the project I managed way back in 1996 and 1997.  Try to remember the Internet as it existed back then …

  • The World Wide Web was invented in 1989 by Tim Berners-Lee
  • The first graphical browser, Mosaic, was released in 1993.
  • There was no Firefox. (Version 1.0 of Firefox was released in 2004.)
  • Google was barely on the radar at that time. (First funding for Google was in 1998.)

(For more information, see Hobbes’ Internet Timeline. This decidedly old-school web page has been a favorite of mine since the mid-1990’s after I met the author at a work function. It’s still my favorite.)

So you see, it was truly the Dark Ages.  The contractors doing the work were using Yahoo!, Hotwire, and other tools available at that time to locate and catalog Internet resources.  The pool of information at that time was likely at least an order of magnitude smaller than what is available now.  Sites (or documents) didn’t go dark then so much as they didn’t ever see the light of day.  It wasn’t that items weren’t private, per se; it was merely (usually) that a unique URL had not been assigned to it.

The two studies are

At the time of these studies, there was a dearth of information available about the information revealed by these studies. I was very proud that, because of that fact, my contractors’ studies were both accepted to peer-reviewed journals, both in print and online.

But,  17 years later, I’m learning that things have come full circle.  In 1996, we were looking to discover what was new. In 2014, the ” Gone Dark Project” was looking for what has disappeared.

Plus ça change……

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s