CERN Archive | HTTP only
home archive home about

CERN Archive

CERN, the European Organization for Nuclear Research, was the center of original work that sparked the World Wide Web. In 2013 this side was recovered and hosted again. The site itself has a (/archive/cern/25)[history page] which cites that the project started in 1989. It does not state directly when the first web site was available on the internet, but states"General release of www on central CERN machines" on May 17, 1991. Though wikipedia and other sources state June 6, 1991 as the date on which the first web site went live.

Anyway, lots of interesting information about the early web efforts.

The site itself has a fair number of issues. This is actually one of the fun challenges for capturing, fixing, and presenting this site.

Some issues include:

  • unterminated tags
  • incorrectly terminated tags
  • incorrect links
  • obsolete tags
  • typos
  • incomplete information
  • links to sites which don't exist
  • links to ftp repositories which are no longer supported

In the current version of the archived site, I've managed to fix enough problems so that the site is generally navigable and not distractingly mal-formatted. There is still a ways to go.

Some things I still want to address:

  • capturing (archiving) inaccessible documents (broken links to individual documents)
  • indexing the site for easy lookup of terms
  • address remaining formatting issues

And as far as the site-as-archive:

  • provide annotations; e.g. note where technical fixes are made, add historical notes
  • navigation trail would be nice
  • capture publication date; this might be based on the resurrection date in 2013 and since, or perhaps it is possible to capture the historical creation date of such documents from the cern archivists.

It is currently implemented as a set of crawling scripts to fetch and correct the pages, which are provided as html pages and a csv file which serves as an index of all links starting from the home page. I will probably be refactoring this into a single sqlite file, which would unlock additional features such as search and site map.

Also, remember it is embedded in the "http-only" project site, in which we use html compatible with ancient browsers, so site appearance and behavior will be old school and will not meet the expectations of the modern web audience (at least those without an interest in the history of the web or the low-tech web.)

Thanks all for now!