Laptop showing message from Wayback Machine

Navigate around dead links with the Wayback Machine browser extension

Do you think most of the people around you know at least a few HTTP status codes? Unfortunately, if any – they’re likely to familiar with at least 404 Not Found and 500 Server Error. If you spend enough time on the web, you may encounter these errors quite often – if not daily. The Internet Archive’s new Wayback Machine extension for Firefox and Chrome can help you get to the page you were looking for even after it has been removed from the web.

The Internet Archive have been archiving copies of web pages for almost 15 years. They don’t have every web page ever created in their collection, but if you’re looking for a page that was available publicly for some time and was of any notoriety to anyone: there is a fair chance that the page will be part of their collection.

The whole collection of web pages is available to anyone through the Wayback Machine where you can lookup any page by its URL. I do this at least a dozen times per month when encountering pages or whole websites that have disappeared off the public web.

When links stop working and pages go off the web, we call it “link rot”. A large portion of the blame for link rot falls on webmasters who don’t care properly for their old links when moving, restructuring, or changing publishing platform. Server decay, link rot, an aging online population, and marketers who recommend their clients to “delete all old pages!” or attempt to rewrite their company’s history all contribute to link rot.

One thing is for sure; link rot is here to stay and it’s only going to get worse over time.

Dead link revival built in to the web browser

The Internet Archived and Mozilla started working together on integrating the Wayback Machine in Firefox around . Mozilla was evaluating whether they can deliver a better web experience without link rot and with access to long-dead pages by partnering with the Internet Archive and baking the feature directly in to Firefox. In dreadful institutionalized language:

“The Internet Archive is interested in promoting the development, and support the operation of, browser functionality to show archived versions of web pages that are no longer available or otherwise return an error, e.g. a 404.”

The result of this cooperation was the No More 404s experiment, part of Firefox’s Test Pilot program. Users who’ve opted in to the experiment would see a dialog offering to serve an archived version of the current page if the page was unavailable on the web but a copy existed in the Internet Archive’s collection.

The original plugin version only detected and offer archived versions for 404 Not Found error pages. However, I contributed a small patch that expanded its coverage to include common server-side, cache-proxy, and temporary error situations as well. This greatly increased the usefulness of the plugin as it covers more reasons that can make pages become unavailable. The experiment’s original name lost its meaning in the process, however.

The Test Pilot experiment is still being evaluated in Firefox, but a new version of the plugin (actually runs practically the exact same WebExtension code as the Firefox experiment) was renamed Wayback Machine and released as an extension for Google Chrome . The plugin is only offered through Test Pilot for Firefox, and is not actually available as a stand-alone extension for Firefox through the Add-ons Catalog.

The Chrome extension get 22 % of the total number of users who use the Firefox extension after 5,5 months in just one month without any marketing push from the browser vendor. If Mozilla decides to bake Wayback Machine lookups into Firefox, then we might see a lot more interest in the feature from Firefox users and privacy researchers.

You can get the Wayback Machine extension in the Google Chrome Store or by participating in the Test Pilot experiment for Firefox.

Using the plugin has some implications for your browsing privacy. The address of any broken page, as identified by its HTTP status code, will be sent to the Internet Archive so that they may determine whether they have an archived copy of the page available or not. The Internet Archive have a long-winded and archaic privacy policy not updated since that covers access to their collections. The plugin uses HTTPS to communicate with the archive and all archived pages are retrieved over HTTPS..