We take the Internet and its wealth of knowledge for granted. Virtually everything is readily accessible 24 hours a day, 7 days a week at the click of a button. That is — until it’s not. Websites can go belly up without a moment’s notice, their content gone forever.
It is important to archive content that appears on the Internet for various reasons. Saving websites is a way of preserving human culture, much in the same way we protect and curate books or works of art. Curiosity is a big driver — after all, kids today couldn’t imagine an old Geocities web page in their wildest dreams. Aside from general curiosity, saving websites can allow us to refer back to important information.
It is super convenient to reference info found on the Web. But what happens when that link just points to a 404 error message? In 2013 a Harvard study found that 49% of the websites referenced in Supreme Court decisions in the US were now dead ends. How can we prevent vital information like this from disappearing into the virtual ether?
Luckily, the folks at The Internet Archive have developed a tool that can index and archive websites. They call it the Wayback Machine, and it has been archiving websites since 2001. To date, the Wayback Machine has saved over 304 billion web pages.
There are a number of reasons one would want to archive a website. Luckily, The Wayback Machine makes it super easy. Here are the ways in which you can use The Wayback Machine for all your webpage archiving needs.
Which Sites Are Cataloged?
Many popular websites are automatically archived by the Wayback Machine. However, you can use the Wayback Machine to manually archive virtually any page. Websites are often abandoned or changed completely, so the Wayback machine acts as a way to preserve the culture of the Internet by keeping a digital “hard copy” of a website. Be aware that text and images are left intact; however, some outbound links and embedded items (e.g. videos) are not.
It is important to note that The Wayback Machine only scans and archives public sites. This means that password protected sites or ones located on private servers cannot be archived. In addition, if a website prohibits search engines from including it in search results, Wayback Machine will not be able to archive it.
How to Use the Wayback Machine
There are two methods you can use to start archiving websites. Fortunately, both of them are super-easy and don’t require any special know-how. Start by placing your cursor in front of the URL in your browser’s address bar. Type
web.archive.org/save/ and hit Enter. A dialog box should appear on your screen informing you that the Wayback Machine is saving the page.
The second way to archive a webpage is to use the Wayback Machine archive website. First, navigate to a webpage you want to save and copy the URL. With that done, head to the Wayback Machine archive website. On the right side of this page you will see a header that reads “Save Page Now.” Paste the URL of the webpage you want to save into the text box and click the “Save Page” button.
Regardless of which method you use, the result is the same. Be aware that saving the page can take a while, so be patient and let it do its thing.
Wayback Machine Browser Extension
The Wayback Machine also has an official browser extension for Google Chrome. Using it to archive web pages is super easy. Simply navigate to a page you want to archive, click on the Wayback Machine icon in your toolbar and click “Save Page Now.”
In addition to making it even easier to save pages, the browser extension has another nifty trick up ts sleeve. Have you ever clicked on a link only to be confronted by a vague 404 error message? Whether it is a valuable source for your research paper or a really good recipe, it can be incredibly frustrating. With the Wayback Machine extension installed, that frustration could turn into a sigh of relief. When your browser runs into a dead end, the extension will search the archive to see if there is a saved copy on the Wayback Machine. If there is, it will ask you if you would like to visit that page.
If you don’t use Chrome, don’t fret. There is a Wayback Machine extension available for Firefox; however, it is still a work in progress. Additionally, there are plans to develop an extension for Safari users as well.
Do you or your organization have a website that needs to be indexed and archived frequently? If so, manually archiving each individual web page using the methods above can be incredibly tedious and costly. Fortunately, the Internet Archive provides a service called Archive-It that can automate the archiving process for you.
This service is not free; however, it can be ideal for those who want to back up their content with a “set it and forget it” mentality. Just stipulate which pages you would like to save and how often. This paid subscription is perfect for those who wish to save their web content on a regular basis.
Do you use the Wayback Machine? If so, do you visit it purely for fun or do you find it a useful tool? Are there other ways to back up content on the Web? Let us know in the comments!
Our latest tutorials delivered straight to your inbox