How to Use Wget to Download Websites to Your PC

Ever wonder if there was a way to download a website without a web browser? You’re in luck. With the power of Linux command line, anything is possible. There are multiple methods to complete this task, but we’re focusing on wget in this article.

What Is wget?

wget is a GNU command-line utility for retrieving content from web servers. As a downloader, wget is very powerful in its own right. wget is capable of working with multiple protocols, such as HTTP, HTTPS and FTP. Other capabilities of the wget utility include:

ability to run silently or in the background
integrated with Linux scripts or CRON jobs
can run multiple downloads at one time
downloads files that require a password

Why wget?

While there are a multitude of tools that can perform website activities, wget allows for a broad scope. It gives the user the ability to function without a web browser by:

downloading a full copy of a website
downloading a specific file from a website
automating the retrieval of a file on demand
obtaining a document from an authentication portal

wget is also built into most Linux distros, so it is available right from the start, and no further installation is required.

wget Basics

Getting started with wget is fairly simple. First, open a Linux Terminal.

Once a terminal window is open, you can run wget as shown below:

wget URL

Replace “URL” with the exact URL of the website.

To resume a partially downloaded file, use a -c switch in your command as follows:

wget -c URL

To make your wget download silent, add the -q switch to your initial wget command:

wget -q URL

If you are not sure of proper use of options within wget, use the following:

wget --help

Other than websites, you can also download a file using wget. For example:

wget https://example.com/file.zip

It would simply grab the file and save it to the current directory.

If you want to save to a different filename or different location, use the -O flag.

wget https://example.com/file.zip -O ~/Documents/my_downloaded_file.zip

FTP Options

As noted earlier, wget supports FTP as well. If you just specify a FTP site:

wget ftp://ftp.example.com

wget will assume you want an anonymous login. Alternatively, you can manually specify things like username and password with the following flags:

--ftp-user=USER: specifies the username for login
--ftp-password=PASS: specifies password
--no-passive-ftp: disables passive transfer mode

Timeouts, Retries, and Failed Downloads

Finally, wget comes with several options relating to server connection problems and timeouts. Not all failures can be dealt, with of course, but the following flags are all intended to help deal with server issues:

--tries=NUMBER: specifies number of times to retry download
--retry-connrefused: Retries download even if connection is refused by server
--timeout=SECONDS: global setting – how long to wait before timeouts
--wait=SECONDS: how long to wait between successful downloads (if repeating)

Who Would Use wget?

In reading this post, you may be thinking, “This sounds complicated and far more difficult than using a web browser,” but anyone can find a use for this utility, whether as a systems admin or a programmer. Below are two examples of how I use this command throughout my day, with my role sometimes changing.

It makes my works as a security researcher easier because I can schedule this command to download multiple websites at once. I can do this by creating a text file (using any text editor) that contains a number of URLs in a list (one URL per line). By executing the command below with the -i switch, wget will download each website in the list.

wget -i download_file_name URL

As a systems administrator, I can obtain documents from password-protected locations with ease. This may not assist you as well offline, but by running wget allows, it allows you to add credentials to a site.

wget --user=user_id --password=user_password URL

There you have it! Was it as difficult as you thought? Being able to automate your actions with wget will save you time and give you the ability to also work offline. What do you have to lose?

Leave a comment below and let us know whether you found this useful.