Fetching Files with wget

Perhaps your X server has crashed. Or you’re working on a shell script. Or you’re SSHed into a headless server. For one reason or another, if you need to fetch a file and a web browser’s not an option, you might want to look into wget. If you’ve used the Linux shell much and worked with scripts or package installers, there’s a good chance you’ve seen wget in action. At the simplest level, it does just what the name implies and gets a file from the web (or FTP). Underneath that, though, is some clever functionality.

As noted above, wget is most commonly used to quickly grab a file from somewhere on the web.

wget http://mydomain.com/file.zip

Which would simply grab the file and save it to the current directory.

If you want to save to a different filename or different location, you use the -O flag.

#Remember it's a capital "O" not zero or small "o"
wget http://download.maketecheasier.com/Firefox_shortcut_keys.pdf -O Documents/ffkeys.pdf

You may be downloading multiple files, in which case you may want to specify a location for all downloads. Just use the -P flag (or –directory-prefix=LOCATION) to specify where they go.

As you can see, that’s a lot of output. Let’s try the -q option to clean it up, making it more suitable for scripts.

wget -q http://mydomain.com/file.zip

Similarly, you can use the -nv option for just a little output, but not as much as the default.

What if your download failed, and you want to resume? What if the file already exists, and you don’t want to overwrite it? There are options to handle those as well as several other situations.

To resume a broken download, you’d use the -c flag (or –continue)

If you need to make sure your command (or script) doesn’t overwrite any existing files, use the -nc option (for no-clobber)

There are times when you can’t be certain if filenames will be case-sensitive on both ends, but the –ignore-case flag will negate that problem.

To limit the download rate, use the –limit-rate=RATE option, as demonstrated below.

wget --limit-rate=20000 http://ftp.us.debian.org/debian-cd/5.0.7/amd64/iso-cd/debian-507-amd64-netinst.iso

As noted earlier, wget supports FTP as well. If you just specify a FTP site, like

wget ftp://ftp.us.debian.org/debian-cd/5.0.7/amd64/iso-cd/debian-507-amd64-netinst.iso

wget will assume you want an anonymous login. If that’s not the case, you can manually specify things like username and password with the following flags:

  • –ftp-user=USER Specifies the username for login
  • –ftp-password=PASS Specifies password
  • –no-passive-ftp Disables passive transfer mode.

As well as a few others for advanced use cases.

Finally, wget comes with several options relating to server connection problems and timeouts. Not all failures can be dealt with of course, but the following flags are all intended to help deal with server issues:

  • –tries=NUMBER Specify number of times to retry download
  • –retry-connrefused Retry download even if connection is refused by server.
  • –continue Resumes an incomplete download, used as -c above
  • –timeout=SECONDS Global setting – how long to wait before timeouts
  • –wait=SECONDS How long to wait between successful downloads (if repeating)

For such a simple, basic, built-in utility, wget has a surprising amount to offer. Next time you find yourself writing an internet-aware shell script, or needing to get that missing driver file on your broken computer, give wget a shot. If you’ve got any interesting stories about how wget has got you out of a jam, let us know in the comments below.