How do you know for sure that the 4 GB file you just downloaded has been transferred without error? One way is to use a hash algorithm that produces a “fingerprint” or a “message digest” of the downloaded file. Like human fingerprints, the resulting character string is meant to be unique and only that file can produce that fingerprint. Sites offering large downloads, say a Linux distribution like Fedora, will also publish a list of the hashes for the files. All you need to do is check the hash of the file you have against the published hash and if they are the same, then the file has been downloaded correctly.
In the past, the preferred hashing algorithm was MD5 and although it is still widely used (for example the Ubuntu project still provides MD5 hashes), it is slowly being replaced by the SHA family of hashes. The problem with MD5 is that it is possible to create multiple files with the same fingerprint. In one famous case among cryptographers, a security researchers said he knew who would win the presidential election and he had created a file with the result in it and issued the MD5 hash. He would release his file after the election and prove his prediction was right. In fact, what he had done was create several files with all the possible winners and manipulated the files in such a way that they all had the same MD5 fingerprint!
There are several different Secure Hash Algorithms (SHA) including SHA1, SHA256 and SHA512. Technically SHA256 and SHA512 both use the same algorithm, but process the data in different sized chunks – SHA256 uses 32 bit blocks and SHA512 64 bit blocks.
SHA1 is similar to MD5 and like MD5, there are some concerns about the uniqueness of the resulting hashes and it is no longer approved for many cryptographic uses since 2010. However if you find a site which publishes SHA1 hashes, you can check them like this:
The output will look like this:
SHA256 hashes are generated in the same way:
And the output is similar, except note that the fingerprint string is much longer:
And likewise for SHA512:
The resulting fingerprint is even longer:
Rather just publish the fingerprint string in isolation, some sites offer a checksum file which contains all the hash information in a machine readable form that the various
sha commands can use to automatically verify files. A checksum file for the net install 32 bit Intel version of Fedora 19 might look like this:
To check this use the “
-c” parameter like this:
sha256sum -c Fedora-19-i386-CHECKSUM
Fedora-19-i386-CHECKSUM is the name of the file containing the fingerprint information as shown above.
If the fingerprints match, then the output will look like this:
If there is an error in the downloaded file, the output will be:
Fedora-19-i386-netinst.iso: FAILED sha256sum: WARNING: 1 computed checksum did NOT match
Your Linux distribution likely also contains the
sha384 commands. These two hash algorithms are truncated versions of SHA256 and SHA512. They can be used in exactly the same way as the
sha512 commands. Try producing hashes uses them and notice the differences in the output.