In Depth Look at Linux’s Archiving and Compression Commands

The need to pack and compress files together into a single archive has been around since computers first got hard drives, and that need has remained until today. Everything from documents and photos to software and device drivers are uploaded and downloaded as archives every day.┬áMost computer users are familiar with .zip files, but there is more to archives than just the humble .zip. In this tutorial, we will show you the various Linux’s archiving and compression commands and the proper way to make use of them.

Historically the default archive tool on Linux is the tar command. Originally it stood for “Tape Archive”, but that was back when tapes were the primary media for moving around data. The tar command is very flexible and it can create, compress, update, extract and test archive files. The default extension for an uncompressed tar archive (sometimes called a tar file or tarball) is .tar while compressed tar archives most often use the .tgz extension (meaning a tar file compressed with GNU zip). Tar actually offers several different compression methods including bzip2, zip, LZW and LZMA.

To create an uncompressed tarball of all the files in a directory use the following command:

tar cvf somefiles.tar *

c means create, v stands for verbose (meaning the tar command will list the files it is archiving) and f tells tar that the next parameter is the name of the archive file, in this case “somefiles.tar”. The wildcard * means all the files in the directory, as it would for most Linux shell commands.

The tarball “somefiles.tar” is created in the local directory. This can now be compressed with a tool like gzip, zip, compress or bzip2. For example:

gzip somefiles.tar

gzip will compress the tarball and add the .gz extension. Now in the local directory there is a somefiles.tar.gz file instead of the somefiles.tar file.

This two step process, create the tarball and compressing it, can be reduced to one step using tar’s built in compression:

tar cvzf somefiles.tgz *

This will create a gzip compressed tarball called somefiles.tgz. The additional z option causes tar to compress the tarball. Instead of z you could use j ,J or Z which tells tar to use bzip2, xz and LZW compression respectively. xz implements LZMA2 compression which is the same algorithm as the popular Windows 7-Zip program.

linux-compression-7zipIt is possible to create a 7-Zip compatible file using the 7zr command. To create a .7z archive use:

7zr a somefiles.7z *

The a options means add, i.e. add all the local files into the archive somefiles.7z. This file can then be sent to a Windows user and they will be able to extract the contents without any problems.

Using 7zr to perform backups on Linux isn’t recommended as 7-zip does not store the owner/group information about the files it archives. It is possible to use 7zr to compress a tarball (which does store the owner information). You can do this using a Unix pipe as follows:

tar cvf - * | 7zr a -si somefiles.tar.7z

The hyphen after the f option tells tar to send its output to the Unix stdout and not to a file. Since we are using a pipe, the output from tar will be fed into 7zr which is waiting for input from stdin due to the -si option. 7zr will then create an archive called somefiles.tar.7z which will contain the tarball which in turn contains the files. If you are unfamiliar with using pipes, you could create a standard tarball and then compress it with 7zr using two steps:

tar -cvf somefiles.tar *
7zr a somefiles.tar.7z somefiles.tar

Extracting the files from these different archives is also very simple, here is a quick cheat sheet for extracting the different files created above.

To extract a simple tarball use:

tar xvf somefiles.tar

Where x means extract.

To extract a compressed tarball use:

tar xvzf somefiles.tgz

The z option tells tar that gzip was used to compress the original archive. Instead of z you could use j ,J or Z depending on the compression algorithm used when the file was created.

To extract the files from a 7-Zip file use:

7zr e somefiles.tar.7z

Linux offers a wide range of archiving and compression commands. Try experimenting with the zip, and xz commands, they work in a very similar way to the other tools mentioned here. If you get stuck you should try reading the man page, e.g. man xz for extra help.