The need to pack and compress files together into a single archive has been around since computers first got hard drives, and that need has remained until today. Everything from documents and photos to software and device drivers are uploaded and downloaded as archives every day. Most computer users are familiar with .zip files, but there is more to archives than just the humble .zip. In this tutorial, we will show you the various Linux’s archiving and compression commands and the proper way to make use of them.
Tar and gzip
Historically the default archive tool on Linux is the
tar command. Originally it stood for “Tape Archive”, but that was back when tapes were the primary media for moving around data. The
tar command is very flexible and it can create, compress, update, extract and test archive files. The default extension for an uncompressed tar archive (sometimes called a tar file or tarball) is .tar while compressed tar archives most often use the .tgz extension (meaning a tar file compressed with GNU zip). Tar actually offers several different compression methods including bzip2, zip, LZW and LZMA.
To create an uncompressed tarball of all the files in a directory use the following command:
c means create,
v stands for verbose (meaning the tar command will list the files it is archiving) and
f tells tar that the next parameter is the name of the archive file, in this case “somefiles.tar”. The wildcard * means all the files in the directory, as it would for most Linux shell commands.
The tarball “somefiles.tar” is created in the local directory. This can now be compressed with a tool like gzip, zip, compress or bzip2. For example:
gzip will compress the tarball and add the .gz extension. Now in the local directory there is a
somefiles.tar.gz file instead of the
This two step process, create the tarball and compressing it, can be reduced to one step using tar’s built in compression:
This will create a gzip compressed tarball called
somefiles.tgz. The additional
z option causes tar to compress the tarball. Instead of
z you could use
Z which tells tar to use bzip2, xz and LZW compression respectively. xz implements LZMA2 compression which is the same algorithm as the popular Windows 7-Zip program.
It is possible to create a 7-Zip compatible file using the
7zr command. To create a .7z archive use:
a options means add, i.e. add all the local files into the archive somefiles.7z. This file can then be sent to a Windows user and they will be able to extract the contents without any problems.
Using 7zr to perform backups on Linux isn’t recommended as 7-zip does not store the owner/group information about the files it archives. It is possible to use 7zr to compress a tarball (which does store the owner information). You can do this using a Unix pipe as follows:
The hyphen after the
f option tells tar to send its output to the Unix
stdout and not to a file. Since we are using a pipe, the output from tar will be fed into 7zr which is waiting for input from
stdin due to the
-si option. 7zr will then create an archive called
somefiles.tar.7z which will contain the tarball which in turn contains the files. If you are unfamiliar with using pipes, you could create a standard tarball and then compress it with 7zr using two steps:
Extracting the files from these different archives is also very simple, here is a quick cheat sheet for extracting the different files created above.
To extract a simple tarball use:
x means extract.
To extract a compressed tarball use:
z option tells tar that gzip was used to compress the original archive. Instead of
z you could use
Z depending on the compression algorithm used when the file was created.
To extract the files from a 7-Zip file use:
Linux offers a wide range of archiving and compression commands. Try experimenting with the zip, and xz commands, they work in a very similar way to the other tools mentioned here. If you get stuck you should try reading the man page, e.g.
man xz for extra help.