MTE Explains: Linux Disk Structure And Why It Doesn’t Need Defragmentation

Is Linux so great that there is no fragmentation of the data? FALSE. Linux is great (that’s true) but even in your wildest dreams, there is no way to completely eradicate a disk fragmentation. I see you coming there: “I never defragmented my Linux system, in fact I never saw a tool for that, and my computer has never shown any problem. Are you sure you know what you are talking about?” Well, you are right. In principle, you do not have to defragment your disk on Linux. Data can be fragmented, but in theory you do not have to worry about it, and here’s why.

We’ve all heard of defragmentation. But first, what is fragmentation of data? To explain this concept, one of the best example you can find is the one used by Roberto Di Cosmo in a conference of 1998:

Your hard drive (or any other storing device) is like a shelf divided in boxes. All the boxes are of the same size and you use the shelf to store folders and files. When the shelf is empty, it is easy to put a folder in a box. If the folder is too large to fit in one box, you divide it and store the excess in the box next to it. You can do that as long as you have enough space left. However, when you are dealing with data on a computer, especially the one used by programs, the size varies a lot. Some files get bigger, deleted or moved. So really quickly, your shelf becomes a mess. Some boxes are half-empty, others cannot contain a growing folder. There are no free boxes at the bottom of the shelf (you started at the top) but you still need to store a new folder. Therefore you search for some free spaces in the previous boxes. In the end, your folder is divided and stored with some other parts of folders. You can imagine how difficult it is going to be in order to fetch the entire file in the shelf. Even if you wrote down where you stored the different parts, you still have to search in different boxes to gather all the files.

linux_fragmentation-table

You can now imagine the pain of your computer searching for a file when the disk is really fragmented. Compared to your processor’s speed, the time needed by your hard drive to find a fragmented folder is a little eternity. So in order to stop the sufferings and the delays, we use the defragmentation process. It basically does what it sounds like: takes everything out and try to put all the folders back in order, getting rid of the wasted spaces and storing the divided parts back next to each other.

Linux does not face the problem of the shelf. At least not to that extent. This is due to type of file system created specially for Linux: ext4. Ext4, like other file systems, manages the data and the space on a hard drive, but also does its best to prevent fragmentation. Going back to the shelf concept, when you store a folder into a box, ext4 will automatically book the neighboring boxes. It tries to anticipate the folder expansion, and actually does it quite well. That way, no folders will be divided and the shelf will remain ordered.

The downside is that the method requires a lot of free space. If there are no boxes left in the shelf, ext4 will have no choice but to go back to the old method of filling the holes. This can happen if you have less than 20% of free space left on your hard drive. So in general, your hard drive is not fragmented, or if it is, it is frequently less than 3% of its size.

YES there can fragmentation on Linux, but NO you do not have to do anything about it. The only advice that I shall give you is to manage well your hard drive, use LVM if you can and leave more than 20% of free space at all time. If for some reason you suspect a heavy fragmentation, the simplest solution is to move everything on a separate device and transfer it back. Ext4 should do the rest.

Do you have another tip against fragmentation? Or another question about the subject? Please let us know in the comments.

Image credit: Storage by BigStockPhoto