MTE Explains: How File Recovery Works on a PC

It’s happened to all of us. You’ve mistakenly deleted a file or folder and emptied the recycle bin. Or maybe Windows refused to boot, and when you connected your hard drive to another PC, you were unable to read your data. Will you ever get your data back? This article will focus on understanding the concepts behind file recovery on a PC.

file-recovery-hard-drive-structureh

First off, let’s go through the logical structure of a hard drive. Typically, a hard drive is composed of a Boot Sector, index and data. The Boot Sector is situated at the beginning of the drive and is normally used to start the operating system and includes details about the drive’s partitions. The index contains information about the files and folders that exist on the drive – in the case of a Windows-based system, that would be the MFT or Master Fat Table. The data section of the drive holds the actual files and their contents.

file-recovery-deleted-files-empty-recycle-bin

What happens when you delete a file and empty the Recycle Bin? Is it gone for good? The contents are not really disposed of permanently. The index entry that directs the user to the file is marked as deleted and the file’s content area is marked as available to be overwritten or for use in the future. This means that the physical data is still around until it has been overwritten by another file. Therefore, if you hope to recover files, you should only attempt recovery by turning off your computer and connecting your hard drive to a different computer. Otherwise, any activity you perform on the drive may actually overwrite the data of your deleted files.

In the case of data corruption, there are many possible causes. The loss of power at an inopportune moment or a random computer crash/reboot could corrupt a segment of a file, the filesystem more broadly, or the MFT itself. Similar to the deleted files scenario, you should turn the system off and attempt recovery on a second computer.

Considering either scenario – deleted files or data corruption, there is a significantly good chance of recovering the data. The success rate of doing so somewhat depends on how long ago the files were deleted or the corruption that occurred and how much use the computer has had since that point in time.

There are a larger number of programs available that can facilitate data recovery. In my experience, TestDisk has worked extremely well.

Most data recovery applications have some sort of quick scan feature. This is usually only used for deleted files and necessitates that the logical drive is visible by the operating system – i.e. that the partition is not corrupt, mounted correctly, and you can actually browse the drive in explorer. A quick scan will scan the file table (MFT as discussed earlier) and look for files that have been marked as deleted.

file-recovery-deleted-files-quick-scan

The file table will dictate the location of the files on the drive and thus permit their recovery. However, if the space that they reside in on the drive has been overwritten then the recovery will not work as you had hoped and you will get a file full of junk. Most data recovery applications have a built-in file previewer which lets you take a sneak peak at the file contents. However, this may not be very useful if you don’t know what you’re looking for – plain-text files are easy to understand, and Word Documents generally have their contents as cleartext somewhere within a garbled mess of hex, but other media files will be more difficult. As you can see, the batch file below is clearly in good condition:

file-recovery-deleted-files-file-preview

Another complication of deleted file recovery is that the file(s)’ original location is not always known. You may need to go trawling through a random list of randomly named directories in order to find the files you wish to recover. This is because the file table may no longer be linked to the file’s directory location information. As you can see in the image below, the directory list on the left is made up of random characters. The filenames themselves, however, should still be intact, and your data recovery application should have a search option available to make the task of locating the files easier.

Once you have ascertained which files are to be recovered, your data recovery application should be the last step. Just remember to pick a destination drive that is not the same as the drive you are recovering from. Otherwise you risk the data recovery process writing over the very files you are trying to recover!

This scenario is a little more complicated. Various aspects of the file system may be corrupted – the file table, a segment of the data or many other permutations, including in the case of formatted drive recovery. In some scenarios, the data recovery application is able to read segments of your MFT to locate a significant portion of files. There may also be a mirror of the MFT so the data recovery application may be able to combine both the mirror and the normal copy to locate all of your data.

If the “faster” version of corrupt data recovery fails, then the fallback approach is to scan the raw drive data for signatures of particular file-types that you wish to recover. Files such as jpegs, MS Word documents, and Excel files have a specific “signature” – usually a similar beginning and ending to a file that defines their filetype. So the data recovery application would scan the drive for these strings in order to locate lost files.

However, this process is nowhere near perfect – the main issue being that it is sometimes difficult to decipher where a given file may end which can cause multiple files to group together in some cases. The software cannot figure this out with the limited information available and so makes a calculated decision. Another limitation with this method is that files that are not stored in contiguous space (fragmented files) will not be recovered correctly, as the recovery software has no knowledge of the file’s fragment locations without a file table. The final issue with this approach, as illustrated in the image below, is that it can be a slow process. (the recovery shown below is actually on a physically damaged drive, a normal raw scan would take a few hours, not 3 weeks!)

It is important to note that the chances of recovering the file does depend on how long it has been removed/deleted/formatted from the hard drive. For example, if the file was only erased a mere hour ago then it should still be relatively intact and not overwritten by the system. Unfortunately, if the file was deleted a few weeks to months beforehand, the chances of regaining the file may be significantly reduced. To have the best chance of recovery, we recommend that the user ceases saving new files (as well as performing any other actions) on their computer and immediately commence trying to restore the necessary files.

Image credit: Broken Hdd Data Loss by BigStockPhoto