Beginner’s Guide to Sed [Linux]

After grep, the next logical step is a tutorial on sed. The sed command comes from Stream EDitor, and as its name indicates, it deals with text flux. However, if sed is one of the most powerful commands in Unix, its manual page is also among the most enigmatic. I will try in this article to summarize the most basic usage of sed and then give you a few examples of advanced scripts.

The general command for sed is something like:

sed [option] '{script}' [text file]

Sed will perform the operations that you want him to do on the text file and display the result in the standard output. If you want the result in a text file, you can either redirect it via the habitual method:

sed [option] '{script}' [text file] > [edited text file]

Or use the option “-i” that will directly edit the input file:

sed -i[option] '{script}' [text file]

Now let’s begin working on the script. The most obvious first step is the null script:

sed '' test.txt

will just display the text in test.txt.

sed-null

A good usage of sed is deletion. Let’s practice through examples.

sed '2,4 d' test.txt

will delete the lines 2 to 4 of test.txt.

sed-24d

You can guess that the syntax for the script is:

sed '[first line to delete] [last line to delete] d' test.txt

But the fancy part comes when you use regular expressions, or regex, as delimiter for the deletion. For example,

sed '/^#/ d' test.txt

will delete every line that begins with “#” (in other words, if you code, it will delete all your comments).

sed-diezd

The general syntax is

sed '/regex/ d' test.txt

for deleting the line containing the regex.

sed '/regex1/,/regex2/ d' test.txt

for deleting the interval from the line containing regex1 to the line containing regex2.

The special character “^” that I used in the first example is to indicate the beginning of the line.

Then, the second basic usage that I can think of is substitution. The general syntax is:

sed -re 's/regex1/regex2/' test.txt

It will have for effect to search in the first line for regex1, replace it with regex2, go to the next line and repeat until the end of the entry flux.

A good example is:

sed -re 's/^# *//' test.txt

sed-uncomment

It will replace the symbol “#” at the beginning of a line, and all the blank spaces with nothing. In other terms, it uncomments the text file. The symbol “*” is a meta-character designing 0 or more blank spaces here.

You can do some pretty fancy stuff with sed, but you will reach the limit pretty fast if you don’t pay attention to its basic behavior. Sed deals with flux linearly: It applies a line-by-line treatment to a text file. If you want to do more than one modification to a same line, you have to use labels and multi-line treatment. All of this can become very complex, very quickly. I will now show you a few advanced examples and explain them to you. If you want more, I am sure that you can search by yourself and use the basics I gave you.

If you want to delete the empty lines of a file, you can use the command

sed -re '/^$/ {N; D}' test.txt

The meta-character “$” means the end of the line, so “^$” designs an empty line. Then, “{N;D}” is a rather complex syntax for saying delete that line.

If you want to delete every tag in a html file, this is the command for you:

sed -re ':start s/<[^>]*>//g; /</ {N; b start}' test.txt

The “:start” is called a label. It is a bit like a tag within the script that we want to go back to later in order to apply multiple changes to a same line. sed searches for anything of the form “<XXX>” (the regex <[^>]*>) and replaces it with nothing, so the first html tag of the line is deleted. But then, before going to the next line, it checks if there is something else beginning with “<”, and if there is, it goes back to the label “:start” and re-applies the treatment.

You are now ready to study more deeply sed, or just use it for simple modifications. It is a command that I find particularly useful in scripts in general, but it took me some time to understand its syntax. I hope it will be much faster for you.

Do you know another basic command for sed? Or do you use another advanced script involving sed that you want to share? Please let us know in the comments.