sed, the Stream EDitor : Practical examples



(excerpt from man sed)

sed is a stream editor. A stream editor is used to perform basic text transformations on an input stream (a file or input from a pipeline). While in some ways similar to an editor which permits scripted edits (such as vi), sed works by making only one pass over the input(s), and is consequently more efficient. But it is sed’s ability to filter text in a pipeline which particularly distinguishes it from other types of editors.
(excerpt from wikipedia)
It reads input line by line (sequentially), applying the operation which has been specified via the command line (or a sed script), and then outputs the line.


1 One liners

1.1 Substitution

Substitute _every instance_ of <pattern1> by <pattern2> in original_file and append it to new_file

sed "s/<pattern1>/<pattern2>/g" original_file > new_file

Same as above, but instead of appending changes to new_file, we backup original_file by appending a .old extension so the newly generated file will replace the original one ( this will create an original_file.old and appends changes to original_file)

sed -i.old "s/<pattern1>/<pattern2>/g" original_file

Same as first example but this time we want to substitute only the third instance of <pattern1> of each line (note that sed is reading line by line so we are speaking of nth instance of each line)

sed "s/<pattern1>/<pattern2>/3" original_file > new_file

It is also possible to specify an range starting from the nth occurrence to the end of each line by combining the global parameter g with an occurrence number (as seen in the above example). Let’s apply the substitution from the 4th occurrence to all the other (for each line as usual)

sed "s/<pattern1>/<pattern2>/4g" original_file > new_file

1.2 Deletion

Delete the nth line from <file>

sed -i '<n>d' <file>

Same as right above but we now keep a .old suffixed version of the modified file

sed -i.old '<n>d' <file>

Delete a line containing or matching <pattern> from <file>

sed -i.old '/<pattern>/d' <file>

Delete a line using its number (can be checked using the :se nu or longer :set number option in vi interactive mode) from <file>

sed -i.old '<number>d' <file>


1.3 Commenting / uncommenting

1.3.1 Commenting

Commenting is actually the same as “insert a “#” (dash) at the beginning of any line matching a <pattern>” in <file>

sed '/<pattern>/ s/^/# /' <file>

1.3.2 Uncommenting

Uncommenting is equivalent to “remove any line starting by a dash character (#)“, so let’s do it :

sed '/^#/d' <file>

Remember that sed is, by default, only printing its modified version of the file to STDOUT, so the above command is good to simplify a big or heavy commented file. If you want to save a lighter version of this kind of file you just need to add the -i flag.

1.4 Multi-line / new-line (AKA end-of-line) jobs

1.4.1 Removing / replacing a new-line character

This kind of job is not “natural” for sed.

Remember that sed process a file line after line and each processed line (stored in “pattern space” ) is amputated of its terminating new-line character ! There is anyway some good recipes to fix that, here one that should allow you to work with end-of-line characters.

sed -r ':a;N;$!ba;s/\n//g' < input_file > output_file

In this first example we are about to remove every new-line character from input_file !


1.4.2 Working on multi-lines

Let’s see another useful example. We are going to use this complex structure (:a;N;$!ba;s/\n) as a part of a another pattern.

Let’s say we want to remove every new-line followed by “5 space characters, followed by a “&” esperluette and again an random number of space“, here is what the file might looks like :

THAGCEL(1839)*THTGAS(1839)*THDGAS(1839) +
     &                (1.-THAGCEL(1839))*THTLIQ(1839)*THDLIQ(1839) ) /
        THDLX(10,1839) = 0.0
THHGAS(1839)*THAGCEL(1839)*THDGAS(1839) +

And here is what we’d like it to be :

THAGCEL(1839)*THTGAS(1839)*THDGAS(1839) +(1.-THAGCEL(1839))*THTLIQ(1839)*THDLIQ(1839) ) /
        THDLX(10,1839) = 0.0
THHGAS(1839)*THAGCEL(1839)*THDGAS(1839) +

As you may have not noticed, the difference between those 2 excerpt is that we replaced this part :

THAGCEL(1839)*THTGAS(1839)*THDGAS(1839) +
     &                (1.-THAGCEL(1839))*THTLIQ(1839)*THDLIQ(1839) ) /
        THDLX(10,1839) = 0.0

by this :

THAGCEL(1839)*THTGAS(1839)*THDGAS(1839) +(1.-THAGCEL(1839))*THTLIQ(1839)*THDLIQ(1839) ) /
        THDLX(10,1839) = 0.0

I agree that this file is not the best example ever, but that’s the only one i got for now, and it is actually this kind of file which led me to find out how to do that kind of jobs with sed (except that it was more about 10k lines…)

Here is the command i used to achieve this :

sed -r ':a;N;$!ba;s/\n[[:space:]]+&[[:space:]]+//g' < input_file > output_file



Tagged on: , , ,

Leave a Reply

Your email address will not be published. Required fields are marked *

This site supports SyntaxHighlighter via WP SyntaxHighlighter. It can highlight your code.
How to highlight your code: Paste your code in the comment form, select it and then click the language link button below. This will wrap your code in a <pre> tag and format it when submitted.