Using sed for lazy multi-line search and replace

Question

Using sed for lazy multi-line search and replace

I am trying to use sed to remove blocks of HTML from a file. The block to be deleted appears several times in the file, and also spans several lines. Also notes that the block has different contents in it, but has clear delineations of the beginning and the end.

I tried several approaches to get this to work, and I run into problems laziness working in sed and matching between lines.

Here is an example of what I'm trying to do:

 good stuff a good stuff same line START bad stuff 1.0 bad stuff 1.1 END good stuff b good stuff b good stuff same line START bad stuff 2.0 bad stuff 2.0 END good stuff c

becomes:

 good stuff a good stuff same line good stuff b good stuff b good stuff same line good stuff c

Here are a few approaches I've tried so far.

sed -n '1h;1!H;${;g;s/START.*END//mg;p;}' < test > test2 Get strings for work.

sed -n 's/START[^END]*END//g' < test > test2 Only negates E or N or D.

sed -n 's/START.*?END//g' < test > test2 Does not behave with laziness.

Thanks.

+4

sed

Alex Unger Feb 01 '13 at 20:01

source share

5 answers

sed is not suitable for multi-line input. Use awk instead. You want the string to match the regular expression, and turn off printing if this is the beginning of your "bad" block. Here is an example of your file:

 $ awk ' BEGIN { pr = 1; } /^START/ { pr = 0; } { if (pr) print; } /^END/ { pr = 1; } ' < yourfile good stuff a good stuff b good stuff b good stuff c

+2

Lev iserovich Feb 01 '13 at 20:22

source share

What about:

 $ sed '/START/,/END/d' file.txt good stuff a good stuff b good stuff b good stuff c

More about ranges here

+1

Fredrik pihl Feb 01 '13 at 20:11

source share

This may work for you (GNU sed):

 sed '/START/!b;:a;/END/bb;$!{N;ba};:b;s/START.*END//' file

+1

potong Feb 01 '13 at 23:06

source share

sed is a great tool for simple single-line replacements, use awk for anything else:

 $ awk 'sub(/START.*|.*END/,""){f=!f;if(NF)print;next} !f' file good stuff a good stuff same line good stuff b good stuff b good stuff same line good stuff c

0

Ed morton Feb 02 '13 at 14:46

source share

aragaer · Accepted Answer · 2013-02-01T21:40:09+0000

One sed can be hard to do. Two sed make this trivial:

sed 's/START/\nSTART\n/g' | sed '/START/,/END/d'

Using sed for lazy multi-line search and replace

More articles: