Using sed for lazy multi-line search and replace

I am trying to use sed to remove blocks of HTML from a file. The block to be deleted appears several times in the file, and also spans several lines. Also notes that the block has different contents in it, but has clear delineations of the beginning and the end.

I tried several approaches to get this to work, and I run into problems laziness working in sed and matching between lines.

Here is an example of what I'm trying to do:

 good stuff a good stuff same line START bad stuff 1.0 bad stuff 1.1 END good stuff b good stuff b good stuff same line START bad stuff 2.0 bad stuff 2.0 END good stuff c 

becomes:

 good stuff a good stuff same line good stuff b good stuff b good stuff same line good stuff c 

Here are a few approaches I've tried so far.

sed -n '1h;1!H;${;g;s/START.*END//mg;p;}' < test > test2 Get strings for work.

sed -n 's/START[^END]*END//g' < test > test2 Only negates E or N or D.

sed -n 's/START.*?END//g' < test > test2 Does not behave with laziness.

Thanks.

+4
source share
5 answers

One sed can be hard to do. Two sed make this trivial:

sed 's/START/\nSTART\n/g' | sed '/START/,/END/d'

+1
source

sed is not suitable for multi-line input. Use awk instead. You want the string to match the regular expression, and turn off printing if this is the beginning of your "bad" block. Here is an example of your file:

 $ awk ' BEGIN { pr = 1; } /^START/ { pr = 0; } { if (pr) print; } /^END/ { pr = 1; } ' < yourfile good stuff a good stuff b good stuff b good stuff c 
+2
source

What about:

 $ sed '/START/,/END/d' file.txt good stuff a good stuff b good stuff b good stuff c 

More about ranges here

+1
source

This may work for you (GNU sed):

 sed '/START/!b;:a;/END/bb;$!{N;ba};:b;s/START.*END//' file 
+1
source

sed is a great tool for simple single-line replacements, use awk for anything else:

 $ awk 'sub(/START.*|.*END/,""){f=!f;if(NF)print;next} !f' file good stuff a good stuff same line good stuff b good stuff b good stuff same line good stuff c 
0
source

All Articles