I am trying to use sed to remove blocks of HTML from a file. The block to be deleted appears several times in the file, and also spans several lines. Also notes that the block has different contents in it, but has clear delineations of the beginning and the end.
I tried several approaches to get this to work, and I run into problems laziness working in sed and matching between lines.
Here is an example of what I'm trying to do:
good stuff a good stuff same line START bad stuff 1.0 bad stuff 1.1 END good stuff b good stuff b good stuff same line START bad stuff 2.0 bad stuff 2.0 END good stuff c
becomes:
good stuff a good stuff same line good stuff b good stuff b good stuff same line good stuff c
Here are a few approaches I've tried so far.
sed -n '1h;1!H;${;g;s/START.*END//mg;p;}' < test > test2 Get strings for work.
sed -n 's/START[^END]*END//g' < test > test2 Only negates E or N or D.
sed -n 's/START.*?END//g' < test > test2 Does not behave with laziness.
Thanks.
source share