Answer to emails: how to condense several "empty" (not quite empty lines consisting only of ">") lines into one?
I'm trying to do something like this , but for quoted letters, so this
On 2014-07-11 at 03:36 PM, <ilovespaces@email.com> wrote: >Hi Everyone, > > > >I love spaces. > > > >That all. Would become that
On 2014-07-11 at 03:36 PM, <ilovespaces@email.com> wrote: >Hi Everyone, > >I love spaces. > >That all. thanks
Assuming each visual line is a valid logical line (a line of characters ending in \n ), you can do without the rest of the tools and just run uniq(1) at the input.
Example.
% cat tst >Hi Everyone, > > > >I love spaces. > > > >That all. % uniq tst >Hi Everyone, > >I love spaces. > >That all. % Try the following:
sed -r '/^>\s*$/{N;/^>\s*\n>\s*$/D}' Here is an explanation:
Commands Used:
NAdd the next line of input to the template space.DDelete to the first newline inserted in the template space. Begin the next cycle, but skip reading with input if there is still data in the template space.
Used patterns:
/^>\s*$/matches a string containing '>' with zero or more spaces followed by/^>\s*\n>\s*$/corresponds to two continuous lines containing>with zero or more spaces that should be used withN
So the above workflow of the sed command:
- read a line in the template space (if the end of the file is encountered, exit)
- If the template space contains only '>', go to step 4, go to step 3
- print the context in the template space and go to step 1
- add '\ n' and the next line to the template space if the template space contains only '> \ n>' (which means we meet two continuous lines), go to step 5, go to step 3
- remove the context before '\ n' (included) and then go to step 2
sed '/^>\s\s*$/d;$b;/^[^>]/b;a>' input Values:
/^>\s\s*$/d : delete all lines with one > and a space.
$b;/^[^>]/b : print and skip the last line, lines starting with > .
a> : add > after all other lines.
gives:
On 2014-07-11 at 03:36 PM, <ilovespaces@email.com> wrote: >Hi Everyone, > >I love spaces. > >That all. awk way
This actually takes into account spaces, unlike other answers (except perreals :)) It also does not just insert > after each line with more than > on it (this means that if there were several lines with text, empty lines would not be inserted between them.)
awk 'a=/^>[ ]*$/{x=$1}!a&&x{print x;x=0}!a' file Explanation
a=/^>[ ]*$/ Sets a to pattern. Pattern is begins with > and then has only spaces till end {x=$1} Sets x to $1. !a&&x While it does not match a(the pattern) and x is 0 {print x;x=0} Print x(>) and set x to zero !a If it is not a(the pattern) print the line How it works, it sets x to> when it finds a string containing only> and spaces.
Then it is executed until it finds a line that does not match, prints> and prints the line. This is reset every time it finds a pattern again
Hope this helps :)