Answer to emails: how to condense several "empty" (not quite empty lines consisting only of ">") lines into one?

Question

Answer to emails: how to condense several "empty" (not quite empty lines consisting only of ">") lines into one?

I'm trying to do something like this , but for quoted letters, so this

On 2014-07-11 at 03:36 PM, <ilovespaces@email.com> wrote: >Hi Everyone, > > > >I love spaces. > > > >That all.

Would become that

 On 2014-07-11 at 03:36 PM, <ilovespaces@email.com> wrote: >Hi Everyone, > >I love spaces. > >That all.

thanks

+8

unix sed

user3843237 Jul 16 '14 at 2:59

source share

4 answers

Noufal ibrahim · Answer 1 · 2014-07-16T08:14:13+0000

Assuming each visual line is a valid logical line (a line of characters ending in \n ), you can do without the rest of the tools and just run uniq(1) at the input.

Example.

 % cat tst >Hi Everyone, > > > >I love spaces. > > > >That all. % uniq tst >Hi Everyone, > >I love spaces. > >That all. %

Wkplus · Answer 2 · 2014-07-16T05:23:40+0000

Try the following:

 sed -r '/^>\s*$/{N;/^>\s*\n>\s*$/D}'

Here is an explanation:

Commands Used:

N Add the next line of input to the template space.
D Delete to the first newline inserted in the template space. Begin the next cycle, but skip reading with input if there is still data in the template space.

Used patterns:

/^>\s*$/ matches a string containing '>' with zero or more spaces followed by
/^>\s*\n>\s*$/ corresponds to two continuous lines containing > with zero or more spaces that should be used with N

So the above workflow of the sed command:

read a line in the template space (if the end of the file is encountered, exit)
If the template space contains only '>', go to step 4, go to step 3
print the context in the template space and go to step 1
add '\ n' and the next line to the template space if the template space contains only '> \ n>' (which means we meet two continuous lines), go to step 5, go to step 3
remove the context before '\ n' (included) and then go to step 2

perreal · Answer 3 · 2014-07-16T06:12:31+0000

 sed '/^>\s\s*$/d;$b;/^[^>]/b;a>' input

Values:

/^>\s\s*$/d : delete all lines with one > and a space.

$b;/^[^>]/b : print and skip the last line, lines starting with > .

a> : add > after all other lines.

gives:

 On 2014-07-11 at 03:36 PM, <ilovespaces@email.com> wrote: >Hi Everyone, > >I love spaces. > >That all.

user3442743 · Answer 4 · 2014-07-16T08:07:56+0000

awk way

This actually takes into account spaces, unlike other answers (except perreals :)) It also does not just insert > after each line with more than > on it (this means that if there were several lines with text, empty lines would not be inserted between them.)

 awk 'a=/^>[ ]*$/{x=$1}!a&&x{print x;x=0}!a' file

Explanation

 a=/^>[ ]*$/ Sets a to pattern. Pattern is begins with > and then has only spaces till end {x=$1} Sets x to $1. !a&&x While it does not match a(the pattern) and x is 0 {print x;x=0} Print x(>) and set x to zero !a If it is not a(the pattern) print the line

How it works, it sets x to> when it finds a string containing only> and spaces.
Then it is executed until it finds a line that does not match, prints> and prints the line. This is reset every time it finds a pattern again

Hope this helps :)

Answer to emails: how to condense several "empty" (not quite empty lines consisting only of ">") lines into one?

Here is an explanation:

More articles: