") lines into one? I'm trying to do something like th...">

Answer to emails: how to condense several "empty" (not quite empty lines consisting only of ">") lines into one?

I'm trying to do something like this , but for quoted letters, so this

On 2014-07-11 at 03:36 PM, <ilovespaces@email.com> wrote: >Hi Everyone, > > > >I love spaces. > > > >That all. 

Would become that

 On 2014-07-11 at 03:36 PM, <ilovespaces@email.com> wrote: >Hi Everyone, > >I love spaces. > >That all. 

thanks

+8
unix sed
source share
4 answers

Assuming each visual line is a valid logical line (a line of characters ending in \n ), you can do without the rest of the tools and just run uniq(1) at the input.

Example.

 % cat tst >Hi Everyone, > > > >I love spaces. > > > >That all. % uniq tst >Hi Everyone, > >I love spaces. > >That all. % 
+11
source share

Try the following:

 sed -r '/^>\s*$/{N;/^>\s*\n>\s*$/D}' 

Here is an explanation:

Commands Used:

  • N Add the next line of input to the template space.
  • D Delete to the first newline inserted in the template space. Begin the next cycle, but skip reading with input if there is still data in the template space.

Used patterns:

  • /^>\s*$/ matches a string containing '>' with zero or more spaces followed by
  • /^>\s*\n>\s*$/ corresponds to two continuous lines containing > with zero or more spaces that should be used with N

So the above workflow of the sed command:

  • read a line in the template space (if the end of the file is encountered, exit)
  • If the template space contains only '>', go to step 4, go to step 3
  • print the context in the template space and go to step 1
  • add '\ n' and the next line to the template space if the template space contains only '> \ n>' (which means we meet two continuous lines), go to step 5, go to step 3
  • remove the context before '\ n' (included) and then go to step 2
+2
source share
 sed '/^>\s\s*$/d;$b;/^[^>]/b;a>' input 

Values:

/^>\s\s*$/d : delete all lines with one > and a space.

$b;/^[^>]/b : print and skip the last line, lines starting with > .

a> : add > after all other lines.

gives:

 On 2014-07-11 at 03:36 PM, <ilovespaces@email.com> wrote: >Hi Everyone, > >I love spaces. > >That all. 
+2
source share

awk way

This actually takes into account spaces, unlike other answers (except perreals :)) It also does not just insert > after each line with more than > on it (this means that if there were several lines with text, empty lines would not be inserted between them.)

 awk 'a=/^>[ ]*$/{x=$1}!a&&x{print x;x=0}!a' file 

Explanation

 a=/^>[ ]*$/ Sets a to pattern. Pattern is begins with > and then has only spaces till end {x=$1} Sets x to $1. !a&&x While it does not match a(the pattern) and x is 0 {print x;x=0} Print x(>) and set x to zero !a If it is not a(the pattern) print the line 

How it works, it sets x to> when it finds a string containing only> and spaces.
Then it is executed until it finds a line that does not match, prints> and prints the line. This is reset every time it finds a pattern again

Hope this helps :)

0
source share

All Articles