Need help understanding this regular expression in sed

I posted this question and someone answered with this

sed '/^void.*{$/!b;:a;/\n}$/bb;$!{N;ba};:b;s/\n/&test1&/;s/\(.*\n\)\(.*\n\)/\1test2\n\2/' file

I am new to sed and regex and cannot understand what is the function of each part.

I will try to explain what I understood, and you guys can fill in the missing things. I will move around the symbol

  • ^void.*{$ - this means everything that starts with void and ends with {
  • /!b; I did not understand what this does. Now b for branching . what / does there
  • :a; designed to create a label
  • /\n Again did not understand / there
  • }$ , which ends in }
  • /bb I don't understand him
  • $! means if not the end of the file
  • {N; I didnโ€™t understand what this means, N means copying the next line in the buffer, but got {
  • :b I do not understand. b for branching, but doesnโ€™t know what 10. does there. s/\n/&test1&/ I think replacing it \ n with \ntest1\n , but not sure
  • s/\(.*\n\)\(.*\n\)/\1test2\n\2/ dont get this either.
+4
source share
2 answers

You can associate multiple sed expressions with a character ; . Look at each separately here.

The first expression /^void.*{$/!b has a delimiter expression between delimiters / 's. It corresponds to:

^ - start of line

void - followed by the characters "void"

.* - followed by something

{ - followed by the left curly figure

$ - followed by the end of the line

The modifier in this first expression !b means that if the match does not match, abort the sed evaluation.

Expression :a is a label. It was used with the goto-like sed function called branching. We will see how labels are used in the following expression.

The expression /\n}$/bb matches:

\n - new line

} - followed by right curly

$ - followed by the end of the line

The bb modifier means that if you find a match, โ€œbranchโ€ to label b. Label b is defined in a later expression as :b .

The expression $!{N;ba} should be read as a unit, although it does ; In the middle. The columns in this case are a sequence of instructions that must be executed together.

$! - if this is not the end of the input

{ - run a group of commands (in this case there are two)

N - read another line, silently

ba - branch for marking

} - final group of commands

Next is the label :b , which we will hit when we match one } on the line, through the expression /\n}$/bb .

Finally, there are two replacement patterns that are pretty standard regex. s before the expression essentially means s/find_this/replace_it_with_this/ . In the case s/\n/&test1&/ we have:

\n - find a new line

/ - and replace it with

& - the thing that was matched in the first expression (in this case, in a new line)

test1 - word test1

& - and again the thing that was matched

So basically s/\n/&test1&/ means replacing the next \n with \ntest1\n .

The last expression is similar, but introduces something called captures. Captures let you still match everything, but keep everything between \( and \) for use in the replacement part of an expression. For example, s/a\(b\)c\(d\)e/\1 \2/ displays bd if the input string abcde . In this example, \1 and \2 are replaced by things that are captured in shielded pairs, b and d , respectively.

s is a lookup pattern:

/ - find

\( - and enter into the replacement variable \1

. - nothing

* - and any quantity

\n - including the first new line that you encounter

\) - (end of capture for \1 )

\( - and enter into the replacement variable \2

. - nothing

* - and any quantity

\n - including the first new line that you encounter

\) - (end of capture for \2 )

/ - and replace it with

\1 - the first thing that was captured,

test2\n - test2 \ n,

\2 - and the second thing is captured.

+2
source

This term:

 /^void.*{$/!b 

means matching ^void.*{$ , and slashes are regular expressions surrounding the regular expression. So you get /^void.*{$/ . If the exclamation point follows the match expression, as in /regex/! then this means the following command if regex does not . The next b command, which is a branch. Which without a label name is at the end of the script. Thus, in general, this expression tries to match ^void.*{$ (I.e., a line starting with void and ending with { ), and skips ( b ), the rest of the script in case of an unsuccessful match ( ! ).

This thing:

 :a;/\n}$/bb;$!{N;ba}; 

runs the label :a; and tries to match \n}$ (a new line and one line } per line), which is again enclosed in /regex/ . Coincidentally, it branches ( b ) into label b (hence /regex/bb ). If this is not the end of the input ( $! ), Then read the line N and go back to the label a ( ba ). Here, a curly pair (ie {commands} ) creates a block. This block is executed as a whole if $! true, that is, there is more input. So $!{N;ba} just means:

 If not end of input: begin real line jump to label a end 
+2
source

All Articles