Regex matches all block comment exceptions

The problem is that I want to combine all the text on each side of the comment and exclude the comment itself.

There are many posts related to comments related to comments, but most of them are in other languages ​​(I use notepad ++, which I know is POSIX ERE, does not discuss languages ​​or tools), and most of them focus on looking for comments that I have already made.

This will find the desired text that I want (this will include the comment of the internal block in the match):

(^)rule ((.|\n|\r)*?)(^)end 

The above finds something between the "rule" and the "ending", inclusive. Good.

Here a block comment will be found:

 (?:/\*(?:(?:[^*]|\*(?!/))*)\*/) 

The above finds something between /* and */ , inclusive. Good. I don't care if there can be one of */ inside a comment, not a problem in my case.

Now the question is how to put a block comment in negative in the middle of a matching positive rule so that it matches everything between RULE and END , except for the comment?

Bonus points if your answer excludes one-line comments // .

+4
source share
2 answers

Let me start by saying that regex is not made for this!

But this is not impossible: this can be done using a recursive regular expression:

  • Match everything from the “rule” to the “end” or to the start of the comment block, which after it once again matches all, to “finish” OR to the beginning of the comment block, which after it still matches all, to "finish" OR etc.

of course, only capturing "everything"

What does it mean:

 ^rule((?:.|\r|\n)*?)(?:^end|(?:(?://$|/\*(?:(?:[^*]|\*(?!/))*)\*/))) ^ put cursor there and insert ((?:.|\r|\n)*?)(?:^end|(?:(?://$|/\*(?:(?:[^*]|\*(?!/))*)\*/))) or end with (?:\r?\n^end) 

then replace with

$ 1 $ 2 $ 3 $ 4 $ ..

where the number of permutations must match the number of recursions

to check the limits of Notepad ++, I created this script:

http://jsfiddle.net/lovinglobo/wPKjb/

Notepad ++ breaks down into more than 29 recursions, simply saying "invalid regex".

0
source

If you can flip your requirement and instead remove all comments from the source, you can use this template to match the comments (both by block and by line):

 /(\/\*).*?(\*\/)|(\/\/).*?(\n)/s 
0
source

All Articles