How do conditional expressions in search groups work in .NET regex?

Playing with regular expressions, especially a balanced .NET flavor match, I came to the conclusion that I did not understand how the internal engine works as well as I thought. I would appreciate any evidence on why my templates behave the way they do! But the fist ...

Disclaimer: This question is purely theoretical, and any result obtained here will never be used, modified, or used in production code for HTML analysis. Ever. I promise. I'm scared of the pony. =)

Now to my problem. I will try to match the letter A if it is not surpassed by # . To demonstrate, I always use the string ..A..#..A.. Here the first A must be matched. Of course, this is a fairly simple task using "A(?<!^.*#.*)" , But I want to use the conventions here, as they can be used for balanced comparisons and other interesting things.

What I tried

 "A(?<=^(#(?<q>)|[^#])*(?(q)(?!)))" 

The way I interpret this: when the engine collides with “A”, it goes back to the beginning of the line, and for each character add an empty match to the capture group q if the character is #. Then it must fail if q contains a match. I do not understand why this expression matches as in my sample line.

When I just delete lookbehind and match the entire string, this works:

 "^(#(?<q>)|[^#])*(?(q)(?!))A" 

matches the entire line up to first A, even if the first group quantifier is greedy. Inserting a "#" at the beginning will also result in a match failure (optional).

So: how to browse groups called capture groups inside them and conditional expressions together?

Thanks!

Edit: This problem is easier to see in (?<=(?<q>)(?(q)(?!))). , which does not have to match any character, but matches all.

+7
c # regex theory
source share
1 answer

Conditional sentences are actually not that useful in balanced matching - or anywhere else, for that matter;) Balanced matching is performed using the named capture group as a stack; every time this group matches something, the consistent text is pushed onto the stack. There is also special syntax for the popping stack. Here is a good introduction:

http://blog.stevenlevithan.com/archives/balancing-groups

+3
source share

All Articles