Exp)" I am trying to improve regular expressions. I find it hard to understand what (?> exp...">

How does a subexpression work without backtracking "(?> Exp)"

I am trying to improve regular expressions. I find it hard to understand what (?> expression ) means. Where can I find more information about non-accumulating sub-expressoins? Link Description THIS says:

A greedy sub-expression, also known as a non-backtracking sub-expression. This is matched only once, and then is not involved in returns.

this other link: http://msdn.microsoft.com/en-us/library/bs2twtah(v=vs.71).aspx also has a definition of irreversible subexpression , but it’s still hard for me to understand what this means plus, I can’t come up with an example where I will use (?>exp)

+8
c # regex
source share
3 answers

As always, regular-expressions.info is a good place to start.

Use the atomic group if you want to make sure that everything that has ever been agreed will remain part of the match.

For example, to match several "words" that may or may not be separated by spaces, followed by a colon, the user tried a regular expression:

 (?:[A-Za-z0-9_.&,-]+\s*)+: 

When there was a match, everything was in order. But when it wasn’t, his computer would not respond with 100% CPU load due to a catastrophic return, because the regular expression mechanism tried in vain to find a matching word combination that would match the next colon. It was, of course, impossible.

Using an atomic group, this could be prevented:

 (?>[A-Za-z0-9_.&,-]+\s*)+: 

Now everything that was agreed corresponds to a coincidence - there is no return and, consequently, a quick failure.

+9
source share

The Regex tutorial has a page on it: http://www.regular-expressions.info/atomic.html

Basically, it is that it discards the return information, which means that a(?>bc|b)c matches abcc , but not abc .

The reason it does not match the second line is because it finds a match with bc and discards the rotation tracking information bc|b . He essentially forgets part of |b . Therefore, after bc there is no c , and the match is not satisfied.

The most useful method of using atomic groups, as they are called, is to optimize slow regular expressions. You can find more detailed information on the above page.

+8
source share

Read possessive quantifiers [az]*+ make the backtracking engine remember only the previous step, which does not correspond to all previous steps that correspond.

This is useful when many valid steps are likely, and they will consume memory if each step is stored for any possible regression with return.

Potential quantifiers are short for atomic groups.

+1
source share

All Articles