What is the difference between [0-9] + and [0-9] ++?

Can someone explain to me what is the difference between [0-9]+ and [0-9]++ ?

+7
source share
2 answers

The PCRE mechanism used by PHP for regular expressions supports "possessive quantifiers" :

The quantons followed by + are "possessive." They eat as many characters as possible and do not return to match the rest of the picture. "aabc" .*abc matches "aabc" , but .*+abc not because .*+ Eats the whole line. Potential quantifiers can be used to speed up processing.

and

If the PCRE_UNGREEDY option is set (an option not available in Perl), then the default quantifiers are not greedy, but you can make individual quantifiers greedy by following them with a question mark. In other words, it inverts the default behavior.

The difference is as follows:

 /[0-9]+/ - one or more digits; greediness defined by the PCRE_UNGREEDY option /[0-9]+?/ - one or more digits, but as few as possible (non-greedy) /[0-9]++/ - one or more digits, but as many as possible (greedy, default) 

This snippet visualizes the difference in greedy mode by default. Please note that the first fragment is functionally the same as the last, since the extra + (in a sense) is already applied by default.

This snippet visualizes the difference when applying PCRE_UNGREEDY (default breeding mode). See how the default value is canceled.

+14
source

++ (and ?+ , *+ and {n,m}+ ) are called possessive quantifiers .

Both [0-9]+ and [0-9]++ correspond to one or more ASCII digits, but the second does not allow the regular expression mechanism to return to correspondence if it is necessary for the successful execution of the general regular expression.

Example:

 [0-9]+0 

matches string 00 , while [0-9]++0 does not match.

In the first case, [0-9]+ first matches 00 , but then returns one character to resolve the next 0 . In the second case, ++ prevents this, so a complete match is not performed.

Not all regular expression flavors support this syntax; some others instead realize atomic groups (or even both).

+4
source

All Articles