Writing a Better Regular Expression to Avoid Using the Lazy Repeat Quantifier

I have a regex:

(<select([^>]*>))(.*?)(</select\s*>)

Since it uses a lazy repeat quantifier, for longer lines (with options over 500), it returns more than 100,000 times and fails. Please help me find the best regex that doesn't use lazy repetition quantifier

+5
source share
2 answers
<select[^>]*>[^<]*(?:<(?!/select>)[^<]*)*</select>

... or in readable form:

<select[^>]*>    # start tag
[^<]*            # anything except opening bracket
(?:              # if you find an open bracket
  <(?!/select>)  #   match it if it not part of end tag
  [^<]*          #   consume any more non-brackets
)*               # repeat as needed
</select>        # end tag

This is an example of the "extended loop" method that Friedl develops in his book, Mastering Regular Expressions . I did a quick test in RegexBuddy using a template based on reluctant quantifiers:

(?s)<select[^>]*>.*?</select>

... 6000 . 500 . (</select), , 800 .

, :

<select[^>]*+>[^<]*+(?:<(?!/select>)[^<]*+)*+</select>

, . , ; 500 , , .

+2

, , . !

(<select([^>]*>))(.*+)(</select\s*>)

perl regexp:

,         , Perl .         . , Perl ""         .

       *+     Match 0 or more times and give nothing back
       ++     Match 1 or more times and give nothing back
       ?+     Match 0 or 1 time and give nothing back
       {n}+   Match exactly n times and give nothing back (redundant)
       {n,}+  Match at least n times and give nothing back
       {n,m}+ Match at least n but not more than m times and give nothing back

,

      'aaaa' =~ /a++a/

, "a ++" "a"        string .         , perl ,         . , "        " :

      /"(?:[^"\\]++|\\.)*+"/
+1

All Articles