The difference between ". +" And ". +?"

Can someone explain the difference between .+ And .+?

I have a line: "extend cup end table"

  • Sample e.+d finds: extend cup end
  • Sample e.+?d finds: extend and end

I know that + is one or more, ? - one or zero. But I can’t understand how this works.

+13
source share
2 answers

Both will match any sequence of one or more characters. The difference is that:

  • .+ is greedy and consumes as many characters as it can.
  • .+? reluctantly and consumes as few characters as possible.

See Differences between greedy, reluctant, and possessive quantifiers in the Java manual.

In this way:

  • e.+d finds the longest substring that starts with e and ends with d (and contains at least one character in between). In your example, extend cup end will be found.
  • e.+?d find the shortest such substring. In your example, extend and end are two such non-overlapping matches, so it finds both.
+26
source

The regular expression e.+?d matches 'e' , and then tries to match as few characters as possible (jagged or reluctant), followed by 'd' . That's why the following 2 substrings are matched:

 extend cup end table ^^^^^^ ^^^ 1 2 

The regular expression e.+d matches 'e' , and then tries to match as many characters (greedy) as possible, and then 'd' . It happens that the first 'e' found, and then .+ Matches as much as it can (to the end of the line or input):

 extend cup end table ^^^^^^^^^^^^^^^^^^^^ 

The regex engine comes to the end of a line (or input) and cannot match the 'd' in the regex pattern. Thus, he returns to the last 'd' . This is why one match was found:

 extend cup end table ^^^^^^^^^^^^^^<----- backtrack 1 
+7
source

All Articles