Ruby Regexp: + vs *. special behavior?
Using ruby regexp, I get the following results:
>> 'foobar'[/o+/] => "oo" >> 'foobar'[/o*/] => "" But:
>> 'foobar'[/fo+/] => "foo" >> 'foobar'[/fo*/] => "foo" The documentation says:
*: zero or more repetitions of the previous +: one or more repetitions of the previous
So, I expect that "foobar" [/ o * /] returns the same result as "foobar" [/ o + /]
Does anyone have an explanation for this
'foobar'[/o*/] matches the zero o that appears before f , at position 0'foobar'[/o+/] cannot coincide with it, because it must be at least 1 o , therefore it matches all o from position 1
In particular, the matches you see
'foobar'[/o*/] => '<>foobar''foobar'[/o+/] => 'f<oo>bar'
This is a common misunderstanding of how a regular expression works.
Although * greedy is not tied to the beginning of the line, the regexp mechanism will still start looking from the beginning of the line. In the case of "/ o + /" it does not match at position 0 (for example, "f"), but since the value "+" means one or more, it should continue to match (this has nothing to do with greed) until a match is found or all positions are evaluated.
However, in the case of "/ o * /", which, as you know, means 0 or more times when it does not match at position 0, the regex engine will gracefully stop at that point (as it should be, because o * is just means o is optional). There are also reasons for performance, since "o" is optional, why spend more time looking for it?