The reason is not that lookbehind is greedy. This is because the regex engine tries to match a pattern at every position it can.
He advances in the phrase such and such as of 29-5-11 , which successfully matches (?<!as of ) , but does not match \d{1,2} .
But then the engine finds itself in a position such and such as of !29-5-11 (marked ! ). But here it does not match (?<!as of ) .
And he moves on to the next position: such and such as of 2!9-5-11 . Where does it successfully match (?<!as of ) , and then \d{1,2} .
How to avoid it?
The general solution is to formulate the template as clear as possible .
In this case, I would prefer a digit with the necessary space or the beginning of a line.
(?<!as of)(?:^|\s+)(\d{1,2}-\d{1,2}-\d{2})
Mark Byers solution is also very good.
I think itβs very important to understand why the regex engine behaves this way and gives undesirable results.
By the way, the solution I gave above does not work if there are 2 or more spaces. This does not work, because the position of the fist corresponds here such and such as of ! 29-5-11 such and such as of ! 29-5-11 with the above drawing.
What can be done to avoid this?
Unfortunately, lookbehind in the Python engine regex does not support the + or * quantifiers.
I think that the simplest solution would be to make sure that there are no spaces before (?:^|\s+) (interfering that all spaces are consumed (?:^|\s+) right after any non-spatial text (and in case the text as of ), stop moving forward and backward to the next starting position, starting the search again and again in the next position of the searched text).
re.search(r'(?<!as of)(?<!\s)(?:^|\s+)(\d{1,2}-\d{1,2}-\d{2})','such and such as of 29-5-11').group(1)