Python regex speed - greedy and inanimate

I do some regular expressions in Python line by line

\w\s+\w 

for many large documents. Obviously, if I make the regex not greedy (with ? ), It will not change what it matches (like \w ! = \s ), but will it work faster? In other words, does Python work with inanimate regular expressions from the first character matched forward, not from the end of the document to that character, or is it a naive look?

+4
source share
1 answer

Is this the template you mean?

 In [15]: s = 'some text with \tspaces between' In [16]: timeit re.sub(r'(\w)(\s+)(\w)', '\\1 \\3', s) 10000 loops, best of 3: 30.5 us per loop In [17]: timeit re.sub(r'(\w)(\s+?)(\w)', '\\1 \\3', s) 10000 loops, best of 3: 24.9 us per loop 

There seems to be a pretty slight difference. Only 5 microseconds with inanimate

Using 500 words of lorem-ipsum, with a few mixed spaces between each word, I get a difference of 8 ms.

+2
source

All Articles