) It matches the ...">

Understanding what makes this regex so slow

I have a regex:

import re

regexp = re.compile(r'^(?P<parts>(?:[\w-]+/?)+)/$')

It matches the type string foo/bar/baz/and puts it foo/bar/bazin a group with a name parts( /?in combination with support /$).

This works fine until you compare a string that does not end with a dash . Then it becomes slower at an apparent exponential speed with each new char that you add to the corresponding line.

Example

# This is instant (trailing slash)
regexp.match('this-will-take-no-time-at-all/')

# This is slow
regexp.match('this-takes-about-5-seconds')

# This will not finish
regexp.match('this-probably-will-not-finish-until-the-day-star-turns-black')

, , /$ ( ) (.. ). , , ?

. .

+4
2

- :

:

^(?P<parts>(?:[\w-]+/)*[\w-]+)/$

:

, , . , , .

+3

Wiktor Stribizew . . , , :

'^(?P<parts>(?:[\w-]+/?)+)/$'

, , , .

, arm/ :

(arm)/
(ar)(m)/
(a)(rm)/
(a)(r)(m)/

, , , . , :

'^(?P<parts>(?:[\w-]+/)*[\w-]+)/$'

.

+1

All Articles