This question comes up in a Django address, but the problem seems to be common.
I want to map URLs constructed as follows:
1,2,3,4,5,6/10,11,12/
I use regex:
^(?P<apples>([0123456789]+,?)+)/(?P<oranges>([0123456789]+,?)+)/$
When I try to match it with a "valid" URL (i.e. matching), I get an instant match:
In [11]: print datetime.datetime.now(); re.compile(r"^(?P<apples>([0123456789]+,?)+)/(?P<oranges>([0123456789]+,?)+)/$").search("114,414,415,416,417,418,419,420,113,410,411,412,413/1/"); print datetime.datetime.now() 2011-10-18 14:27:42.087883 Out[11]: <_sre.SRE_Match object at 0x2ab0960> 2011-10-18 14:27:42.088145
However, when I try to match an "invalid" URL (inconsistency), the entire regular expression takes a time to return nothing:
In [12]: print datetime.datetime.now(); re.compile(r"^(?P<apples>([0123456789]+,?)+)/(?P<oranges>([0123456789]+,?)+)/").search("114,414,415,416,417,418,419,420,113,410,411,412,413/"); print datetime.datetime.now() 2011-10-18 14:29:21.011342 2011-10-18 14:30:00.573270
I guess there is something in the regexp engine that slows down when multiple groups need to be matched. Is there a workaround for this? Maybe my regex should be fixed?