Why using an attribute instead of a method provides such a significant increase in Python speed

Question

Why using an attribute instead of a method provides such a significant increase in Python speed

I experimented with a class that matches patterns. My class looks something like this:

class Matcher(object): def __init__(self, pattern): self._re = re.compile(pattern) def match(self, value): return self._re.match(value)

All told, my script takes ~ 45 seconds to run. As an experiment, I changed my code to:

 class Matcher(object): def __init__(self, pattern): self._re = re.compile(pattern) self.match = self._re.match

Running this script took 37 seconds. No matter how many times I repeat this process, I see the same significant increase in productivity. Running it through cProfile shows something like this:

  ncalls tottime percall cumtime percall filename:lineno(function) 46100979 14.356 0.000 14.356 0.000 {method 'match' of '_sre.SRE_Pattern' objects} 44839409 9.287 0.000 20.031 0.000 matcher.py:266(match)

Why is there a matching method that adds 9.2 seconds at runtime? ~~The most unpleasant part is that I tried to recreate a simple case and could not do it.~~ ~~What am I missing here?~~ My simple test case had an error! Now it mimics the behavior that I see:

 import re import sys import time class X(object): def __init__(self): self._re = re.compile('.*a') def match(self, value): return self._re.match(value) class Y(object): def __init__(self): self._re = re.compile('ba') self.match = self._re.match inp = 'ba' x = X() y = Y() sys.stdout.write("Testing with a method...") sys.stdout.flush() start = time.time() for i in range(100000000): m = x.match(inp) end = time.time() sys.stdout.write("Done: "+str(end-start)+"\n") sys.stdout.write("Testing with an attribute...") sys.stdout.flush() start = time.time() for i in range(100000000): m = y.match(inp) end = time.time() sys.stdout.write("Done: "+str(end-start)+"\n")

Output:

 $ python speedtest.py Testing with a method...Done: 50.6646981239 Testing with an attribute...Done: 35.5526258945

For reference, both of them are much faster with pyp, but still show significant profit when working with the attribute instead of the method:

 $ pypy speedtest.py Testing with a method...Done: 6.15996003151 Testing with an attribute...Done: 3.57215714455

+7

performance python regex

dave mankoff Sep 7 '12 at 19:31

source share

2 answers

The first version is an additional function call every time. This does some overhead.

0

desimusxvii Sep 7 '12 at 19:33

source share

Brenbarn · Accepted Answer · 2012-09-07T19:35:18+0000

This is probably mainly the overhead of an extra function call. Calling a Python function is relatively costly due to the need to create an additional stack frame, etc. Here is a bare-bones example showing similar performance:

 >>> timeit.timeit("f()", "g = (lambda: 1); f = lambda: g()") 0.2858083918486847 >>> timeit.timeit("f()", "f = lambda: 1") 0.13749289364989004

There is also the added cost of two additional attribute searches inside your method: searching _re on self , and then searching match on this _re object. However, this is most likely a small component, as dictionary searches are pretty fast in Python. (My timeit example shows rather poor performance, even if there is only one additional name lookup in the two-call version.)

Why using an attribute instead of a method provides such a significant increase in Python speed

More articles: