Performance discrepancy: obj . setitem (x, y) vs. obj [x] = y?

Question

Performance discrepancy: obj . setitem (x, y) vs. obj [x] = y?

I was writing a simple dict subclass with access to attributes, and I came across something that seemed strange when I optimized. I originally wrote the __getattr__ and __setattr__ as simple aliases for self[key] , etc., but then I thought it would be faster to call self.__getitem__ and self.__setitem__ directly, since they are supposed to be called under the hood with [key] . Out of curiosity, I timed both implementations and found some surprises.

Below are two implementations: there are not as many as you can see.

 # brackets class AttrDict(dict): def __getattr__(self, key): return self[key] def __setattr__(self, key, val): self[key] = val # methods class AttrDict(dict): def __getattr__(self, key): return self.__getitem__(key) def __setattr__(self, key, val): self.__setitem__(key, val)

Intuitively, I expected the second implementation to be a little faster, since it seems to skip the step of moving from notation to the bracket to the function call. However, this is not quite what my timeit results timeit .

 >>> methods = '''\ ... class AttrDict(dict): ... def __getattr__(self, key): ... return self.__getitem__(key) ... def __setattr__(self, key, val): ... self.__setitem__(key, val) ... o = AttrDict() ... o.att = 1 ... ''' >>> brackets = '''\ ... class AttrDict(dict): ... def __getattr__(self, key): ... return self[key] ... def __setattr__(self, key, val): ... self[key] = val ... ... o = AttrDict() ... o.att = 1 ... ''' >>> getting = 'foo = o.att' >>> setting = 'o.att = 1'

The above code is just configured. Here are the tests:

 >>> for op in (getting, setting): ... print('GET' if op == getting else 'SET') ... for setup in (brackets, methods): ... s = 'Brackets:' if setup == brackets else 'Methods:' ... print(s, min(timeit.repeat(op, setup, number=1000000, repeat=20))) ... GET Brackets: 1.109725879526195 Methods: 1.050940903987339 SET Brackets: 0.44571820606051915 Methods: 0.7166479863124096 >>>

As you can see, using self.__getitem__ very slightly faster than self[key] , but self.__setitem__ much slower than self[key] = val . This seems very strange - I know that function overheads can be large, but if this was a problem , I would expect that in both cases the parenthesis notation will be displayed in this case, which does not happen here.

I looked at him a little further; here are the results of dis :

 >>> exec(brackets) >>> dis.dis(AttrDict.__getattr__) 3 0 LOAD_FAST 0 (self) 3 LOAD_FAST 1 (key) 6 BINARY_SUBSCR 7 RETURN_VALUE >>> dis.dis(AttrDict.__setattr__) 5 0 LOAD_FAST 2 (val) 3 LOAD_FAST 0 (self) 6 LOAD_FAST 1 (key) 9 STORE_SUBSCR 10 LOAD_CONST 0 (None) 13 RETURN_VALUE >>> exec(methods) >>> dis.dis(AttrDict.__getattr__) 3 0 LOAD_FAST 0 (self) 3 LOAD_ATTR 0 (__getitem__) 6 LOAD_FAST 1 (key) 9 CALL_FUNCTION 1 (1 positional, 0 keyword pair) 12 RETURN_VALUE >>> dis.dis(AttrDict.__setattr__) 5 0 LOAD_FAST 0 (self) 3 LOAD_ATTR 0 (__setitem__) 6 LOAD_FAST 1 (key) 9 LOAD_FAST 2 (val) 12 CALL_FUNCTION 2 (2 positional, 0 keyword pair) 15 POP_TOP 16 LOAD_CONST 0 (None) 19 RETURN_VALUE

The only thing I can think of is that perhaps the POP_TOP command has significant overhead compared to the rest of the calls, but can it really be that ? This is the only thing that stands out here ... Can anyone see what happens to make __setitem__ much slower than his cousin blacksmith regarding __getitem__ ?

Potentially relevant information:

CPython 3.3.2 32-bit on win32

+6

performance python

Henry Keiter Feb 12 '14 at 17:05

source share

1 answer

Claudiu · Answer 1 · 2014-02-12T17:09:31+0000

Hmm, this is interesting. If I launched a shortened version of your material:

 setup=""" def getbrack(a, b): return a[b] def getitem(a, b): return a.__getitem__(b) def setbrack(a, b, c): a[b] = c def setitem(a, b, c): return a.__setitem__(b, c) a = {2: 3} """

setitem and getitem slower than their respective setbrack and getbrack :

 >>> timeit.timeit("getbrack(a, 2)", setup, number=10000000) 1.1424450874328613 >>> timeit.timeit("getitem(a, 2)", setup, number=10000000) 1.5957350730895996 >>> timeit.timeit("setbrack(a, 2, 3)", setup, number=10000000) 1.4236340522766113 >>> timeit.timeit("setitem(a, 2, 3)", setup, number=10000000) 2.402789831161499

However, if I perform your test specifically, I get the same results - GET 'Brackets' slower than GET 'Methods' .

This means that it has something to do with the classes you use, not the brackets or set itself.

Now, if I modify the test so as not to refer to self ...

 brackets = '''d = {} class AttrDict2(dict): def __getattr__(self, key): return d[key] def __setattr__(self, key, val): d[key] = val o = AttrDict2() o.att = 1''' methods = '''d = {} class AttrDict2(dict): def __getattr__(self, key): return d.__getitem__(key) def __setattr__(self, key, val): d.__setitem__(key, val) o = AttrDict2() o.att = 1'''

Then I again get the behavior that brackets are always faster than methods. Perhaps this has something to do with how self[] works in the dict subclass?

Performance discrepancy: obj .__ setitem __ (x, y) vs. obj [x] = y?

More articles:

Performance discrepancy: obj . setitem (x, y) vs. obj [x] = y?