Short answer:
- In python 2: your guess is true:
dict.keys() slows down. - In python 3: your guess is wrong:
in dict.keys() does, like in dict
Details for py2 and py3 are given below.
Answer in Python 2.7:
Your assumption is true for the following reasons:
dict.keys() enables an additional function call (stack overhead).dict.keys() returns a list containing all the keys in memory (as opposed to a lazy generator object). Therefore, it must allocate memory.key in dict can use an inline object, which is an indexed search. key in dict.keys() is a linear search in a list
I created a small script icon to show my point:
#! /usr/bin/python2.7 import datetime as dt import random dict_size = 1000000 num_iterations = 100 d = {i: i for i in xrange(dict_size)} def f(): k = random.randint(0,dict_size-1) return (k in d) def g(): k = random.randint(0,dict_size-1) return (k in d.keys()) def test(func): t = dt.datetime.utcnow() for i in xrange(num_iterations): func() print "%s --> %1.6fs" % (func, (dt.datetime.utcnow()-t).total_seconds()) test(f) test(g)
Exit (python 2.7.6 Ubuntu 14.04):
<function f at 0x7ff2e0126d70> --> 0.000598 s <function g at 0x7ff2e0126de8> --> 5.191553 s
I also tested with nr iterations, and the dict size changed (dict only 100 elements, 1M iterations).
<function f at 0x7f94cb5e6d70> --> 3.614162 s <function g at 0x7f94cb5e6de8> --> 7.007922 s
Here the results are much closer to each other.
So the performance difference really grows with the dict size.
Python 3 answer:
I adapted the script for python 3:
xrange gone, use range . (not in the performance-critical internal circuit of the test function, therefore, a limited performance impact).- use curly braces with
print - change the shabang line to
#!/usr/bin/python3
And tested with python 3.4.3 on the same computer.
dict_size = 1000000; num_iterations = 100
f → 0.000590 s g → 0.000565 s
dict_size = 100; num_iterations = 1000000
f → 4.525487 s g → 4.837232 s
So in python 3 the performance difference is gone.
Freek wiekmeijer
source share