Itertools library performance compared to python code

As an answer to my question, Find a position based on 1 in which the two lists are the same . I got a hint to use the C-library itertools to speed things up.

To check, I encoded the following test using cProfile:

from itertools import takewhile, izip def match_iter(self, other): return sum(1 for x in takewhile(lambda x: x[0] == x[1], izip(self, other))) def match_loop(self, other): element = -1 for element in range(min(len(self), len(other))): if self[element] != other[element]: element -= 1 break return element +1 def test(): a = [0, 1, 2, 3, 4] b = [0, 1, 2, 3, 4, 0] print("match_loop a=%s, b=%s, result=%s" % (a, b, match_loop(a, b))) print("match_iter a=%s, b=%s, result=%s" % (a, b, match_iter(a, b))) i = 10000 while i > 0: i -= 1 match_loop(a, b) match_iter(a, b) def profile_test(): import cProfile cProfile.run('test()') if __name__ == '__main__': profile_test() 

The match_iter () function uses itertools, and the match_loop () function is the one I implemented before using simple python.

The test () function defines two lists, prints lists with the results of two functions to check if it works. Both results have an expected value of 5, which is the length for equal lists. He then performs 10,000 times on both functions.

Finally, all of this is profiled with profile_test ().

I found out that izip is not implemented in itertools python3, at least not in the debian wheezy whitch that I use. So I checked the test with python2.7

Here are the results:

 python2.7 match_test.py match_loop a=[0, 1, 2, 3, 4], b=[0, 1, 2, 3, 4, 0], result=5 match_iter a=[0, 1, 2, 3, 4], b=[0, 1, 2, 3, 4, 0], result=5 180021 function calls in 0.636 seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 0.636 0.636 <string>:1(<module>) 1 0.039 0.039 0.636 0.636 match_test.py:15(test) 10001 0.048 0.000 0.434 0.000 match_test.py:3(match_iter) 60006 0.188 0.000 0.275 0.000 match_test.py:4(<genexpr>) 50005 0.087 0.000 0.087 0.000 match_test.py:4(<lambda>) 10001 0.099 0.000 0.162 0.000 match_test.py:7(match_loop) 20002 0.028 0.000 0.028 0.000 {len} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 10001 0.018 0.000 0.018 0.000 {min} 10001 0.018 0.000 0.018 0.000 {range} 10001 0.111 0.000 0.387 0.000 {sum} 

Which makes me wonder when looking at cumtime values, my simple python version has a value of 0.162 seconds for 10,000 cycles, and the match_iter version takes 0.434 seconds.

On the one hand, python is very fast, great, so I don't have to worry. But can it be right that the C-library takes more than two times to complete the work as simple Python code? Or am I making a fatal mistake?

To verify that I also ran the test using python2.6, which seems to be even faster, but with the same difference between loops and itertools.

Who is experienced and wants to help?

+4
source share
2 answers

I suppose the problem is that your test lists are tiny - which means that any difference is likely to be minimal, and the cost of creating iterators outweighs the benefits they provide.

In larger tests (where performance is likely to matter), a version using sum() is more likely to outperform another version.

In addition, there is a question about style - the manual version is longer and relies on iteration over the index, which makes it less flexible.

I would say that the most readable solution would be something like this:

 def while_equal(seq, other): for this, that in zip(seq, other): if this != that: return yield this def match(seq, other): return sum(1 for _ in while_equal(seq, other)) 

Interestingly, on my system a slightly modified version of this:

 def while_equal(seq, other): for this, that in zip(seq, other): if this != that: return yield 1 def match(seq, other): return sum(while_equal(seq, other)) 

Runs better than the clean loop version:

 a = [0, 1, 2, 3, 4] b = [0, 1, 2, 3, 4, 0] import timeit print(timeit.timeit('match_loop(a,b)', 'from __main__ import a, b, match_loop')) print(timeit.timeit('match(a,b)', 'from __main__ import match, a, b')) 

Donation:

 1.3171300539979711 1.291257290984504 

However, if we upgrade the clean loop version to be more Pythonic:

 def match_loop(seq, other): count = 0 for this, that in zip(seq, other): if this != that: return count count += 1 return count 

This time (using the same method as above) for 0.8548871780512854 much faster for me than any other method, although it is still readable. This is likely due to index cyclization in the original version, which is usually very slow. However, I will go for the first version in this post, as I consider this the most readable.

+4
source
  • The first is that it actually does something.
  • Secondly, readability is usually more important than writing fast code. If your code runs 3 times faster, but you spend 2 times for every 3 weeks of debugging, it is not worth your time.
  • Third, you can also use timeit to time small bits of code. I believe this approach will be a little easier than using profile . ( profile is good for finding bottlenecks though).

itertools , in general, is pretty fast. However, especially in this case, your takewhile will slow down because itertools must call a function for each element in this path. Each function call in python has enough overhead associated with it, so this can slow you down a bit (there is also the cost of creating a lambda function in the first place). Note that sum with a generator expression also adds a bit of overhead. Ultimately, however, it seems that the base loop wins all the time in this situation.

 from itertools import takewhile, izip def match_iter(self, other): return sum(1 for x in takewhile(lambda x: x[0] == x[1], izip(self, other))) def match_loop(self, other): cmp = lambda x1,x2: x1 == x2 for element in range(min(len(self), len(other))): if self[element] == other[element]: element += 1 else: break return element def match_loop_lambda(self, other): cmp = lambda x1,x2: x1 == x2 for element in range(min(len(self), len(other))): if cmp(self[element],other[element]): element += 1 else: break return element def match_iter_nosum(self,other): element = 0 for _ in takewhile(lambda x: x[0] == x[1], izip(self, other)): element += 1 return element def match_iter_izip(self,other): element = 0 for x1,x2 in izip(self,other): if x1 == x2: element += 1 else: break return element a = [0, 1, 2, 3, 4] b = [0, 1, 2, 3, 4, 0] import timeit print timeit.timeit('match_iter(a,b)','from __main__ import a,b,match_iter') print timeit.timeit('match_loop(a,b)','from __main__ import a,b,match_loop') print timeit.timeit('match_loop_lambda(a,b)','from __main__ import a,b,match_loop_lambda') print timeit.timeit('match_iter_nosum(a,b)','from __main__ import a,b,match_iter_nosum') print timeit.timeit('match_iter_izip(a,b)','from __main__ import a,b,match_iter_izip') 

Please note that the fastest version is a + itertools loop hybrid. This (explicit) loop over izip also turns out to be easier to read (in my opinion). So, from this we can conclude that takewhile is the slow part, not necessarily itertools in general.

+5
source

All Articles