Why is the list sum (sometimes) faster than itertools.chain?

I answered a few questions here, using this to “flatten” the list of lists:

>>> l = [[1,2,3],[4,5,6],[7,8,9]] >>> sum(l,[]) 

It works fine and gives:

 [1, 2, 3, 4, 5, 6, 7, 8, 9] 

although I was told that the sum statement has a = a + b , which is not as efficient as itertools.chain

My planned question is “Why is this possible on lists where it is prevented for strings”, but I did a quick test on my machine comparing sum and itertools.chain.from_iterable using the same data:

 import itertools,timeit print(timeit.timeit("sum(l,[])",setup='l = [[1,2,3],[4,5,6],[7,8,9]]')) print(timeit.timeit("list(itertools.chain.from_iterable(l))",setup='l = [[1,2,3],[4,5,6],[7,8,9]]')) 

I have done this several times, and I always get roughly the same numbers as shown below:

 0.7155522836070246 0.9883352857722025 

To my surprise, chain - recommended by sum for lists by everyone in the few comments to my answers - is much slower.

This is still interesting when repeating in a for loop, because it does not actually create a list, but when creating a list, sum wins.

So, you should discard itertools.chain and use sum when the expected result is list ?

EDIT: thanks to some comments, I did another test, increasing the number of lists

 s = 'l = [[4,5,6] for _ in range(20)]' print(timeit.timeit("sum(l,[])",setup=s)) print(timeit.timeit("list(itertools.chain.from_iterable(l))",setup=s)) 

now I get the opposite:

 6.479897810702537 3.793455760814343 
+4
python list
source share
2 answers

Your test inputs are tiny. At these scales, the terrible O (n ^ 2) asymptotic execution time of the sum version is not visible. Constant factors prevail in timings, and sum has the best constant coefficient, since it does not need to work through iterators.

With large lists, it becomes clear that sum is not at all intended for this kind of thing:

 >>> timeit.timeit('list(itertools.chain.from_iterable(l))', ... 'l = [[i] for i in xrange(5000)]; import itertools', ... number=1000) 0.20425895931668947 >>> timeit.timeit('sum(l, [])', 'l = [[i] for i in xrange(5000)]', number=1000) 49.55303902059097 
+8
source share

For the first question : “To my surprise, the chain recommended by the sum for the lists in all of the several comments on my answers is much slower,” there are two reasons for your observed timings:

  • For small inputs, timings are dominated by function function calls. Calling both list and chain.from_iterable more expensive than calling only sum . The actual job of concatenating small inputs is faster than the job of invoking functions and methods.

  • For large inputs, the expected quadratic behavior of the logic a = a + b dominates.

For your other question , “why is this possible in lists where it is prevented for strings”, the answer is that we cannot detect and report all quadratic cases, so we just report the one that the user most likely accidentally stumble upon.

Also, the ''.join(list_of_strings) harder to understand if you don't already know about it. In contrast, more efficient work for lists is much easier to find, t=[]; for s in list_of_lists: t+=s t=[]; for s in list_of_lists: t+=s .

Using the non-itertools alternative , you can get reasonable performance with simple in-place list extensions:

 result = [] for seq in list_of_lists: result += seq 

The loop runs at a speed of "python-speed" instead of "C-speed", but there is no overhead for functions, there is no additional level of iteration, and more importantly, list concatenation can use a known input length so that it can preallocate the space needed for the result (this is called __length_hint __).

Another thought , you should never trust timing, which includes growing lists gradually. Internal logic uses realloc () to resize a list as it grows. In timeline sets, the environment is favorable, and realloc can often be distributed locally because no other data interferes. However, the same logic used in real code can be significantly degraded because more fragmented memory forces realloc to copy all the data into a larger empty space. In other words, timings may not at all indicate the actual performance of the real code that you care about.

In any case, the main reason sum () is because Guido van Rossum and Alex Martelli think that this is best suited for the language:

+7
source share

All Articles