This is somehow a continuation of this question.
So, you will notice that you cannot execute sum in the list of strings to concatenate them, python tells you to use str.join instead, and this is good advice, because no matter how you use + on strings, performance is bad.
The "cannot use sum " constraint does not apply to list , and although itertools.chain.from_iterable is the preferred way to do this list smoothing.
But sum(x,[]) , when x is a list of lists, is ultimately bad.
But should it remain so?
I compared 3 approaches
import time import itertools a = [list(range(1,1000)) for _ in range(1000)] start=time.time() sum(a,[]) print(time.time()-start) start=time.time() list(itertools.chain.from_iterable(a)) print(time.time()-start) start=time.time() z=[] for s in a: z += s print(time.time()-start)
results:
sum() in the list of lists: 10.46647310256958. Ok, we knew.itertools.chain : 0.07705187797546387- user accumulated amount using the add-on in place: 0.057044029235839844 (maybe itertools.chain can be faster, as you can see)
So, sum lagging behind because it does result = result + b instead of result += b
So now my question is:
Why can't sum use this cumulative approach if one is available?
(This would be transparent to existing applications and would allow efficient use of the built-in sum module to smooth lists)
performance python list sum
Jean-Franรงois Fabre
source share