Iterate over one list in parallel in python

The goal is to perform calculations on one iterin parallel, using functions at the same time . Perhaps, using (for example) instead of the classical for analysis (LARGE) data that comes through ... builtin sum & mapitertoolsfor loopsiterator

In one simple example, I want to calculate ilen, sum_x & sum_x_sq:

ilen,sum_x,sum_x_sq=iterlen(iter),sum(iter),sum(map(lambda x:x*x, iter))

But without converting (large) iterto list(as with iter=list(iter))

nb Do this with sum & mapand without for loops, perhaps using modules itertoolsand / or threading?

def example_large_data(n=100000000, mean=0, std_dev=1):
  for i in range(n): yield random.gauss(mean,std_dev)

- change -

VERY concrete: I looked at it well itertools, hoping that there was such a double function as mapwhich could do this. For example:len_x,sum_x,sum_x_sq=itertools.iterfork(iter_x,iterlen,sum,sum_sq)

: , python "iterfork".

+1
1

itertools.tee, , .

iter0, iter1, iter2 = itertools.tee(input_iter, 3)
ilen, sum_x, sum_x_sq = count(iter0),sum(iter1),sum(map(lambda x:x*x, iter2))

, sum ( map Python 2) , . , , . tee , , , , .

, , , zip. Python 3, map zip - . , sum .

, , , itertools.accumulate ( Python 3.2). , . , ( , count len), :

iter0, iter1, iter2 = itertools.tee(input_iter, 3)

len_gen = itertools.accumulate(map(lambda x: 1, iter0))
sum_gen = itertools.accumulate(iter1)
sum_sq_gen = itertools.accumulate(map(lambda x: x*x, iter2))

parallel_gen = zip(len_gen, sum_gen, sum_sq_gen)  # zip is a generator in Python 3

for ilen, sum_x, sum_x_sq in parallel_gen:
    pass    # the generators do all the work, so there nothing for us to do here

# ilen_x, sum_x, sum_x_sq have the right values here!

Python 2, 3, accumulate ( Python , ) itertools.imap itertools.izip, map zip.

+2

All Articles