You are right, they are executed sequentially in your example.
p.join()
causes the current thread to block until it completes execution. You either want to join your processes separately outside the for loop (for example, by storing them in a list and then iterating over it) or use something like numpy.Pool
and apply_async
with a callback. It will also allow you to add it directly to your results, rather than supporting the objects around.
For example:
def f(i): return i*np.identity(4) if __name__ == '__main__': p=Pool(5) result = np.zeros((4,4)) def adder(value): global result result += value for i in range(30): p.apply_async(f, args=(i,), callback=adder) p.close() p.join() print result
Closing and then pooling at the end ensures the completion of the pool processes and the completion of the result
object. You can also explore using Pool.imap
as a solution to your problem. This particular solution would look something like this:
if __name__ == '__main__': p=Pool(5) result = np.zeros((4,4)) im = p.imap_unordered(f, range(30), chunksize=5) for x in im: result += x print result
This is cleaner for your specific situation, but it may not be what you end up trying to do.
Regarding saving all your various results, if I understand your question, you can simply add it to the result in the callback method (as mentioned above) or in order of time using imap
/ imap_unordered
(which still saves the results, but you will clear him as he builds). Then it does not need to be stored longer than required to add the result.
David H. Clements
source share