Python multiprocessing pool: how can I find out when all the workers in the pool have finished?

I am running a multiprocessor pool in python, where I have ~ 2000 tasks that map to 24 workers with the pool. each task creates a file based on some data analysis and web services.

I want to start a new task when all tasks in the pool have been completed. How can I find out when all processes in the pool ended?

+4
source share
1 answer

You want to use a methodjoin that stops the main thread of a process from moving forward until all subprocesses are complete:

, , join() -.

from multiprocessing import Process

def f(name):
    print 'hello', name

if __name__ == '__main__':
    processes = []
    for i in range(10):
        p = Process(target=f, args=('bob',))
        processes.append(p)

    for p in processes:
        p.start()
        p.join()

     # only get here once all processes have finished.
     print('finished!')

EDIT:

join

    pool = Pool(processes=4)  # start 4 worker processes
    result = pool.apply_async(f, (10,))  # do some work
    pool.close()
    pool.join()  # block at this line until all processes are done
    print("completed")
+9

All Articles