Do I have to create a new pool object each time or reuse one of them?

I am trying to understand best practices with the Python.Pool multiprocessing process.

In my program, I use Pool.imap very often. Usually every time I run tasks in parallel, I create a new pool object and then close it after completion.

I recently ran into a hang when the number of tasks submitted to the pool was less than the number of processes. It was strange that this only happened in my test pipeline, which used to have a lot of things. Performing the test as a standalone did not call up a hand. I assume this is due to the creation of multiple pools.

I would really like to find some resources that will help me understand the best practices for using Python multiprocessing. In particular, I'm now trying to understand the implications of creating multiple pool objects versus using only one.

+4
source share
1 answer

When you create a workflow pool, new processes are generated from the parent. This is a very fast operation, but it has its costs.

Therefore, if you do not have a very good reason, for example, when a pool is broken due to the unexpected death of one employee, it is best to always use the same pool instance.

, . , ( close()/stop(), join()). , , ..

, , , . , .

+4

All Articles