I am developing a tool that analyzes huge files. To make it faster, I introduced multiprocessing on it, and everything seems to be working fine. For this, I use multiprocessing.pool, creating N threads, and they handle various pieces of work that I previously created.
pool = Pool(processes=params.nthreads) for chunk in chunk_list: pool.apply_async(__parallel_quant, [filelist, chunk, outfilename]) pool.close() pool.join()
As you can see, this is standard pool execution without special use.
Recently, I find a problem when I run a really large amount of data. Standard runs take about 2 hours with 16 threads, but I have a special case that takes about 8 hours due to its really large number of files and their size.
The problem is that I recently found that when I execute this case, the execution is fine until the end, most of the children ends properly, except that it gets stuck in
<built-in method recv of _multiprocessing.Connection object at remote 0x3698db0>
Since this child does not finish parenting, he does not wake up and execution stops.
This situation only occurs when the input files are very large, so I was wondering if there was any default timeout that might cause this problem.
I am using python 2.7 multiprocessing 0.70a1
and my car is centos 7 (32 cores, 64 GB of RAM)
Thank you in advance for your help.
Jordi