Python Multiprocessing Pool vs. ThreadPool Multiprocessing

Question

Python Multiprocessing Pool vs. ThreadPool Multiprocessing

I have a list of image paths that I want to split between processes or threads so that each process processes part of the list. Processing includes loading an image from disk, performing some calculations, and returning the result. I am using Python 2.7 multiprocessing.Pool

This is how I create workflows

 def ProcessParallel(classifier,path): files=glob.glob(path+"\*.png") files_sorted=sorted(files,key=lambda file_name:int(file_name.split('--')[1])) p = multiprocessing.Pool(processes=4,initializer=Initializer,initargs=(classifier,)) data=p.map(LoadAndClassify, files_sorted) return data

The problem that I encountered when I register the initialization time in my Intializer function, I found out that Workers are not initialized in parallel, and each worker is initialized with an interval of 5 seconds. Here are the logs for reference.

 2016-08-08 12:38:32,043 - custom_logging - INFO - Worker started 2016-08-08 12:38:37,647 - custom_logging - INFO - Worker started 2016-08-08 12:38:43,187 - custom_logging - INFO - Worker started 2016-08-08 12:38:48,634 - custom_logging - INFO - Worker started

I tried using multiprocessing.pool.ThreadPool instead, which launches Workers at the same time.
I know how multiprocessing in Windows works, and we need to place a main guard to protect our code from spawning endless processes. The problem in my case is that I hosted my script in IIS using FASTCGI, and my script is not the main one, it is executed by the FastCGI process (there is a wfastcgi.py script that is responsible for this). Now wfastcgi.py has a main protector, and the logs show that I am not creating an infinite number of processes.

Now I want to know that this is the reason the multiprocessor pool does not create workflows at the same time, I really appreciate any help.

EDIT 1: Here is my initializer function

 def Initializer(classifier): global indexing_classifier logger.info('Worker started') indexing_classifier=classifier

+5

python multithreading multiprocessing

Ahmed Aug 08 '16 at 8:37

source share

1 answer

Brett Coover · Answer 1 · 2017-08-21T05:53:23+0000

I had a lot of problems trying to do multiprocessing in cgi / wsgi, it works fine locally, but not on real web servers ... Ultimately, it's just incompatible. If you need to do multiprocessing, send asynchronous jobs to something like Celery.

Python Multiprocessing Pool vs. ThreadPool Multiprocessing

More articles: