Python Multiprocessing Pool vs. ThreadPool Multiprocessing

I have a list of image paths that I want to split between processes or threads so that each process processes part of the list. Processing includes loading an image from disk, performing some calculations, and returning the result. I am using Python 2.7 multiprocessing.Pool

This is how I create workflows

 def ProcessParallel(classifier,path): files=glob.glob(path+"\*.png") files_sorted=sorted(files,key=lambda file_name:int(file_name.split('--')[1])) p = multiprocessing.Pool(processes=4,initializer=Initializer,initargs=(classifier,)) data=p.map(LoadAndClassify, files_sorted) return data 

The problem that I encountered when I register the initialization time in my Intializer function, I found out that Workers are not initialized in parallel, and each worker is initialized with an interval of 5 seconds. Here are the logs for reference.

 2016-08-08 12:38:32,043 - custom_logging - INFO - Worker started 2016-08-08 12:38:37,647 - custom_logging - INFO - Worker started 2016-08-08 12:38:43,187 - custom_logging - INFO - Worker started 2016-08-08 12:38:48,634 - custom_logging - INFO - Worker started 

I tried using multiprocessing.pool.ThreadPool instead, which launches Workers at the same time.
I know how multiprocessing in Windows works, and we need to place a main guard to protect our code from spawning endless processes. The problem in my case is that I hosted my script in IIS using FASTCGI, and my script is not the main one, it is executed by the FastCGI process (there is a wfastcgi.py script that is responsible for this). Now wfastcgi.py has a main protector, and the logs show that I am not creating an infinite number of processes.

Now I want to know that this is the reason the multiprocessor pool does not create workflows at the same time, I really appreciate any help.

EDIT 1: Here is my initializer function

 def Initializer(classifier): global indexing_classifier logger.info('Worker started') indexing_classifier=classifier 
+5
source share
1 answer

I had a lot of problems trying to do multiprocessing in cgi / wsgi, it works fine locally, but not on real web servers ... Ultimately, it's just incompatible. If you need to do multiprocessing, send asynchronous jobs to something like Celery.

0
source

All Articles