I have a python script that executes URL requests using urllib2. I have a pool of 5 processes that execute asynchronously and perform a function. This function makes url calls, receives data, analyzes them in the required format, performs calculations and inserts data. The amount of data depends on each request URL.
I run this script every 5 minutes using a cron job. Sometimes, when I do ps -ef | grep python, I see stuck processes. Is there a way in which I can track the values of processes in a multiprocessing class that can track processes, their state value is complete, stuck or dead, and so on? Here is the code snippet:
This is how I call asynchronous processes
pool = Pool(processes=5)
pool.apply_async(getData, )
And here is the getData part that executes urllib2 requests:
try:
Url = "http://gotodatasite.com"
data = urllib2.urlopen(Url).read().split('\n')
except URLError, e:
print "Error:",e.code
print e.reason
sys.exit(0)
Is there a way to track stuck processes and repeat them again?
source
share