How do I know when a subprocess ended after using os.kill ()?

I have a Python program (exactly, a Django application) that starts a subprocess using subprocess.Popen . Due to the architecture limitations of my application, I cannot use Popen.terminate() to end the subprocess and Popen.poll() to check if the process has completed. This is because I cannot refer to the initial subprocess in a variable.

Instead, I should write the pid process id to the pidfile when the subprocess starts. When I want to stop the subprocess, I open this pidfile and use os.kill(pid, signal.SIGTERM) to stop it.

My question is:. How can I find out when a subprocess is really complete? Using signal.SIGTERM , it takes about 1-2 minutes for final completion after calling os.kill() . At first I thought that os.waitpid() would be right for this task, but when I name it after os.kill() , it will give me OSError: [Errno 10] No child processes ,

By the way, I start and stop the subprocess from the HTML template using two forms, and the program logic is inside the Django view. An exception is displayed in my browser when my application is in debug mode. It is probably also important to know that the subprocess itself, which I call in my view ( python manage.py crawlwebpages ), calls another subprocess, namely the Scrapy crawler instance. I am writing the pid this Scrapy instance in a pidfile , and this is what I want to complete.

Here is the relevant code:

 def process_main_page_forms(request): if request.method == 'POST': if request.POST['form-type'] == u'webpage-crawler-form': template_context = _crawl_webpage(request) elif request.POST['form-type'] == u'stop-crawler-form': template_context = _stop_crawler(request) else: template_context = { 'webpage_crawler_form': WebPageCrawlerForm(), 'stop_crawler_form': StopCrawlerForm()} return render(request, 'main.html', template_context) def _crawl_webpage(request): webpage_crawler_form = WebPageCrawlerForm(request.POST) if webpage_crawler_form.is_valid(): url_to_crawl = webpage_crawler_form.cleaned_data['url_to_crawl'] maximum_pages_to_crawl = webpage_crawler_form.cleaned_data['maximum_pages_to_crawl'] program = 'python manage.py crawlwebpages' + ' -n ' + str(maximum_pages_to_crawl) + ' ' + url_to_crawl p = subprocess.Popen(program.split()) template_context = { 'webpage_crawler_form': webpage_crawler_form, 'stop_crawler_form': StopCrawlerForm()} return template_context def _stop_crawler(request): stop_crawler_form = StopCrawlerForm(request.POST) if stop_crawler_form.is_valid(): with open('scrapy_crawler_process.pid', 'rb') as pidfile: process_id = int(pidfile.read().strip()) print 'PROCESS ID:', process_id os.kill(process_id, signal.SIGTERM) os.waitpid(process_id, os.WNOHANG) # This gives me the OSError print 'Crawler process terminated!' template_context = { 'webpage_crawler_form': WebPageCrawlerForm(), 'stop_crawler_form': stop_crawler_form} return template_context 

What can I do? Thank you very much!

EDIT:

According to the excellent answer given by Jacek Konieczny , I could solve my problem by changing my code in the _stop_crawler(request) function to the following:

 def _stop_crawler(request): stop_crawler_form = StopCrawlerForm(request.POST) if stop_crawler_form.is_valid(): with open('scrapy_crawler_process.pid', 'rb') as pidfile: process_id = int(pidfile.read().strip()) # These are the essential lines os.kill(process_id, signal.SIGTERM) while True: try: time.sleep(10) os.kill(process_id, 0) except OSError: break print 'Crawler process terminated!' template_context = { 'webpage_crawler_form': WebPageCrawlerForm(), 'stop_crawler_form': stop_crawler_form} return template_context 
+8
python django subprocess
source share
2 answers

The usual way to check if a process continues is to kill () it with a "0" signal. It does nothing for the work in progress and throws an OSError exception with errno=ESRCH if the process does not exist.

 [jajcus@lolek ~]$ sleep 1000 & [1] 2405 [jajcus@lolek ~]$ python Python 2.7.3 (default, May 11 2012, 11:57:22) [GCC 4.6.3 20120315 (release)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import os >>> os.kill(2405, 0) >>> os.kill(2405, 15) >>> os.kill(2405, 0) Traceback (most recent call last): File "<stdin>", line 1, in <module> OSError: [Errno 3] No such process 

But if possible, the caller should remain the parent of the called process and use the wait() family of functions to handle its completion. This is what the Popen object Popen .

+7
source share

My solution would be to put an intermediate process that controls the subprocess.

So, your web requests (which seem to occur in different processes - due to parallelization?) Say that the control process launches this program and watches it; as needed, they will ask what status is.

This process in the simplest case should be a process that opens a UNIX domain socket (like a TCP / IP socket) and listens for it. A "web process" connects to it, sends a start request and returns a unique identifier. Subsequently, he can use this identifier for further requests to the new process.

Alternatively, it gives the identifier by itself (or it does not use the identifier at all if there can be only one process), and therefore it does not need to support some variable identifier.

+2
source share

All Articles