I have a Python program (exactly, a Django application) that starts a subprocess using subprocess.Popen . Due to the architecture limitations of my application, I cannot use Popen.terminate() to end the subprocess and Popen.poll() to check if the process has completed. This is because I cannot refer to the initial subprocess in a variable.
Instead, I should write the pid process id to the pidfile when the subprocess starts. When I want to stop the subprocess, I open this pidfile and use os.kill(pid, signal.SIGTERM) to stop it.
My question is:. How can I find out when a subprocess is really complete? Using signal.SIGTERM , it takes about 1-2 minutes for final completion after calling os.kill() . At first I thought that os.waitpid() would be right for this task, but when I name it after os.kill() , it will give me OSError: [Errno 10] No child processes ,
By the way, I start and stop the subprocess from the HTML template using two forms, and the program logic is inside the Django view. An exception is displayed in my browser when my application is in debug mode. It is probably also important to know that the subprocess itself, which I call in my view ( python manage.py crawlwebpages ), calls another subprocess, namely the Scrapy crawler instance. I am writing the pid this Scrapy instance in a pidfile , and this is what I want to complete.
Here is the relevant code:
def process_main_page_forms(request): if request.method == 'POST': if request.POST['form-type'] == u'webpage-crawler-form': template_context = _crawl_webpage(request) elif request.POST['form-type'] == u'stop-crawler-form': template_context = _stop_crawler(request) else: template_context = { 'webpage_crawler_form': WebPageCrawlerForm(), 'stop_crawler_form': StopCrawlerForm()} return render(request, 'main.html', template_context) def _crawl_webpage(request): webpage_crawler_form = WebPageCrawlerForm(request.POST) if webpage_crawler_form.is_valid(): url_to_crawl = webpage_crawler_form.cleaned_data['url_to_crawl'] maximum_pages_to_crawl = webpage_crawler_form.cleaned_data['maximum_pages_to_crawl'] program = 'python manage.py crawlwebpages' + ' -n ' + str(maximum_pages_to_crawl) + ' ' + url_to_crawl p = subprocess.Popen(program.split()) template_context = { 'webpage_crawler_form': webpage_crawler_form, 'stop_crawler_form': StopCrawlerForm()} return template_context def _stop_crawler(request): stop_crawler_form = StopCrawlerForm(request.POST) if stop_crawler_form.is_valid(): with open('scrapy_crawler_process.pid', 'rb') as pidfile: process_id = int(pidfile.read().strip()) print 'PROCESS ID:', process_id os.kill(process_id, signal.SIGTERM) os.waitpid(process_id, os.WNOHANG)
What can I do? Thank you very much!
EDIT:
According to the excellent answer given by Jacek Konieczny , I could solve my problem by changing my code in the _stop_crawler(request) function to the following:
def _stop_crawler(request): stop_crawler_form = StopCrawlerForm(request.POST) if stop_crawler_form.is_valid(): with open('scrapy_crawler_process.pid', 'rb') as pidfile: process_id = int(pidfile.read().strip())