I need to start the pdftk process while serving a web request in Django and wait for it to complete. My current pdftk code is as follows:
proc = subprocess.Popen(["/usr/bin/pdftk", "/tmp/infile1.pdf", "/tmp/infile2.pdf", "cat", "output", "/tmp/outfile.pdf"]) proc.communicate()
This works fine while I do the work under the dev server (it works as the www-data user). But as soon as I switch to mod_wsgi without changing anything, the code hangs on proc.communicate() , and "outfile.pdf" remains the open file descriptor.
I tried several options for calling the subprocess (like the usual old os.system) - setting stdin / stdout / stderr in PIPE or on various file descriptors does not change anything. Using "shell = True" prevents proc.communicate() from freezing, but then pdftk cannot create an output file in both devserver and mod_wsgi. This discussion seems to indicate that there might be some deeper voodoo coming with OS and pdftk signals that I don't understand.
Are there any workarounds to get the subprocess call, for example, to work correctly in wsgi? I avoid using PyPDF to merge PDF files because I need to merge a sufficiently large number of files (several hundred) in which memory runs out (PyPDF should keep every source of the pdf file open in memory when combining them).
I do this under recent Ubuntu, pythons 2.6 and 2.7.
python django subprocess pdftk mod-wsgi
user85461 Sep 25 '11 at 3:26 2011-09-25 03:26
source share