Why is the connection deadlock when used with multiple Popen subprocesses?

Question

Why is the connection deadlock when used with multiple Popen subprocesses?

The following problem does not occur in Python 2.7.3. However, this happens with both Python 2.7.1 and Python 2.6 on my machine (Mac OSX 10.7.3 64-bit). This is the code that I will eventually distribute, so I would like to know if there is a way to complete this task, which is not so much dependent on the version of Python.

I need to open several subprocesses in parallel and write STDIN data to each of them. I usually do this using the Popen.communicate method. However, communicate is a lock every time multiple processes are opened at the same time.

 import subprocess cmd = ["grep", "hello"] processes = [subprocess.Popen(cmd, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE) for _ in range(2)] for p in processes: print p.communicate("hello world\ngoodbye world\n")

If I change the number of processes to for _ in range(1) , the result will be the same as expected:

 ('hello world\n', '')

However, when there are two processes ( for _ in range(2) ), the process is blocked indefinitely. I tried the alternative of writing to stdin manually:

 for p in processes: p.stdin.write("hello world\ngoodbye world\n")

But then any attempt to read from processes ( p.stdout.read() , for example) still blocks.

At first, this seems to be connected, but it indicates that it occurs when multiple threads are used, and that deadlock only happens very rarely (while it always happens here). Is there a way to get this to work with versions of Python prior to 2.7.3?

+8

python python-2.7 subprocess multiprocessing

David robinson Jan 30 '13 at 10:57

source share

1 answer

John hazen · Accepted Answer · 2013-01-31T01:32:18+0000

I had to imitate this a bit. (Once I came across a similar problem, so I thought I knew the answer, but I was wrong.)

The problem (and the fix for 2.7.3) is described here:

http://bugs.python.org/issue12786

The problem is that PIPEs are inherited by subprocesses. The answer is to use "close_fds = True" in your Popen call.

 processes = [subprocess.Popen(cmd, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE,close_fds=True) for _ in range(2)]

If this causes problems with other file descriptors that you want to reuse (if it was a simplified example), it turns out that you can wait for () / communication () using the subprocesses in the reverse order in which they were created, and it seems , it works.

those. instead:

 for p in processes: print p.communicate("hello world\ngoodbye world\n")

using:

 while processes: print processes.pop().communicate("hello world\ngoodbye world\n")

(Or, I suppose, just do process.reverse () before your existing loop.)

Why is the connection deadlock when used with multiple Popen subprocesses?

More articles: