I am trying to use the subprocess module in Python to communicate with a process that reads standard input and writes standard output in a streaming manner. I want the subprocess lines to be read from an iterator that creates the input and then reads the output lines from the subprocess. There can be no one-to-one correspondence between the input and output lines. How can I pass a subprocess from an arbitrary iterator that returns strings?
Here is a sample code that gives a simple test case, and some methods that I tried for some reason do not work:
#!/usr/bin/python from subprocess import * # A really big iterator input_iterator = ("hello %s\n" % x for x in xrange(100000000)) # I thought that stdin could be any iterable, but it actually wants a # filehandle, so this fails with an error. subproc = Popen("cat", stdin=input_iterator, stdout=PIPE) # This works, but it first sends *all* the input at once, then returns # *all* the output as a string, rather than giving me an iterator over # the output. This uses up all my memory, because the input is several # hundred million lines. subproc = Popen("cat", stdin=PIPE, stdout=PIPE) output, error = subproc.communicate("".join(input_iterator)) output_lines = output.split("\n")
So, how can I have my subprocess read from the iterator line by line while I read its stdout in turn?
python subprocess io
Ryan thompson
source share