How to make python scripts available in both bash and inside python

Question

How to make python scripts available in both bash and inside python

Summary I would like to write python scripts that act like bash scripts on the command line, but then I would also like to easily compile them in python. Where I have problems, this is the glue to do the last.

So imagine that I wrote two scripts, script1.py and script2.py , and I can combine them like this:

 echo input_string | ./script1.py -a -b | ./script2.py -c -d

How to get this behavior from another python file? That's how I know, but I don't like it:

 arg_string_1 = convert_to_args(param_1, param_2) arg_string_2 = convert_to_args(param_3, param_4) output_string = subprocess.check_output("echo " + input_string + " | ./script1.py " + arg_string_1 + " | ./script2.py " + arg_string_2)

If I did not want to use multithreading, I could do something like this (?):

 input1 = StringIO(input_string) output1 = StringIO() script1.main(param_1, param_2, input1, output1) input2 = StringIO(output1.get_value()) output2 = StringIO() script2.main(param_3, param_4, input2, output2)

Here's the approach I tried, but I'm stuck in writing glue. I would appreciate that I learned how to complete my approach below, or suggestions for a better design / approach!

My approach: I wrote script1.py and script2.py to look like this:

 #!/usr/bin/python3 ... # import sys and define "parse_args" def main(param_1, param_2, input, output): for line in input: ... print(stuff, file=output) if __name__ == "__main__": parameter_1, parameter_2 = parse_args(sys.argv) main(parameter_1, parameter_2, sys.stdin, sys.stdout)

Then I wanted to write something like this, but I don’t know how to finish:

 pipe_out, pipe_in = ???? output = StringIO() thread_1 = Thread(target=script1.main, args=(param_1, param_2, StreamIO(input_string), pipe_out)) thread_2 = Thread(target=script2.main, args=(param_3, param_4, pipe_in, output) thread_1.start() thread_2.start() thread_1.join() thread_2.join() output_str = output.get_value()

+7

python python-multithreading pipeline

usul Dec 25 '15 at 1:19

source share

3 answers

Sci prog · Answer 1 · 2016-09-16T02:00:26+0000

For "pipe in", sys.stdin used with the readlines() method. (Using the read() method will read one character at a time.)

To transfer information from one stream to another, you can use Queue . You must define one way to signal end of data. In my example, since all the data transferred between streams is str , I just use the None object to signal the end of the data (since it cannot be displayed in the transmitted data).

You can also use more threads or use different functions in the threads.

In my example, I did not include sys.argv to keep it simple. Modifying it to get parameters ( parameter1 , ...) should be easy.

 import sys from threading import Thread from Queue import Queue import fileinput def stdin_to_queue( output_queue ): for inp_line in sys.stdin.readlines(): # input one line at at time output_queue.put( inp_line, True, None ) # blocking, no timeout output_queue.put( None, True, None ) # signal the end of data def main1(input_queue, output_queue, arg1, arg2): do_loop = True while do_loop: inp_data = input_queue.get(True) if inp_data is None: do_loop = False output_queue.put( None, True, None ) # signal end of data else: out_data = arg1 + inp_data.strip('\r\n').upper() + arg2 # or whatever transformation... output_queue.put( out_data, True, None ) def queue_to_stdout(input_queue): do_loop = True while do_loop: inp_data = input_queue.get(True) if inp_data is None: do_loop = False else: sys.stdout.write( inp_data ) def main(): q12 = Queue() q23 = Queue() q34 = Queue() t1 = Thread(target=stdin_to_queue, args=(q12,) ) t2 = Thread(target=main1, args=(q12,q23,'(',')') ) t3 = Thread(target=main1, args=(q23,q34,'[',']') ) t4 = Thread(target=queue_to_stdout, args=(q34,)) t1.start() t2.start() t3.start() t4.start() main()

Finally, I tested this program (python2) with a text file.

 head sometextfile.txt | python script.py

David Heyman · Answer 2 · 2016-09-20T17:26:24+0000

Redirect the return value to stdout depending on whether the script is being executed from the command line:

 #!/usr/bin/python3 import sys # Example function def main(input): # Do something with input producing stuff ... return multipipe(stuff) if __name__ == '__main__': def multipipe(data): print(data) input = parse_args(sys.argv) main(input) else: def multipipe(data): return data

Each other script will have the same two multipipe definitions. Now use multipipe for output.

If you call all the scripts together from the command line $ ./scrip1.py | ./scrip2.py $ ./scrip1.py | ./scrip2.py , everyone will have __name__ == '__main__' , and therefore multipipe will print it all in stdout to read as the next script argument (and return None , so each function returns None , but you don't look at the return values).

If you call them in some other python script, each function will return everything that you passed to multipipe .

In fact, you can use existing functions, just replace print(stuff, file=output) with return multipipe(stuff) . Nice and easy.

To use it for multithreading or multiprocessing, set the functions so that each function returns one thing and connects them to a simple function that adds data to the multithreading queue. For an example of such a queuing system, see the sample at the bottom of the Queue documentation . In this example, just make sure that each step in the pipeline puts None (or another value of the sentinel value of your choice - I like it ... ), since very rarely you pass an Ellipsis object for any reason other than as a marker for its singleton) in queue for the next to indicate the work done.

Yoav kleinberger · Answer 3 · 2016-09-22T20:44:37+0000

There is a very simple solution using the standard Popen class.

Here is an example:

 #this is the master python program import subprocess import sys import os #note the use of stdin and stdout arguments here process1 = subprocess.Popen(['./script1.py'], stdin=sys.stdin, stdout=subprocess.PIPE) process2 = subprocess.Popen(['./script2.py'], stdin=process1.stdout) process1.wait() process2.wait()

two scenarios:

 #!/usr/bin/env python #script1.py import sys for line in sys.stdin: print(line.strip().upper())

Here is the second

 #!/usr/bin/env python #script2.py import sys for line in sys.stdin: print("<{}>".format(line.strip()))

How to make python scripts available in both bash and inside python

More articles: