Executing and capturing java execution output from python

I wrote a program in java using hasoop api. So, the output of this java code is jar .. say foo.jar

To run this jar in hadoop, I do

hadoop jar foo.jar org.foo.bar.MainClass input output

And this will launch a long haop task (say a few minutes).

While the work is done .. hasoop gives me progress .. sort of

Map 0%, Reduce 0%
Map 20%, Reduce 0%
....

etc. .. After the end of work, hadoop spills out a bunch of statistics (for example, the size of the input, splitting, writing, etc.). All this is done from the command line.

Now, what I'm trying to do is ... call this program from python (using simple system execution ..)

But I want ... when I run this code in python. I also want to show some of these statistics ... but not all ..

, , jar python .

. , hadoop ..

Map 0%, Reduce 0%
Map 20%, Reduce 0%

...

..

, , ...

def progress_function(map,reduce):

      return sum([map,reduce])/2.0

..

progress so far:0
progress so far:10

and so on..

, . jar java-.. . java- python.., ... python python .

+4
1

. , python, , , , piping python script .

hadoop jar foo.jar org.foo.bar.MainClass input output 2>&1 | python myscript.py

myscript.py stdin print.

, 2>&1 stderr stdout, piping stdout, stderr.

+1

All Articles