Python does not get source binary from subprocess.check_call

How can I get subprocess.check_call to give me the original binary output of the command, it seems to be encoding incorrectly somewhere.

More details:

I have a command that returns text as follows:

some output text "quote" ... 

(These quotes are unicode e2809d)

This is how I invoke the command:

 f_output = SpooledTemporaryFile() subprocess.check_call(cmd, shell=True, stdout=f_output) f_output.seek(0) output = f_output.read() 

The problem is that I get the following:

 >>> repr(output) some output text ?quote? ... >>> type(output) <str> 

(And if I call 'ord' '?', I get 63.) I'm on Python 2.7 on Linux.

Note. Running the same code on OSX works correctly for me. The problem is that I am running it on a Linux server.

+6
source share
2 answers

Wow, that was the weirdest problem, but I fixed it!

It turns out that the program he called (the java program) returned a different encoding, depending on where it was called from!

Dev osx machine, returns characters perfectly, Linux server from the command line, returns them in order, called from a Django application, nope turns into "?" s.

To fix this, I added this argument to the command:

 -Dfile.encoding=utf-8 

I got this idea here and it seems to work. There you can also change the internal Java program.

I wish I blamed Python! You guys had the right idea.

+1
source

Redirection ( stdout=file ) occurs at the file descriptor level. Python has nothing to do with what is written to the file, if you see ? instead of " in the file itself (not in the REPL).

If it works on OS X and it doesn’t work on a Linux server, then the likely difference is in the environment, check LC_ALL, LC_CTYPE, LANG envvars-python, /bin/sh (due to shell=True ), and cmd may use your locale encoding, which is ASCII, if the environment is not installed (C, POSIX locale).

To get the "raw binary" from a subprocess:

 #!/usr/bin/env python import subprocess raw_binary = subprocess.check_output(['cmd', 'arg 1', 'arg 2']) print(repr(raw_binary)) 

Note:

  • no shell=True - do not use it if necessary.
  • many programs can change their behavior if they find that the output is not tty,.
0
source

All Articles