How to make Popen () correctly understand UTF-8?

This is my Python code:

[...]
proc = Popen(path, stdin=stdin, stdout=PIPE, stderr=PIPE)
result = [x for x in proc.stdout.readlines()]
result = ''.join(result);

Everything works fine when it is ASCII. When I get UTF-8's text stdout, the result is unpredictable. In most cases, the output is damaged. What is wrong here?

Btw, maybe this code needs to be optimized somehow?

+5
source share
2 answers

Have you tried decoding your string and then concatenating UTF-8 strings together? In Python 2.4+ (at least) this can be achieved with

result = [x.decode('utf8') for x in proc.stdout.readlines()]

, x , . decode() ( , UTF-8): x.decode('utf8') unicode, " " ( " 0 255 []" ).

+6

LogPipe.

, encoding='utf-8', errors='ignore' fdopen().

# https://codereview.stackexchange.com/questions/6567/redirecting-subprocesses-output-stdout-and-stderr-to-the-logging-module
class LogPipe(threading.Thread):
    def __init__(self):
        """Setup the object with a logger and a loglevel
        and start the thread
        """
        threading.Thread.__init__(self)
        self.daemon = False
        # self.level = level
        self.fdRead, self.fdWrite = os.pipe()
        self.pipeReader = os.fdopen(self.fdRead, encoding='utf-8', errors='ignore')  # set utf-8 encoding and just ignore illegal character
        self.start()

    def fileno(self):
        """Return the write file descriptor of the pipe
        """
        return self.fdWrite

    def run(self):
        """Run the thread, logging everything.
        """
        for line in iter(self.pipeReader.readline, ''):
            # vlogger.log(self.level, line.strip('\n'))
            vlogger.debug(line.strip('\n'))

        self.pipeReader.close()

    def close(self):
        """Close the write end of the pipe.
        """
        os.close(self.fdWrite)
0

All Articles