I am trying to find a general solution for printing unicode strings from a python script.
The requirements are that it must be executed on both python 2.7 and 3.x on any platform and with any terminal settings and environment variables (for example, LANG = C or LANG = en_US.UTF-8).
The python print function automatically tries to encode the terminal encoding when printing, but if the terminal encoding is ascii, it fails.
For example, the following works when the environment is "LANG = enUS.UTF-8":
x = u'\xea' print(x)
But it does not work in python 2.7 when "LANG = C":
UnicodeEncodeError: 'ascii' codec can't encode character u'\xea' in position 0: ordinal not in range(128)
The following steps are performed regardless of LANG settings, but Unicode characters will not be displayed properly if the terminal uses a different Unicode encoding:
print(x.encode('utf-8'))
The desired behavior is to always show unicode in the terminal, if possible, and show some encoding if the terminal does not support unicode. For example, the output will be UTF-8 encoded if the terminal only supports ascii. Basically, the goal is to do the same thing as the python print function when it works, but in cases where the print function does not work, use some standard encoding.
python encoding unicode utf-8
clark800
source share