When is chr (ord (c)) not equal to c in Python?

I am reading the source code for testinfra in the Ansible module. I found the following lines of code:

  # Ansible return an unicode object but this is bytes ... # A simple test case is: # >>> assert File("/bin/true").content == open("/bin/true").read() stdout_bytes = b"".join((chr(ord(c)) for c in out['stdout'])) stderr_bytes = b"".join((chr(ord(c)) for c in out['stderr'])) 

Iterates through stdout , gets the integer serial number of each character and converts it back to a single-character string. But what is the point?

+5
source share
2 answers

When c is a Unicode-specific character (cannot be encoded in ASCII):

 >>> ord(u'\u2020') 8224 >>> chr(ord(u'\u2020')) ValueError: chr() arg not in range(256) 

This is true only in Python2 , as in Python3 , unichr is removed, and chr acts as unichr. This seems to be unusual behavior for such a library, as it usually generates an unexpected error specific to the executable for any non-English language.

+5
source

If c is an 8-bit string. From the docs for ord () :

[returns] byte value when the argument is an 8-bit string

chr() then converts it to the corresponding character. This is basically just converting bytes to characters, as the comment says.

-1
source

All Articles