What is the point of chr (128) .. chr (255) in Python?

Edit: I'm talking about behavior in Python 2.7.

The chr function converts integers from 0 to 127 to ASCII characters. For example.

 >>> chr(65) 'A' 

I understand how useful this is in certain situations, and I understand why it covers 0..127, a 7-bit ASCII range.

The function also takes arguments from 128..255. For these numbers, it simply returns the hexadecimal representation of the argument. In this range, different bytes mean different things, depending on which part of the ISO-8859 standard is used.

I would understand if chr accepted another argument, for example

 >>> chr(228, encoding='iso-8859-1') # hypothetical 'Γ€' 

However, there is no such option:

 chr(i) -> character Return a string of one character with ordinal i; 0 <= i < 256. 

My questions are: what is the point of raising a ValueError for i > 255 instead of i > 127 ? All functions for 128 <= i < 256 are hexadecimal return values?

+7
python ascii
source share
3 answers

In Python 2.xa, str is a sequence of bytes, so chr() returns a string of one byte and takes values ​​in the range 0-255, since this is the range that a byte can be represented. When you print repr() lines with a byte in the range 128-255, the character is printed in escape format because there is no standard way to represent such characters (ASCII defines only 0-127). You can convert it to Unicode using unicode() , however, and specify the source encoding:

 unicode(chr(200), encoding="latin1") 

In Python 3.x, str is a sequence of Unicode characters, and chr() takes up a much wider range. Bytes are processed by type bytes .

+7
source share

I see what you are saying, but this is wrong. In Python 3.4, chr documented as:

Returns a string representing a character whose Unicode code is an integer.

And here are a few examples:

 >>> chr(15000) 'γͺ˜' >>> chr(5000) 'ᎈ' 

In Python 2.x, this was:

Returns a string of one character whose ASCII code is an integer i.

The chr function has existed in Python for a long time, and I think that understanding of different encodings was developed only in the latest releases. In this sense, it makes sense to maintain an ASCII base table and return hexadecimal values ​​for an extended ASCII set in the range 128 - 255.

Even in Unicode, the ASCII set is defined only as 128 characters, not 256, so there is no (there was) no standard one and the accepted way to allow ord() to return an answer for these input values.

0
source share

Please note that python 2 string processing does not work. This is one of the reasons I recommend switching to python 3.

In python 2, the string type was intended to represent both text and binary strings. So chr () is used to convert an integer to byte. It is not related to text, ASCII or ISO-8859-1. This is a binary stream of bytes:

  binary_command = chr(100) + chr(200) + chr(10) device.write(binary_command) etc() 

In python 2.7, the bytes () type was added for direct compatibility with python 3 and it maps to str ().

0
source share

All Articles