Assuming you only need to support Python 2.6 and later, you can just use bytes for, well, bytes. Use b literals to create objects with bytes, for example b'\x0a\x0b\x00' . When working with files, make sure that the mode turns on b (as in open('file.bin', 'rb') ).
Beware that iteration and access to elements are different. In these cases, you can write your code to use chunks. Instead of b[0] == 0 (Python 3) or b[0] == b'\x00' (Python 2) write b[0:1] == b'\x00' . Other options use bytearray (when bytes are mutable) or helper functions.
Character strings must be unicode in Python 2, regardless of porting to Python 3; otherwise, the code is likely to be incorrect if it encounters non-ASCII characters. Equivalent to str in Python 3.
Either use u literals to create character strings (e.g. u'Düsseldorf' ) and / or be sure to run each file with from __future__ import unicode_literals . Declare file encodings, when necessary, by running files using # encoding: utf-8 .
Use io.open to read character strings from files. For network code, select bytes and call decode to get a character string.
If you need to support Python 2.5 or 3.2, see six for literal conversion.
Add a lot of statements to make sure that functions that work with character strings do not receive bytes, and vice versa. As usual, a good test suite with 100% coverage helps.
source share