Combining byte strings using ''.join() works just fine; the error you see appears only if you mixed unicode and str objects:
>>> utf8 = [u'\u0123'.encode('utf8'), u'\u0234'.encode('utf8')] >>> ''.join(utf8) '\xc4\xa3\xc8\xb4' >>> u''.join(utf8) Traceback (most recent call last): File "<stdin>", line 1, in <module> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc4 in position 0: ordinal not in range(128) >>> ''.join(utf8 + [u'unicode object']) Traceback (most recent call last): File "<stdin>", line 1, in <module> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc4 in position 0: ordinal not in range(128)
The above exceptions occur when using the Unicode u'' value as a joiner and adding a Unicode string to the list of strings to join, respectively.
Martijn pieters
source share