Problem of Evidence 1 ###
It throws "UnicodeDecodeError: the ascii codec cannot decode the 0xc2 byte at position 2: the serial number is not in the range (128)" when executing the following code:
filename = 'Spywaj.ttf' print repr(filename) >> 'Sp\xc2\x88ywaj.ttf' filepath = os.path.join('/dirname', filename)
I donโt see how to get this exception - both os.path.join arguments are str objects. It makes no sense to try to convert anything to unicode. Are you sure the code above is exactly what you ran?
Problem of Evidence 2
Alex os.path.join's suggestion is working now, but I still canโt access the file on disk with the name of the file that it joined.
filename = filename.decode('utf-8') filepath = os.path.join('/dirname', filename) print filepath >> /dirname/u'Sp\xc2\x88ywaj.ttf'
Sorry, assuming filename not changed from the previous snippet, this is definitely not possible. This is similar to the result of os.path.join('/dirname', repr(filename)) ... please make sure to publish the code that you were actually executing along with the actual result (and the actual trace, if any )
Confusion
new_filepath = filepath.encode('Latin-1').encode('utf-8')
Alex wanted to try twice, each time with one of these encodings - do not try once with both encodings! Since all characters in the file path are in the ASCII range (see Proof of Problem 2), the effect was just filepath.encode ('ascii')
A simple solution
You know how to find the file name that interests you:
valid_filepath = glob.glob('/dirname/*.ttf')[0]
If you need to write this name hard in a script, you can use the repr () function to get a view that you can enter into your script without worrying about utf8, unicode, encode, decode and all this noise:
print repr(valid_filepath)
Suppose it prints '/dirname/Sp\xc2\x88ywaj.ttf' ... then all you have to do is carefully copy and paste it into your script:
file_path = '/dirname/Sp\xc2\x88ywaj.ttf'