It took some time, but I think I found the answer.
I suggested that this word should be Akvaléir. I found a description of the page about this in French. When I used your code snippet, I had a line like
>>> fileinfo.filename 'Akval\x82ir, La police - The Font - Fr - En.pdf' >>>
This does not work with UTF8, Latin-1, CP-1251 or CP-1252 encodings. Then I found that CP863 is a possible Canadian encoding, so maybe it was from French Canada.
>>> print unicode(fileinfo.filename, "cp863").encode("utf8") Akvaléir, La police - The Font - Fr - En.pdf >>>
However, I then read the Zip file format specification , which says
The ZIP format has historically been supported only by the original IBM PC character set, commonly called the IBM Code Page 437.
...
If bit 11 is used for general purposes, the file name and comment must support Unicode Standard, version 4.1.0 or greater, using the character encoding form specified by the UTF-8 repository specification.
Testing this question gives me the same answer as the Canadian codepage
>>> print unicode(fileinfo.filename, "cp437").encode("utf8") Akvaléir, La police - The Font - Fr - En.pdf >>>
I don’t have a Unicode encoded zip file, and I'm not going to create it, so I just assume that all zip files are cp437 encoded.
import shutil import zipfile f = zipfile.ZipFile('akvaleir.zip', 'r') for fileinfo in f.infolist(): filename = unicode(fileinfo.filename, "cp437") outputfile = open(filename, "wb") shutil.copyfileobj(f.open(fileinfo.filename), outputfile)
On my Mac, which gives
109936 Nov 27 01:46 Akvale??ir_Normal_v2007.ttf 25244 Nov 27 01:46 Akvale??ir, La police - The Font - Fr - En.pdf
which completes the tab
ls Akvale\314\201ir
and displayed with a good “é” in my file browser.