My best guess is that the file name itself does not use UTF-8. Or at least scandir() doesn't pick it that way.
Maybe mb_detect_encoding() can shed some light?
var_dump(mb_detect_encoding($filename));
If not, try to guess what encoding (CP1252 or ISO-8859-1 will be my first guess) and convert it to UTF-8, see if the output is valid:
var_dump(mb_convert_encoding($filename, 'UTF-8', 'Windows-1252')); var_dump(mb_convert_encoding($filename, 'UTF-8', 'ISO-8859-1')); var_dump(mb_convert_encoding($filename, 'UTF-8', 'ISO-8859-15'));
Or using iconv() :
var_dump(iconv('WINDOWS-1252', 'UTF-8', $filename)); var_dump(iconv('ISO-8859-1', 'UTF-8', $filename)); var_dump(iconv('ISO-8859-15', 'UTF-8', $filename));
Then, when you figure out which encoding is actually used, your code should look something like this (assuming CP1252):
$filename = htmlentities(mb_convert_encoding($filename, 'UTF-8', 'Windows-1252'), ENT_QUOTES, 'UTF-8');
Jasper N. Brouwer
source share