So this seems like a bug in fgetcsv .
Now I process the CSV data myself (a little cumbersome), but it works, and I have no encoding problems at all.
This (not yet optimized version) of what I am doing:
$rawCSV = file_get_contents($csvfile); $lines = preg_split ('/$\R?^/m', $rawCSV); //split on line breaks in all operating systems: http://stackoverflow.com/a/7498886/797194 foreach ($lines as $line) { array_push($words, getCSVValues($line)); }
getCSVValues comes from here and is needed to work with CSV strings like this (commas!):
"I'm a string, what should I do when I need commas?",Howdy there
Looks like:
function getCSVValues($string, $separator=","){ $elements = explode($separator, $string); for ($i = 0; $i < count($elements); $i++) { $nquotes = substr_count($elements[$i], '"'); if ($nquotes %2 == 1) { for ($j = $i+1; $j < count($elements); $j++) { if (substr_count($elements[$j], '"') %2 == 1) { // Look for an odd-number of quotes // Put the quoted string pieces back together again array_splice($elements, $i, $j-$i+1, implode($separator, array_slice($elements, $i, $j-$i+1))); break; } } } if ($nquotes > 0) { // Remove first and last quotes, then merge pairs of quotes $qstr =& $elements[$i]; $qstr = substr_replace($qstr, '', strpos($qstr, '"'), 1); $qstr = substr_replace($qstr, '', strrpos($qstr, '"'), 1); $qstr = str_replace('""', '"', $qstr); } } return $elements; }
Pretty workaround, but it seems to work fine.
EDIT:
There is also an error , apparently, it depends on the locale settings.