PHP: Sorting letters using usort or sorting from a UTF-8 string causes unknown characters

I am trying to alphabetically sort a UTF-8 string. The result contains unknown characters, and I do not know why. The same thing happens with usort and sorting.

setlocale(LC_COLLATE, 'ro_RO.UTF-8');

$word = 'ÎABAȚÂIEȘĂ';
$chars = str_split($word);

echo 'Word: ' . $word . "\n";

//sort($chars, SORT_LOCALE_STRING);

usort($chars, function($a, $b){
    echo 'Comparing: ' . $a . ' and ' . $b . "\n";
    return strcoll($a, $b);
});

echo 'Result: ' . implode($chars) . "\n";

Command line example: http://s18.postimg.org/avqfhetsp/test.gif

+4
source share
1 answer

The problem is not caused by comparison and / or sorting, but by a function str_split(). Since the multibyte version of this function does not exist, you must use mb_split()or for this purpose preg_split().

+1
source

All Articles