Array sort with sort

I have an array with words in French: ['États-Unis', 'Espagne', etc.], which I would like to sort alphabetically according to my locale (fr_FR)

I am using the following code:

$collator = new Collator('fr-FR'); echo $collator->getErrorMessage(); $collator->asort($array); 

but I get the error U_USING_DEFAULT_WARNING, where I assume that English or some other language is used. More importantly, the array is not sorted correctly (the US appears before Spain, I would expect the opposite to happen)

I have intl package installed and my system has corresponding locales (Ubuntu)

 $locale -a C C.UTF-8 en_US.utf8 es_ES.utf8 fr_FR fr_FR.iso88591 fr_FR.utf8 POSIX 

I tried to use different combinations when creating a Collation object without any good results: "fr-FR", "fr-FR.UTF8", etc.

Is there anything else I'm missing?

+8
php localization collation locale
source share
3 answers

According to this blog post , for the words cote , cotĂŠ , cĂ´te and cĂ´tĂŠ (already sorted in English), the sort order in French is: cote , cĂ´te , cotĂŠ and cĂ´tĂŠ . The code below sorts words in French sort:

 $words = array('cote', 'cotĂŠ', 'cĂ´te', 'cĂ´tĂŠ'); print_r($words); $collator = new Collator('fr_FR'); // print info about locale echo 'French Collation ' . (($collator->getAttribute(Collator::FRENCH_COLLATION) == Collator::ON) ? 'On' : 'Off') . "\n"; echo $collator->getLocale(Locale::VALID_LOCALE) . "\n"; echo $collator->getLocale(Locale::ACTUAL_LOCALE) . "\n"; $collator->asort($words); print_r($words); 

And the printed result is as follows:

 Array ( [0] => cote [1] => cotĂŠ [2] => cĂ´te [3] => cĂ´tĂŠ ) French Collation On fr_FR fr Array ( [0] => cote [2] => cĂ´te [1] => cotĂŠ [3] => cĂ´tĂŠ ) 

In the same blog post, the author says:

[...] diacritics are evaluated from right to left, and not from left to right. Thus, côte precedes coté , and not after it, as in languages ​​such as English, which evaluate them from left to right. Since the word côte does not have SHARP on the "e" at the end of the word, but coté . In English and most other languages, evaluation begins on the left, and therefore CIRCUMFLEX or the lack of an “o” is the controlling factor when ordering.

So, if you have an array with the words Spain and the USA , they will have the same order in English and French.

You should also keep in mind that the asort method supports the association of array indices. See the difference:

 asort: Array ( [0] => cote [2] => cĂ´te [1] => cotĂŠ [3] => cĂ´tĂŠ ) sort: Array ( [0] => cote [1] => cĂ´te [2] => cotĂŠ [3] => cĂ´tĂŠ ) 

About U_USING_DEFAULT_WARNING

According to this API documentation :

U_USING_DEFAULT_WARNING indicates that the default locale data was used; neither the requested locale nor any of its falling locations were found.

When I use the fr_FR locale, for example, I get U_USING_FALLBACK_WARNING, which indicates that the locale of the return was used, in this case the fr language.

Locale

It seems that your computer does not support the French language (or it does, but somehow PHP cannot use it, and then abandon the default language), although the locale -a command displays French packages, I have some suggestions, which you can try.

First specify all supported locales:

 cat /usr/share/i18n/SUPPORTED 

Now create the languages ​​you need:

 sudo locale-gen fr_FR.UTF-8 sudo locale-gen fr_FR.ISO-8859-1 sudo dpkg-reconfigure locales 

If this does not work, try installing the language-pack-fr and language-support-fr packages and generate the languages ​​again.

This problem is odd. I have a virtual machine with Ubuntu 11.04 and PHP 5.3.8, and it works fine in my Debian 6 too, and I have not installed any package or configured anything.

+5
source share

I am using cygwin:

 $ locale -a | grep fr_FR fr_FR fr_FR.utf8 fr_FR@euro 

(note that the output is not fr_FR.iso88591 )

Code (file encoding - UTF-8):

 $collator = new Collator('fr_FR'); var_dump($collator->getErrorMessage()); // FRENCH_COLLATION is OFF $arr = array('États-Unis', 'Espagne'); var_dump($collator->getAttribute(Collator::FRENCH_COLLATION) == Collator::ON); var_dump($collator->getLocale(Locale::VALID_LOCALE)); var_dump($collator->getLocale(Locale::ACTUAL_LOCALE)); $collator->asort($arr); var_dump($arr); // FRENCH_COLLATION is ON $collator->setAttribute(Collator::FRENCH_COLLATION, Collator::ON); $arr = array('États-Unis', 'Espagne'); var_dump($collator->getAttribute(Collator::FRENCH_COLLATION) == Collator::ON); var_dump($collator->getLocale(Locale::VALID_LOCALE)); var_dump($collator->getLocale(Locale::ACTUAL_LOCALE)); $collator->asort($arr); var_dump($arr); 

Output:

 string(23) "U_USING_DEFAULT_WARNING" bool(false) string(5) "fr_FR" string(4) "root" array(2) { [1]=> string(7) "Espagne" [0]=> string(11) "États-Unis" } bool(true) string(5) "fr_FR" string(4) "root" array(2) { [1]=> string(7) "Espagne" [0]=> string(11) "États-Unis" } 

And here's the trick: I convert the file encoding to ISO 8859-1 (in vim, I do :set fileencoding=iso-8859-1 ) and try again:

 string(23) "U_USING_DEFAULT_WARNING" bool(false) string(5) "fr_FR" string(4) "root" array(2) { [0]=> string(10) "▒tats-Unis" [1]=> string(7) "Espagne" } bool(true) string(5) "fr_FR" string(4) "root" array(2) { [0]=> string(10) "▒tats-Unis" [1]=> string(7) "Espagne" } 

Some characters are broken, but I think because my terminal does not support this code page. The main thing is that the line order now is what you described: "Espagne" appears after "États-Unis".

So, I think this is a file encoding.

0
source share

Try just "FR", it should work for your system, I think:

 $collator = new Collator('FR'); 
0
source share

All Articles