I would recommend pyICU - Python bindings for the IBM rich open-source ICU internationalization library. You create a Collator object, for example. from:
collator = PyICU.Collator.createInstance(PyICU.Locale.getFrance())
and then you can sort, for example. a list of utf-8 encoded strings according to the rules for the French language, for example. using thelist.sort(cmp=collator.compare) .
The only problem I encountered was that I did not find a good packaged, immediately used version of PyICU plus ICU for MacOSX - I finished creating and installing from sources: ICU's own sources, 3.6, from here - there are binary files for Windows and several versions of Unix, but not for Mac; PyICU 0.8.1 from here .
Clean of these build / install issues and a few meager documents for Python bindings, ICU is really a find if you are doing a significant part of the work related to i18n, and PyICU is a very useful set of bindings to it!
source share