Sort a string collection in Python using various locale settings

I want to sort the list of strings according to user preferences. I have a multilingual Pappon webapp and what is the correct way to sort the strings this way?

I know that I can configure the locale, for example:

import locale locale.setlocale(locale.LC_ALL, '') 

But this should be done when the application starts (and doc says that it is not thread safe!), Is it recommended to install it in each thread in accordance with the current user (request) setting?

I would like something like the locale.strcoll (...) function with an additional parameter - the language that is used for sorting.

+4
source share
4 answers

Another possible solution is to use a SQL server with good language support (unfortunately, sqlite is not an option). Then I can put all the data in a temporary memory table and select it with ORDER BY. IMO, this should be a better solution than trying to distribute locale settings to multiple processes, as answer.se answer recommends.

0
source

I would recommend pyICU - Python bindings for the IBM rich open-source ICU internationalization library. You create a Collator object, for example. from:

  collator = PyICU.Collator.createInstance(PyICU.Locale.getFrance()) 

and then you can sort, for example. a list of utf-8 encoded strings according to the rules for the French language, for example. using thelist.sort(cmp=collator.compare) .

The only problem I encountered was that I did not find a good packaged, immediately used version of PyICU plus ICU for MacOSX - I finished creating and installing from sources: ICU's own sources, 3.6, from here - there are binary files for Windows and several versions of Unix, but not for Mac; PyICU 0.8.1 from here .

Clean of these build / install issues and a few meager documents for Python bindings, ICU is really a find if you are doing a significant part of the work related to i18n, and PyICU is a very useful set of bindings to it!

+4
source

You will need the latest possible intensive care center in your PICU to get the best and most up-to-date data.

+1
source

Given the warnings about the documentation, it seems you are on your own if you are trying to set the locale in different threads.

If you can split the problem into one thread per locale, could you also split it into one subprocess for each language using Python 2.6 multiprocessing?

It seems that everything that allows this should be a hack, you can even consider using the sort (1) command line called with another LC_ALL for different languages.

0
source

All Articles