Sorting the list of tuples by locale (Swedish order)

Apparently, PostgreSQL 8.4 and Ubuntu 10.04 cannot handle the updated sorting method for W and V for the Swedish alphabet. That is, he still arranges them as the same letter (old definition for Swedish order):

  • Wa
  • Vb
  • Wc
  • Vd

it should be (new definition for Swedish order):

  • Vb
  • Vd
  • Wa
  • Wc

I need to do this correctly for the Python / Django site that I am creating. I tried various ways to simply organize the list of tuples created from QuerySet Django using * values_list *. But since the Swedish letters å, ä and ö must be properly ordered. Now I have either one or the other way and not both.

list_of_tuples = [(u'Wa', 1), (u'Vb',2), (u'Wc',3), (u'Vd',4), (u'Öa',5), (u'äa',6), (u'Åa',7)] print '########## Ordering One ##############' ordered_list_one = sorted(list_of_tuples, key=lambda t: tuple(t[0].lower())) for item in ordered_list_one: print item[0] print '########## Ordering Two ##############' locale.setlocale(locale.LC_ALL, "sv_SE.utf8") list_of_names = [u'Wa', u'Vb', u'Wc', u'Vd', u'Öa', u'äa', u'Åa'] ordered_list_two = sorted(list_of_names, cmp=locale.strcoll) for item in ordered_list_two: print item 

Examples give:

 ########## Ordering One ############## Vb Vd Wa Wc äa Åa Öa ########## Ordering Two ############## Wa Vb Wc Vd Åa äa Öa 

Now what I want is their combination, so the correct V / W and å, ä, ö settings are correct. More precisely. I want Ordering One to comply with the locale. By then, using the second element (object id) in each tuple, I could get the correct object in Django.

I begin to doubt that this will be possible? Will there be a PostgreSQL update with a newer version that handles sorts better and then uses raw SQL in Django?

+7
source share
2 answers

When running LC_ALL=sv_SE.UTF-8 sort in your Ubuntu-10.04 example, it exits from Wa to Vb (the "old way"), so Ubuntu doesn't seem to agree with the "new way". Since PostgreSQL relies on the operating system to do this, it will behave exactly like the OS, given the same lc_collate.

In fact, there is a patch in debian glibc related to this particular sorting problem: http://sourceware.org/bugzilla/show_bug.cgi?id=9724 But it was objected and not accepted. If you only need this behavior on the system you are administering, you can apply the patch change to / usr / share / i 18n / locales / sv_SE and rebuild the se_SV locale-gen sv_SE.UTF-8 running locale-gen sv_SE.UTF-8 . Or better yet, create your own alternative language derived from it to avoid clutter with the original.

+8
source

This solution is complicated because key = locale.strxfrm works fine with single lists and dictionaries, but not list lists or tuple lists.

Changes in Py2 -> Py3: use locale.setlocale (locale.LC_ALL, '') and key = 'locale.strxfrm' (instead of 'cmp = locale.strcoll').

 list_of_tuples = [('Wa', 1), ('Vb',2), ('Wc',3), ('Vd',4), ('Öa',5), ('äa',6), ('Åa',7)] def locTupSorter(uLot): "Locale-wise list of tuples sorter - works with most European languages" import locale locale.setlocale(locale.LC_ALL, '') # get current locale dicTups = dict(uLot) # list of tups to unsorted dictionary ssList = sorted(dicTups, key=locale.strxfrm) sLot = [] for i in range(len(ssList)): # builds a sorted list of tups tfLot = () elem = ssList[i] # creates tuples for list tfLot = (elem, dicTups[elem]) sLot.append(tfLot) # creates sorted list of tuples return(sLot) print("list_of_tuples=\n", list_of_tuples) sortedLot = locTupSorter(list_of_tuples) print("sorted list of tuples=\n",sortedLot) 
0
source

All Articles