Linq OrderBy in a general list returns an incomplete alphabetical list

I am trying to sort a general list of objects using the Name property. I am using LINQ and the following expressions do not work:

var query = possibleWords.OrderBy(x => x.Name.ToLower()).ToList(); foreach (Word word in query) //possibleWords.OrderBy(word => word.Name)) { listWords.Items.Add(word.Name); } 

"query" should now contain a list of ordered elements, if I understand it correctly, and the element should be added to the list named listWords.

However, the conclusion is as follows:

http://screencast.com/t/s1CkkWfXD4 (sorry for the URL link, but SO somehow blocked me from my account and I apparently can't post images with this new one).

The list is almost alphabetical, but not quite. For some reason, "aa" and "aaaa" comes last. What could be the reason and how to solve it?

Thanks in advance.

Development upon request

This code is entered in Visual Studio and executed:

  List<Word> words = new List<Word>(); words.Add(new Word("a")); words.Add(new Word("Calculator")); words.Add(new Word("aaa")); words.Add(new Word("Projects")); words.Add(new Word("aa")); words.Add(new Word("bb")); words.Add(new Word("c")); IEnumerable<Word> query = words.OrderBy(x => x.Name.ToLower()).ToList(); foreach (Word word in query) { Console.WriteLine(word.Name); } 

Gives me the following result:

 a bb c Calculator ccc Projects aa aaa 

This is not sorted correctly: the first "a" is correct, but subsequent entries "aa" and "aaa" are sent to the bottom of the list.

I'm not too good at character sets and coding, so maybe I'm making a rookie mistake. But in this case, I don’t understand what it can be, and I would be a little puzzled by why the first “a” arranges correctly, but the second and third “aa” and “aaa” are not!

Further Development - Word Class

 [Serializable()] public class Word { [System.Xml.Serialization.XmlAttribute("Name")] public string Name { get; set; } public Word(string name) { Name = name; } public Word() { } //Parameter less constructor neccessary for serialization } 

Reason and Resolution

Like @Douglas, the problem was resolved by supplying a comparison of StringComparer.InvariantCultureIgnoreCase with the OrderBy method.

In further research, it seems that the FindAll and OrderBy methods (possibly others) have problems using Danish culture (da-DK). There may be other methods or cultures that fail, but da-DK methods and FindAll + OrderBy methods definitely do not work as expected.

The OrderBy method has the problem described in this thread (incorrect ordering). The FindAll method has a similar, very strange problem: suppose we have a list of entries: a, aa, aaa and aaaa. When using FindAll (x => x.StartsWith ("a")), it returns only "a" NOT aa, aaa and aaaa. If you use StartsWith ("aa"), it will correctly find aa, as well as aaa and aaaa. When using StartWith ("aaa") it will not find aaaa again, only aaa! This seems to be a mistake within the framework.

+7
source share
3 answers

Could you try replacing:

 IEnumerable<Word> query = words.OrderBy(x => x.Name.ToLower()).ToList(); 

... from:

 IEnumerable<Word> query = words.OrderBy(x => x.Name, StringComparer.InvariantCultureIgnoreCase); 

There is very little chance that this is a strange cultural problem.

+6
source

The following code displays the expected result:

 class Word { public Word(string str) { Name = str; } public string Name { get; private set; } } public static void Main(string[] args) { List<Word> words = new List<Word>(); words.Add(new Word("a")); words.Add(new Word("Calculator")); words.Add(new Word("aaa")); words.Add(new Word("Projects")); words.Add(new Word("aa")); words.Add(new Word("bb")); words.Add(new Word("c")); IEnumerable<Word> query = words.OrderBy(x => x.Name.ToLower()).ToList(); foreach (Word word in query) { Console.WriteLine(word.Name); } } 

Outputs:

 a aa aaa bb c Calculator Projects 

Update: Ok, the secret is solved (sort of). If before the code you do the following:

 var cultureInfo = new CultureInfo("da-DK"); Thread.CurrentThread.CurrentCulture = cultureInfo; Thread.CurrentThread.CurrentUICulture = cultureInfo; 

You get the "wrong" output:

 a bb c Calculator Projects aa aaa 

Danish lexicographical comparison rules are apparently different. Here is the explanation I found on the net (http://stackoverflow.com/questions/4064633/string-comparison-in-java):

Note that this is very dependent on the active locale. For example, here in Denmark, we have the symbol "å", which was previously written as "aa" and is very different from two single a's. Consequently, the Danish sorting rules interpret two consonant a identically on "å", which means that it comes after z. It also means that Danish dictionaries are sorted differently than English or Swedish.

+5
source

Most likely, you are the last “a” - this is another (not ASCII) character. Check the character code (int)("a"[0]) to see if it matches the English "a".

You can’t make a mistake when sorting, if so, nothing can be fixed (except maybe it’s better to understand your data).

+2
source

All Articles