OrderBy ignoring accented letters

I need a method like OrderBy() , which always orders ignoring accented letters and treats them as inactive. I already tried to override OrderBy() , but it seems I can not do this because it is a static method.

So now I want to create a custom lambda expression for OrderBy() , for example:

 public static IOrderedEnumerable<TSource> ToOrderBy<TSource, TKey>( this IEnumerable<TSource> source, Func<TSource, TKey> keySelector) { if(source == null) return null; var seenKeys = new HashSet<TKey>(); var culture = new CultureInfo("pt-PT"); return source.OrderBy(element => seenKeys.Add(keySelector(element)), StringComparer.Create(culture, false)); } 

However, I get this error:

Error 2 Type arguments for the method "System.Linq.Enumerable.OrderBy <TSource, TKey> (System.Collections.Generic.IEnumerable <TSource>, System.Func <TSource, TKey>, System.Collections.Generic.IComparer <TKey> ) 'cannot be deduced from Usage. Try to explicitly specify type arguments.

This doesn't seem to like StringComparer . How can i solve this?

Note:

I already tried to use RemoveDiacritics() from here , but I do not know how to use this method in this case. So I tried to do something like this , which also seems nice.

+7
sorting c # lambda expression
source share
2 answers

Solved! I got this error because to use the StringComparer element to sort in the OrderBy() expression, this element must be string .

So, when I know that this element is a string, I pass the string, and I use the RemoveDiacritics() method to ignore accented letters and look at them as inactive.

 public static IOrderedEnumerable<TSource> ToOrderBy<TSource, TKey>(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector) { if(!source.SafeAny()) return null; return source.OrderBy(element => Utils.RemoveDiacritics(keySelector(element).ToString())); } 

To ensure that RemoveDiacritics() works fine, I add the HtmlDecode() .

 public static string RemoveDiacritics(string text) { if(text != null) text = WebUtility.HtmlDecode(text); string formD = text.Normalize(NormalizationForm.FormD); StringBuilder sb = new StringBuilder(); foreach (char ch in formD) { UnicodeCategory uc = CharUnicodeInfo.GetUnicodeCategory(ch); if (uc != UnicodeCategory.NonSpacingMark) { sb.Append(ch); } } return sb.ToString().Normalize(NormalizationForm.FormC); } 
+2
source share

keySelector takes keySelector as the first argument. This keySelector should be Func<string,T> . So you need a method that takes a string and returns the value with which to enumerate the enumeration.

Unfortunately, I'm not sure how to determine if a character is an accented letter. RemoveDiacritics does not work for my é .

So, suppose you have a method called IsAccentedLetter that determines if a character is an accented letter:

 public bool IsAccentedLetter(char c) { // I'm afraid this does NOT really do the job return CharUnicodeInfo.GetUnicodeCategory(c) == UnicodeCategory.NonSpacingMark; } 

So, you can sort your list as follows:

 string[] myStrings = getStrings(); // whereever your strings come from var ordered = myStrings.OrderBy(s => new string(s.Select(c => IsAccentedLetter(c) ? ' ' : c).ToArray()), StringComparer.Create(culture, false)); 

The lambda expression takes a string and returns the same string, but replaces the accented letters with empty space. OrderBy now sorts your enumeration with these lines and therefore “ignores” accented letters.

UPDATE: If you have a working RemoveDiacritics(string s) method that returns strings with replaced letters with an accent, you can simply call OrderBy as follows:

 string[] mystrings = getStrings(); var ordered = myStrings.OrderBy(RemoveDiacritics, StringComparer.Create(culture, false)); 
+2
source share

All Articles