How to check if Unicode character has diacritics in .Net?

I am designing a heuristic for automatic language recognition and would like to know if a given letter has diacritics (for example, “Reye” - “all letters have diacritics”). It would be better if I could also get the type of diacritics, if possible .

I looked through the list of UnicodeCategory , but did not find anything that could help me here.

+5
c # unicode diacritics
source share
1 answer

One possible way is to normalize it in a form where the letters and their diacritics are written as several code points. Then check if you have a letter with accents.

Adapting from How to remove diacritical characters (accents) from a string in .NET? , you can normalize with Normalize(NormalizationForm.FormD) and check diacritics with UnicodeCategory.NonSpacingMark .

 bool IsLetterWithDiacritics(char c) { var s = c.ToString().Normalize(NormalizationForm.FormD); return (s.Length > 1) && char.IsLetter(s[0]) && s.Skip(1).All(c2 => CharUnicodeInfo.GetUnicodeCategory(c2) == UnicodeCategory.NonSpacingMark); } 
+11
source share

All Articles