I do not know C #, but this is still a UNICODE question. I would do this by normalizing UNICODE using this function .
First, we normalize to the decomposed form. Then filter out all the characters from the "Mark, Nospacing" [Mn] category. Finally, normalize back to the folded form.
If I see correctly, your character is displayed in UNICODE ARABIC LETTER ALEF WITH HAMZA ABOVE ( U + 0623 , [Lo] ), followed by ARABIC FATHATAN ( U + 064B , [Mn] ). The first character splits into ARABIC LETTER ALEF ( U + 0627 , [Lo] ) + ARABIC HAMZA ABOVE ( U + 0654 , [Mn] ).
Here is a chain of transformations (the first arrow indicates decomposition, the second - filtering non-adjacent labels, the third - composition):
U+0623 + U+064B β U+0627 + U+0654 + U+064B β U+0627 β U+0627
After you decompose, remove all the characters from the [Mn] category and put it back, you are left with only ARABIC LETTER ALEF .
Bolo
source share