Arab problem Replace أ with only ا

Question

Arab problem Replace أ with only ا

How to replace alf bel tanween with regular alf

+7

c # regex unicode normalization unicode-normalization

Baharanji Jan 13 '11 at 16:07

source share

3 answers

I do not know C #, but this is still a UNICODE question. I would do this by normalizing UNICODE using this function .

First, we normalize to the decomposed form. Then filter out all the characters from the "Mark, Nospacing" [Mn] category. Finally, normalize back to the folded form.

If I see correctly, your character is displayed in UNICODE ARABIC LETTER ALEF WITH HAMZA ABOVE ( U + 0623 , [Lo] ), followed by ARABIC FATHATAN ( U + 064B , [Mn] ). The first character splits into ARABIC LETTER ALEF ( U + 0627 , [Lo] ) + ARABIC HAMZA ABOVE ( U + 0654 , [Mn] ).

Here is a chain of transformations (the first arrow indicates decomposition, the second - filtering non-adjacent labels, the third - composition):

 U+0623 + U+064B → U+0627 + U+0654 + U+064B → U+0627 → U+0627

After you decompose, remove all the characters from the [Mn] category and put it back, you are left with only ARABIC LETTER ALEF .

+4

Bolo Jan 13 '11 at 16:27

source share

Take a look at this project, which gives examples of replacing Unicode characters in strings: http://www.codeproject.com/KB/string/FontGlyphSet.aspx

Arab problem Replace أ with only ا

More articles: