How to combine the Arabic word with "tashkel"?

Question

How to combine the Arabic word with "tashkel"?

I use the following function to highlight a specific word and it works great in English

function highlight(str,toBeHighlightedWord) { toBeHighlightedWord="(\\b"+ toBeHighlightedWord.replace(/([{}()[\]\\.?*+^$|=!:~-])/g, "\\$1")+ "\\b)"; var r = new RegExp(toBeHighlightedWord,"igm"); str = str.replace(/(>[^<]+<)/igm,function(a){ return a.replace(r,"<span color='red' class='hl'>$1</span>"); }); return str; }

but the dose is not used for arabic text

so how to change the regular expression to match Arabic words, also Arabic words with tashkel, where tashkel are the characters added between the original characters example: "محمد" is without tashkels "مح َ م َّ د" with a tashkel tashkel word decoration and these small signs are symbols

+7

javascript regex arabic

Hager aly Jun 14 '14 at 7:06

source share

1 answer

Casimir et Hippolyte · Accepted Answer · 2014-06-14T07:47:23+0000

In Javascript, you can use the \b word boundary with only these characters: [a-zA-Z0-9_] . The lookbehind statement cannot be useful either, because this function is not supported by Javascript.

The way to solve the problem and to “emulate” a kind of word boundary is to use a negative character class with the characters you want to highlight (since it is a negative character class, it will correspond to characters that cannot be part of the word.) In the capture group for the left border . For the right side, a negative look will be very simple.

 toBeHighlightedWord="([^\\w\\u0600-\\u06FF\\uFB50-\\uFDFF\\uFE70-\\uFEFF]|^)(" + toBeHighlightedWord.replace(/([{}()[\]\\.?*+^$|=!:~-])/g, "\\$1") + ")(?![\\w\\u0600-\\u06FF\\uFB50-\\uFDFF\\uFE70-\\uFEFF])"; var r = new RegExp(toBeHighlightedWord, "ig"); str = str.replace(/(>[^<]+<)/g, function(a){ return a.replace(r, "$1<span color='red' class='hl'>$2</span>"); }

The character ranges that are used here come from three blocks of the Unicode table:

0600-06FF (Arabic)
FB50-FDFF (Arabic Presentation Form-A)
FE70-FEFF (Arabic Presentation Form-B)

Note that using the new capture group changes the replacement pattern.

How to combine the Arabic word with "tashkel"?

More articles: