How to combine the Arabic word with "tashkel"?

I use the following function to highlight a specific word and it works great in English

function highlight(str,toBeHighlightedWord) { toBeHighlightedWord="(\\b"+ toBeHighlightedWord.replace(/([{}()[\]\\.?*+^$|=!:~-])/g, "\\$1")+ "\\b)"; var r = new RegExp(toBeHighlightedWord,"igm"); str = str.replace(/(>[^<]+<)/igm,function(a){ return a.replace(r,"<span color='red' class='hl'>$1</span>"); }); return str; } 

but the dose is not used for arabic text

so how to change the regular expression to match Arabic words, also Arabic words with tashkel, where tashkel are the characters added between the original characters example: "Ω…Ψ­Ω…Ψ―" is without tashkels "Ω…Ψ­ َ Ω… ΩŽΩ‘ Ψ―" with a tashkel tashkel word decoration and these small signs are symbols

+7
javascript regex arabic
source share
1 answer

In Javascript, you can use the \b word boundary with only these characters: [a-zA-Z0-9_] . The lookbehind statement cannot be useful either, because this function is not supported by Javascript.

The way to solve the problem and to β€œemulate” a kind of word boundary is to use a negative character class with the characters you want to highlight (since it is a negative character class, it will correspond to characters that cannot be part of the word.) In the capture group for the left border . For the right side, a negative look will be very simple.

 toBeHighlightedWord="([^\\w\\u0600-\\u06FF\\uFB50-\\uFDFF\\uFE70-\\uFEFF]|^)(" + toBeHighlightedWord.replace(/([{}()[\]\\.?*+^$|=!:~-])/g, "\\$1") + ")(?![\\w\\u0600-\\u06FF\\uFB50-\\uFDFF\\uFE70-\\uFEFF])"; var r = new RegExp(toBeHighlightedWord, "ig"); str = str.replace(/(>[^<]+<)/g, function(a){ return a.replace(r, "$1<span color='red' class='hl'>$2</span>"); } 

The character ranges that are used here come from three blocks of the Unicode table:

Note that using the new capture group changes the replacement pattern.

+5
source share

All Articles