I have Burmese text, UTF-8. I use PHP to work with text. At some point along the way, some ZWSPs filled up, and I would like to remove them. I tried two different ways to delete characters, and none of them work.
At first I tried to use:
$newBody = str_replace("​", "", $newBody);
to find the HTML object and delete it, as it appears in the Web Inspector. Spaces are not removed. I also tried this as:
$newBody = str_replace("​", "", $newBody);
and get the same result.
The second method I tried was found on this subject Remove the ZERO WIDTH NON-JOINER character from a string in PHP
which is as follows:
$newBody = str_replace("\xE2\x80\x8C", "", $newBody);
but I also did not get the result. ZWSP has not been deleted.
An example word in the text ($newBody) looks like this : αα°β​αβ​ααααΊ And I want to make it look like this : αα°αααααΊαΈ
Any ideas? Would preg_replace work better?
So i tried
$newBody = preg_replace("/\xE2\x80\x8B/", "", $newBody);
and it seems to work, but now there is another problem.
<a class="defined" title="Ukraine">αα°​α​ααααΊαΈ</a>
converted to
<a class="defined _tt_t_" title="Ukraine" style="font-family: 'Masterpiece Uni Sans', TharLon, Myanmar3, Yunghkio, Padauk, Parabaik, 'WinUni Innwa', 'Win Uni Innwa', 'MyMyanmar Unicode', Panglong, 'Myanmar Sangam MN', 'Myanmar MN';">αα°αααααΊαΈ</a>
I do not want him to add all the extra stuff. Any idea why this is happening? Besides the fact that you somehow focus only on text, is there another way to prevent this additional material from being added to preg_replace? By the way, using google chrome on Mac. It looks like it works a little differently with firefox ...