How can I find a match for a Russian word in a string (also in Russian) in PHP?
So, for example, something like this:
$pattern = '//'; preg_replace($pattern, $replacement, $string_in_russian)
I tried utf8_encode and htmlentities with the UTF-8 flag for $ pattern, but that didn't work. Should I also encode $ string_in_russian?
Update: The suggestion for the / u flag does not work, so I put the actual code in it. This is from the glossary plugin for Wordpress (my site is correctly configured to use the Russian language, and it works, but not in this case). So here is the code
$glossary_title = $glossary_item->post_title; $glossary_search = '/\b'.$glossary_title.'s*?\b(?=([^"]\*"[^"]\*")\*[^"]*$)/iu'; $glossary_replace = '<a'.$timestamp.'>$0</a'.$timestamp.'>'; $content_temp = preg_replace($glossary_search, $glossary_replace, $content, 1);
When I do a quick echo in an HTML comment, this is the type of string I get for the template
/\bs*?\b(?=([^"]*"[^"]")[^"]*$)/iu
And well, that still doesn't work. I thought that maybe it was the “s” that wrapped me up (this level of regex is a little higher than me, but I guess it exists for possible plurals), but deleting it did not help.
Update # 2: Okay, so I decided to do a full “clean slide” test - a simple PHP file with some lines of content in English and Russian and target words to replace. Here is the code
$content_en = 'Nulla volutpat pretium nunc, ac feugiat neque lobortis vitae. In eu sapien sit amet eros tincidunt viverra. <b style="color:purple">Proin</b> congue hendrerit felis, et consequat neque ultrices lobortis. <b style="color:purple">Proin</b> luctus bibendum libero et molestie. Sed tristique lacus a urna semper eget feugiat lacus varius. Donec vel sodales diam. <b style="color:purple">Proin</b> fringilla laoreet purus, a facilisis nisi porttitor vel. Nullam ac justo ac elit laoreet ullamcorper vel a magna. Suspendisse in arcu sapien.'; $find_en = 'proin'; $replace_with_en = '<em style="color:red">REPLACEMENT</em>'; $glossary_search = '/\b'.$find_en.'s*?\b(?=([^"]*"[^"]*")*[^"]*$)/iu'; $content_en_replaced = preg_replace($glossary_search, $replace_with_en, $content_en); $content_ru = 'Lorem Ipsum , , , " <b style="color:purple"></b> .. <b style="color:purple"></b> .. <b style="color:purple"></b> .." HTML Lorem Ipsum .'; $find_ru = ''; $replace_with_ru = '<em style="color:red"></em>'; $glossary_search = '/\b'.$find_ru.'s*?\b(?=([^"]*"[^"]*")*[^"]*$)/iu'; $content_ru_replaced = preg_replace($glossary_search, $replace_with_ru, $content_ru);
And here is a screenshot of the release http://www.flickr.com/photos/iliadraznin/5372578707/
As you can see, the English text replaced the target word, while the Russian language does not, and the code is identical, and I use the / u flag. The file is also encoded by UTF-8. Any suggestions? (and again I tried to remove the "s", still nothing)