PHP preg_replace oddity with pound sign and ã

I apply the following function

<?php function replaceChar($string){ $new_string = preg_replace("/[^a-zA-Z0-9\sçéèêëñòóôõöàáâäåìíîïùúûüýÿ]/", "", $string); return $new_string; } $string = "This is some text and numbers 12345 and symbols !£%^#&$ and foreign letters éèêëñòóôõöàáâäåìíîïùúûüýÿ"; echo replaceChar($string); ?> 

which works fine, but if I add ã to the preg_replace for example

 $new_string = preg_replace("/[^a-zA-Z0-9\sçéèêëñòóôõöàáâãäåìíîïùúûüýÿ]/", "", $string); $string = "This is some text and numbers 12345 and symbols !£%^#&$ and foreign letters éèêëñòóôõöàáâäåìíîïùúûüýÿã"; 

This contradicts the pound sign £ and replaces the pound sign with an indefinite question mark in a black square.

This is not critical, but does anyone know why this is?

Thanks,

Barry

UPDATE: Thanks to everyone. Changed functions added the u modifier: pt2.php.net/manual/en / ... - as suggested by Artefacto, and it works with pleasure

 function replaceChar($string){ $new_string = preg_replace("/[^a-zA-Z0-9\sçéèêëñòóôõøöàáâãäåìíîïùúûüýÿ]/u", "", $string); return $new_string; } 
+4
source share
4 answers

If your line is in UTF-8, you should add the u modifier to regex. Like this:

 function replaceChar($string){ $new_string = preg_replace("/[^a-zA-Z0-9\sçéèêëñòóôõöàáâäåìíîïùúûüýÿ]/u", "", $string); return $new_string; } $string = "This is some text and numbers 12345 and symbols !£%^#&$ and foreign letters éèêëñòóôõöàáâäåìíîïùúûüýÿ"; echo replaceChar($string); 
+4
source

Your string is most likely UTF-8, but preg_replace () works with bytes

+2
source

this code is valid ...

perhaps you should try coding with a central european character

 <?php header ('Content-type: text/html; charset=ISO-8859-2'); ?> 
0
source

Maybe you should take a look at mb_ereg_replace () . As Mark noted, preg_replace only works at the byte level and does not work with multibyte character encodings.

Cheers
Fabian

0
source

Source: https://habr.com/ru/post/1310835/


All Articles