Mb_strpos vs strpos, what's the difference?

Yes. I know. We should use the mb_ * function when we are working with multibyte char. But when do we use strpos? Let's take a look at this code (stored in utf-8)

var_dump(strpos("My symbol utf-8 is the €.", "\xE2\x82\xAC")); // int(23) 

Is there any difference in using mb_strpos? Doesn't this work work in one place? After all, strs is looking for a string (a few bytes)? Is there any reason to use strpos instead?

+6
source share
2 answers

For UTF-8, matching a sequence of bytes exactly matches a matching sequence of characters.

This way they both find the needle exactly at the same point, but mb_strpos counts the complete UTF-8 byte sequences to the needle, where strpos calculates any bytes. Therefore, if your line contains another multi-byte UTF-8 sequence, the results will be different:

 strpos("My symbolö utf-8 is the €.", "€") !== mb_strpos("My symbolö utf-8 is the €.", "€", 0, "UTF-8") 

But:

 strpos("My symbol utf-8 is the €.", "€") === mb_strpos("My symbol utf-8 is the €.", "€", 0, "UTF-8") 
+11
source

Depending on the character set used and the search string, this may or may not matter.

strpos() searches for a sequence of bytes that is passed as a needle.

mb_strpos() does the same, but also respects character boundaries.

So strpos() will match if a sequence of bytes is found anywhere in the string. mb_strpos() will only match if the byte sequence also represents a valid set of full characters.

+5
source

All Articles