You can use the nice bitwise XOR ( ^ ) property to achieve this: basically, when you xor two lines together, characters that are the same will become null bytes ( "\0" ). So, if we xor two lines, we just need to find the position of the first non-empty byte using strspn :
$position = strspn($string1 ^ $string2, "\0");
That's all. Therefore, consider an example:
$string1 = 'foobarbaz'; $string2 = 'foobarbiz'; $pos = strspn($string1 ^ $string2, "\0"); printf( 'First difference at position %d: "%s" vs "%s"', $pos, $string1[$pos], $string2[$pos] );
This will output:
First difference in position 7: "a" vs "i"
So what needs to be done. It is very efficient because it uses only C functions and requires only one copy of string memory.
Edit: MultiByte solution on the same lines:
function getCharacterOffsetOfDifference($str1, $str2, $encoding = 'UTF-8') { return mb_strlen( mb_strcut( $str1, 0, strspn($str1 ^ $str2, "\0"), $encoding ), $encoding ); }
First, the difference at the byte level is found using the above method, and then the offset is mapped to the character level. This is done using the mb_strcut function, which is basically substr , but respects the boundaries of multibyte characters.
var_dump(getCharacterOffsetOfDifference('foo', 'foa')); // 2 var_dump(getCharacterOffsetOfDifference('©oo', 'foa')); // 0 var_dump(getCharacterOffsetOfDifference('f©o', 'fªa')); // 1
This is not as elegant as the first solution, but it is still single-line (and if you use the default encoding a little easier):
return mb_strlen(mb_strcut($str1, 0, strspn($str1 ^ $str2, "\0")));