Levenshtein distance is a good way to compare strings. This is faster than similar_text() , and allows you to control your output by weighing different parts of the algorithm.
To turn the Levenshtein distance into a useful percentage of βmatchβ, you can express it as part of the average lengths of the source lines:
// Assume $src1 and $src2 are your source strings and at least one is non-empty $avgLength = ( strlen( $src1 ) + strlen( $src2 ) ) / 2; $matchFraction = 1 - ( levenshtein( $src1, $src2 ) / $avgLength ); //$matchFraction is now between 0 and 1, with 1 being equal strings and 0 being totally different
source share