How to improve match of PHP string to analogous text_ ()?

Question

How to improve match of PHP string to analogous text_ ()?

I use the PHP call similar_text () to compare two strings, however I am not getting good enough results, for example, the best I get is 80.95% for the match that I would like to see 100% on.

What other functions can I use to get the lines down to the kernel?

<!-- Overcast, Rain or Showers compared Overcast, Rain or Showers is 80.9523809524 --> <!-- Overcast, Risk of Rain or Showers compared Overcast, Rain or Showers is 86.2068965517 --> <!-- Overcast, Chance of Rain or Showers compared Overcast, Rain or Showers is 83.3333333333 -->

+2

string-matching php text

sandraqu May 21 '12 at 18:48

source share

2 answers

Levenshtein distance is a good way to compare strings. This is faster than similar_text() , and allows you to control your output by weighing different parts of the algorithm.

To turn the Levenshtein distance into a useful percentage of “match”, you can express it as part of the average lengths of the source lines:

 // Assume $src1 and $src2 are your source strings and at least one is non-empty $avgLength = ( strlen( $src1 ) + strlen( $src2 ) ) / 2; $matchFraction = 1 - ( levenshtein( $src1, $src2 ) / $avgLength ); //$matchFraction is now between 0 and 1, with 1 being equal strings and 0 being totally different

+3

Jazz May 21 '12 at 18:58

source share

Jeroen · Accepted Answer · 2012-05-21T18:50:57+0000

Levenshtein distance: http://php.net/manual/en/function.levenshtein.php

It refers to Similar_text () , so 0% means there is no difference.

 // <!-- Overcast, Rain or Showers compared Overcast, Rain or Showers is 0 --> // <!-- Overcast, Risk of Rain or Showers compared Overcast, Rain or Showers is 11 --> // <!-- Overcast, Chance of Rain or Showers compared Overcast, Rain or Showers is 13 -->

How to improve match of PHP string to analogous text_ ()?

More articles: