A method for finding similar substrings of two strings

I use this piece of Java code to find similar strings:

if( str1.indexof(str2) >= 0 || str2.indexof(str1) >= 0 ) .......

but with str1 = "pizzabase" and str2 = "namedpizzaowl" it does not work.

how to find common substrings, i.e. "pizza"?

+4
source share
2 answers

If your algorithm says that two lines are similar when they contain a common substring, then this algorithm will always return true; the empty string "" trivially a substring of each string. It also makes sense to determine the degree of similarity between the lines and return a number, not a logical one.

This is a good algorithm to determine the sequence (or, more generally, the sequence): http://en.wikipedia.org/wiki/Levenshtein_distance .

0
source

Iterate over each letter in str1 , checking its existence in str2 . If it does not exist, go to the next letter, if this happens, increase the length of the substring in str1 , which you check in str2 , by two characters and repeat until further matches are found, or you iterate through str1 .

This will find all the substrings that are common, but similar to bubbles - are hardly optimal, but a very simple example of how to solve the problem.

Something like this pseudo-descending example:

 pos = 0 len = 1 matches = []; while (pos < str1.length()) { while (str2.indexOf(str1.substring(pos, len))) { len++; } matches.push(str1.substring(pos, len - 1)); pos++; len = 1; } 
+1
source

All Articles