Find out how many percent one row contains in another

I need to find out how many percent or characters one line contains on another line. I tried Levenshtein Distance, but this algorithm returns how much char needs to be changed in order for the strings to be equal. Can anyone help? I need this in C #, but this is not so important.

Answer code: public double LongestCommonSubsequence (string s1, string s2) {// if the string is empty, the length should be 0 if (String.IsNullOrEmpty (s1) || String.IsNullOrEmpty (s2)) return 0;

int[,] num = new int[s1.Length, s2.Length]; //2D array char letter1; char letter2; //Actual algorithm for (int i = 0; i < s1.Length; i++) { letter1 = s1[i]; for (int j = 0; j < s2.Length; j++) { letter2 = s2[j]; if (letter1 == letter2) { if ((i == 0) || (j == 0)) num[i, j] = 1; else num[i, j] = 1 + num[i - 1, j - 1]; } else { if ((i == 0) && (j == 0)) num[i, j] = 0; else if ((i == 0) && !(j == 0)) //First ith element num[i, j] = Math.Max(0, num[i, j - 1]); else if (!(i == 0) && (j == 0)) //First jth element num[i, j] = Math.Max(num[i - 1, j], 0); else // if (!(i == 0) && !(j == 0)) num[i, j] = Math.Max(num[i - 1, j], num[i, j - 1]); } }//end j }//end i return (s2.Length - (double)num[s1.Length - 1, s2.Length - 1]) / s1.Length * 100; } //end LongestCommonSubsequence 
+4
source share
2 answers

It looks like you might need the longest common subsequence , which is the basis for diff algorithms. Unfortunately, this problem is NP-hard, which means the lack of an effective (polynomial time) solution. There are several suggestions on the Wikipedia page.

+2
source

Uhh ... can't you just use the number of characters you need to change?

 (length(destination)-changed_character_count)/ length(source) 

EDIT: Based on the revised question, treat both rows as sets, calculate the given intersection, and base the percentage of the size of this set and the original row as a set.

0
source

Source: https://habr.com/ru/post/1313232/


All Articles