R-contiguous match, MATLAB

I want to compare these two lines according to the r-adjacent matching rule. Therefore, in this example, if we set r to 6, then it will return true for the first example and false for the second example.

Example 1:

A='ABCDEFGHIJKLM' B='XYZ0123EFGHIJAB' return true (since it they both have 6 contiguous match 'EFGHIJ') 

Example 2:

 A='ABCDEFGHJKLM' B='XYZ0123EFGHAB' return false (since they both have only 4 contiguous match 'EFGH') 

What is the fastest way in MATLAB since my data is huge? Thanks.

+3
source share
1 answer

Case: entering lines with unique characters

Here's one approach: nofollow → t20> a> -

 matches = ismember(A,B) %// OR any(bsxfun(@eq,A,B.'),1) matches_ext = [0 matches 0] starts = strfind(matches_ext,[0 1]) stops = strfind(matches_ext,[1 0]) interval_lens = stops - starts out = any(interval_lens >= r) 

Here's another with diff and find instead of strfind -

 matches = ismember(A,B) %// OR any(bsxfun(@eq,A,B.'),1) matches_ext = [0 matches 0] df = diff(matches_ext) interval_lens = find(df == -1) - find(df == 1) out = any(interval_lens >= r) 

Here is another one with 1D convolution -

 matches = ismember(A,B) %// OR any(bsxfun(@eq,A,B.'),1) out = any(conv(double(matches),ones(1,r)) == r) 

Case: input strings with unique characters

Here's one approach using bsxfun -

 matches = bsxfun(@eq,A,B.'); %//' intv = (0:r-1)*(size(matches,1)+1)+1 idx = find(matches) idx = idx(idx <= max(idx) - max(intv)) out = any(all(matches(bsxfun(@plus,idx,intv)),2)) 
+4
source

All Articles