Search for shared blocks

I have two files (f1 and f2) containing some text (or binary data).
How to quickly find common blocks?

eg.
f1: ABC DEF
f2: XXABC XEF

output:

common blocks:
length 4: "ABC" at f1 @ 0 and f2 @ 2 length 2: "EF" at f1 @ 5 and f2 @ 8

+3
source share
3 answers

Wikipedia has pseudo-code for finding the longest common substring between two data sequences. In your case, you simply retrieve the entire common substring from the table, which is not a prefix of other common substrings (i.e., Maximum common substrings).

+1
source

The open source PMD project has a cut and paste detection module, which is listed on this page: http://pmd.sourceforge.net/integrations.html .

+1
source

All Articles