You can create some indexing (example: trie) to sum the input file. Then you can check how many indexes match between documents.
Eg. Create a trie for an input file of length 10. For each line of length 10 (overlap) in text files, check how many of them match in trie.
Elkamina
source share