Assume two sets of rows:
[ "Mr. Jones", "O'Flaherty", "Bob", "Rob Jenkins" ]
[ "Maxwell O'Flaherty", "Robert Jenkins", "Mrs. Smith" ]
Obviously, these two sets have Maxwell O'Flaherty and Robert Jenkins.
Is there any algorithm that allows us to do this mapping programmatically? I am thinking of writing something that will go through each element in an array of strings and try to find any substring that is unique and will not be contained in any other element in any of the sets, and then use it as a kind of hash of each element to match two sets.
source
share