I want to compare two documents regardless of line breaks. If the contents of the same, but the position and the number of line breaks are different, I want to match the lines in one document with the lines in another.
Given:
Document 1
I went to Paris in July 15, where I met some nice people. And I came back to NY in Aug 15. I am planning to go there soon after I finish what I do.
Document 2
I went to Paris in July 15, where I met some nice people. And I came back to NY in Aug 15. I am planning to go there soon after I finish what I do.
I want an algorithm capable of determining that line 1 in document 1 contains the same text as lines 1 through 5 in document 2, that lines 2 and 3 in document 1 contain the same text as line 6 in the document 2, etc.
1 = 1,2,3,4,5 2,3 = 6 4,5,6 = 7,8
Is there a way with regular expression to match every line in every document if it spans multiple lines in other documents?
hmghaly
source share