I saw that there are a lot of reports about XML comparisons, but none of the ones that I examined solved my problem.
We have text documents in XML format (product descriptions, headings and paragraphs) that are updated (for example, version), and I was instructed to make changes to the digests. That is, we want to take two sequential files and create a third; the heading structure (outline) should be retained, but only paragraphs with changes should be retained - additions as well as deletions should be noted.
So, I'm trying to find a way to walk with both DOM trees and detect additions and deletions, but I am having trouble finding them reliably. This is obvious because I have to do the diff, but I cannot use simple diff, because I want to make separate differences within each element, and because I cannot use the traditional diff result, but must have a fully formatted xml digest.
Any clues before I try to solve the problem "The longest common subsequence problem", which will be a huge task?
source share