I am working on various large binaries. I implemented the famous Myers Diff algorithm, which gives minimal diff. However, this is O (ND), so in order to separate two very different 1 MB files, I expect it to take 1 million squares = 1 trillion. This is not good!
What I would like is an algorithm that creates a potentially not minimal diff, but does it much faster. I know that it must exist because Beyond Compare does this. But I dont know how!
Of course: There are tools like xdelta or bdiff, but they create a patch designed to consume computers, which is different from the difference consumed by humans. The patch focuses on converting one file to another, so it can do things like copying from previous parts of a file. Differences in human costs allow you to visually show the differences and can only insert and delete. For example, this conversion:
"puddi" โ "puddipuddipuddi"
will create a small patch with โcopy [0.4] to [5.9] and [10, 14]โ, but with a big difference โappend" puddipuddi '. "I'm interested in algorithms that create a higher diff.
Thanks!
fish
source share