CloneDR semantic projects detect duplicate clones that are accurate and close to zero based on the langauge structure, so it is not fooled by changes in spaces or line breaks, inserted / modified comments, or even changed variable names.
It uses parser processing fronts to work with C, C ++, C #, Java, COBOL, PHP, Python, Fortran, Ada, ...
There are several examples of website cloning analysis reports for different languages.
Ira Baxter
source share