If you are working with files, you first need to check the file length, and then create a hash only for files with the same size.
Then just compare the file hashes. If they are the same, you have a duplicate file.
There is a trade-off between security and accuracy: maybe, who knows, have different files with the same hash. This way you can improve your decision: create a simple, fast hash to find duplicates. When they are different, you have different files. When they are equal, create a second hash. If the second hash is different, you just had a false positive. If they are equal again, you may have a real duplicate.
In other words:
generate file sizes for each file, verify if there some with the same size. if you have any, then generate a fast hash for them. compare the hashes. If different, ignore. If equal: generate a second hash. Compare. If different, ignore. If equal, you have two identical files.
Running a hash for each file will take too long and will be useless if most of your files are different.
source share