In C ++, what is the fastest way to find out if two string or binary files are different?

I am writing unit test and I need to compare the result file with the gold file. What is the easiest way to do this?

So far I have (for Linux environment):

int result = system("diff file1 file2"); 

They are different if result != 0

+5
source share
5 answers

If you want a clean solution in C ++, I would do something like this

 #include <algorithm> #include <iterator> #include <string> #include <fstream> template<typename InputIterator1, typename InputIterator2> bool range_equal(InputIterator1 first1, InputIterator1 last1, InputIterator2 first2, InputIterator2 last2) { while(first1 != last1 && first2 != last2) { if(*first1 != *first2) return false; ++first1; ++first2; } return (first1 == last1) && (first2 == last2); } bool compare_files(const std::string& filename1, const std::string& filename2) { std::ifstream file1(filename1); std::ifstream file2(filename2); std::istreambuf_iterator<char> begin1(file1); std::istreambuf_iterator<char> begin2(file2); std::istreambuf_iterator<char> end; return range_equal(begin1, end, begin2, end); } 

This avoids reading the entire file into memory and stops as soon as the files differ (or at the end of the file). Range_equal, because std::equal does not accept a pair of iterators for the second range and is not safe if the second range is shorter.

+16
source

Development of DaveS answer , and as the first thing checking file size :

 #include <fstream> #include <algorithm> bool compare_files(const std::string& filename1, const std::string& filename2) { std::ifstream file1(filename1, std::ifstream::ate | std::ifstream::binary); //open file at the end std::ifstream file2(filename2, std::ifstream::ate | std::ifstream::binary); //open file at the end const std::ifstream::pos_type fileSize = file1.tellg(); if (fileSize != file2.tellg()) { return false; //different file size } file1.seekg(0); //rewind file2.seekg(0); //rewind std::istreambuf_iterator<char> begin1(file1); std::istreambuf_iterator<char> begin2(file2); return std::equal(begin1,std::istreambuf_iterator<char>(),begin2); //Second argument is end-of-range iterator } 

(Interestingly, before rewinding, file1 could be used to create a more efficient iterator of the end of the stream, which, knowing the length of the stream, would allow std::equal process more bytes at a time).

+2
source

one way to prevent both files from reading is to pre-compute the golden file into a hash, for example, md5. Then you only need to check the test file. Please note that this can be slower than just reading both files!

Alternatively, align your check - look at the file sizes if they are different and the files are different, and you can avoid a lengthy read and compare operation.

+1
source

This should work:

 #include <string> #include <fstream> #include <streambuf> #include <iterator> bool equal_files(const std::string& a, const std::string& b) std::ifstream stream{a}; std::string file1{std::istreambuf_iterator<char>(stream), std::istreambuf_iterator<char>()}; stream = std::ifstream{b}; std::string file2{std::istreambuf_iterator<char>(stream), std::istreambuf_iterator<char>()}; return file1 == file2; } 

I suspect this is not as fast as diff , but it avoids calling system . This should be enough for a test case.

0
source

It may be too much, but you can build a SHA-256 hash table using boost / bimap and boost / scope_exit.

Here's a video on how to do this Stephan T Lavavej (starts at 8.15): http://channel9.msdn.com/Series/C9-Lectures-Stephan-T-Lavavej-Advanced-STL/C9-Lectures-Stephan-T-Lavavej -Advanced-STL-5-of-n

For more information about the algorithm: http://en.wikipedia.org/wiki/SHA-2

0
source

All Articles