Why are Crypto ++ and Ruby generating slightly different SHA-1 hashes?

I use two different libraries to generate the SHA-1 hash for use in file verification β€” an older version of the Crypto ++ library and the Digest :: SHA1 class implemented by Ruby. Although I saw other instances of inconsistent hashes caused by differences in encoding, the two libraries output hashes that are almost identical.

For example, transferring a file through each process gives the following results:

Crypto ++ 01c15e4f46d8181b984fa2a2c740f8f67130acac

Ruby: eac15e4f46d8181b984fa2a2c740f8f67130acac

As you can see, only the first two characters of the hash string are different, and this behavior is repeated in many files. I reviewed the source code for each implementation, and the only difference I found at first glance was in the hex format, which is used for 160-bit hashing. I have no idea how this hex is used in the algorithm, and I thought it would probably be faster for me to ask a question if someone had this problem before.

I have included data from the respective libraries below. I also included values ​​from OpenSSL, as each of the three libraries had slightly different values.

Crypto ++:

digest[0] = 0x67452301L; digest[1] = 0xEFCDAB89L; digest[2] = 0x98BADCFEL; digest[3] = 0x10325476L; digest[4] = 0xC3D2E1F0L; 

Ruby:

 context->state[0] = 0x67452301; context->state[1] = 0xEFCDAB89; context->state[2] = 0x98BADCFE; context->state[3] = 0x10325476; context->state[4] = 0xC3D2E1F0; 

OpenSSL:

 #define INIT_DATA_h0 0x67452301UL #define INIT_DATA_h1 0xefcdab89UL #define INIT_DATA_h2 0x98badcfeUL #define INIT_DATA_h3 0x10325476UL #define INIT_DATA_h4 0xc3d2e1f0UL 

By the way, here is the code that is used to generate the hash in Ruby. I do not have access to the source code for the implementation of Crypto ++.

 File.class_eval do def self.hash_digest filename, options = {} opts = {:buffer_length => 1024, :method => :sha1}.update(options) hash_func = (opts[:method].to_s == 'sha1') ? Digest::SHA1.new : Digest::MD5.new open(filename, "r") do |f| while !f.eof b = f.read hash_func.update(b) end end hash_func.hexdigest end end 
+4
source share
2 answers

I would suggest that you are disabled by byte when printing SHA-1 hashes. Can we see the code that prints them? If not, here are a couple of potentially useful diagnostics:

  • Make a very short file (say, one word) and put its contents as a hexadecimal string in http://www.fileformat.info/tool/hash.htm . However, you need to know exactly the hexadecimal contents of the file. You can use xxd for Unix, but you will have to keep track of release issues. I'm not sure how to do this on other OSs.

  • Does the same file with the same SHA-1 version always print the same value several times in this first byte? If so, does this change the value when changing files?

+2
source

It doesn’t make much sense. If something is wrong with the implementation of SHA1, for example, with these numbers, it will most likely lead to hashes that are completely different from the real SHA1 hashes, and not just one byte. Even if something is wrong with your file reading cycle, that it will lose a new line or something else, you still get a completely different hash by changing one byte in the stream, it will not be one byte from the real SHA1 hash.

If I use your method in the next program, I get the correct results.

 #!/usr/bin/env ruby require 'digest/sha1' require 'digest/md5' File.class_eval do def self.hash_digest filename, options = {} opts = {:buffer_length => 1024, :method => :sha1}.update(options) hash_func = (opts[:method].to_s == 'sha1') ? Digest::SHA1.new : Digest::MD5.new open(filename, "r") do |f| while !f.eof b = f.read hash_func.update(b) end end hash_func.hexdigest end end puts File.hash_digest(ARGV[0]) 

And its output compared to the output of OpenSSL.

 tmp$ dd if=/dev/urandom of=random.bin bs=1MB count=1 1+0 records in 1+0 records out 1000000 bytes (1.0 MB) copied, 0.287903 s, 3.5 MB/s tmp$ ./digest.rb random.bin a511d8153426ebea4e4694cde78db4e3a9e413d1 tmp$ openssl sha1 random.bin SHA1(random.bin)= a511d8153426ebea4e4694cde78db4e3a9e413d1 

So there is nothing wrong with your hashing method. Something goes wrong between its return value and its seal.

+2
source

All Articles