Why hash functions like sha1 only use up to 16 different char (hexadecimal)?

Sorry for the curiosity that I have.

sha1 use [a-f0-9] characters for its hash function. Can I find out why it does not use all possible characters [a-z0-9] , using all the characters that grealty can use to increase the number of possible different hashes, thereby reducing the likelihood of a possible collision.

If you do not think this is a real question, just leave a comment, I will delete this question immediately.

===

As indicated in the answer, sha1 uses NOT only 16 chars . Correct fact: sha1 is 160 bits of binary data (cit.). I added this to prevent confusion.

+4
source share
5 answers

You are misleading the presentation with content.

sha1 is 160 bits of binary data. You can just as easily imagine it with:

 hex: 0xf1d2d2f924e986ac86fdf7b36c94bcdf32beec15 decimal: 1380568310619656533693587816107765069100751973397 binary: 1111000111010010110100101111100100100100111010011000011010101100100001101111110111110111101100110110110010010100101111001101111100110010101111101110110000010101 base 62: xufK3qj2bZgDrLA0XN0cLv1jZXc 

There is nothing magical about hexadecimal. This is a very common mechanism for displaying content, which is easily divided into 4-bit boundaries.

The base 62 output is generated using this small amount of ruby:

 #!/usr/bin/ruby def chars_from_hex(s) c = s % 62 s = s / 62 if ( s > 0 ) chars_from_hex(s) end if (c < 10) print c elsif (c < 36) print "abcdefghijklmnopqrstuvwxyz"[c-11].chr() elsif (c < 62) print "ABCDEFGHIJKLMNOPQRSTUVWXYZ"[c-37].chr() else puts "error c", c end end chars_from_hex(0xf1d2d2f924e986ac86fdf7b36c94bcdf32beec15) 

It uses the standard idiom to convert from one database to another and treats 0-9 as 0-9, az as 10 -35, az as 36-61. This can be trivially expanded to support more digits by including, for example, !@ #$%^&*()-_=+\|[]{},.<>/?;:'"~` If this is so necessary. (Or any of the huge array of Unicode codepoints .)

@ yes123 asked a question about representing the ascii hash function on purpose, so here is the result of interpreting the 160-bit hash directly as ascii:

 ñÒÒù$é¬ý÷³l¼ß2¾ì 

This is not like that because:

  • ascii does not have a good printable representation for byte values ​​less than 32
  • ascii itself cannot represent byte values ​​greater than 127, between 127 and 255 is interpreted in accordance with iso-8859-01 or another character encoding scheme

This basic conversion can be practically useful; The Base64 method uses 64 (instead of 62) characters to represent 6 bits at a time; he needs two more characters for the "numbers" and a character to fill. UUEncoding has chosen a different set of "numbers". And the drive colleague had a problem that was easily solved by changing the base of input numbers to output numbers .

+11
source

This is a false argument. sha1 uses 40 * 4 = 160 bits.

It just happens to be convenient (and therefore an agreement) to format this as 40 hexadecimal digits.

You can use different cryptographic hashes with a large hash size if you feel that you are in a problem area where collisions begin to be probable in 160 bits

  sha224: 224 bits sha256: 256 bits md5: 128 bits 
+2
source

Using hex just makes it easier to render. SHA1 uses 160 bits. Thanks to the hex code encoding, it makes it easy to display and transfer the digest as a string. It's all.

+2
source

The output of the hash algorithm is a bit. Representing them in hexadecimal is just a representation. The result is a length of 0 mod 16, so the representation in base 17 would be inconvenient.

+1
source

sha-1 creates a 160-bit hash, 20 bytes, which has 1461501637330902918203684832716283019655932542976 possible values. Because this is how the hashing algorithm is defined.

However, it is often useful to encode the hash as readable text, and a convenient way is to simply encode these 20 bytes as hexadecimal (which occupy up to 40 bytes). And the hexadecimal characters are [a-f0-9].

+1
source

All Articles