Bash, how is the hash value of a string?

I want to just convert a string of any length to an integer value. Each line will display a unique or even not unique integer. Is there any existing opensource command that does this?

Bonus points if they are unique, for example, calculating the lexicographic order using the bash command.

+5
source share
2 answers

You need to be careful when using hash functions in common programming languages. It was customary to introduce randomized seeds into hash functions, so the hash values ​​are unique to only one program run. This avoids the denial of service attacks noted in oCert advisory 2011-3 . (As an advisory, this problem was described in 2003 in a paper submitted by Usenix.)

For example, the Python hash function was randomized by default since v3.3:

 $ python3 -c 'from sys import argv;print(hash(argv[1]))' abc -2595772619214671013 $ python3 -c 'from sys import argv;print(hash(argv[1]))' abc -6001956461950650533 $ python3 -c 'from sys import argv;print(hash(argv[1]))' abc -7414807274805087300 $ python3 -c 'from sys import argv;print(hash(argv[1]))' abc -327608370992723225 # Python2 generates consistent hash values $ python -c 'from sys import argv;print(hash(argv[1]))' abc 1453079729188098211 $ python -c 'from sys import argv;print(hash(argv[1]))' abc 1453079729188098211 $ python -c 'from sys import argv;print(hash(argv[1]))' abc 1453079729188098211 

You can control the hash randomization in Python by setting the environment variable PYTHONHASHSEED .

Or you can use a standardized cryptographic hash such as SHA-1. The commonly available sha1sum utility prints the result in hexadecimal format, but you can convert it to decimal with bash (truncated to 64 bits):

 $ echo $((0x$(sha1sum <<<"string to hash")0)) -7037254581539467098 

or in its full 160-bit glory using bc (which requires hex to be uppercase):

 $ bc <<<ibase=16\;$(sha1sum <<<"string to hash"|tr az AZ)0 861191872165666513280590001082621748432296579238 

If you only need a hash value modulo some power 16, you can use the first few bytes of the sum SHA-1. (You can use any choice of bytes - they are all equally well distributed, but the first ones are somewhat easier to extract):

 $ echo $((0x$(sha1sum <<<"string to hash"|cut -c1-2))) 150 

Note. . As @gniourf_gniourf notes in a comment, the above does not really calculate the SHA-1 checksum of a given string, because the syntax bash here-string ( <<<word ) adds a new line to the word . Since the checksum of a line with a new line added is as good a hash as the checksum of the line itself, the problem does not arise if you always use the same mechanism to create the hash.

+12
source

You can use the sum or cksum (the latter is preferred ) to generate a base-10 integer:

 $ cksum <<< 'hello world' | cut -f 1 -d ' ' 3733384285 $ cksum <<< 'goodbye world' | cut -f 1 -d ' ' 2600070097 

If you are interested in math behind these simple hashes, check out the original implementations:

+5
source

Source: https://habr.com/ru/post/1214623/


All Articles