Haskell SHA1 Encoding

I have a list of file paths and I want all these files to be saved as sha1 hash encoding in the list again. It should be as general as possible, so the files can be either text or binary. And now my questions are:

  • What packages should be used and why?
  • How consistent is the approach? With this, I mean: if there can be different results with different programs that use sha1 for the encoding itself (for example, sha1sum)
+7
source share
1 answer

The cryptohash package is probably the easiest to use. Just read your input into lazy 1 ByteString and use the hashlazy function to get a ByteString with the received hash. Here is a small sample program that you can use to compare output with the sha1sum parameter.

 import Crypto.Hash.SHA1 (hashlazy) import qualified Data.ByteString as Strict import qualified Data.ByteString.Lazy as Lazy import System.Process (system) import Text.Printf (printf) hashFile :: FilePath -> IO Strict.ByteString hashFile = fmap hashlazy . Lazy.readFile toHex :: Strict.ByteString -> String toHex bytes = Strict.unpack bytes >>= printf "%02x" test :: FilePath -> IO () test path = do hashFile path >>= putStrLn . toHex system $ "sha1sum " ++ path return () 

Since it reads simple bytes, not characters, there should be no encoding problems, and it should always give the same result as sha1sum :

 > test "/usr/share/dict/words" d6e483cb67d6de3b8cfe8f4952eb55453bb99116 d6e483cb67d6de3b8cfe8f4952eb55453bb99116 /usr/share/dict/words 

This also works for any hash supported by the cryptohash package. Just change the import, for example. Crypto.Hash.SHA256 use a different hash.

1 Using lazy ByteStrings allows you to fully load the entire file into memory, which is important when working with large files.

+18
source

All Articles