How to generate md5 hashes for large files using VBA?

I have the following functions to generate md5 hashes for files. Functions work fine for small files, but crash and generation Run-time error 7 - Not enough memory when I try to hash files over ~ 250 MB (I don't know at what exact size the breaks are, but files below 200 MB work fine).

I don’t understand why it breaks down in a certain size, therefore, if someone can shed light on this, I would really appreciate it.

Also, is there anything I can do to make functions handle large files? I intend to use functions in a larger tool where I will need to generate hashes for files of unknown sizes. Most of them will be small enough for the current functions to work, but I will also have to process large files.

I got my current functions from the most used answer to this post. How to get MD5 hexagonal hash for a file using VBA?

Public Function FileToMD5Hex(ByVal strFileName As String) As String Dim varEnc As Variant Dim varBytes As Variant Dim strOut As String Dim intPos As Integer Set varEnc = CreateObject("System.Security.Cryptography.MD5CryptoServiceProvider") 'Convert the string to a byte array and hash it varBytes = GetFileBytes(strFileName) varBytes = varEnc.ComputeHash_2((varBytes)) 'Convert the byte array to a hex string For intPos = 1 To LenB(varBytes) strOut = strOut & LCase(Right("0" & Hex(AscB(MidB(varBytes, intPos, 1))), 2)) Next FileToMD5Hex = strOut Set varEnc = Nothing End Function Private Function GetFileBytes(ByVal strPath As String) As Byte() Dim lngFileNum As Long Dim bytRtnVal() As Byte lngFileNum = FreeFile 'If file exists, get number of bytes If LenB(Dir(strPath)) Then Open strPath For Binary Access Read As lngFileNum ReDim bytRtnVal(LOF(lngFileNum)) As Byte Get lngFileNum, , bytRtnVal Close lngFileNum Else MsgBox "Filen finns inte" & vbCrLf & "Avbryter", vbCritical, "Filen hittades inte" Exit Function End If GetFileBytes = bytRtnVal Erase bytRtnVal End Function 

thanks

+3
vba excel-vba hash
source share
2 answers

It looks like you have reached the limit of memory. It would be best to calculate the MD5 file by block:

 Public Function ComputeMD5(filepath As String) As String Dim buffer() As Byte, svc As Object, hFile%, blockSize&, i& blockSize = 2 ^ 16 ' open the file ' If Len(Dir(filepath)) Then Else Err.Raise 5, , "file not found" & vbCr & filepath hFile = FreeFile Open filepath For Binary Access Read As hFile ' allocate buffer ' If LOF(hFile) < blockSize Then blockSize = ((LOF(hFile) + 1024) \ 1024) * 1024 ReDim buffer(0 To blockSize - 1) ' compute hash ' Set svc = CreateObject("System.Security.Cryptography.MD5CryptoServiceProvider") For i = 1 To LOF(hFile) \ blockSize Get hFile, , buffer svc.TransformBlock buffer, 0, blockSize, buffer, 0 Next Get hFile, , buffer svc.TransformFinalBlock buffer, 0, LOF(hFile) Mod blockSize buffer = svc.Hash ' cleanup ' svc.Clear Close hFile ' convert to an hexa string ' ComputeMD5 = String$(32, "0") For i = 0 To 15 Mid$(ComputeMD5, i + i + 2 + (buffer(i) > 15)) = Hex(buffer(i)) Next End Function 
+5
source share

This is an extension for FlorentB's answer that worked brilliantly for me until my files exceeded the 2GB LOF () size limit.

I tried to adapt to get the file length by alternative means as follows:

 Public Function ComputeMD5(filepath As String) As String If Len(Dir(filepath)) Then Else Err.Raise 5, , "File not found." & vbCr & filepath Dim blockSize As Long: blockSize = 2 ^ 20 Dim blockSize_f As Double Dim buffer() As Byte Dim fileLength As Variant Dim hFile As Integer Dim n_Reads As Long Dim i As Long Dim svc As Object: Set svc = CreateObject("System.Security.Cryptography.MD5CryptoServiceProvider") fileLength = DecGetFileSize(filepath) If fileLength < blockSize Then blockSize = ((fileLength + 1024) \ 1024) * 1024 ReDim buffer(0 To blockSize - 1) n_Reads = fileLength / blockSize blockSize_f = fileLength - (CDbl(blockSize) * n_Reads) hFile = FreeFile Open filepath For Binary Access Read As hFile For i = 1 To n_Reads Get hFile, i, buffer svc.TransformBlock buffer, 0, blockSize, buffer, 0 Next i Get hFile, i, buffer svc.TransformFinalBlock buffer, 0, blockSize_f buffer = svc.Hash svc.Clear Close hFile ComputeMD5 = String$(32, "0") For i = 0 To 15 Mid$(ComputeMD5, i + i + 2 + (buffer(i) > 15)) = Hex(buffer(i)) Next End Function Public Function DecGetFileSize(fname As String) As Variant Dim fso As New FileSystemObject Dim f: Set f = fso.GetFile(fname) DecGetFileSize = CDec(f.Size) Set f = Nothing Set fso = Nothing End Function 

This all works fine by returning a line, however this line is not equal to MD5 calculated using other tools in the same file.

I cannot understand where the discrepancy arises.

I checked and double checked filelength, n_reads, blockSize and blockSize_f, and I'm sure that these values ​​are correct.

I had some problems with the Get function, where if I had not explicitly indicated the block number to it, it dies in block 2048.

Any ideas / pointers would be highly appreciated.

0
source share

All Articles