Convert VERY BIG binary to Base64String gradually

I need help converting the VERY LARGE binary file (ZIP file) to Base64String and vice versa. Files are too large to load into memory immediately (they call OutOfMemoryExceptions), otherwise it would be a simple task. I do not want to process the contents of the ZIP file separately, I want to process the entire ZIP file.

Problem:

I can convert the whole ZIP file (test sizes vary from 1 MB to 800 MB at present) to Base64String, but when I convert it, it is damaged. The new ZIP file is the right size, it is recognized as a Windows ZIP file and WinRAR / 7-Zip, etc. And I can even look into the ZIP file and see the contents with the correct sizes / properties, but when I try to extract from the ZIP file I get: "Error: 0x80004005", which is a common error code.

I am not sure where and why corruption occurs. I did some investigation and I noticed the following:

If you have a large text file, you can easily convert it to Base64String. If calling Convert.ToBase64String throughout the file gave: "abcdefghijklmnopqrstuvwx" , then calling it in a file in two parts will give: "abcdefghijkl" and "mnopqrstuvwx" .

Unfortunately, if the file is binary, then the result is different. While the whole file might give: "abcdefghijklmnopqrstuvwx" , trying to process this in two parts will give something like: "oiweh87yakgb" and "kyckshfguywp" .

Is there a way for incremental base 64 to encode the binary avoiding this damage?

My code is:

  private void ConvertLargeFile() { FileStream inputStream = new FileStream("C:\\Users\\test\\Desktop\\my.zip", FileMode.Open, FileAccess.Read); byte[] buffer = new byte[MultipleOfThree]; int bytesRead = inputStream.Read(buffer, 0, buffer.Length); while(bytesRead > 0) { byte[] secondaryBuffer = new byte[buffer.Length]; int secondaryBufferBytesRead = bytesRead; Array.Copy(buffer, secondaryBuffer, buffer.Length); bool isFinalChunk = false; Array.Clear(buffer, 0, buffer.Length); bytesRead = inputStream.Read(buffer, 0, buffer.Length); if(bytesRead == 0) { isFinalChunk = true; buffer = new byte[secondaryBufferBytesRead]; Array.Copy(secondaryBuffer, buffer, buffer.length); } String base64String = Convert.ToBase64String(isFinalChunk ? buffer : secondaryBuffer); File.AppendAllText("C:\\Users\\test\\Desktop\\Base64Zip", base64String); } inputStream.Dispose(); } 

Decryption is more similar. I am using the size of the above base64String variable (which varies depending on the original size of the buffer I'm testing with) as the size of the buffer to decode. Then, instead of Convert.ToBase64String() I call Convert.FromBase64String() and write to a different file name / path.

EDIT:

In my rush to reduce code (I reorganized it into a new project, separate from other processing, to exclude code that is not central to this problem). I entered an error. Base 64 conversion must be performed on secondaryBuffer for all iterations that retain the latter (identified by isFinalChunk ) when buffer should be used. I have adjusted the code above.

EDIT No. 2:

Thank you all for your comments / feedback. After fixing the error (see edit above), I re-checked my code and it actually works now. I intend to test and implement the @rene solution, as it seems to be the best, but I thought that I should know everything about my discovery.

+7
c # base64 zipfile
source share
3 answers

The following code works based on the code provided in the blog post from Wiktor Zychla . The same solution is indicated in the Convert.ToBase64String comments section, as indicated by Ivan Stoev

 // using System.Security.Cryptography private void ConvertLargeFile() { //encode var filein= @"C:\Users\test\Desktop\my.zip"; var fileout = @"C:\Users\test\Desktop\Base64Zip"; using (FileStream fs = File.Open(fileout, FileMode.Create)) using (var cs=new CryptoStream(fs, new ToBase64Transform(), CryptoStreamMode.Write)) using(var fi =File.Open(filein, FileMode.Open)) { fi.CopyTo(cs); } // the zip file is now stored in base64zip // and decode using (FileStream f64 = File.Open(fileout, FileMode.Open) ) using (var cs=new CryptoStream(f64, new FromBase64Transform(), CryptoStreamMode.Read ) ) using(var fo =File.Open(filein +".orig", FileMode.Create)) { cs.CopyTo(fo); } // the original file is in my.zip.orig // use the commandlinetool // fc my.zip my.zip.orig // to verify that the start file and the encoded and decoded file // are the same } 

The code uses the standard classes found in the System.Security.Cryptography namespace and uses CryptoStream and FromBase64Transform and its counterpart ToBase64Transform

+10
source share

You can avoid using a secondary buffer by going the offset and length to Convert.ToBase64String , for example:

 private void ConvertLargeFile() { using (var inputStream = new FileStream("C:\\Users\\test\\Desktop\\my.zip", FileMode.Open, FileAccess.Read)) { byte[] buffer = new byte[MultipleOfThree]; int bytesRead = inputStream.Read(buffer, 0, buffer.Length); while(bytesRead > 0) { String base64String = Convert.ToBase64String(buffer, 0, bytesRead); File.AppendAllText("C:\\Users\\test\\Desktop\\Base64Zip", base64String); bytesRead = inputStream.Read(buffer, 0, buffer.Length); } } } 

The above should work, but I think Renee's answer is actually the best solution.

+8
source share

Use this code:

 public void ConvertLargeFile(string source , string destination) { using (FileStream inputStream = new FileStream(source, FileMode.Open, FileAccess.Read)) { int buffer_size = 30000; //or any multiple of 3 byte[] buffer = new byte[buffer_size]; int bytesRead = inputStream.Read(buffer, 0, buffer.Length); while (bytesRead > 0) { byte[] buffer2 = buffer; if(bytesRead < buffer_size) { buffer2 = new byte[bytesRead]; Buffer.BlockCopy(buffer, 0, buffer2, 0, bytesRead); } string base64String = System.Convert.ToBase64String(buffer2); File.AppendAllText(destination, base64String); bytesRead = inputStream.Read(buffer, 0, buffer.Length); } } } 
+1
source share

All Articles