File encoding errors in base64 java

I have this class for encoding and decoding a file. When I run the class with .txt files, the result is successful. But when I run the code with .jpg or .doc, I cannot open the file or it is not equal to the original. I do not know why this is happening. I changed this class http://myjeeva.com/convert-image-to-string-and-string-to-image-in-java.html . But I want to change this line

byte imageData[] = new byte[(int) file.length()]; 

for

 byte example[] = new byte[1024]; 

and read the file as many times as we need. Thanks.

 import java.io.*; import java.util.*; public class Encode { 

Input = root of the input file - output = output file root - imageDataString = string encoded

  String input; String output; String imageDataString; public void setFileInput(String input){ this.input=input; } public void setFileOutput(String output){ this.output=output; } public String getFileInput(){ return input; } public String getFileOutput(){ return output; } public String getEncodeString(){ return imageDataString; } public String processCode(){ StringBuilder sb= new StringBuilder(); try{ File fileInput= new File( getFileInput() ); FileInputStream imageInFile = new FileInputStream(fileInput); 

In the examples, I saw that people create bytes [] with the same length as the file. I do not want this because I do not know how long the file will have.

  byte buff[] = new byte[1024]; int r = 0; while ( ( r = imageInFile.read( buff)) > 0 ) { String imageData = encodeImage(buff); sb.append( imageData); if ( imageInFile.available() <= 0 ) { break; } } } catch (FileNotFoundException e) { System.out.println("File not found" + e); } catch (IOException ioe) { System.out.println("Exception while reading the file " + ioe); } imageDataString = sb.toString(); return imageDataString; } public void processDecode(String str) throws IOException{ byte[] imageByteArray = decodeImage(str); File fileOutput= new File( getFileOutput()); FileOutputStream imageOutFile = new FileOutputStream( fileOutput); imageOutFile.write(imageByteArray); imageOutFile.close(); } public static String encodeImage(byte[] imageByteArray) { return Base64.getEncoder().withoutPadding().encodeToString( imageByteArray); } public static byte[] decodeImage(String imageDataString) { return Base64.getDecoder().decode( imageDataString); } public static void main(String[] args) throws IOException { Encode a = new Encode(); a.setFileInput( "C://Users//xxx//Desktop//original.doc"); a.setFileOutput("C://Users//xxx//Desktop//original-copied.doc"); a.processCode( ); a.processDecode( a.getEncodeString()); System.out.println("COPIED"); } } 

I tried to change

 String imageData = encodeImage(buff); 

for

 String imageData = encodeImage(buff,r); 

and encodeImage method

 public static String encodeImage(byte[] imageByteArray, int r) { byte[] aux = new byte[r]; for ( int i = 0; i < aux.length; i++) { aux[i] = imageByteArray[i]; if ( aux[i] <= 0 ) { break; } } return Base64.getDecoder().decode( aux); } 

But I have an error:

 Exception in thread "main" java.lang.IllegalArgumentException: Last unit does not have enough valid bits 
+5
source share
2 answers

You have two problems in your program.

The first, as @Joop Eggen mentioned, is that you are not processing your input correctly.

In fact, Java does not promise you that even in the middle of a file you will read all 1024 bytes. It could just read 50 bytes and tell you that it reads 50 bytes, and then next time it will read more than 50 bytes.

Suppose you read 1024 bytes in the previous round. And now, in the current round, you are only reading 50. Now your byte array contains 50 new bytes, and the rest are old bytes from the previous read!

Therefore, you always need to copy the exact number of bytes copied to the new array and pass this to your encoding function.

So, to fix this specific problem, you need to do something like:

  while ( ( r = imageInFile.read( buff)) > 0 ) { byte[] realBuff = Arrays.copyOf( buff, r ); String imageData = encodeImage(realBuff); ... } 

However, this is not the only problem. Your real problem is the Base64 encoding itself.

What Base64 does is take your bytes, break them into 6-bit chunks, and then treat each of these fragments as a number between N 0 and 63. Then it takes the Nth character from its character table to represent that fragment.

But this means that it cannot simply encode one byte or two bytes, because the byte contains 8 bits, which means that one fragment of 6 bits and 2 remaining bits. Two bytes have 16 bits. These are 2 pieces of 6 bits and 4 remaining bits.

To solve this problem, Base64 always encodes 3 consecutive bytes. If the input is not evenly divided by three, it adds an extra zero bits .

Here is a small program that demonstrates the problem:

 package testing; import java.util.Base64; public class SimpleTest { public static void main(String[] args) { // An array containing six bytes to encode and decode. byte[] fullArray = { 0b01010101, (byte) 0b11110000, (byte)0b10101010, 0b00001111, (byte)0b11001100, 0b00110011 }; // The same array broken into three chunks of two bytes. byte[][] threeTwoByteArrays = { { 0b01010101, (byte) 0b11110000 }, { (byte)0b10101010, 0b00001111 }, { (byte)0b11001100, 0b00110011 } }; Base64.Encoder encoder = Base64.getEncoder().withoutPadding(); // Encode the full array String encodedFullArray = encoder.encodeToString(fullArray); // Encode the three chunks consecutively StringBuilder encodedStringBuilder = new StringBuilder(); for ( byte [] twoByteArray : threeTwoByteArrays ) { encodedStringBuilder.append(encoder.encodeToString(twoByteArray)); } String encodedInChunks = encodedStringBuilder.toString(); System.out.println("Encoded full array: " + encodedFullArray); System.out.println("Encoded in chunks of two bytes: " + encodedInChunks); // Now decode the two resulting strings Base64.Decoder decoder = Base64.getDecoder(); byte[] decodedFromFull = decoder.decode(encodedFullArray); System.out.println("Byte array decoded from full: " + byteArrayBinaryString(decodedFromFull)); byte[] decodedFromChunked = decoder.decode(encodedInChunks); System.out.println("Byte array decoded from chunks: " + byteArrayBinaryString(decodedFromChunked)); } /** * Convert a byte array to a string representation in binary */ public static String byteArrayBinaryString( byte[] bytes ) { StringBuilder sb = new StringBuilder(); sb.append('['); for ( byte b : bytes ) { sb.append(Integer.toBinaryString(Byte.toUnsignedInt(b))).append(','); } if ( sb.length() > 1) { sb.setCharAt(sb.length() - 1, ']'); } else { sb.append(']'); } return sb.toString(); } } 

So imagine my 6-byte array is your image file. And imagine that your buffer does not read 1024 bytes, but 2 bytes each time. This will be the encoding output:

  Encoded full array: VfCqD8wz
 Encoded in chunks of two bytes: VfAqg8zDM

As you can see, the encoding of the full array gave us 8 characters. Each group of three bytes is converted into four pieces of 6 bits, which, in turn, are converted to four characters.

But encoding three double-byte arrays gave you a string of 9 characters. This is a completely different line! Each group of two bytes was expanded to three pieces of 6 bits by filling with zeros. And since you did not request a registration, it only generates 3 characters without the extra = , which usually marks when the number of bytes is not divisible by 3.

Exiting the part of the program that decodes the 8-digit, correct encoded string is good:

  Byte array decoded from full: [1010101,11110000,10101010,1111,11001100,110011]

But the result of trying to decode a 9-character invalid encoded string is:

  Exception in thread "main" java.lang.IllegalArgumentException: Last unit does not have enough valid bits
     at java.util.Base64 $ Decoder.decode0 (Base64.java:734)
     at java.util.Base64 $ Decoder.decode (Base64.java=26)
     at java.util.Base64 $ Decoder.decode (Base64.javahaps49)
     at testing.SimpleTest.main (SimpleTest.java:34)

Not good! A good base64 string should always have a multiple of 4 characters, and we only have 9.

Since you selected a buffer size of 1024 that is not a multiple of 3, this problem will occur. You must encode several of the three bytes each time to create the correct string. So actually you need to create a buffer of size 3072 or something like that.

But because of the first problem, be very careful what you pass to the encoder. Because it always happens that you will read less than 3072 bytes. And then, if the number is not divided by three, the same problem will arise.

+4
source

Take a look:

  while ( ( r = imageInFile.read( buff)) > 0 ) { String imageData = encodeImage(buff); 

read returns -1 at the end of the file or the actual number of bytes read.

Thus, the last buff may not be fully read and may even contain garbage from any previous reading. Therefore you need to use r .

Since this is an assignment, the rest is up to you.

By the way:

  byte[] array = new byte[1024] 

more common in Java. Syntax:

  byte array[] = ... 

for compatibility with C / C ++.

0
source

All Articles