Problem using base64 encoder and InputStreamReader

I have several CLOB columns in the database in which I need to insert Base64 encoded binaries. These files can be large, so I need to transfer them, I can not immediately read all this.

I am using org.apache.commons.codec.binary.Base64InputStream for coding and I have a problem. My code is essentially this

 FileInputStream fis = new FileInputStream(file); Base64InputStream b64is = new Base64InputStream(fis, true, -1, null); BufferedReader reader = new BufferedReader(new InputStreamReader(b64is)); preparedStatement.setCharacterStream(1, reader); 

When I run the above code, I get one of them while updating java.io.IOException: Underlying input stream returned zero bytes , it is deeply embedded in the InputStreamReader code.

Why is this not working? It seems to me that the reader will try to read from the base stream 64, which will be read from the file stream, and everything should be happy.

+7
java encoding base64 jdbc
source share
3 answers

This seems to be a bug in Base64InputStream . You call it right.

You must report this to the Apache commons codec project.

A simple test case:

 import java.io.*; import org.apache.commons.codec.binary.Base64InputStream; class tmp { public static void main(String[] args) throws IOException { FileInputStream fis = new FileInputStream(args[0]); Base64InputStream b64is = new Base64InputStream(fis, true, -1, null); while (true) { byte[] c = new byte[1024]; int n = b64is.read(c); if (n < 0) break; if (n == 0) throw new IOException("returned 0!"); for (int i = 0; i < n; i++) { System.out.print((char)c[i]); } } } } 

calling read(byte[]) InputStream not allowed to return 0. It returns 0 in any file that consists of 3 bytes long.

+14
source share

Interestingly, I did some tests here, and that really throws this exception when you read Base64InputStream using InputStreamReader , regardless of the source of the stream, but it works flawlessly when you read it as a binary stream. As Trashgod mentioned, Base64 encoding is framed. InputStreamReader should actually call flush() on Base64InputStream again to see if it will return more data.

I see no other way to fix this than to implement my own Base64InputStreamReader or Base64Reader . This is actually a mistake, see Keith's answer.

As a workaround, you can just save it in a BLOB instead of a CLOB in the database and use PreparedStatement#setBinaryStream() instead. It doesn't matter if it is stored as binary data or not. You do not want such Base64 large data to be indexable or searchable anyway.


Update : since this is not an option and the Apache Commons Codec guys are fixing the Base64InputStream error, which I wrote as CODEC-101 may take some time, you can use another third-party Base64 API. I found it here (the public domain, so you can do whatever you want, even post in your own package), you checked it here, and it works great.

 InputStream base64 = new Base64.InputStream(input, Base64.ENCODE); 

Update 2 : the guy with the codec is fixed pretty soon.

 Index: src/java/org/apache/commons/codec/binary/Base64InputStream.java =================================================================== --- src/java/org/apache/commons/codec/binary/Base64InputStream.java (revision 950817) +++ src/java/org/apache/commons/codec/binary/Base64InputStream.java (working copy) @@ -145,21 +145,41 @@ } else if (len == 0) { return 0; } else { - if (!base64.hasData()) { - byte[] buf = new byte[doEncode ? 4096 : 8192]; - int c = in.read(buf); - // A little optimization to avoid System.arraycopy() - // when possible. - if (c > 0 && b.length == len) { - base64.setInitialBuffer(b, offset, len); + int readLen = 0; + /* + Rationale for while-loop on (readLen == 0): + ----- + Base64.readResults() usually returns > 0 or EOF (-1). In the + rare case where it returns 0, we just keep trying. + + This is essentially an undocumented contract for InputStream + implementors that want their code to work properly with + java.io.InputStreamReader, since the latter hates it when + InputStream.read(byte[]) returns a zero. Unfortunately our + readResults() call must return 0 if a large amount of the data + being decoded was non-base64, so this while-loop enables proper + interop with InputStreamReader for that scenario. + ----- + This is a fix for CODEC-101 + */ + while (readLen == 0) { + if (!base64.hasData()) { + byte[] buf = new byte[doEncode ? 4096 : 8192]; + int c = in.read(buf); + // A little optimization to avoid System.arraycopy() + // when possible. + if (c > 0 && b.length == len) { + base64.setInitialBuffer(b, offset, len); + } + if (doEncode) { + base64.encode(buf, 0, c); + } else { + base64.decode(buf, 0, c); + } } - if (doEncode) { - base64.encode(buf, 0, c); - } else { - base64.decode(buf, 0, c); - } + readLen = base64.readResults(b, offset, len); } - return base64.readResults(b, offset, len); + return readLen; } } 

I tried it here and it works great.

+4
source share

"For maximum efficiency, consider wrapping the InputStreamReader in a BufferedReader . For example:"

 BufferedReader in = new BufferedReader(new InputStreamReader(b64is)); 

Appendix: Several characters have been added as Base64 , make sure the source is not truncated. A flush() may be required.

0
source share

All Articles