Why is the result of the GZip algorithm not the same in Android and .Net?

Why does the result of the GZip algorithm not match Android and .Net?

My android code:

public static String compressString(String str) { String str1 = null; ByteArrayOutputStream bos = null; try { bos = new ByteArrayOutputStream(); BufferedOutputStream dest = null; byte b[] = str.getBytes(); GZIPOutputStream gz = new GZIPOutputStream(bos, b.length); gz.write(b, 0, b.length); bos.close(); gz.close(); } catch (Exception e) { System.out.println(e); e.printStackTrace(); } byte b1[] = bos.toByteArray(); return Base64.encode(b1); } 

My code in .Net WebService:

  public static string compressString(string text) { byte[] buffer = Encoding.UTF8.GetBytes(text); MemoryStream ms = new MemoryStream(); using (GZipStream zip = new GZipStream(ms, CompressionMode.Compress, true)) { zip.Write(buffer, 0, buffer.Length); } ms.Position = 0; MemoryStream outStream = new MemoryStream(); byte[] compressed = new byte[ms.Length]; ms.Read(compressed, 0, compressed.Length); byte[] gzBuffer = new byte[compressed.Length + 4]; System.Buffer.BlockCopy(compressed, 0, gzBuffer, 4, compressed.Length); System.Buffer.BlockCopy(BitConverter.GetBytes(buffer.Length), 0, gzBuffer, 0, 4); return Convert.ToBase64String(gzBuffer); } 

In android:

 compressString("hello"); -> "H4sIAAAAAAAAAMtIzcnJBwCGphA2BQAAAA==" 

In .Net:

 compressString("hello"); -> "BQAAAB+LCAAAAAAABADtvQdgHEmWJSYvbcp7f0r1StfgdKEIgGATJNiQQBDswYjN5pLsHWlHIymrKoHKZVZlXWYWQMztnbz33nvvvffee++997o7nU4n99//P1xmZAFs9s5K2smeIYCqyB8/fnwfPyLmeVlW/w+GphA2BQAAAA==" 

Interestingly, when I use the Decompress method in android to decompress the result of the .Net compressString method, it correctly returns the original string, but I get an error when I unzip the result of the android compressString method.

Android Decompress Method:

  public static String Decompress(String zipText) throws IOException { int size = 0; byte[] gzipBuff = Base64.decode(zipText); ByteArrayInputStream memstream = new ByteArrayInputStream(gzipBuff, 4, gzipBuff.length - 4); GZIPInputStream gzin = new GZIPInputStream(memstream); final int buffSize = 8192; byte[] tempBuffer = new byte[buffSize]; ByteArrayOutputStream baos = new ByteArrayOutputStream(); while ((size = gzin.read(tempBuffer, 0, buffSize)) != -1) { baos.write(tempBuffer, 0, size); } byte[] buffer = baos.toByteArray(); baos.close(); return new String(buffer, "UTF-8"); } 

I think there is an error in the Android compressString method. Can anybody help me?

+4
source share
3 answers

According to this answer , I have 4 methods. Android and .net compress and decompress methods. These methods are compatible with each other, with the exception of one case.

+2
source

In the Android version, you have to close bos after closing gz .

Additionally, this line in compressString can cause problems:

 byte b[] = str.getBytes(); 

This converts characters to bytes using the default encoding on the device, which is almost certainly not UTF-8. The .NET version, on the other hand, uses UTF8. On Android, try instead:

 byte b[] = str.getBytes("UTF-8"); 

EDIT: upon further review of the code, I suggest you rewrite it as follows:

 byte b[] = str.getBytes("UTF-8"); GZIPOutputStream gz = new GZIPOutputStream(bos); gz.write(b, 0, b.length); gz.finish(); gz.close(); bos.close(); 

Changes: using UTF-8 for character encoding; use the default internal buffer size for GZIPOutputStream; call gz.close() before calling bos.close() (the latter is probably not even needed); and call gz.finish() before calling gz.close() .

EDIT 2:

Well, I had to understand what was going on. The GZIPOutputStream class is, in my opinion, a stupid design. It does not have the ability to determine the required compression, and compression is set to none by default. You need to subclass it and override the default compression. The easiest way is to do this:

 GZIPOutputStream gz = new GZIPOutputStream(bos) { { def.setLevel(Deflater.BEST_COMPRESSION); } }; 

This will reset the internal deflator that uses gzip for best compression. (By the way, if you are not familiar with this, the syntax I use here is called the instance initialization block .)

+2
source

The main difference is that your .NET code puts the length of the compressed data in the first four bytes of the binary data. Your Java codes do not. He skips the length field.

When unpacking, however, you expect the length in the first four bytes and start the GZIP decompression at position 4 (skipping the first four bytes).

0
source

All Articles