How to open a java program generated zip file using UTF-8 encoding

Our product has an export function that uses ZipOutputStream for the zip directory; however, when you try to pin a directory containing file names with a Chinese or Japanese character, the export does not work properly. For some reason, new files in a zipped file are named differently. Here is an example of our code:

 ZipOutputStream out = new ZipOutputStream(new FileOutputStream(zipFileName)); out.setEncoding("UTF-8"); //program to add directory to zip //program add/create file to zip out.close(); 

My import algorithm, also built into Java, can import the archived file correctly even if it contains Chinese / Japanese characters in the file / directory names.

  Zipfile zipfile = new ZipFile(zipPath, "UTF-8"); Enumeration e = zipFile.getEntries(); while (e.hasMoreElements()) { entry = (ZipEntry) e.nextElement(); String name = entry.getName(); .... 

Does the zip program have problems unpacking UTF-8 encoded files or is there something special that is needed to create a zip file that can be easily used by existing software using utf-8 encoding?


I wrote an example program:

 package ZipFile; import java.io.File; import java.io.FileInputStream; import java.io.FileOutputStream; import org.apache.tools.zip.ZipEntry; import org.apache.tools.zip.ZipOutputStream; public class ZipFolder{ public static void main(String[] a) throws Exception { String srcFolder = "D:/9.4_work/openscript_repo/δΈ­ζ–‡124.All/δΈ­ζ–‡"; String destZipFile = "D:/Eclipse_Projects/OpenScriptDebuggingProject/src/ZipFile/demo.zip"; zipFolder(srcFolder, destZipFile); } static public void zipFolder(String srcFolder, String destZipFile) throws Exception { ZipOutputStream zip = null; FileOutputStream fileWriter = null; fileWriter = new FileOutputStream(destZipFile); zip = new ZipOutputStream(fileWriter); zip.setEncoding("UTF-8"); // using GBK encoding, the chinese name can be correctly displayed when unzip // zip.setEncoding("GBK"); addFolderToZip("", srcFolder, zip); zip.flush(); zip.close(); } static private void addFileToZip(String path, String srcFile, ZipOutputStream zip) throws Exception { File folder = new File(srcFile); if (folder.isDirectory()) { addFolderToZip(path, srcFile, zip); } else { byte[] buf = new byte[1024]; int len; FileInputStream in = new FileInputStream(srcFile); zip.putNextEntry(new ZipEntry(path + "/" + folder.getName())); while ((len = in.read(buf)) > 0) { zip.write(buf, 0, len); } } } static private void addFolderToZip(String path, String srcFolder, ZipOutputStream zip) throws Exception { File folder = new File(srcFolder); for (String fileName : folder.list()) { if (path.equals("")) { addFileToZip(folder.getName(), srcFolder + "/" + fileName, zip); } else { addFileToZip(path + "/" + folder.getName(), srcFolder + "/" + fileName, zip); } } } 

}

+4
source share
2 answers

The top answer here may answer your question; Unfortunately, it seems that the Zip format does not actually allow you to create a Zip file that will correctly display the file names on any computer:

https://superuser.com/questions/60379/linux-zip-tgz-filenames-encoding-problem

I expect it to work when you set the encoding to GBK, because this is your default system encoding, and therefore 7zip uses this for all zip files that are opened.

This suggests that the rar and 7z formats have better support.

I found a blog post about UTF-8 in zips with Java. It offers a new version of the ZIP specification that current versions of Java cannot create, but Java 7 will do. I don't know if Apache classes use this either.

http://blogs.oracle.com/xuemingshen/entry/non_utf_8_encoding_in

+1
source

The following utility class allows you to compress and decompress strings using the GZIP compression algorithm. This can be useful if you want to keep long lines in a database, for example.

 import java.io.ByteArrayOutputStream; import java.io.ByteArrayInputStream; import java.util.zip.GZIPOutputStream; import java.util.zip.GZIPInputStream; public class GzipStringUtil { public static byte[] compressString(String uncompressedString) throws IllegalArgumentException, IllegalStateException { if (uncompressedString == null) { throw new IllegalArgumentException("The uncompressed string specified was null."); } try { byte[] utfEncodedBytes = uncompressedString.getBytes("UTF-8"); ByteArrayOutputStream baos = new ByteArrayOutputStream(); GZIPOutputStream gzipOutputStream = new GZIPOutputStream(baos); gzipOutputStream.write(utfEncodedBytes); gzipOutputStream.finish(); gzipOutputStream.close(); return baos.toByteArray(); } catch (Exception e) { throw new IllegalStateException("GZIP compression failed: " + e, e); } } public static String uncompressString(byte[] compressedString) throws IllegalArgumentException, IllegalStateException { if (compressedString == null) { throw new IllegalArgumentException("The compressed string specified was null."); } try { ByteArrayInputStream bais = new ByteArrayInputStream(compressedString); GZIPInputStream gzipInputStream = new GZIPInputStream(bais); ByteArrayOutputStream baos = new ByteArrayOutputStream(); for (int value = 0; value != -1;) { value = gzipInputStream.read(); if (value != -1) { baos.write(value); } } gzipInputStream.close(); baos.close(); return new String(baos.toByteArray(), "UTF-8"); } catch (Exception e) { throw new IllegalStateException("GZIP uncompression failed: " + e, e); } } } 

Here is a TestCase example that provides an example using the class above:

 public class GzipStringUtilTest extends TestCase { public void testGzipStringUtil() { String input = "This is a test. This is a test. This is a test. This is a test. This is a test."; System.out.println("Input: [" + input + "]"); byte[] compressed = GzipStringUtil.compressString(input); System.out.println("Compressed: " + Arrays.toString(compressed)); System.out.println("-> Compressed input string of length " + input.length() + " to " + compressed.length + " bytes"); String uncompressed = GzipStringUtil.uncompressString(compressed); System.out.println("Uncompressed: [" + uncompressed + "]"); assertEquals("The uncompressed string [" + uncompressed + "] unexpectedly does not match the input string [" + input + "]", input, uncompressed); System.out.println("The input was compressed and uncompressed successfully, and the input matches uncompressed output."); } } 
+1
source

All Articles