Incorrect zip code entries when writing a file name containing non-English characters, even with Java 7

I am trying to develop code that can handle ZIP files with non-English characters (Umlaut, Arabic, etc.), but the zipped file contains invalid names. I am using java version 1.7.0_45 , so this should not be due to the error mentioned here . I set the encoding in UTF-8 for the ZipOutputStream constructor and according to Javadocs it should work according to my requirements.

I am sure that the zip file is written correctly, as trying to read the records from the file gives the correct file names (as expected).

However, when I try to open / extract using the Ubuntu default ArchiveManager / Unzip utility, the file names are confused.

Here is my code:

 private void convertFilesToZip(List<File> files) { FileInputStream inputStream = null; try { byte[] buffer = new byte[1024]; FileOutputStream fileOutputStream = new FileOutputStream("zipFile.zip"); ZipOutputStream outputStream = new ZipOutputStream(fileOutputStream, Charset.forName("UTF-8")); for (File file : files) { inputStream = new FileInputStream(file); String filename = file.getName(); System.out.println("Adding file : " + filename); outputStream.putNextEntry(new ZipEntry(filename)); int length; while ((length = inputStream.read(buffer)) > 0) { outputStream.write(buffer, 0, length); } outputStream.closeEntry(); } if(inputStream != null) inputStream.close(); outputStream.close(); System.out.println("Zip created successfully"); System.out.println("======================================================="); System.out.println("Reading zip Entries"); ZipInputStream zipInputStream = new ZipInputStream(new FileInputStream(new File("zipFile.zip")), Charset.forName("UTF-8")); ZipEntry zipEntry; while((zipEntry=zipInputStream.getNextEntry())!=null){ System.out.println(zipEntry.getName()); zipInputStream.closeEntry(); } zipInputStream.close(); } catch (IOException exception) { exception.printStackTrace(); } } 

the output for files is as follows:

 Adding file : umlaut_ḧ.txt Adding file : ذ ر ز س ش ص ض.txt Adding file : äǟc̈ḧös̈ ẗǚẍŸ_uploadFile4.txt Adding file : pingüino.txt Adding file : ÄÖÜäöüß- Español deEspaña.ppt Zip created successfully ======================================================= Reading zip Entries umlaut_ḧ.txt ذ ر ز س ش ص ض.txt äǟc̈ḧös̈ ẗǚẍŸ_uploadFile4.txt pingüino.txt ÄÖÜäöüß- Español deEspaña.ppt 

Has anyone successfully implemented what I want to achieve here. Can someone point me to something that I may have missed or did wrong. I did my best and even tried the Apache Commons Compress , but still no luck.

The error report mentions that the error is resolved in Java 7, then why the code does not work.

Any help is greatly appreciated. Thanks in advance.

+6
source share
1 answer

[Update] Finally, I realized that the problem is not in the code, but in fact with the Ubuntu archiver by default. It does not recognize / retrieve content properly. When the same file is opened / extracted by the zip window handler, it works flawlessly.

In addition, commons-compress supports a bunch of other formats, in addition, zip, gzip, supported by Java.

http://commons.apache.org/proper/commons-compress/index.html

+2
source

All Articles