Zip file with utf-8 file names

On my website, I have the opportunity to upload all the images uploaded by users. The problem is the images with Hebrew names (I need the original file name). I tried to decode the file names, but this does not help. Here is the code:

using ICSharpCode.SharpZipLib.Zip; Encoding iso = Encoding.GetEncoding("ISO-8859-1"); Encoding utf8 = Encoding.UTF8; byte[] utfBytes = utf8.GetBytes(file.Name); byte[] isoBytes = Encoding.Convert(utf8, iso, utfBytes); string name = iso.GetString(isoBytes); var entry = new ZipEntry(name + ".jpg"); zipStream.PutNextEntry(entry); using (var reader = new System.IO.FileStream(file.Name, FileMode.Open, FileAccess.Read, FileShare.ReadWrite)) { byte[] buffer = new byte[ChunkSize]; int bytesRead; while ((bytesRead = reader.Read(buffer, 0, buffer.Length)) > 0) { byte[] actual = new byte[bytesRead]; Buffer.BlockCopy(buffer, 0, actual, 0, bytesRead); zipStream.Write(actual, 0, actual.Length); } } 

After utf-8 encoding, I get the names of the hieroglyphs, for example: ??????. jpg Where is my mistake?

+6
source share
2 answers

Unicode (UTF-8 is one of binary encoding) can represent more characters than another 8-bit encoding. Moreover, you do not perform the proper conversion, but re-interpret, which means that you get garbage for your file names. You really should read an article from Joel on Unicode .

...

Now that you have read the article, you should know that in a C# line, the line can store unicode data, so you probably do not need to do any file.Name conversion and pass this directly to the ZipEntry constructor if the library does not contain processing errors encodings (this is always possible).

+1
source

You are doing the wrong conversion since strings in C # are already unicode. What tools do you use to check file names in the archive? By default, Windows ZIP implementations use a system DOS encoding for file names, while other implementations may use a different encoding.

0
source

All Articles