Unpacking archives with a file with French names

I am trying to deliver a project to a client. The challenge is to pack the files into an archive; just right? Well, files have (and should have) French characters in their names. I archive from the linux command line, it opens from the desktop on windows.

At first I tried "zip" and it did not work. Character support seems to depend on the implementation of what I read here in StackOverflow. When unpacking, the resulting files did not look like me (the Ubuntu Archive Manager) or her (WinZip, Windows).

Then we tried tar. Finally, everything looks fine to me, but still doesn't suit the client (trying PeaZip and 7zip for Windows).

In this, I really did not expect this to be a problem. French computer users need to archive things, what do they use?

Any insight or help on this would be very helpful. Thanks!

+4
source share
4 answers

Try using an archive program that allows you to specify a character encoding (for example, UTF-8) or figure out how to do this with the one you have. This forum topic can help you, because it is similar to what you ask, although vice versa, for German, not French: http://sourceforge.net/projects/sevenzip/forums/forum/45797/topic / 3710172

+2
source

ZIP traditionally encodes file names using the IBM437 encoding. However, as far as I know, many tools (incorrectly) tend to use the default encoding in the system, which is likely to cause problems in this situation, since both ends can use different encodings.

In theory, ZIP also supports UTF-8 by now, which should solve these problems, but again, tool support will be a problem. For example, as far as I know, Windows Explorer ZIP archive support will not be able to process UTF-8 encoded file names.

So, we are done with this: both ends must agree with the encoding used for the file names, and you will need an encoding that supports all the characters you have (any Unicode encoding will be fine, I'm not sure about IBM437, though). ZIP has come a long way, and therefore there are many tools that generally disagree with the encoding. If possible, explicitly specify the encoding to use and prefer Unicode. In terms of compatibility with arbitrary tools, you might be better off using a newer format designed with Unicode in mind.

7-Zip supports it starting with version 4.58 beta, according to the change log, but will only use it if the local code page does not support the required characters. Using the command line switch -mcu will use UTF-8 for anything but ASCII. Local encodings usually only differ in a range of characters other than ASCII, so this is likely to do the trick. That is, if the tool used for unpacking also supports UTF-8 (which is most likely for 7-ZIP than for ZIP, because it is not as outdated as ZIP, and there are fewer tools for unpacking).

WinRAR may also be worth a try.

+6
source

Alternatively ... You can destroy accented characters. If French-speaking users are on the receiving side of file transfer, they may or may not be sympathetic (ask your users!).

The French do not, in fact, have all the accents. You have [ae] -grave, e-aigue, [aeiou] -circumflex and c-cedilla to worry about, capital and lowering (although this is more likely for serious and dangerous ones, unless someone has pressed a key "Key")

Tar has the -transform option. If you create a sed pattern to turn all the aeiou and c characters designated as iso-latin-1 into an unaccented version, you'll probably be fine.

0
source

I think you should go with 7z compression. On Linux, this can be done using PeaZip or by installing p7zip and using it using a user interface such as Ark or Filereoller, depending on your desktop (I prefer PeaZip because it can be used on any desktop). The 7z format was developed taking into account UTF8 (the author is Russian), and in my experience it has never lost.

0
source

Source: https://habr.com/ru/post/1315436/


All Articles