Git and Umlaut issue on Mac OS X

Today I discovered an error for Git on Mac OS X.

For example, I will first transfer a file named überschrift.txt with the German special character Ü. From the git status command, I get the following output.

 Users-iMac: user$ git status On branch master # Untracked files: # (use "git add <file>..." to include in what will be committed) # # "U\314\210berschrift.txt" nothing added to commit but untracked files present (use "git add" to track) 

Git 1.7.2 seems to have problems with German special characters on Mac OS X X. Is there a solution to getting Git to read the file names correctly?

+59
git versioning macos
Apr 7 2018-11-11T00:
source share
7 answers

Enable core.precomposeunicode on mac

 git config --global core.precomposeunicode true 

For this you need to have at least Git 1.8.2.

Mountain lion ships with 1.7.5. To get the new Git, use git-osx-installer or homebrew (Xcode required).

What is it.

+80
Mar 21 '13 at 17:06
source share
— -

The reason is a different implementation of how the file system stores the file name.

In Unicode, Ü can be represented in two ways: one - only using Ü, the other - using U + ", combining the umlaut symbol." A Unicode string can contain both forms, but since it confuses them, the file system normalizes the Unicode string by setting each umlauted-U to Ü or U +, combining the umlaut character.

Linux uses the old method called Normal-Form-Composed (or NFC), and Mac OS X uses the last method called Normal-Form-Decomposed (NFD).

Apparently, Git does not care about this point and just uses the byte sequence of the file name, which leads to the problem you are facing.

There is a patch on the Git mailing list on Mac OS X and German special characters so that Git compares file names after normalization.

+28
Apr 7 2018-11-14T00:
source share

To make git add file work with umlauts in file names on Mac OS X, you can convert file path strings from folded to canonically decomposed UTF-8 using iconv .

 # test case mkdir testproject cd testproject git --version # git version 1.7.6.1 locale charmap # UTF-8 git init file=$'\303\234berschrift.txt' # composed UTF-8 (Linux-compatible) touch "$file" echo 'Hello, world!' > "$file" # convert composed into canonically decomposed UTF-8 # cf. http://codesnippets.joyent.com/posts/show/12251 # printf '%s' "$file" | iconv -f utf-8 -t utf-8-mac | LC_ALL=C vis -fotc #git add "$file" git add "$(printf '%s' "$file" | iconv -f utf-8 -t utf-8-mac)" git commit -a -m 'This is my commit message!' git show git status git ls-files '*' git ls-files -z '*' | tr '\0' '\n' touch $'caf\303\251 1' $'caf\303\251 2' $'caf\303\251 3' git ls-files --other '*' git ls-files -z --other '*' | tr '\0' '\n' 
+5
Sep 06 2018-11-15T00:
source share

The following entry in ~ / .gitconfig works for me on 10.12.1 Sierra for UTF-8 names:

 precomposeunicode = true quotepath = false 

The first parameter is necessary so that git "understands" UTF-8 and the second so that it does not miss characters.

+4
Dec 20 '16 at 17:44
source share

Change the repository to the OSX-specific flag core.precomposeunicode to true:

 git config core.precomposeunicode.true 

To verify that new repositories receive this flag, also do:

 git config --global core.precomposeunicode true 

Here is the corresponding snippet from the man page:

This option is used only for Mac OS Git implementation. when core.precomposeunicode = true, Git returns decomposition into unicode file names made by Mac OS. This is useful when sharing a repository between Mac OS and Linux or Windows. (Git for Windows 1.7.10 or higher or Git under cygwin 1.7). When false, file names are processed completely transparently using Git, which is backward compatible with older versions of Git.

+3
Dec 02 '13 at
source share

It is right.

Your file name is in UTF-8 , and is represented as LATIN CAPITAL LETTER U + COMBINING DIAERESIS (Unicode 0x0308, utf8 0xcc 0x88) instead of LATIN CAPITAL LETTER U WITH DIAERESIS (Unicode 0x00dc, utf8 0xc3. The Mac OS X HFS file system decomposes Unicode in this way . Git , in turn, displays the octal byte of a non-ASCII file name.

Please note that Unicode file names may make your repository not portable. For example, msysgit had problems with Unicode file names .

+1
Apr 07 2018-11-11T00:
source share

I had a similar problem with my personal repository, so I wrote a helper script with Python 3. You can view it here: https://github.com/sjtoik/umlaut-cleaner

The script requires a bit of manual labor, but not much.

0
Apr 22 '14 at 10:07
source share



All Articles