How to stop git from coding violation at checkout

I recently added a .gitattributes file to the C # repository with the following settings:

* text=auto *.cs text diff=csharp 

I renormalized the repository following these instructions from github and it seemed to work fine.

The problem is that when I look at some files (not all), I see many strange characters mixed with real code. It seems that when git launches files through the lf->crlf specified in the .gitattributes file above.

According to Notepad ++, files that are messed up use UCS-2 Little Endian or UCS-2 Big Endian encoding. Files that seem to work fine are either ANSI encoded or UTF-8 encoded.

For reference, my version of git is 1.8.0.msysgit.0 , and my OS is Windows 8.

Any ideas how I can fix this? Will it be enough to change the encoding of the files?

+4
source share
2 answers

This happens if you use an encoding where each character has two bytes.
Then CRLF will be encoded as \0\r\0\n .

Git believes this is a single-byte encoding, so it turns into \0\r\0\r\n .
This leads to the fact that the next line will go out in one byte, as a result of which every other line will be filled in by the Chinese. (since \0 becomes the low byte, not the high byte)

You can convert files to UTF8 using this LINQPad script:

 const string path = @"C:\..."; foreach (var file in Directory.EnumerateFiles(path, "*", SearchOption.AllDirectories)) { if (!new [] { ".html", ".js"}.Contains(Path.GetExtension(file))) continue; File.WriteAllText(file, String.Join("\r\n", File.ReadAllLines(file)), new UTF8Encoding(encoderShouldEmitUTF8Identifier: true)); file.Dump(); } 

This will not fix broken files; you can fix files by replacing \r\n with \n in a hex editor. I do not have a LINQPad script. (since there is no simple Replace() method for byte[] s)

+3
source

To fix this, either convert the encoding of the files (UTF-8 should be fine) or turn off the automatic string conversion ( git config core.autocrlf false and .gitattributes that you have).

0
source

All Articles