Writing C # source code to files

I have a dumb problem. I am reading some .cs files from disk. Performing a large number of regular expressions and other operations with them using the .net program that I did. Then write them back to disk.

The resulting files are somehow erroneously encoded. What encoding are the C # source files? And then there is the first thing in byte order, is this necessary? Does this work when I use File.WriteAllText ()?

A file-changing program is a simple .net application, and the code is just

string text = System.IO.File.ReadAllText(fn); string newText = Regex.Replace(text, regexStr, replaceStr); System.IO.File.WriteAllText(fn, newText); 

There are comments in C # files, and lines don't seem to be part of the standard code page.

One of the problematic characters is "Γ€"

Decision:

it looks like it is working correctly

  string text = System.IO.File.ReadAllText(fn, Encoding.GetEncoding(1252)); string newText = Regex.Replace(text, regexStr, replaceStr); System.IO.File.WriteAllText(fn, newText, Encoding.GetEncoding(1252)); 
+4
source share
3 answers

System.IO.File.ReadAllText(fn) tries to guess the encoding of the input file. This can go horribly wrong.

Visual Studio 2008 creates default files in UTF-8. Similarly, you should try to use UTF-8 where possible by specifying Encoding.UTF8Encoding when writing files to disk.

+2
source

By default, files must be encoded with the same code page as specified in the regional settings of the device. By default it will be "Unicode (UTF-8 with signature) - Codepage 65001", you can use any code page that you want, for example, you can also use "Western European (Windows) - Codepage 1252".

+1
source

At one time, I wrote several code codes and always used ASCII encoding (plain text). What language do you use to perform regular expression operations in CS files?

0
source

All Articles