In VC ++ 2003, I can just save the source file as UTF-8, and all the lines were used as is. In other words, the following code will print lines, as in the console. If the source file was saved as UTF-8, then the result will be UTF-8.
printf("Chinese (Traditional)"); printf("中国語 (繁体)"); printf("중국어 (번체)"); printf("Chinês (Tradicional)");
I saved the file in UTF-8 format using the UTF-8 specification. However, compiling with VC2008 results in:
warning C4566: character represented by universal-character-name '\uC911' cannot be represented in the current code page (932) warning C4566: character represented by universal-character-name '\uAD6D' cannot be represented in the current code page (932) etc.
The symbols causing these warnings are corrupted. Those that correspond to the language (in this case 932 = Japanese) are converted to the locale encoding, i.e. Shift-jis.
I cannot find a way to get VC ++ 2008 to compile this for me. Please note that it does not matter which language I use in the source file. It seems that there is no language standard that says: "I know what I'm doing, so don't change string literals." In particular, the useless pseudo-language UTF-8 does not work.
#pragma setlocale(".65001") => error C2175: '.65001' : invalid locale
Also, "C" is not executed:
#pragma setlocale("C") => see warnings above (in particular locale is still 932)
It seems that VC2008 forces all characters to the specified (or default) locale, and this language cannot be UTF-8. I don’t want to change the file to use escape lines such as "\ xbf \ x11 ...", because the same source is compiled using gcc, which may well deal with UTF-8 files.
Is it possible to indicate that compiling the source file should leave the string literals intact?
To ask about it differently, what compilation flags can I use to indicate backward compatibility with VC2003 when compiling the source file. that is, do not modify string literals, use them for bytes as they are.
Update
Thanks for the suggestions, but I want to avoid wchar. Since this application only deals with strings in UTF-8, using wchar will require me to convert all strings back to UTF-8, which should not be unnecessary. All input, output, and internal processing is in UTF-8. This is a simple application that works great both on Linux and when compiling with VC2003. I want to be able to compile the same application with VC2008 and work.
For this to happen, I need VC2008 not trying to convert it to a local local language (Japanese, 932). I want the VC2008 to be backward compatible with the VC2003. I want to install a locale or compiler that says strings are used as is, essentially, as opaque char arrays, or as UTF-8. It looks like I could be stuck in VC2003 and gcc, although VC2008 is trying to be too smart in this case.