I am trying to print a Chinese character using the types wchar_t, char16_t and char32_t, but to no avail.

Question

I am trying to print a Chinese character using the types wchar_t, char16_t and char32_t, but to no avail.

I am trying to print the Chinese character 中 using the types wchar_t , char16_t and char32_t , without success ( live example )

 #include <iostream> int main() { char x[] = "中"; // Chinese character with unicode point U+4E2D char y[] = u8"中"; wchar_t z = L'中'; char16_t b = u'\u4e2d'; char32_t a = U'\U00004e2d'; std::cout << x << '\n'; // Ok std::cout << y << '\n'; // Ok std::wcout << z << '\n'; // ?? std::cout << a << '\n'; // prints the decimal number (20013) corresponding to the unicode point U+4E2D std::cout << b << '\n'; // " " " }

+5

c ++ c ++ 14 cout

François-Marie Arouet Jul 22 '15 at 18:40

source share

1 answer

Cubbi · Accepted Answer · 2015-07-23T02:28:27+0000

Since you use your test on a Linux system, the source code is UTF-8, so x and y are the same thing. These bytes are shunted, not modified, to the standard output std::cout << x and std::cout << y , and when you look at a web page (or when you look at the Linux terminal), you see a symbol as you expected.

std::wcout << z will be printed if you do two things:

 std::ios::sync_with_stdio(false); std::wcout.imbue(std::locale("en_US.utf8"));

without unsynching from C, GNU libstdC ++ goes through C IO streams that can never print a wide char after printing a narrow char in the same stream. LLVM lib ++ even works with synchronization, but, of course, it still needs to tell the stream how to convert the wide characters to the bytes it sends to standard output.

To print b and a , you have to convert them to wide or narrow; even with wbuffer_convert configuring the stream, char32_t is very much. It will look like this:

 std::wstring_convert<std::codecvt_utf8<char32_t>, char32_t> conv32; std::cout << conv32.to_bytes(a) << '\n';

Putting it all together: http://coliru.stacked-crooked.com/a/a809c38e21cc1743

I am trying to print a Chinese character using the types wchar_t, char16_t and char32_t, but to no avail.

More articles: