I quite often confuse:
Confusion over Unicode and Multibyte
After reading the comments by all participants, plus:
Looking at an old article (2001): http://www.hastingsresearch.com/net/04-unicode-limitations.shtml , which talks about unicode:
- This is a 16-bit character definition that allows theoretically 65,000 characters. However, complete world character sets add up to over 170,000 characters.
and looking at the current “modern” article: http://en.wikipedia.org/wiki/Unicode
The most commonly used UTF-8 encodings (which uses 1 byte for all ASCII characters that have the same code values, as in standard ASCII encoding and up to 4 bytes for other characters), now deprecated UCS-2 (which uses 2 bytes for all characters but does not include a character in the Unicode standard), and UTF-16 (which extends UCS-2, using 4 bytes to encode missing characters from UCS-2).
It seems that in the compilation options in VC2008 , the Unicode options in the Character Sets section really mean "Unicode encoded in UCS-2" (or UTF-16? I'm not sure)
I am trying to verify this by running the following code under VC2008
#include <iostream>
int main()
{
std::cout << sizeof(L"我爱你") << std::endl;
std::cout << sizeof(L"abc") << std::endl;
getchar();
}
It seems that at compile time with Unicode character set parameters, the result matched my guess.
? "" ?:)