This is just a mistake. § 2.1 / 1 refers to Phase 1,
(An implementation can use any internal encoding if the actual extended character is found in the source file and the same extended character expressed in the source file as a universal character-name (that is, using the \ uXXXX notation) is treated equivalently.)
This is not a note or footnote. C ++ 0x adds an exception for raw string literals, which can solve your problem at hand if you have one.
This program clearly shows the malfunction:
#include <iostream> #define GET_UCN(X) L ## #X int main() { std::wcout << GET_UCN("€") << '\n' << GET_UCN("\u20AC") << '\n'; }
http://ideone.com/lb9jc
Since both lines are wide, the first should be damaged by several characters if the compiler cannot interpret the sequence of input of several bytes. In this example, a complete lack of support for UTF-8 could cause the compiler to split the echo sequentially.
Potatoswatter
source share