C # definitely uses UTF-16. The correct way to define characters over the range U + 0000 - U + FFFF uses an escape sequence that allows you to define characters using 8 hexadecimal digits:
string s = "\U0001D11E";
If you use \u1D11E , it is interpreted as the character U+1D11 , followed by E
When using these characters, keep in mind that the String.Length property, and most string methods work with UTF-16 code units, not Unicode characters. From the MSDN documentation:
The Length property returns the number of Char objects in this instance, and not the number of Unicode characters. The reason is that a Unicode character can be represented by more than one Char. Use the System.Globalization.StringInfo class to work with each Unicode character instead of each Char.
source share