Strings in Java code-agnostic (they use UTF-16 internally, but that doesn't matter here). The codes that you enter after \u are Unicde code points, they are not the actual binary representation of characters. Each character has an associated code point. Different encodings determine how you map codes to a given binary representation.
In this case, you create a string using code points, and then convert it to arbitrary encoding using the getBytes() method. For example, the euro sign ( € ):
"\u20AC".getBytes("UTF-8"); //-30, -126, -84 "\u20AC".getBytes("UTF-16"); //-2, -1, 32, -84 "\u20AC".getBytes("UTF-32"); // 0, 0, 32, -84
It is worth remembering: UTF-16 does not actually use 16 bits all the time!
source share