JavaScript uses UCS-2 internally.
This means that additional Unicode characters are displayed as two separate blocks of code (surrogate halves). For example, 'π'.length == 2 , although its only Unicode character.
Because of this, if you want to get a Unicode code point for each character in a string, you need to convert the UCS-2 string to an array of UTF-16 code points (where each surrogate pair forms one code point). You can use the Punycode.js functions for this:
punycode.ucs2.decode('abc'); // [97, 98, 99] punycode.ucs2.decode('π'); // [119558]
Mathias bynens
source share