There is no way to do this in JavaScript natively. (See Riccardo Galli's answer for a modern approach.)
For historical reference or in cases where the TextEncoder APIs are still unavailable .
If you know the character encoding, you can calculate it yourself.
encodeURIComponent assumes UTF-8 as the character encoding, so if you need this encoding, you can do it,
function lengthInUtf8Bytes(str) {
This should work because of the way UTF-8 encodes multibyte sequences. The first encoded byte always starts either with the most significant bit of zero for one sequence of bytes, or with a byte whose first hexadecimal digit is C, D, E or F. The second and subsequent bytes are those whose first two bits are equal to 10. These are those extra bytes. which you want to read in UTF-8.
Wikipedia table clarifies the situation
Bits Last code point Byte 1 Byte 2 Byte 3 7 U+007F 0xxxxxxx 11 U+07FF 110xxxxx 10xxxxxx 16 U+FFFF 1110xxxx 10xxxxxx 10xxxxxx ...
If instead you need to understand the encoding of the page, you can use this trick:
function lengthInPageEncoding(s) { var a = document.createElement('A'); a.href = '#' + s; var sEncoded = a.href; sEncoded = sEncoded.substring(sEncoded.indexOf('#') + 1); var m = sEncoded.match(/%[0-9a-f]{2}/g); return sEncoded.length - (m ? m.length * 2 : 0); }
Mike Samuel Apr 01 2018-11-11T00: 00Z
source share