You want to decode UTF-16, not convert to UTF-8. Decoding means that the result is a string of abstract characters. Of course, there is an internal encoding for strings, UTF-16 or UCS-2 in javascript, but this is an implementation detail.
With strings, the goal is that you donβt need to worry about coding, but simply how to manipulate characters βas isβ. Thus, you can write string methods that do not require input decoding at all. Of course, there are many edge cases when it falls apart.
You cannot decode utf-16 by simply deleting zeros. I mean this will work fine for the first 256 Unicode code points, but you will get garbage if one of the other characters ~ 110,000 characters in Unicode is used. You can't even get the most popular non-ASCII characters like em dash or any smart quotes that work.
Also, looking at your example, it looks like UTF-16LE.
//Braindead decoder that assumes fully valid input function decodeUTF16LE( binaryStr ) { var cp = []; for( var i = 0; i < binaryStr.length; i+=2) { cp.push( binaryStr.charCodeAt(i) | ( binaryStr.charCodeAt(i+1) << 8 ) ); } return String.fromCharCode.apply( String, cp ); } var base64decode = atob; //In chrome and firefox, atob is a native method available for base64 decoding var base64 = "VABlAHMAdABpAG4AZwA"; var binaryStr = base64decode(base64); var result = decodeUTF16LE(binaryStr);
Now you can even use smart quotes:
var base64 = "HCBoAGUAbABsAG8AHSA=" var binaryStr = base64decode(base64); var result = decodeUTF16LE(binaryStr);
source share