Unicode in the Rhino

For some reason, Unicode strings don't behave correctly in Rhino , Mozilla's JavaScript engine . If I enter Unicode text in the REPL or manipulate it, it will return gibberish.

js> ' ' B>B0;L=0O :81>@3870F8O 

ASCII characters work fine.

 js> 'reprap for everyone' reprap for everyone 

Unix commands work just fine too:

 $ echo ' '   

JVM output is also great, launching class Test { public static void main(String[] args) { System.out.println(" "); } } class Test { public static void main(String[] args) { System.out.println(" "); } } displays the Cyrillic alphabet correctly.

Versions of Java and Rhino:

 $ java -version java version "1.7.0_09" OpenJDK Runtime Environment (IcedTea7 2.3.3) (7u9-2.3.3-0ubuntu1~12.10.1) OpenJDK 64-Bit Server VM (build 23.2-b09, mixed mode) $ rhino Rhino 1.7 release 3 2012 05 18 

locales:

 $ echo $LC_TYPE $ echo $LANG en_US.UTF-8 

Changing LC_ALL to en_US.UTF-8 does not help.

Is this issue related to this issue of StackOverflow, Javascript using UCS-2 ?

What is the problem and how can I use the correct Unicode in the Rhino REPL?

+7
source share
1 answer

It should be noted that JavaScript does not really handle Unicode properly, since it precedes UTF16. (It uses another 16-bit encoding system, which is similar, but certainly not the same.)

This writeup explains the problem well and provides libraries and workarounds .

+1
source

All Articles