How to use ICU with UTF-16?

Question

How to use ICU with UTF-16?

I am studying the use of ICU for processing Unicode strings in my own Node.js module, because it seems to me that v8::String(according to these documents ) there is no C ++ API for this.

As far as I know, V8 expects UTF-16 in ExternalStringResourceand other APIs, so I would like to use ICU to handle UTF-16.
I need:

Iterate over characters (not just 16-bit code units) of a UTF-16 string
Report the number of characters (not just 16-bit code blocks) containing the UTF-16 string

So, I looked at the ICU documentation and found UnicodeStringand CharacterIterator. However, it UnicodeStringdoes not have a method fromUTF16, only fromUTF8that fromUTF32.

Another thing I'm not sure about is the constructor UnicodeStringcopying the data that I give them or not? I would very much prefer to use a zero copy approach, where I would just work with an immutable object so that it does not perform any copy operations, just use the buffer that I point to.

I'm also not sure if I can just use UCharIterator(assuming that I can somehow convert UChar*from my UTF-16 strings).

So my question is: How to use ICU for the above purposes?

Thanks in advance for your answers!

+4

c ++ node.js unicode utf-16 icu

Venemo Nov 07 '13 at 16:59

source share

1 answer

R. Martinho Fernandes · Accepted Answer · 2013-11-07T17:04:42+0000

UnicodeString UTF-16 . fromUTF8 fromUTF32: UTF-16 .

. , std::string.

UCharIterator, . UChar - 16- . 16- , , UCHAR_TYPE:

UChar UCHAR_TYPE, #defined (, char16_t) wchar_t, 16 ; .
, UChar uint16_t.
UChar , 16- wchar_t.

How to use ICU with UTF-16?

More articles: