I have a number of Unicode codes. What I really need to do is iterate through these code points as a sequence of characters, rather than a series of code points and define the properties of each individual character, for example. this letter, whatever.
For example, imagine that I wrote a Unicode-compatible text field, and the user entered a Unicode character, which was more than one code — for example, “e with diacritical”. I know that this particular character can be represented as one code example, and it can be normalized, but I do not think that this is possible in the general case. How can I implement backspace? Obviously, he cannot just delete the last code, because they could just enter more than one code.
How can I iterate over a bunch of Unicode codes as characters?
Edit: The break iterators offered by the ICU seem to be pretty much needed by me. However, I do not use the ICU, so any references to how to implement my own equivalent functionality will be the accepted answer.
Other editing: It turns out that the Windows API does offer this feature. MSDN is just not very good with all string functions in one place. CharNext is the function I'm looking for.
Puppy
source share