I have a twelve year program for Windows. As it may be obvious to those who know it, it was designed for ASCII characters, not Unicode. Most of it has been converted, but there is one place that still needs to be changed. However, there is a serious limitation: the exact same ASCII MUST sequence is created by different codes, some of which will work on systems other than Windows.
I am trying to determine if UTF-8 will do the trick or not. I heard along the way that different UTF-8 sequences may contain the same Unicode string, which would be a problem here.
So the question is: given the Unicode string, can I expect one canonical UTF-8 sequence to be generated by any standards-compliant converter implementation? Or are there several possibilities?
source
share