Unicode version in .NET.

The CharUnicodeInfo.GetUnicodeCategory documentation says:

Please note that CharUnicodeInfo.GetUnicodeCategory does not always return the same UnicodeCategory as the Char.GetUnicodeCategory method when passing a specific character as a parameter.

The CharUnicodeInfo.GetUnicodeCategory method is intended to reflect the current version of the Unicode standard . In contrast, although the Char.GetUnicodeCategory method typically reflects the current version of the Unicode standard, it can return a character category based on a previous version of the standard, or it can return a category that is different from the current standard to maintain backward compatibility.

So, which version of the Unicode standard is reflected by CharUnicodeInfo.GetUnicodeCategory and Char.GetUnicodeCategory , in which version of the .NET Framework?

+4
source share
4 answers

The documentation for the String Class indicates the Unicode version that conforms to the .NET Framework 4 and 4.5:

.NET Framework 4

In the .NET Framework 4, sorting, casing, normalization, and Unicode information are synchronized with Windows 7 and comply with the Unicode 5.1 standard.

.NET Framework 4.5

In the .NET Framework 4.5, which runs on Windows 8, the sorting, casing, normalization, and Unicode information conforms to the Unicode 6.0 standard. On other operating systems, it complies with the Unicode 5.0 standard.

+4
source

As far as I can tell, the unicode version is not saved. A symbol search is implemented by storing symbol information in an embedded resource named "charinfo.nlp" in the mscorlib.dll file, and this is used as an internal search table. There is a β€œversion” property in the header for this data in the lookup table, but it is β€œ0” in binary data (offset 0x20), so I'm not sure what version it is or if it just isn't implemented.

+2
source

As Michael Kaplan states :

Version released by the Unicode Consortium.

Because in fact there is no definitive answer to this very non-specific question. The answer always depends entirely on a [usually one] specific question that a person asks for an answer to

So, the polite answer in the end is IT DEPENDS ON WHAT YOU CAN. CAN YOU BATTLE A BIT?

+1
source

This page contains a wiki comment from Shawn Steele from Microsoft, which I think should explain why using CharUnicodeInfo is preferable.

-1
source

All Articles