I looked in mscorelib.dll with .NET Reflector and came across a Char class. I always wondered how methods like Char.isLetter were made. I was expecting a huge list of tests, but having bought a little, I found a very short code that defines the Unicode category. However, this code uses some tables and some sliding voodoo battles. Can someone explain to me how this is done, or point me to some resources?
EDIT:
Here is the code. This is in System.Globalization.CharUnicodeInfo.
internal static unsafe byte InternalGetCategoryValue(int ch, int offset)
{
ushort num = s_pCategoryLevel1Index[ch >> 8];
num = s_pCategoryLevel1Index[num + ((ch >> 4) & 15)];
byte* numPtr = (byte*) (s_pCategoryLevel1Index + num);
byte num2 = numPtr[ch & 15];
return s_pCategoriesValue[(num2 * 2) + offset];
}
s_pCategoryLevel1Indexis short*and s_pCategoryValuesisbyte*
Both are created in the static constructor of CharUnicodeInfo:
static unsafe CharUnicodeInfo()
{
s_pDataTable = GlobalizationAssembly.GetGlobalizationResourceBytePtr(typeof(CharUnicodeInfo).Assembly, "charinfo.nlp");
UnicodeDataHeader* headerPtr = (UnicodeDataHeader*) s_pDataTable;
s_pCategoryLevel1Index = (ushort*) (s_pDataTable + headerPtr->OffsetToCategoriesIndex);
s_pCategoriesValue = s_pDataTable + ((byte*) headerPtr->OffsetToCategoriesValue);
s_pNumericLevel1Index = (ushort*) (s_pDataTable + headerPtr->OffsetToNumbericIndex);
s_pNumericValues = s_pDataTable + ((byte*) headerPtr->OffsetToNumbericValue);
s_pDigitValues = (DigitValues*) (s_pDataTable + headerPtr->OffsetToDigitValue);
nativeInitTable(s_pDataTable);
}
Here is the UnicodeDataHeader.
internal struct UnicodeDataHeader
{
[FieldOffset(40)]
internal uint OffsetToCategoriesIndex;
[FieldOffset(0x2c)]
internal uint OffsetToCategoriesValue;
[FieldOffset(0x34)]
internal uint OffsetToDigitValue;
[FieldOffset(0x30)]
internal uint OffsetToNumbericIndex;
[FieldOffset(0x38)]
internal uint OffsetToNumbericValue;
[FieldOffset(0)]
internal char TableName;
[FieldOffset(0x20)]
internal ushort version;
}
Note. . Hope this does not violate any licenses. If so, I will remove the code.