I read this article because it was linked by Jon Skeet to this answer . I'm trying to understand how hashing works, and why John likes the algorithm he provided so much. I do not claim that I have an answer to this question, but I have a specific question about the implementation of the System.Stringbase GetHashCode.
Consider annotated string oriented code <<<<<==========:
public override unsafe int GetHashCode()
{
if (HashHelpers.s_UseRandomizedStringHashing)
return string.InternalMarvin32HashString(this, this.Length, 0L);
fixed (char* chPtr = this)
{
int num1 = 352654597;
int num2 = num1;
int* numPtr = (int*) chPtr;
int length = this.Length;
while (length > 2)
{
num1 = (num1 << 5) + num1 + (num1 >> 27) ^ *numPtr;
num2 = (num2 << 5) + num2 + (num2 >> 27) ^ numPtr[1];
numPtr += 2;
length -= 4; <<<<<==========
}
if (length > 0)
num1 = (num1 << 5) + num1 + (num1 >> 27) ^ *numPtr;
return num1 + num2 * 1566083941;
}
}
Why do they only process every fourth character? And, if you want enough, why do they process it from right to left?
source
share