Why does String GetHashCode process only every fourth character?

Question

Why does String GetHashCode process only every fourth character?

I read this article because it was linked by Jon Skeet to this answer . I'm trying to understand how hashing works, and why John likes the algorithm he provided so much. I do not claim that I have an answer to this question, but I have a specific question about the implementation of the System.Stringbase GetHashCode.

Consider annotated string oriented code <<<<<==========:

public override unsafe int GetHashCode()
{
  if (HashHelpers.s_UseRandomizedStringHashing)
    return string.InternalMarvin32HashString(this, this.Length, 0L);
  fixed (char* chPtr = this)
  {
    int num1 = 352654597;
    int num2 = num1;
    int* numPtr = (int*) chPtr;
    int length = this.Length;
    while (length > 2)
    {
      num1 = (num1 << 5) + num1 + (num1 >> 27) ^ *numPtr;
      num2 = (num2 << 5) + num2 + (num2 >> 27) ^ numPtr[1];
      numPtr += 2;
      length -= 4;   <<<<<==========
    }
    if (length > 0)
      num1 = (num1 << 5) + num1 + (num1 >> 27) ^ *numPtr;
    return num1 + num2 * 1566083941;
  }
}

Why do they only process every fourth character? And, if you want enough, why do they process it from right to left?

+3

c # algorithm hash

Mike perrenoud Dec 04 '13 at 18:25

source share

3 answers

. - :

int* numPtr = (int*) chPtr;

int*, char , numPtr. :

num1 = (num1 << 5) + num1 + (num1 >> 27) ^ *numPtr;
num2 = (num2 << 5) + num2 + (num2 >> 27) ^ numPtr[1];

4 .

+4

MarcinJuraszek 04 . '13 18:28

numPtr 32- .
32- (*numPtr numPtr[1]).

, numPtr += 2 ( 2 32- ) length -= 4 ( 4 16- char s).

+3

SLaks 04 . '13 18:29

Reed Copsey · Accepted Answer · 2013-12-04T18:28:44+0000

Why do they only process every fourth character? And, if you want enough, why do they process it from right to left?

. ( , *numPtr numPtr[1] while). Int32 , 4 , 4 .

( ), , . , " 4 " , .

Why does String GetHashCode process only every fourth character?

More articles: