This question is not about why one multiplies, which is pretty obvious about its distribution.
Why use a prime in hashCode?
But rather, it is more about one property of multiplication, which becomes more important, the more factors are included in the hash code calculation formula.
A simple calculation can obviously overwhelm, but that doesn't really matter.
a * 31 + b
The real problem is when many elements are in the formula.
((a * 31) + b) * 31 ... 6n.
After more than 5 or 6 terms include the value of the first term, it is lost because its bits overflow by the time the hash code reaches the inclusion of 5+. Using this system, only the last 5 or so terms are really significant for the final meaning.
31 ^ 7 > Integer.MAX_VALUE
So why do most calculations roll bits that overflow back, and xor w / lower bits of the result. I understand that this requires some bit, and the calculations must be done using longs (64 bits), so the top 32 bits can be XOR'd with the whole result, but at least no bits will be lost.
Is there any special reason why overflow is ignored? Its not so expensive to use a long one, as described earlier.
EDIT
100000*31^7= 2751261411100000 0x9C641F717C560 6553600000*31^7 180306667837849600000 0xC641F717C5600000
Note that the latter value is exactly 65,536 times the previous one, which also means that its response is 16 bits larger. Note that the integer value 0xC641F717C5600000 is 0xC5600000, the actual significant values ββare lost from the 16-bit value.
*SAMPLE A* 65536*4096*27512614111 =7385361114638319616 =0x667E12CDF0000000 12345678 =0xF0000000 *SAMPLE B* 9*65536*4096*27512614111 =66468250031744876544 =0x9A6EA93D70000000 12345678 =0x70000000
Note that the topmost bit of SAMPLE B, which is exactly 9x SAMPLE A, is practically no different from the final 32-bit value - if I changed 9x to 17x, then the lower bits would be identical. However, if the upper most bits were not "lost" due to overflow and xord with lower 32 bits, then the value will be different.