Good hash for small class? (override GetHashCode)

I use some identity classes / structures that contain 1-2 ints, possibly a datetime or a small string. I use them as keys in a dictionary.

What would be a good GetHashCode override for something like this? Something very simple, but still somewhat impressive.

thanks

+4
source share
2 answers

The accepted answer to this SO question is the method I use.

What is the best algorithm for overriding System.Object.GetHashCode?

+1
source

Take a look at Essential C # .

It contains a detailed description of how to correctly rewrite GetHashCode() .

Book Extract

The goal of the hash code is to efficiently balance the hash table by creating a number corresponding to the value of the object.

  • Mandatory: equal objects must have the same hash codes (if a.Equals(b) , then a.GetHashCode() == b.GetHashCode() )
  • Required: GetHashCode() returns the lifetime of a specific object, must be constant (the same value), even if the data of the object changes. In many cases, you must cache the return method to force this to happen.
  • Required: GetHashCode() should not throw any exceptions; GetHashCode() must always return a value successfully.
  • Performance: Hash codes should be unique when possible. However, since the hash code only returns int , there must be an overlap of the hash codes for objects that have potentially more values โ€‹โ€‹than int can store - almost all types. (An obvious example is long , since there is a more possible long value than int can uniquely identify.)
  • Performance. Possible hash values โ€‹โ€‹should be evenly distributed across the int range. For example, creating a hash that does not take into account the fact that the distribution of the string in Latin is mainly focused on the initial 128 ASCII characters will lead to a very uneven distribution of string values โ€‹โ€‹and the GetHashCode() algorithm will not be strong.
  • Performance: GetHashCode() must be optimized for performance. GetHashCode() commonly used in Equals() implementations to short circuit comparisons with full equivalents if the hash codes are different. As a result, it is often called when a type is used as a key type in dictionary collections.
  • Performance. Small differences between the two objects should lead to large differences between the values โ€‹โ€‹of the hash codes - ideally, a 1-bit difference in the object leads to an average of 16 bits of the hash code changing. This helps ensure that the hash table remains balanced no matter how it "balances" the hash values.
  • Security: It should be difficult for an attacker to create an object with a specific hash code. The attack is to fill the hash table with large amounts of data, which all hash to the same value. Then, the implementation of the hash table becomes O (n) instead of O (1), which leads to a possible denial of service attack.

As already mentioned here, you should also consider some points of redefining Equals() and there are code examples showing how to implement these two functions.

Thus, this data should provide a starting point, but I recommend buying a book and reading the full chapter 9 (at least the first twelve sides) to get all the points on how to properly implement these two important functions.

+1
source

Source: https://habr.com/ru/post/1315074/


All Articles