I need to use the Dictionary<long, string> collections, which specify two instances of d1 and d2 , where each of them has the same KeyValuePair<long, string> content, which can be inserted in any order:
(d1 == d2) evaluates to trued1.GetHashCode() == d2.GetHashCode()
The first requirement was achieved most easily using SortedDictionary instead of a regular Dictionary .
The second requirement is necessary because I have one point where I need to save Dictionary<Dictionary<long, string>, List<string> - the main type of Dictionary used as a key for another Dictionary , and if HashCodes are not evaluated based on the identical contents of ContainsKey() will not work the way I want (i.e. if an element with d1 already inserted in the dictionary, then dictionary.ContainsKey(d2) should be evaluated as true .
To do this, I created a new class ComparableDictionary : SortedDictionary<long, string> object class ComparableDictionary : SortedDictionary<long, string> and included the following:
public override int GetHashCode() { StringBuilder str = new StringBuilder(); foreach (var item in this) { str.Append(item.Key); str.Append("_"); str.Append(item.Value); str.Append("%%"); } return str.ToString().GetHashCode(); }
In my unit testing, this meets the criteria for both equality and hash codes. However, while reading the "Guide and Rules for GetHashCode", I came across the following:
Rule: the integer returned by GetHashCode should never change while the object is contained in the data structure, which depends on the remaining hash code
It is acceptable, albeit dangerous, to make an object whose hash code value can mutate, because the fields of the object mutate. If you have such an object and you put it in a hash table, then the code that mutates the object and the code that supports the hash table should have some consistent protocol that ensures that the object is not mutated while it is in the hash -table. What this protocol looks like is up to you.
If the hash code of an object can mutate when it is in the hash table, then obviously the Contains method stops working. You put an object in bucket # 5, you mutate it, and when you ask if it contains a mutated object, it looks in bucket # 74 and does not find it.
Remember that objects can be placed in hash tables in ways you did not expect. Many LINQ sequence operators use internal hash tables. Do not send dangerous mutant objects by listing a LINQ query that returns them!
Now the Dictionary<ComparableDictionary, List<String>> used only once in the code, in the place where the contents of all the ComparableDictionary collections should be set. Thus, according to these recommendations, I believe that it would be acceptable to override GetHashCode as I did (based entirely on the contents of the dictionary).
After this introduction, my questions are :
- I know that the performance of
SortedDictionary very low compared to Dictionary (and I can have hundreds of object instances). The only reason to use SortedDictionary is that I can have equality comparison work based on the contents of the dictionary, regardless of the insertion order. Is there a better way to achieve this equality requirement without using a SortedDictionary ? - Is my
GetHashCode implementation acceptable based on requirements? Despite the fact that it is based on mutable content, I donβt think it should pose any risk since the only place it is used (I think) is after the content has been installed.
Note. While I was customizing them using Dictionary or SortedDictionary , I am not tied to these types of collections. The basic need is a collection that can store pairs of values ββand meet the equality and hash requirements defined above.