Two lines for a dictionary key

I have two lines that I would like to use as a dictionary key, but I'm lazy to create another object, calculate the hash code of the lines, etc.

So, instead, can I get the hash codes from two lines, add them and use the result as a dictionary?

May I cause a collision? right?.

Any ideas?

+8
string hashcode dictionary c # data-structures
source share
4 answers

I have two lines that I like use them as a dictionary key, but I'm lazy to create another object

In .NET 4.0, you can use the Tuple<T1, T2> class as a key, with T1 and T2 = string.

Is it possible to get hash codes from two lines, add them and use the result as a dictionary key?

The Tuple<T1, T2> formula uses hash codes to combine - it’s something like (not documented or guaranteed did not change): ((h1 << 5) + h1) ^ h2 , which should be decent enough for your purposes. By the way, naive adding is usually not the best way to combine hash codes.

May I cause a collision? right?.

It is always possible even with one line. There are more lines than 32-bit integers.

+19
source share

If you are using .NET 4, you can use the Tuple class:

 Dictionary<Tuple<string, string>, TValue> dict = new ... 

If you are not using .NET 4, you must create your own type to save this.

You can use KeyValuePair , but it inherits the appropriate methods from the base value type and thus relies heavily on reflection. This has performance implications (see bottom of answer.)

For KeyValuePair:

 Dictionary<KeyValuePair<string, string>, TValue> dict = new ... 

Here is a generic type that you can use if you don't want to cook it yourself:

 public struct SimpleTuple<TValue1, TValue2> { private readonly TValue1 _Value1; private readonly TValue2 _Value2; public SimpleTuple(TValue1 value1, TValue2 value2) { _Value1 = value1; _Value2 = value2; } public TValue1 Value1 { get { return _Value1; } } public TValue2 Value2 { get { return _Value2; } } public int GetHashCode() { unchecked { int result = 37; result *= 23; if Value1 != null) result += Value1.GetHashCode(); result *= 23; if (Value2 != null) result += Value2.GetHashCode(); return result; } } public override bool Equals(object obj) { if (obj == null) return false; if (obj.GetType() != typeof(SimpleTuple<TValue1, TValue2>)) return false; var other = (SimpleTuple<TValue1, TValue2>)obj; return Equals(other.Value1, Value1) && Equals(other.Value2, Value2); } } 

Of course, KeyValuePair also works on .NET 4.0 just like good bad.

As for collisions, it depends on what you mean. A hash table (a dictionary uses the internal structure of a hash table) always has the ability to get key collisions, but that's where the comparison comes into play. If two different keys generate the same hash code, the dictionary class will compare the key with the key to see if they are really the same value, or just create the same hash code.

The reason that a hash table will always have the possibility of collision is best described using the pidgeonhole principle (Wikipedia) .

This means that if you encounter two different keys, this will not be a problem, they will both be stored with the correct values ​​in the dictionary.

Of course, if you create the same key twice, the dictionary will consider it the same key and either will not be able to add a new value or overwrite the existing one (depending on how you ask it to add the value).

This will throw an exception for duplicate keys:

 dict.Add(key, value); 

This will add or overwrite the existing one:

 dict[key] = value; 

In response to Ani's comment, I wrote the following simple test script for LINQPad . Output:

 KeyValuePair: 975ms
 MyKeyValuePair: 52ms 

script:

 void Main() { const int iterations = 10 * 1000 * 1000; // JIT preheat Test1(1); Test2(1); Stopwatch sw = Stopwatch.StartNew(); Test1(iterations); sw.Stop(); Debug.WriteLine("KeyValuePair: " + sw.ElapsedMilliseconds + "ms"); sw = Stopwatch.StartNew(); Test2(iterations); sw.Stop(); Debug.WriteLine("MyKeyValuePair: " + sw.ElapsedMilliseconds + "ms"); } public static void Test1(int iterations) { for (int index = 0; index < iterations; index++) { var kvp = new KeyValuePair<int, int>(index, index); kvp.GetHashCode(); } } public static void Test2(int iterations) { for (int index = 0; index < iterations; index++) { var kvp = new MyKeyValuePair<int, int>(index, index); kvp.GetHashCode(); } } public struct MyKeyValuePair<TKey, TValue> { private readonly TKey _Key; private readonly TValue _Value; public MyKeyValuePair(TKey key, TValue value) { _Key = key; _Value = value; } public TKey Key { get { return _Key; } } public TValue Value { get { return _Value; } } public int GetHashCode() { unchecked { int result = 37; result *= 23; if (Key != null) result += Key.GetHashCode(); result *= 23; if (Value != null) result += Value.GetHashCode(); return result; } } public override bool Equals(object obj) { if (obj == null) return false; if (obj.GetType() != typeof(MyKeyValuePair<TKey, TValue>)) return false; var other = (MyKeyValuePair<TKey, TValue>)obj; return Equals(other.Key, Key) && Equals(other.Value, Value); } } 
+10
source share

Use a tuple:

 var dict = new Dictionary<Tuple<string,string>,SomeType>(); dict.Add(Tuple.Create("Hello","World"), new SomeType()); 
+3
source share

A simple solution, and it works with all versions of .net. Just connect the strings together.

 var dictionary = new Dictionary<string, int>(); dictionary.Add("The meaning" + " of life, the universe, and everything", 42); 

Of course, this only works with two lines (although you can use .ToString () for many other types), and if you do not need to search for a dictionary of only one of the two lines, but if you have both it is quite simple.

+3
source share

All Articles