Unexpected results when comparing dictionary searches with multiple operators in .NET 4.7

I am having a problem when I need to do a dynamic dispatch based on an object type. The types based on which I need to send are known at compile time - in my example they are 17 .

My initial assumption was to use Dictionary<Type, Action<Object>> to dispatch and use obj.GetType() to find the appropriate action. But then I decided to use BenchmarkDotNet to find out if I can do better and exactly how expensive the search for a shipment will be. Bellow is the code I used for the test.

 public class Program { private static readonly Object Value = Guid.NewGuid(); private static readonly Dictionary<Type, Action<Object>> Dictionary = new Dictionary<Type, Action<Object>>() { [ typeof( Byte ) ] = x => Empty( (Byte)x ), [ typeof( Byte[] ) ] = x => Empty( (Byte[])x ), [ typeof( SByte ) ] = x => Empty( (SByte)x ), [ typeof( Int16 ) ] = x => Empty( (Int16)x ), [ typeof( UInt16 ) ] = x => Empty( (UInt16)x ), [ typeof( Int32 ) ] = x => Empty( (Int32)x ), [ typeof( UInt32 ) ] = x => Empty( (UInt32)x ), [ typeof( Int64 ) ] = x => Empty( (Int64)x ), [ typeof( UInt64 ) ] = x => Empty( (UInt64)x ), [ typeof( Decimal ) ] = x => Empty( (Decimal)x ), [ typeof( Single ) ] = x => Empty( (Single)x ), [ typeof( Double ) ] = x => Empty( (Double)x ), [ typeof( String ) ] = x => Empty( (String)x ), [ typeof( DateTime ) ] = x => Empty( (DateTime)x ), [ typeof( TimeSpan ) ] = x => Empty( (TimeSpan)x ), [ typeof( Guid ) ] = x => Empty( (Guid)x ), [ typeof( Char ) ] = x => Empty( (Char)x ), }; [Benchmark] public void Switch() => Switch( Value ); [Benchmark] public void Lookup() => Lookup( Value ); private static void Switch( Object value ) { if ( value is Byte ) goto L_Byte; if ( value is SByte ) goto L_SByte; if ( value is Int16 ) goto L_Int16; if ( value is UInt16 ) goto L_UInt16; if ( value is Int32 ) goto L_Int32; if ( value is UInt32 ) goto L_UInt32; if ( value is Int64 ) goto L_Int64; if ( value is UInt64 ) goto L_UInt64; if ( value is Decimal ) goto L_Decimal; if ( value is Single ) goto L_Single; if ( value is Double ) goto L_Double; if ( value is DateTime ) goto L_DateTime; if ( value is TimeSpan ) goto L_TimeSpan; if ( value is DateTimeOffset ) goto L_DateTimeOffset; if ( value is String ) goto L_String; if ( value is Byte[] ) goto L_ByteArray; if ( value is Char ) goto L_Char; if ( value is Guid ) goto L_Guid; return; L_Byte: Empty( (Byte)value ); return; L_SByte: Empty( (SByte)value ); return; L_Int16: Empty( (Int16)value ); return; L_UInt16: Empty( (UInt16)value ); return; L_Int32: Empty( (Int32)value ); return; L_UInt32: Empty( (UInt32)value ); return; L_Int64: Empty( (Int64)value ); return; L_UInt64: Empty( (UInt64)value ); return; L_Decimal: Empty( (Decimal)value ); return; L_Single: Empty( (Single)value ); return; L_Double: Empty( (Double)value ); return; L_DateTime: Empty( (DateTime)value ); return; L_DateTimeOffset: Empty( (DateTimeOffset)value ); return; L_TimeSpan: Empty( (TimeSpan)value ); return; L_String: Empty( (String)value ); return; L_ByteArray: Empty( (Byte[])value ); return; L_Char: Empty( (Char)value ); return; L_Guid: Empty( (Guid)value ); return; } private static void Lookup( Object value ) { if ( Dictionary.TryGetValue( value.GetType(), out var action ) ) { action( value ); } } [MethodImpl( MethodImplOptions.NoInlining )] private static void Empty<T>( T value ) { } static void Main( string[] args ) { BenchmarkRunner.Run( typeof( Program ) ); Console.ReadLine(); } } 

In my example, I tested the test with the boxed Guid , which is the worst case in the handcrafted switch function. The results were unexpected, to say the least:

 BenchmarkDotNet=v0.10.11, OS=Windows 10 Redstone 3 [1709, Fall Creators Update] (10.0.16299.125) Processor=Intel Core i7-4790K CPU 4.00GHz (Haswell), ProcessorCount=8 Frequency=3903988 Hz, Resolution=256.1483 ns, Timer=TSC [Host] : .NET Framework 4.7 (CLR 4.0.30319.42000), 32bit LegacyJIT-v4.7.2600.0 DefaultJob : .NET Framework 4.7 (CLR 4.0.30319.42000), 32bit LegacyJIT-v4.7.2600.0 Method | Mean | Error | StdDev | ------- |---------:|----------:|----------:| Switch | 13.21 ns | 0.1057 ns | 0.0989 ns | Lookup | 28.22 ns | 0.1082 ns | 0.1012 ns | 

The switching function is 2 times faster for him in the worst case. If I reorder the ifs, so the most common types will be the first, then on average I expect it to work 3-5 times faster.

My question is why are 18 checks much faster than searching through a single dictionary? Am I missing something?

EDIT:

The original test was x86 (preferred 32 bit) mode on an x64 machine. I also tested the tests in 64 releases:

  Method | Mean | Error | StdDev | ---------- |----------:|----------:|----------:| Switch | 12.451 ns | 0.0600 ns | 0.0561 ns | Lookup | 22.552 ns | 0.1108 ns | 0.1037 ns | 
+8
c # microbenchmark
source share
1 answer

I am by no means the guru of performing IL, but if you decompile and especially look at IL, that makes sense.

The is statement is just 4 operation codes (ldarg, isinst, ldnull, cgt), and each part of the switch consists of just 7, to which goto is added. Part of the Switch action for calling Empty() then another 6, giving 17 * 7 + 6 = 125 max.

Unlike Dictionary.TryGetValue can only be one method call, but inside that it handles hashing, looping and comparing values ​​a lot:

http://referencesource.microsoft.com/#mscorlib/system/collections/generic/dictionary.cs,2e5bc6d8c0f21e67

 public bool TryGetValue(TKey key, out TValue value) { int i = FindEntry(key); if (i >= 0) { value = entries[i].value; return true; } value = default(TValue); return false; } 

http://referencesource.microsoft.com/#mscorlib/system/collections/generic/dictionary.cs,bcd13bb775d408f1

 private int FindEntry(TKey key) { if( key == null) { ThrowHelper.ThrowArgumentNullException(ExceptionArgument.key); } if (buckets != null) { int hashCode = comparer.GetHashCode(key) & 0x7FFFFFFF; for (int i = buckets[hashCode % buckets.Length]; i >= 0; i = entries[i].next) { if (entries[i].hashCode == hashCode && comparer.Equals(entries[i].key, key)) return i; } } return -1; } 

The for loop inside FindEntry one - 31 opcodes for each loop, giving a maximum of 527 opcodes for this part only.

This is a very simple analysis, but it is easy to see that the switch should be more efficient. As it often happens, you need to consider performance compared to readability and maintainability. If using dictionary lookups gives you nicer code, it is rarely the case that performance loss (15ns) outweighs this benefit.

+5
source share

All Articles