I recently tested the performance of the for loop and foreach loop in C #, and I noticed that to summarize the ints array in long, the foreach loop may appear faster. Here is the full test program , I used Visual Studio 2012, x86, release mode, optimization is on.
Here is the build code for both loops. Preview:
long sum = 0; 00000000 push ebp 00000001 mov ebp,esp 00000003 push edi 00000004 push esi 00000005 push ebx 00000006 xor ebx,ebx 00000008 xor edi,edi foreach (var i in collection) { 0000000a xor esi,esi 0000000c cmp dword ptr [ecx+4],0 00000010 jle 00000025 00000012 mov eax,dword ptr [ecx+esi*4+8] sum += i; 00000016 mov edx,eax 00000018 sar edx,1Fh 0000001b add ebx,eax 0000001d adc edi,edx 0000001f inc esi foreach (var i in collection) { 00000020 cmp dword ptr [ecx+4],esi 00000023 jg 00000012 } return sum; 00000025 mov eax,ebx 00000027 mov edx,edi 00000029 pop ebx 0000002a pop esi 0000002b pop edi 0000002c pop ebp 0000002d ret
And for:
long sum = 0; 00000000 push ebp 00000001 mov ebp,esp 00000003 push edi 00000004 push esi 00000005 push ebx 00000006 push eax 00000007 xor ebx,ebx 00000009 xor edi,edi for (int i = 0; i < collection.Length; ++i) { 0000000b xor esi,esi 0000000d mov eax,dword ptr [ecx+4] 00000010 mov dword ptr [ebp-10h],eax 00000013 test eax,eax 00000015 jle 0000002A sum += collection[i]; 00000017 mov eax,dword ptr [ecx+esi*4+8] 0000001b cdq 0000001c add eax,ebx 0000001e adc edx,edi 00000020 mov ebx,eax 00000022 mov edi,edx for (int i = 0; i < collection.Length; ++i) { 00000024 inc esi 00000025 cmp dword ptr [ebp-10h],esi 00000028 jg 00000017 } return sum; 0000002a mov eax,ebx 0000002c mov edx,edi 0000002e pop ecx 0000002f pop ebx 00000030 pop esi 00000031 pop edi 00000032 pop ebp 00000033 ret
As you can see, the main loop is 7 instructions for "foreach" and 9 instructions for "for". This results in an approximately 10% performance difference in my tests.
I do not read assembly code very well, but I do not understand why a for loop would not be as efficient as foreach. What's going on here?