I did some testing in program c using the clock and the results were inconclusive .. so I wrote the code
for(int i=0;i< 100000000; i++){ g+=2; }
and code
int i2 = 0; while(i2 < 100000000){ g+=2; i2++; }
when compiling in dev C ++ using gcc, the assembly breaks into exactly the same bytecode ...
CPU Disasm; Address Hex dump Command Comments 0040160C |. C745 E0 00000 MOV DWORD PTR SS:[LOCAL.8],0 00401613 |> 817D E0 FFE0F /CMP DWORD PTR SS:[LOCAL.8],5F5E0FF 0040161A |. 7F 0D |JG SHORT 00401629 0040161C |. 8D45 E4 |LEA EAX,[LOCAL.7] 0040161F |. 8300 02 |ADD DWORD PTR DS:[EAX],2 00401622 |. 8D45 E0 |LEA EAX,[LOCAL.8] 00401625 |. FF00 |INC DWORD PTR DS:[EAX] 00401627 |.^ EB EA \JMP SHORT 00401613
both sets of code deal with this exact assembly node so you can see that they are equally good. this in combination with watch tests always returns as one faster once, and the other faster, at another time proves it ...
if you want you can find your code in a disassembler and optimize it manually;)
for example, these cycles can be represented as follows
CPU Disasm Address Hex dump Command Comments 00401492 |. 60 PUSHAD 00401493 |. 31DB XOR EBX,EBX 00401495 |. 31C0 XOR EAX,EAX 00401497 |> 81FB FFE0F505 /CMP EBX,5F5E0FF 0040149D |. 7F 05 |JG SHORT 004014A4 0040149F |. 40 |INC EAX 004014A0 |. 40 |INC EAX 004014A1 |. 43 |INC EBX 004014A2 |.^ EB F3 \JMP SHORT 00401497 004014A4 |> 8945 E4 MOV DWORD PTR SS:[LOCAL.7],EAX 004014A7 |. 61 POPAD
you can change two INC EAX to ADD EAX, 2, but this uses 1 more byte, which I did not want to lose :)
this optimization is about twice as fast :)