Using gcc 4.6 with -O3, I programmed the following four codes using a simple time command
#include <iostream> int main(int argc, char* argv[]) { double val = 1.0; unsigned int numIterations = 1e7; for(unsigned int ii = 0;ii < numIterations;++ii) { val *= 0.999; } std::cout<<val<<std::endl; }
Case 1 works after 0.09 seconds
#include <iostream> int main(int argc, char* argv[]) { double val = 1.0; unsigned int numIterations = 1e8; for(unsigned int ii = 0;ii < numIterations;++ii) { val *= 0.999; } std::cout<<val<<std::endl; }
Case 2 works after 17.6 seconds
int main(int argc, char* argv[]) { double val = 1.0; unsigned int numIterations = 1e8; for(unsigned int ii = 0;ii < numIterations;++ii) { val *= 0.999; } }
Case 3 works after 0.8 seconds
#include <iostream> int main(int argc, char* argv[]) { double val = 1.0; unsigned int numIterations = 1e8; for(unsigned int ii = 0;ii < numIterations;++ii) { val *= 0.999999; } std::cout<<val<<std::endl; }
Case 4 works after 0.8 seconds
My question is: why is the second case so slower than all other cases? Case 3 shows that removing cout returns the runtime as expected. And case 4 shows that changing the multiplier also significantly reduces the execution time. What optimization or optimization does not occur in case 2 and why?
Update:
When I initially ran these tests, there were no separate numIterations variables, the value was hard-coded in a for loop. In general, hard-coding this value made things work slower than the cases given here. This is especially true for case 3, which started almost instantly with the numIterations variable, as shown above, which indicates that James McNellis is right about the whole optimized loop. I'm not sure why hard-coded 1e8 in a for loop prevents deleting a loop in Case 3 or makes things slower in other cases, however, the underlying premise of Case 2 is much slower - this is even more true.
The difference in assembly output for the above cases gives
Case 2 and Case 1:
movl $ 100,000,000, 16 (% esp)
movl $ 10,000,000, 16 (% esp)
Case 2 and Case 4:
.long -652835029
.long 1072691150
.long -417264663
.long 1072693245