Microoptimization: repetition using a local variable or class

I thought I would save some time if I declare a repeating variable as a member of a class:

struct Foo { int i; void method1() { for(i=0; i<A; ++i) ... } void method2() { for(i=0; i<B; ++i) ... } } foo; 

However, it looks like it's about 20% faster

 struct Foo { void method1() { for(int i=0; i<A; ++i) ... } void method2() { for(int i=0; i<B; ++i) ... } } foo; 

in this code

 void loop() { // Arduino loops foo.method1(); foo.method2(); } 

Can you explain the difference in performance?

(I need to run a lot of simple parallel "processes" on Arduino, where such microoptimization matters.)

+7
c ++ micro-optimization
source share
2 answers

When you declare a loop variable in a loop, it is very narrow. The compiler can store it in the register all the time, so it does not get memory bindings even once.

When you declare a loop variable as an instance variable, the compiler does not have that flexibility. It should keep the variable in memory if some of your methods want to examine its state. For example, if you do this in your first code example

 void method2() { for(i=0; i<B; ++i) { method3(); } } void method3() { printf("%d\n", i); } 

the value of i in method3 should change as the loop advances. The compiler is not able to transfer all its side effects to memory. Moreover, he cannot assume that i remained the same when you return from method3 , which will further increase the number of memory accesses.

Working with updates in memory requires a lot more CPU cycles than performing updates for register-based variables. This is why it is always recommended to keep loop variables in the range up to the loop level.

+8
source share

Can you explain the difference in performance?

The most plausible explanation that I could come up with for this performance difference is:

The data item i declared in global memory, which cannot be stored in the register all the time, so operations with it will be slower than on the variable of cycle i due to a very wide area (The data member i should serve all member functions of the class).

@DarioOO adds:

In addition, the compiler cannot freely store it in register, because method3() can throw an exception that leaves the object in an undesirable state (because theoretically no one bothers you writing int k=this->i; for(k=0;k<A;k++)method3(); this->i=k; This code will be almost as fast as a local variable, but you should when method3() throws (I believe when there is a guarantee it doesn't throw the compiler will optimize, that with -O3 or -O4 to be checked)

+2
source share

All Articles