Does a boolean condition in a for loop, which is always false, is optimized?

I have the following situation.

bool user_set_flag; getFlagFromUser(&user_set_flag); while(1){ if(user_set_flag){ //do some computation and output } //do other computation } 

The user_set_flag variable is set only once and only once in the code, at the very beginning, in fact, the user chooses what he wants to do with the program. Let's say that the user chooses user_set_flag = false , then the compiler compiles the code so that the if(user_set_flag) will be checked only once or will always be checked. Can I give compiler hints, for example, set bool to const?

The reason I'm asking about this is because my application is critical and it processes frames as quickly as possible. At some point, a branch that is always false should be determined at runtime?

+4
source share
7 answers

Firstly, processors have a feature called branch prediction . After several loop cycles, the processor will be able to notice that your if always goes in one direction. (He may even notice regular patterns, for example true false true false .) Then he will speculatively execute this branch and as long as he is able to correctly predict, the additional cost of the if is largely eliminated. If you think that the user is more likely to select true rather than false , you can even report this to the gcc (gcc-specific) compiler .

However, in one of your comments, you mentioned that you have a "more complex sequence of bools". I think it is possible that the processor does not have memory to match the pattern with all these transitions - by the time it returns to the first if , knowing which path that jumped is offset from its Memory. But we could help here ...

The compiler has the ability to convert loops and if-statements to what it considers more optimal. For instance. it could turn your code into a form given by schnaader. This is called loop unswitching . You can help him by doing Profiled Optimization (PGO) , letting the compiler know where the hotspots are, (Note: In gcc, -funswitch-loops is only enabled with -O3 .)

You need to profile your code at the instruction level ( VTune will be a good tool for this) to see if if statements are really a bottleneck. If they really are, and if, looking at the generated assembly, you think that the compiler was wrong, despite the PGO, you can try to execute the if-statement yourself. Perhaps the boilerplate code will make it more convenient:

 template<bool B> void innerLoop() { for (int i=0; i<10000; i++) { if (B) { // some stuff.. } else { // some other stuff.. } } } if (user_set_flag) innerLoop<true>(); else innerLoop<false>(); 
+14
source

I do not think it is generally possible to optimize this further. The compiler is smart enough to know that the value of user_set_flag will not change during the execution of the loop and will generate the most efficient machine code for it.

This is also somewhat in the area of ​​the second compiler guess. If you really really do not know what you are doing, it is best to stick to the simplest solution.

As an exercise, try performing (time) execution using both if (true) and if(user_set_flag) . I assume that there will be zero runtime difference.

+6
source

An alternative could be:

 if(user_set_flag){ while(1){ ComputationAndOutput(); OtherComputation(); } } else { while(1){ OtherComputation(); } } 

but, as Smashery already said, this is micro-optimization and will not speed up your program until you can make other optimizations.

+5
source

Technically, the compiler can optimize such situations.

For instance:

 #include <cstdio> int main(int argc, char* []) { while (true) { if (argc == 1) { puts("one"); } puts("some more"); } } 

main compiles in (g ++ -O3):

  cmpl $1, 8(%ebp) je L9 .p2align 4,,15 L2: movl $LC1, (%esp) call _puts jmp L2 L9: movl $LC0, (%esp) call _puts movl $LC1, (%esp) call _puts movl $LC0, (%esp) call _puts movl $LC1, (%esp) call _puts jmp L9 

As you can see, the condition is evaluated only once to determine which cycle to use. And he unfolded the true branch a bit :)

I would conclude that there is no reason to worry about these microoptimizations unless you determine that the compiler cannot optimize the re-evaluation of an immutable logical (for example, if it was global, as the compiler would know that it would not be changed by function calls) and that this is really a bottleneck.

+4
source

You say that the user really has a parameter that can set this flag to true or false . This means that this may change at runtime. This means that it cannot be optimized (usually).

In general, the compiler can only "optimize" the things that it knows at compile time. This means: at the moment you click on the "Build" element in your editor’s menu. If it can change, it - usually - cannot be optimized.

However, it is quite easy (well, depending on the parts that you did not show) to optimize it yourself. If one assembly instruction that is used inside the loop bothers you, place the if statement outside the loop. Thus, it is executed only once to call the function.

+1
source

If you know the value of the flag at compile time, you can add a compilation flag to not include the if statement:

 while(1){ #ifdef user_set_flag { //do some computation and output } #endif //do other computation } 
0
source

If you really want the fastest possible, then you want to do an aggressive performance tuning. So forget about trying to guess what the compiler can do to optimize your program.

It is timid.

Instead, take a charge. It shows how.

0
source

All Articles