Optimization and multithreading in the new book of B. Stroustup

Please refer to Section 41.2.2 "Reordering Instructions" from the "TCPL" 4th edition of B. Stroustrup, which I rewrite below:

To improve performance, compilers, optimizers, and hardware fix instructions. Consider:

// thread 1: int x; bool x_init; void init() { x = initialize(); // no use of x_init in initialize() x_init = true; // ... } 

For this piece of code, there is no given reason to assign x before assigning x_init. The optimizer (or the hardware scheduler instruction) may decide to speed up the program by doing x_init = first. We probably wanted x_init to indicate whether x was initialized with an initializer () or not. However, we did not say that, therefore, the hardware, the compiler and the optimizer do not know what.

Add another thread to the program:

 // thread 2: extern int x; extern bool x_init; void f2() { int y; while (!x_init) // if necessary, wait for initialization to complete this_thread::sleep_for(milliseconds{10}); y = x; // ... } 

Now we have a problem: thread 2 can never wait and thus assign uninitialized x to y. Even if thread 1 does not set x_init and x to "Wrong order, we can still have a problem. There are no x_init assigned to thread 2, so the optimizer may decide to un-evaluate! X_init from the loop, so thread 2 never sleeps or sleeps forever and ever.

  • Does the Standard allow reordering in stream 1? (some quotation from the Standard will be offered). Why does this speed up the program?
  • Both answers in this SO discussion show that such optimization does not occur when there are global variables in the code, like x_init above.
  • What does the author mean "raise the grade! X_init from the loop"? Is it something like this?

     if( !x_init ) while(true) this_thread::sleep_for(milliseconds{10}); y = x; 
+7
c ++ optimization multithreading c ++ 11
source share
2 answers

This is not so much a C ++ compiler / standard problem as modern processors. Take a look here . The compiler is not going to issue memory protection instructions between the x and x_init assignments unless you report it.

For what it's worth before C ++ 11 , the standard had no idea about multithreading in an abstract machine model. Things are a little nicer these days.

+3
source share
  • The C ++ 11 standard does not allow "allow" or "prevent" reordering. It defines some way to force the creation of a certain “barrier”, which, as it turned out, does not allow the compiler to move instructions before / after them. The compiler in this example can change the order of destination, because it can be more efficient on a processor with multiple computing devices (ALU / Hyperthreading / etc ...) even with a single core. Typically, if your CPU has 2 ALUs that can run in parallel, there is no reason the compiler will not try to serve them with as much work as it can. I'm not talking about reordering the CPU instructions that run inside the Intel processor (for example, not okay), but compiling the time to ensure that all computing resources are busy with some kind of work.

  • I think it depends on compilation compilation flags. Usually, if you do not report this, the compiler should assume that another compilation unit (for example B.cpp, which is not displayed at compile time) may have "extern bool x_init" and may change it at any time. Then the reordering optimization would break with the expected behavior (B can define the initialize () function). This example is trivial and can hardly break. The related SO response is not related to this “optimization”, but simply that in their case, the compiler cannot make the assumption that the global array is not modified from the outside and, as such, cannot do the optimization. This is not like your example.

  • Yes. This is a very common optimization trick, not:

// test is a bool

 for (int i = 0; i < 345; i++) { if (test) do_something(); } 

Perhaps the compiler:

 if (test) for(int i = 0; i < 345; i++) { do_something(); } 

And save 344 worthless tests.

+1
source share

All Articles