Wrong C ++ keyword involving memory pickup?

Question

Wrong C ++ keyword involving memory pickup?

I understand that volatile tells the compiler that the value can be changed, but in order to perform this function, the compiler must enter a memory fence to make it work?

In my opinion, the sequence of operations on unstable objects cannot be reordered and must be preserved. This, apparently, means that some memory fences are needed and that this is actually not the case. Am I saying it right?

There is an interesting discussion on this related issue.

Jonathan Wackel writes :

... Access to individual mutable variables cannot be reordered by the compiler if they occur in separate complete expressions ... right that volatility is useless for thread safety, but not for the reasons it gives. This is not because the compiler can reorder the calls, but because the CPU can reorder them. atomic operations and memory blocks prevent the compiler and processor reordering

What David Schwartz answers in the comments :

... There is no difference, from the point of view of the C ++ standard, between the compiler executing something and the compiler emitting instructions that force the hardware to do something. If the CPU can reorder access to volatiles, then the standard does not require that their order be preserved ....
... In the C ++ standard, there is no difference in what remaps do. And you cannot say that the CPU can reorder them without an observable effect, so everything is in order - the C ++ standard defines them as observable. The compiler is compatible with the C ++ standard on the platform if it generates code that forces the platform to do what the standard requires. If the standard requires access to volatile reorders, then the platform that reorders them does not meet the requirements ....
I want to say that if the C ++ standard prohibits the compiler from reordering access to individual volatile substances, stating that the order of such accesses is part of the program’s observed behavior, it also requires the compiler to emit code that prohibits CPU use from this. There is no difference in the standard between the compiler doing and the fact that the compiler generates code does the CPU.

Which gives two questions: is one of them “correct”? What do real implementations really do?

+71

c ++ multithreading volatile c ++ 11

Nathan Doromal Oct 10 '14 at 19:51

source share

12 answers

Wrong C ++ keyword involving memory pickup?

A C ++ compiler that conforms to the specification is not required to introduce a memory fence. Your specific compiler can; direct your question to the authors of your compiler.

The "volatile" function in C ++ has nothing to do with stream processing. Remember that the goal of "volatile" is to disable compiler optimization, so reading from a register that changes due to exogenous conditions is not optimized. Is the memory address that is being written by another thread on another processor a register that changes due to exogenous conditions? No. Again, if some authors of the compiler decided to process memory addresses written by different threads on different CPUs, as if they were changing due to exogenous conditions, what is their business; they are not required to do this. They are also not required - even if they introduce a memory fence - for example, to ensure that each thread sees a sequential order of volatile reads and writes.

In fact, volatile is practically useless for streaming processing in C / C ++. It is best to avoid this.

In addition: memory barriers are part of the implementation of specific processor architectures. In C #, where volatility is clearly intended for multithreading, the specification does not mean that half fences will be introduced, because the program can work on an architecture that does not have fences in the first place. Rather, again, the specification gives certain (extremely weak) guarantees as to which optimizers will be avoided by the compiler, runtime, and processor in order to establish certain (extremely weak) restrictions on how some side effects will be ordered. In practice, these optimizations are eliminated with half fences, but that implementation details may change in the future.

The fact that you care about the volatile semantics in any language, as they relate to multithreading, indicates that you are thinking about memory sharing over threads. Just think, don't do it. This makes your program much more difficult to understand and most likely contains subtle, impossible to reproduce errors.

+20

Eric Lippert Oct 11 '14 at 1:39 on

source share

First of all, C ++ standards do not guarantee the memory barriers necessary for proper read / write ordering, which are not atomic. mutable variables are recommended for use with MMIO, signal processing, etc. In most implementations, volatile is not suitable for multithreading and is usually not recommended.

Regarding the implementation of volatile accesses, this is the compiler's choice.

This article , describing the behavior of gcc , shows that you cannot use a volatile object as a memory barrier to organize a sequence of records into volatile memory.

Regarding icc behavior, I found this source , telling also that volatile does not guarantee streamlining access to memory.

The Microsoft VS2013 compiler has a different behavior. This documentation explains how volatile applies Release / Acquire semantics and allows the use of volatile objects in locks / releases in multi-threaded applications.

Another aspect that needs to be considered in considerations is that the same compiler may have different behavior . for variability depending on the target equipment architecture . This post regarding the MSVS 2013 compiler clearly indicates compilation features that are mutable for ARM platforms.

So my answer is:

Wrong C ++ keyword involving memory pickup?

will: not guaranteed, maybe not, but some compilers can do this. You should not rely on what he does.

+12

VAndrei Oct 10 '14 at 19:55

source share

What David ignores is the fact that the C ++ standard defines the behavior of several threads interacting only in certain situations, and everything else leads to undefined behavior. A race condition that includes at least one entry is undefined if you are not using atomic variables.

Therefore, the compiler is absolutely entitled to refuse any synchronization instructions, since your processor will notice a difference in a program that exhibits undefined behavior due to lack of synchronization.

+12

Voo Oct 10 '14 at

source share

The compiler only inserts a memory fence in the Itanium architecture, as far as I know.

The volatile keyword is really best used for asynchronous changes, such as memory handlers and register handlers; this is usually the wrong tool for multithreaded programming.

+7

Dietrich Epp Oct. 10 '14 at 19:55

source share

It depends on which compiler is the "compiler". Visual C ++ has been working since 2005. But the standard does not require this, so some other compilers do not.

+6

Ben Voigt Oct 10 '14 at 20:03

source share

It's not obligatory. Volatile is not a synchronization primitive. It simply disables optimization, i.e. You get a predictable sequence of reading and writing to the stream in the same order that is prescribed by the abstract machine. But reading and writing in different streams have no order, in the first place, there is no point in talking about maintaining or not maintaining their order. The order between the hells can be set by synchronization primitives, you get UB without them.

A bit of explanation regarding memory barriers. A typical processor has several levels of memory access. There is a memory pipeline, several cache levels, then RAM, etc.

Membar instructions flush the piping. They do not change the order in which reads and writes are made, but simply force the outstanding ones to execute at the moment. This is useful for multi-threaded programs, but not so many.

The cache is usually automatically matched between the CPUs. If you want to make sure that the cache is in sync with RAM, you need a cache flash. It is very different from membranes.

+5

nm 11 oct. '14 at 10:19

source share

It is mostly from memory and based on pre-C ++ 11 without threads. But participating in the threading discussions in the committee, I can say that the committee never thought that volatile could be used to synchronize between threads. Microsoft suggested this, but the offer was not worn.

The key specification of volatile is that access to volatile is an “observable behavior”, just like IO. In the same way, the compiler cannot reorder or delete a specific IO; it cannot reorder or remove calls to (or, rather, accesses through a lvalue expression with a volatile qualified type). The initial intention of volatility was, in fact, support for IO memory. However, the “problem” in this is that its implementation has identified what constitutes “unsustainable access”. And many compilers implement it as if the definition were “an instruction that reads or writes to memory”. This is a legitimate, albeit useless definition, if the implementation indicates it. (I have not yet found the specification for any compiler.)

It is possible (and this is an argument that I accept) that breaks the target standardly, because if the hardware does not recognize the addresses as being displayed in the IO memory and prohibits any reordering, etc., you cannot even use the volatile displayed IO for the memory, at least on Sparc or Intel architecture. However, none of the comedies I have looked at (Sun CC, g ++ and MSC) display any fence or membrane instructions. (Around the time when Microsoft proposed extending the rules for volatile , I think that some of their compilers made their proposal and did emit fence commands for unstable accesses. I did not check that the latter compilers, but it would not surprise me if it would depend on some compiler option. Version I checkd - I think it was VS6.0 - fences, however.)

+5

James Kanze Oct 12 '14 at 11:30

source share

The compiler should introduce a memory fence around volatile access, if and only if necessary, to make use for volatile , specified in standard work ( setjmp , signal handlers, etc.), on this particular platform.

Note that some compilers go beyond what the C ++ standard requires to make volatile more powerful or useful on these platforms. Portable code should not rely on volatile to do anything beyond what is stated in the C ++ standard.

+4

David Schwartz Oct 13 '14 at 0:27

source share

I always use volatile in interrupt service routines, for example. ISR (often assembly code) modifies some memory location, and higher-level code that runs outside the interrupt context refers to the memory location using a pointer to volatile.

I do this for RAM as well as for inputting IO with memory.

Based on the discussion here, it looks like this is still the actual use of volatile, but has nothing to do with multiple threads or processors. If the compiler for the microcontroller “knows” that there can be no other accesses (for example, each of them is embedded in the chip, without a cache there is only one core), I think that memory pickup is not implied at all, the compiler just needs to prevent certain optimizations .

As we collect more things into the “system” that executes the object code, almost all bets are disabled, at least as I read this discussion. How can a compiler ever cover all bases?

+2

Andrew Queisser Oct. 14 '14 at 22:19

source share

I think the confusion surrounding command variability and reordering stems from two concepts of processor reordering:

Out of turn execution.
A memory read / write sequence, as seen from other processors (reordering in the sense that each processor can see a different sequence).

Volatile affects how the compiler generates code that assumes single-threaded execution (this includes interrupts). This does not mean anything in memory protection instructions, but it rather prevents the compiler from performing certain kinds of optimizations related to memory access.
A typical example is to retrieve a value from memory instead of using one cached in the register.

Out of Line Execution

Processors can execute instructions out of order / speculatively, provided that the final result could occur in the source code. Processors can perform conversions that are forbidden by compilers, because compilers can only perform conversions that are correct under any circumstances. On the contrary, processors can check the correctness of these optimizations and return to them if they turned out to be incorrect.

Memory Read / Write Sequence Observed by Other CPUs

The end result of a sequence of instructions, an efficient order, must match the semantics of the code generated by the compiler. However, the actual execution order chosen by the CPU may be different. The effective order observed in other processors (each processor may have a different representation) may be limited by memory barriers.
I’m not sure how effective and actual the order can differ, because I don’t know to what extent memory obstacles can prevent processors from executing out of turn.

Sources:

0

Paweł Batko Nov 19 '17 at 0:38

source share

While I was working with an online downloadable video tutorial for developing 3D graphics and a game engine that works with modern OpenGL. We used volatile in one of our classes. The tutorial's website can be found here , and a video that works with the volatile keyword is found in the Shader Engine series 98 video. These works are not mine, but are accredited by Marek A. Krzeminski, MASc , and this is an excerpt from the video download page.

"Since we can now run our games in multiple threads, it’s important to correctly synchronize the data between the threads. In this video I will show how to create a volatile blocking class to ensure that vote variables are correctly synchronized ..."

And if you are subscribed to his site and have access to his video in this video, he refers to this article regarding the use of volatile with multithreading programming.

Here is the article from the link above: http://www.drdobbs.com/cpp/volatile-the-multithreaded-programmers-b/184403766

volatile: Best friend with several programmers
Andrey Alexandrescu, February 01, 2001
The volatile keyword was developed to prevent compiler optimizations that could make code incorrect if certain asynchronous events are present.
I don’t want to spoil your mood, but this column touches on the terrible topic of multithreaded programming. , Generic, , - .
, , , , , , . , amok, .
, , , , . - - , , , , .
, C, ++ , , volatile.
const, volatile . , . , , . .
Consider the following code:
 class Gadget { public: void Wait() { while (!flag_) { Sleep(1000); // sleeps for 1000 milliseconds } } void Wakeup() { flag_ = true; } ... private: bool flag_; }; 
Gadget:: Wait - -_ , true . , , , , , , .
, , Sleep (1000) , _. , _ . , : , Wait - Gadget, Wakeup, Wait . , _ , _. ... .
- , , . C ++ . volatile , - . , , , Gadget Wait/Wakeup, - :
 class Gadget { public: ... as above ... private: volatile bool flag_; }; 
volatile - , . , , volatile, ++ .
volatile
volatile-qualify , . volatile , const. ( volatile .)
const, volatile . , , - (, , ..), . , int volatile int, .
, volatile .
 class Gadget { public: void Foo() volatile; void Bar(); ... private: String name_; int state_; }; ... Gadget regularGadget; volatile Gadget volatileGadget; 
, volatile , .
 volatileGadget.Foo(); // ok, volatile fun called for // volatile object regularGadget.Foo(); // ok, volatile fun called for // non-volatile object volatileGadget.Bar(); // error! Non-volatile function called for // volatile object! 
. , , . :
 Gadget& ref = const_cast<Gadget&>(volatileGadget); ref.Bar(); // ok 
, , , , . , const_cast. , , (, volatileGadget.name_ volatileGadget.state_ - ).
,
. Acquire Release. Acquire - , , Acquire, . , Release, , Acquire. , Acquire Release. Acquire Release . ( Windows , , "" . , mutex mutex .)
. , , . , . , . , . - , - , .
, , , , - , . , ++ .
; , , , . - , .
, , . , . - .
, , , .
, . , const_cast. , ++ . .
LockingPtr
, const_cast. LockingPtr, obj mutex mtx. LockingPtr mtx. , LockingPtr . - > *. Const_cast LockingPtr. , LockingPtr , .
Mutex, LockingPtr:
 class Mutex { public: void Acquire(); void Release(); ... }; 
LockingPtr, Mutex, .
LockingPtr templated . , , LockingPtr, volatile Widget.
LockingPtr . LockingPtr . const_cast .
 template <typename T> class LockingPtr { public: // Constructors/destructors LockingPtr(volatile T& obj, Mutex& mtx) : pObj_(const_cast<T*>(&obj)), pMtx_(&mtx) { mtx.Lock(); } ~LockingPtr() { pMtx_->Unlock(); } // Pointer behavior T& operator*() { return *pObj_; } T* operator->() { return pObj_; } private: T* pObj_; Mutex* pMtx_; LockingPtr(const LockingPtr&); LockingPtr& operator=(const LockingPtr&); }; 
, LockingPtr - . , , const_cast - LockingPtr. .
, , :
 class SyncBuf { public: void Thread1(); void Thread2(); private: typedef vector<char> BufT; volatile BufT buffer_; Mutex mtx_; // controls access to buffer_ }; 
LockingPtr - buffer_:
 void SyncBuf::Thread1() { LockingPtr<BufT> lpBuf(buffer_, mtx_); BufT::iterator i = lpBuf->begin(); for (; i != lpBuf->end(); ++i) { ... use *i ... } } 
- , buffer_, LockingPtr, . , .
, , :
 void SyncBuf::Thread2() { // Error! Cannot access 'begin' for a volatile object BufT::iterator i = buffer_.begin(); // Error! Cannot access 'end' for a volatile object for ( ; i != lpBuf->end(); ++i ) { ... use *i ... } } 
- buffer_, const_cast LockingPtr. , LockingPtr const_cast .
LockingPtr . , LockingPtr :
 unsigned int SyncBuf::Size() { return LockingPtr<BufT>(buffer_, mtx_)->size(); } 
, volatile LockingPtr . , - .
, int.
 class Counter { public: ... void Increment() { ++ctr_; } void Decrement() { —ctr_; } private: int ctr_; }; 
Increment Decrement , . -, ctr_ . -, , ++ ctr_, . . :
RMW (Read-Modify-Write). RMW , .
RMW , : .
, LockingPtr:
 class Counter { public: ... void Increment() { ++*LockingPtr<int>(ctr_, mtx_); } void Decrement() { —*LockingPtr<int>(ctr_, mtx_); } private: volatile int ctr_; Mutex mtx_; }; 
, SyncBuf. What for? Counter , ctr_ ( ). ++ ctr_, ctr_ volatile, . , .
? , , . , , , !
-
, ; , , , . volatile member .
- -, . , . : volatile ; , .
, Widget, - - , .
 class Widget { public: void Operation() volatile; void Operation(); ... private: Mutex mtx_; }; 
. Widget , , , . , Widget .
volatile member , , LockingPtr. :
 void Widget::Operation() volatile { LockingPtr<Widget> lpThis(*this, mtx_); lpThis->Operation(); // invokes the non-volatile function } 
volatile . :
.
volatile .
.
, LockingPtr, , , , , .
, , volatile LockingPtr. . , , . , , . .
, .

- RealNetworks Inc. (www.realnetworks.com), , , Modern ++ Design. : www.moderncppdesign.com. ++ (www.gotw.ca/cpp_seminar).

, , , , . OPs , volatile .

0

Francis Cugler 19 . '17 17:09

source share

Stefan · Accepted Answer · 2014-10-10 21:44

Instead of explaining what volatile does, let me explain when you should use volatile .

Inside the signal handler. Since writing to volatile is pretty much the only thing that the standard allows you to do from a signal handler. Starting with C ++ 11, you can use std::atomic for this purpose, but only if the atom is locked.
When working with setjmp in accordance with Intel .
In direct interaction with the equipment and want to make sure that the compiler does not optimize your reads or writes.

For example:

 volatile int *foo = some_memory_mapped_device; while (*foo) ; // wait until *foo turns false

Without the volatile specifier, the compiler is allowed to fully optimize the loop. The volatile tells the compiler that it cannot assume that 2 subsequent reads return the same value.

Note that volatile has nothing to do with threads. The above example does not work if another stream entry is written to *foo because the search operation is not involved.

In all other cases, the use of volatile should be considered not portable, and not an overview of pass code, except when it comes to pre-C ++ 11 compilers and compiler extensions (such as msvc /volatile:ms switch, which is enabled by default in X86 / I64).

Wrong C ++ keyword involving memory pickup?

Out of Line Execution

Memory Read / Write Sequence Observed by Other CPUs

More articles: