Is the call function an effective memory barrier for modern platforms?

In the code base that I reviewed, I found the following idiom.

void notify(struct actor_t act) { write(act.pipe, "M", 1); } // thread A sending data to thread B void send(byte *data) { global.data = data; notify(threadB); } // in thread B event loop read(this.sock, &cmd, 1); switch (cmd) { case 'M': use_data(global.data);break; ... } 

“Hold it,” I said to the author, a senior member of my team, “there is no memory barrier! You cannot guarantee that global.data will be global.data from the cache to main memory. If thread A and thread B work on two different processors, this the circuit may fail. "

The senior programmer grinned and explained slowly, as if explaining to his five-year-old boy how to tie his shoelaces: “Listen, boy, we saw a lot of problems here related to flow, in high-load tests and in real clients,” he paused to scratching his long beard, "but we never had a mistake with this idiom."

"But it is said in the book ..."

“Quiet!”, He quickly shut up: “Theoretically, this is not guaranteed, but in practice the fact that you used a function call is actually a memory barrier. The compiler will not change the order of the global.data = data instructions, since it cannot know if anyone will use it in the function call, and the x86 architecture will ensure that other processors see this piece of global data by the time that thread B reads the command from the channel. Be sure that we have enough real world problems about which need to worry. We do not need to invest extra effort in fictitious theoretical problems.

"Rest assured, my boy, in time you will understand to separate the real problem from the problems associated with I-need-to-get-a-PhD."

Is he right? Is this really not a problem in practice (say x86, x64 and ARM)?

This is against everything I learned, but he has a long beard and a really smart look!

Extra points if you can show me the piece of code proving that it is wrong!

+56
c multithreading memory-barriers
May 22 '12 at 8:16
source share
4 answers

Memory items are not only designed to prevent reordering of commands. Even if instructions are not reordered, they can still cause cache coherency problems. Regarding reordering, it depends on your compiler and settings. ICC is particularly aggressive with reordering. MSVC with optimization of the entire program can also be.

If your shared data variable is declared volatile , although it is not part of the specification, most compilers will generate a memory variable around read and write from this variable and prevent reordering. This is not the right way to use volatile , and also what it is intended for.

(If I had voices, I would add your storytelling question.)

+8
May 22, '12 at 8:21
source share

In practice, calling a function is a compiler barrier, which means that the compiler will not move access to global memory after the call. A caveat to this is functions that the compiler knows something about, for example. built-in functions, built-in functions (remember the IPO!), etc.

Thus, a theoretical block of processor memory (in addition to the compiler barrier) is needed to complete this work. However, since you invoke read and write, which are syscalls that change the global state, I am pretty sure that the kernel removes memory barriers somewhere in their implementation. There are no such guarantees, so theoretically you need barriers.

+8
May 22 '12 at 8:24 AM
source share

The basic rule: the compiler must make the global state the way you encoded it, but if it can prove that this function does not use global variables, then it can implement the algorithm in any way that it chooses.

The result is that traditional compilers always treated functions in another compilation unit as a memory barrier, because they could not see inside these functions. Increasingly, modern compilers are developing optimization strategies for the “whole program” or “link time” that break down these barriers and will lead to poorly written code, even if it has worked fine for many years.

If the function in question is in the shared library, it will not be able to see inside it, but if the function is defined in accordance with the C standard, then it does not need it - it already knows what the function does, so you also need to be careful. Note that the compiler does not recognize the kernel call for what it is, but the act of inserting something that the compiler cannot recognize (the built-in assembler or function call in the assembler file) creates a memory barrier by itself.

In your case, notify will either be a black box that the compiler cannot see inside (library function), or it will contain a recognizable memory barrier, so you will most likely be safe.

In practice, you have to write very bad code to handle this.

+2
May 22 '12 at 9:34 a.m.
source share

In practice, it corrects, and in this particular case, a memory barrier is implied.

But the fact is that if his presence is “controversial,” the code is already too complex and obscure.

Really, guys, use a mutex or other proper constructions. This is the only safe way to deal with threads and to write supported code.

And you may see other errors, for example, the code is unpredictable if send () is called more than once.

+1
May 28 '12 at 16:32
source share



All Articles