Is the example on the membarrier man page meaningless in x86?

Maybe it's just me, but the example on the page man 2for membarrierseems meaningless.

In principle, it membarrier()is an asynchronous memory barrier, which, given the two coordinating parts of the code (let it be a call, a fast path and a slow path) allows you to move the entire hardware cost of the barrier to a slow path and leave the fast path only with compiler barrier 1 . There are several different ways to perform the behavior membarrier, for example, passing the IPI to each processor involved or expecting that the code running on each processor will be canceled, but the exact implementation data is not important here.

Now here is an example of the conversion given in the man page :

Original code

static volatile int a, b;

static void
fast_path(void)
{
   int read_a, read_b;

   read_b = b;
   asm volatile ("mfence" : : : "memory");
   read_a = a;

   /* read_b == 1 implies read_a == 1. */

   if (read_b == 1 && read_a == 0)
       abort();
}

static void
slow_path(void)
{
   a = 1;
   asm volatile ("mfence" : : : "memory");
   b = 1;
}

Converted Code

(some syscall and init scripts were omitted)

static volatile int a, b;

static void
fast_path(void)
{
   int read_a, read_b;

   read_b = b;
   asm volatile ("" : : : "memory");
   read_a = a;

   /* read_b == 1 implies read_a == 1. */

   if (read_b == 1 && read_a == 0)
       abort();
}

static void
slow_path(void)
{
   a = 1;
   membarrier(MEMBARRIER_CMD_SHARED, 0);
   b = 1;
}

Here it slow_pathperforms two writes ( a, then b), separated by a barrier, and fast_pathperforms two reads ( b, then a), also separated by a barrier.

However, the x86 memory model does not allow reordering the loading or saving of the store! So, as far as I can tell, membarrier()it is not required at all in these scenarios, nor is mfenceit needed in the source code. It seems that in both places there would be enough simple barriers for compilers 2 .

An example that really makes sense, IMO, should have a store, followed by a load shared by a barrier on the fast track.

- ?


1 ( ), - , , .

2 , , , , x86, membarrier() x86.

+6
2

. x86 membarrier() . , , Intel SDM, x86:

Intel SDM Vol. 3 §8.2.3.2 , enter image description here

+6

man-, Dekker. . https://lkml.org/lkml/2017/9/18/779

, membarrier x86, Linux.

!

+4

All Articles