Your answer, as far as I can see, is in the comments - MESI updates caches, not Store/Load buffers. But lock LOCK CMPXCHGsays: locked operations serialize all outstanding load and store operation- thatβs why it needs to unload the Store / Load buffer from this CPU (and not others as detailed here ).
Thus, the current processor must perform an atomic operation with the most recent value that may be in the Store / Load buffers, so a fence is needed to actually merge this.
source
share