The MESI protocol makes memory caches invisible. This means that multi-threaded programs do not have to worry about the kernel reading outdated data from them or two kernels, writing to different parts of the cache line, and receiving half of one record and half of the other sent to main memory.
However, this does not help with read-modify-write operations such as incrementing, comparing and swapping, etc. The MESI protocol does not stop two cores from each reading of the same memory fragment, each of which adds one to it, and then writes the same value back, turning two increments into one.
In modern CPUs, the LOCK prefix blocks the cache line, so the read-modify-write operation is logically atomic. They are simplified, but hopefully they will give you this idea.
Unlocked increment:
- Acquiring a cache line is well compatible. Read the meaning.
- Add a value to the read value.
- Acquire the cache line exclusively (if it is not already E or M) and block it.
- Enter the new value in the cache line.
- Change the cache line to change and unlock.
Blocked increment:
- Acquire the cache line exclusively (if it is not already E or M) and block it.
- Read the value.
- Add one to it.
- Enter the new value in the cache line.
- Change the cache line to change and unlock.
Pay attention to the difference? In an unlocked increment, the cache line is locked only during a write operation to memory, like all writes. In a locked increment, the cache line is saved throughout the entire command, from the read operation to the write operation and includes the increment itself.
In addition, some processors have things other than memory caches that can affect memory visibility. For example, some CPUs have a prefetcher reader or a hosted write buffer, which can cause memory operations to be out of order. Where necessary, the LOCK prefix (or equivalent functionality on other CPUs) will also do whatever it takes to handle memory ordering problems.
David schwartz
source share