Do spin blocks really need DMB?

I am working with a Cortex-A9 dual system and I was trying to understand exactly why spinlock functions should use DMB. It seems that until the storage drain buffer is reset, the lock value should end in L1 on the unlocking core, and the SCU should either cancel or update the value in L1 of the other core. Is this enough to ensure proper consistency and secure blocking? And also STREX does not skip merging storage anyway, that is, we don’t even need a flash?

DMB seems a bit of a blunt hammer, especially since it uses the system domain by default, which probably means writing all the way to main memory, which can be expensive.

Are the DMBs in locks there a workaround for drivers that are not using smp_mb correctly?

Currently, I see, based on performance counters, about 5% of my system cycles disappear in kiosks caused by DMB.

+6
source share
2 answers

I found that these articles can answer your question:

In particular:

You will notice the memory management (DMB) instructions that appear after the lock was acquired. DMB guarantees that all memory accesses in front of the memory barrier will be observed by all other CPUs in the system before all memory accesses are performed after the memory barrier. This makes sense when you consider that after the lock has been acquired, the program will then gain access to the data structure locked by the lock. The DMB in the lock function above ensures that access to the locked data structure is observed after access to the lock.

+2
source

DMB is necessary in the case of SMP, because another processor can see that memory accesses occur in a different order without it, that is, access from within the critical section can occur before the lock is taken from the point of view of the second core.

Thus, the second core could see that it holds the lock, and also sees updates from within the critical section running on another core, broken down.

+2
source