I am working with a Cortex-A9 dual system and I was trying to understand exactly why spinlock functions should use DMB. It seems that until the storage drain buffer is reset, the lock value should end in L1 on the unlocking core, and the SCU should either cancel or update the value in L1 of the other core. Is this enough to ensure proper consistency and secure blocking? And also STREX does not skip merging storage anyway, that is, we donβt even need a flash?
DMB seems a bit of a blunt hammer, especially since it uses the system domain by default, which probably means writing all the way to main memory, which can be expensive.
Are the DMBs in locks there a workaround for drivers that are not using smp_mb correctly?
Currently, I see, based on performance counters, about 5% of my system cycles disappear in kiosks caused by DMB.
source share