If I use DMA for RAM ↔ GPUs on CUDA C ++, how can I be sure that the memory will be read from the pinned (locked) RAM and not from the CPU cache?
After all, with DMA, the CPU knows nothing about someone changing memory and the need for CPU synchronization (Cache ↔ RAM). And, as far as I know, std :: memory_barier () from C + +11 does not help with DMA and will not read from RAM, but will only lead to compliance between L1 / L2 / L3 caches. In addition, in the general case, there is no protocol for resolving a conflict between the cache and RAM on the processor, but only synchronization protocols of different levels of CPU-cache L1 / L2 / L3 and multiprocessor processors in NUMA: MOESI / MESIF
synchronization caching gpgpu cuda dma
Alex
source share