Background: I implemented a stochastic algorithm that requires random ordering for better convergence. However, this clearly destroys the locality of memory. I found that by pre-fetching the following iteration data, the performance drop is minimized.
I can prefetch n lines in the cache using _mm_prefetch in a simple, mostly portable using the OS + compiler, but how long is the cache line? Right now I am using hardcoded value of 64, which seems to be the norm now on x64 processors, but I donβt know how to detect this at runtime, and last year there was no easy solution to this .
I saw GetLogicalProcessorInformation on Windows, but I canβt use such a complex API for something so simple, t work with poppies or linux anyway.
Perhaps there is some completely different API / internal code that can pre-select a region of memory defined in terms of bytes (or words or something else), and allows me to pre-select without knowing the length of the cache line?
Basically, is there a reasonable alternative to _mm_prefetch with #define CACHE_LINE_LEN 64 ?
c ++ caching 64bit
Eamon nerbonne
source share