How can I program a memory area most easily?

Background: I implemented a stochastic algorithm that requires random ordering for better convergence. However, this clearly destroys the locality of memory. I found that by pre-fetching the following iteration data, the performance drop is minimized.

I can prefetch n lines in the cache using _mm_prefetch in a simple, mostly portable using the OS + compiler, but how long is the cache line? Right now I am using hardcoded value of 64, which seems to be the norm now on x64 processors, but I don’t know how to detect this at runtime, and last year there was no easy solution to this .

I saw GetLogicalProcessorInformation on Windows, but I can’t use such a complex API for something so simple, t work with poppies or linux anyway.

Perhaps there is some completely different API / internal code that can pre-select a region of memory defined in terms of bytes (or words or something else), and allows me to pre-select without knowing the length of the cache line?

Basically, is there a reasonable alternative to _mm_prefetch with #define CACHE_LINE_LEN 64 ?

+7
c ++ caching 64bit
source share
1 answer

Here the question is asked about the same here . You can read it from the CPUID if you want to delve into some kind of assembly. To do this, of course, you will need to write platform-specific code.

You are probably already familiar with the Agner Fog optimization guide , which provides cache information for many popular processors. If you can determine the expected CPU that you will encounter, you can simply hardcode the size of the cache line and look at the processor provider information to set the line size.

+5
source share

All Articles