Let me first describe this with the fact that I know that such micro-optimizations are rarely cost-effective. I wonder how the material works. For all caching numbers, etc., I mean Intel's x86-64 i5 processor. Obviously, the numbers will differ for different CPUs.
I often had the impression that moving forward is faster than moving backward. I believe this is due to the fact that the extraction of large amounts of data is performed in direct access - that is, if I read byte 0x128, then the cacheline (assuming a length of 64 bytes) will be read in bytes 0x128-0x191 inclusive. Therefore, if the next byte I wanted to receive was 0x129, it was already in the cache.
However, after reading a little, I now get the impression that it really does not matter? Since alignment of the cache line will select the starting point at the nearest border with 64 divisibles, then if I start with byte 0x127, I will load 0x64-0x127 inclusively and therefore will have data in the cache for my walkback. I will suffer from cachemiss when switching from 0x128 to 0x127, but this is due to the fact that I chose the addresses for this example more than any real consideration.
I know that cache lines are read as 8-byte fragments, and therefore a full load should be loaded before the first operation, if we go back, but I doubt it would make an extremely significant difference.
Maybe someone will clarify if I'm here, and the old me is wrong? I searched full day and still could not get a definitive answer to this.
tl; dr: Is the direction we are leading the array really important? Does it really matter? Has it changed the past? (Until 15 years ago or so)
I checked with the following base code and see the same results back and forth:
#include <windows.h>
I apologize for A) the Windows code and B) Hacky. He threw together to test a hypothesis, but does not prove reasoning.
It would be useful to get any information on how the direction of walking can matter not only with the cache, but also with other aspects!
c ++ caching memory
Mike b
source share