Is prefetching an initiated exact address stream or cache line stream?

On modern x86 processors, hardware prefetching is an important technique for moving cache lines to different levels of the cache hierarchy before they are explicitly requested by the user code.

The basic idea is that when the processor detects a series of calls to sequential or sequential cells 1, it will continue and extract additional memory cells in the sequence, even before executing instructions that (can) actually access these locations.

My question is whether the detection of the prefetch sequence is based on full addresses (actual addresses requested by the user code) or cache line addresses, which are pretty much the address, excluding the lower 6 bits 2 that were deleted.

For example, on a system with a 64-bit cache line, access to full addresses 1, 2, 3, 65, 150will access cache lines 0, 0, 0, 1, 2.

The difference can be significant when the sequence of calls is more regular when addressing a cache line than full addressing. For example, a series of complete addresses, such as:

32, 24, 8, 0, 64 + 32, 64 + 24, 64 + 8, 64 + 0, ..., N*64 + 32, N*64 + 24, N*64 + 8, N*64 + 0

( , 4 8- ), ( , 0, 8, 16, 24,...).

, , ?


. , , , , , - "". ".


1 , , () , 1. , 100, 200, 300,... 100, ( , "" ).

2 , 64- .

+7
1

, , . , Intel Haswell.

, , . -, , , . -, , . prefetcher , , , . . , .

4 . DCU L2 , 64- .

, L2 . , . , .

IP DCU . :

  • , , .
  • , , .
0

All Articles