When L1 passes are much different from L2 access ... associated with TLB?

I performed some tests on some algorithms and profiled their usage and memory efficiency (access L1 / L2 / TLB and omissions), and some of the results are quite intriguing to me.

Given the inclusive cache hierarchy (L1 and L2 caches), should the number of L1 cache misses coincide with the number of access to L2 caches ? One of the explanations that I find will be related to the TLB: when the virtual address does not appear in the TLB, the system automatically skips the search in some cache levels. Does this sound legal?

+5
source share
2 answers

First, inclusive cache hierarchies may not be as widespread as you expect. For example, I don’t think that any modern Intel processors - not Nehalem, not Sandybridge, maybe Atoms - have L1, which is included in L2. (Nehalem and probably Sandybridge do, however, have L1 and L2 included in L3, using current Intel, FLC, and MLC terminology in LLC.)

. , L1, , , L2. , . - , , () L2, . , - , , L1, L2, , , L1, , L2 , , , .

, , L1 L2.

, - , Intel x86s, Nehalem Sandybridge, EMON , L1 L2 .. , , , , ARM Power.

, . . , , L1 L2, - , .

: . .

" L1", , [*] () , L1. , Intel , . , , L1, . , , L2 L2.)

, : Squashed_Cache_Misses.

([*] , "" , " , ". . , , RTL, , . .)

. , A [0], A [1], A [2],... A [63], A [64],...

A [0] 64, A [0].. A [63] 64- . , , , , . QED: 64 , 64 L1, L2.

(, , . , 64 L1 L2.)

:

L2 , L1 ( , ), , . , . , , . Prefetches_from_L2 Prefetches_from_Memory.

, L1, L2. , Intel .

+6

, ( ), (). , , - L1-D, L2.

L2, L1.

+1

All Articles