GPU cache statistics L1 and L2

Question

GPU cache statistics L1 and L2

I wrote some simple tests that perform a series of global memory accesses. When I measure the L1 and L2 cache statistics, I found that (on a GTX580 with 16 SM):

 total L1 cache misses * 16 != total L2 cache queries

Indeed, the right side is much higher than the left (about five times). I heard that some variation in values can be placed in L2. But my kernel has less than 28 registers, not so many. I wonder what will be the source of this difference? Or am I misinterpreting the meaning of these performance counters?

thank

+5

opencl gpu gpgpu cuda

Zk1001 Sep 19 '11 at 10:00

source share

2 answers

Gaszton · Answer 1 · 2011-11-28T14:04:33+0000

cuda Programming Guide Section G.4.2:

. -dlcm, L1, L2 (-Xptxas -dlcm = ca) ( ) L2 (-Xptxas -dlcm = cg). - 128 128- . , L1, L2, 128- , , L2, 32 . L2 , , , , .

Ravi · Answer 2 · 2011-11-28T03:15:10+0000

, L1 128 , L2 - 32 .

GPU cache statistics L1 and L2

More articles: