When we test case use with xptxas, we see something like this:
ptxas info : Used 63 registers, 244 bytes cmem[0], 51220 bytes cmem[2], 24 bytes cmem[14], 20 bytes cmem[16]
I wonder if there is currently any documentation that clearly explains cmem [x]. What is the point of dividing read-only memory into several banks, how many banks in general, and which banks other than 0, 2, 14, 16 are used for?
as a side note, @njuffa (thank you), previously explained on the nvidia forum, what a bank 0,2,14,16 is:
The used read-only memory is divided into constant program variables (bank 1), plus constants generated by the compiler (bank 14).
cmem [0]: kernel arguments
cmem [2]: custom constant objects
cmem [16]: constants generated by the compiler (some of which may correspond to literal constants in the source code)
cuda gpu-constant-memory
King crimson
source share