CUDA: Understanding PTX Information

Question

CUDA: Understanding PTX Information

I do not find much useful information about the PTX information --ptxas-options=-v I found the 2008 NVCC pdf, which has a small commercial, but no details.
1) What does 64 bytes cmem[0], 12 bytes cmem[16] ? I understand that this refers to read-only memory. I do not use constant mem in the code, so this should come from the compiler. (What is included in RO mem?)
2) What does 49152+0 bytes smem ? Yes, this is shared memory, but what do these two mean?
3) Is there a document that will help me with this? (What is called?)
4) Where can I find a document that will explain the * .ptx file? (I would like to be able to read / understand the cuda assy code.)

+8

cuda

Doug Sep 7 '12 at 17:40

source share

2 answers

aland · Answer 1 · 2012-09-07T18:27:20+0000

cmem discussed here . In your case, this means that 64 bytes are used to pass arguments to the kernel, and 12 bytes are occupied by the constants generated by the compiler.
In the case of smem first number is the amount of data in your code request, and the second number ( 0 ) indicates how much memory is used for system purposes.
I do not know any official information about the detailed output format of ptxas . For example. in CUDA's Employment Calculator, they simply say to summarize the values for smem without any explanation.
There are several PTX documents on the nVidia website. The most fundamental is PTX: a parallel implementation of the ISA stream version 3.0 .

lxkarthi · Answer 2 · 2013-03-28T17:41:48+0000

See “Other Uses for NVCC .” They note that the permanent distribution of banks depends on the profile .

In the PTX manual , they say that besides 64KB of read-only memory, they also had 10 banks for read-only memory. A driver can allocate and initialize persistent buffers in these regions and pass pointers to buffers as parameters of the kernel function.

I think this profile set for nvcc will take care of which constants fall into which memory. In any case, we need not worry if each cmem [n] read-only memory is less than 64 KB , since each bank is 64 KB in size and is common to all threads in the grid.

CUDA: Understanding PTX Information

More articles: