The sample code does not actually measure what the OP expects, because some instructions are optimized by the compiler.
In the example, a local variable ( ptest ) loading does not affect the state outside the kernel. In this case, the compiler can completely remove the instruction. This can be seen in the SASS code. The SASS code is the same when ptest=vel[globalz*nx+globalx]; active, or both statements (ptest and stest) are deleted. To check the SASS code, you can run cuobjdump --dump-sass in the object file.
The instructions apparently are not optimized in the shared memory example, which can be checked in SASS code. (In fact, I expected the instructions to be deleted as well. Are there any side effects that are missing?)
As already discussed in the comments, with a simple calculation ( ptest*=ptest ) and writing to global memory, the compiler cannot delete the instruction because it changes the global state.
From the OP comments, I assume that there is a misunderstanding in how the load operation in shared memory works. In fact, data is loaded from the global memory into registers, and then stored in shared memory . The created (corresponding) SASS instructions (for sm_30) look like this:
LD.E R2, [R6]; // load to register R2 STS [R0], R2; // store from register R2 to shared memory
The following example, multiply and save in global memory, demonstrates another case where the compiler does not generate code that you might naively expect:
stest[localz][localx]=vel[globalz*nx+globalx]; // load to shared memory stest[localz][localx]*=stest[localz][localx]; // multiply vel[globalz*nx+globalx]=stest[localz][localx]; // save to global memory
The SASS code indicates that the variable is only stored in shared memory after calculation (and never reads the shared shape memory).
LD.E R2, [R6]; // load to register FMUL R0, R2, R2; // multiply STS [R3], R0; // store the result in shared memory ST.E [R6], R0; // store the result in global memory
I am not an expert in SASS code, please correct me if I am wrong or left something important.