Bart's commentary is mostly right. In more detail, as indicated in the PTX ISA 3.1 manual ,
For some instructions, the destination operand is optional. The bit-bucket operand indicated by the underscore ( _ ) can be used instead of the destination register.
In fact, there is only one instruction class specified in the 3.1 PTX specification for which _ is a valid destination: atom . Here is the semantics of atom :
Atomically loads the original value at location a into destination register d, performs the reduction operation with operand b and the value at location a, and saves the result of the specified operation at location a, overwriting the original value.
And there is a note for atom :
Simple abbreviations can be specified using the destination operand of the bit-bucket _ .
So we can build an example:
atom.global.add.s32 _, [a], 4
This will add 4 to the integer value in memory a and will not return the previous value of location a in the register. Therefore, if you do not need the previous value, you can use this. I assume that the compiler will generate this for this code
atomicAdd(&a, 4);
since the return value of atomicAdd is not stored in the variable.
harrism
source share