Memory data status after cuda exceptions

It is not clear in the CUDA document how memory information changes after CUDA applications throw an exception.

For example, a kernel launch (dynamic) encountered an exception (e.g. Warp Out-of-Address Address), the current kernel launch will be stopped. Will data (for example, __device__ variables) be stored on the device after this point, or deleted with exceptions?

A specific example would be:

  • CPU starts the kernel
  • The kernel updates the value of __device__ variableA as 5, and then fails
  • CPU memcpy value of variable A from device to host, what is the value that the CPU receives in this case, 5 or something else?

Can someone show the rationale for this?

+1
source share
1 answer

The behavior is not defined in the case of a CUDA error that corrupts the CUDA context.

This type of error is obvious because it "sticks", that is, as soon as it occurs, each call to the CUDA API will return this error until the context is destroyed.

, API cuda ( cudaPeekAtLastError). " " ( , ..) "". 3 () API cudaMemcpy cudaMemcpy , cudaMemcpy - cudaMemcpy cudaMemcpy - .

CUDA , - .

cudaMalloc cudaMalloc , . , , () API CUDA . CUDA, cuda , .

, :

, :

cudaErrorMemoryAllocation = 2 API, .

, :

cudaErrorMisalignedAddress = 74 , . , ( ). , CUDA.

, cudaDeviceReset() . "" .

+5

All Articles