CUDA debugging procedure for non-deterministic output

I am debugging my CUDA 4.0 / Thrust code recovery code on my 64-bit Ubuntu 10.10 system and I am trying to figure out how to debug this runtime error that has my output images to some random "noise". There is no random number generator generator in my code, so I expect the result to be consistent between runs, even if it is wrong. However, this is not ...

I'm just wondering if there is any general procedure for debugging CUDA runtime errors like these. I do not use shared memory in my cuda kernels. I did my best to avoid any race conditions related to global memory, but I could have missed something.

I tried using gpu ocelot , but it has problems recognizing some of my calls to the CUDA and CUSPARSE functions.

Also, my code usually works. It's just that when I change this parameter, I get these non-deterministic results. I checked all the code associated with this parameter, but I can not understand what I'm doing wrong. If I can redirect it to what I can post here, I can do it, but at this point it is too difficult to post here.

+4
source share
1 answer

Are you sure that all your kernels have proper block / remainder processing? One place that we saw without deterministic results occurred when we had data elements at the end of an array that was not being processed.

Our cores were originally designed for data that is known to be integer multiples of 256 elements. So we used a block size of 256 and made a simple split to get the number of blocks. When the data was then changed to any length, the remaining 255 or fewer elements were never processed. Then those spots on the output had random data.

+2
source

All Articles