CUDA Newbie - Simple var step not working

I am working on a project with CUDA. To verify this, I have the following code.

#include <iostream> using namespace std; __global__ void inc(int *foo) { ++(*foo); } int main() { int count = 0, *cuda_count; cudaMalloc((void**)&cuda_count, sizeof(int)); cudaMemcpy(cuda_count, &count, sizeof(int), cudaMemcpyHostToDevice); cout << "count: " << count << '\n'; inc <<< 100, 25 >>> (&count); cudaMemcpy(&count, cuda_count, sizeof(int), cudaMemcpyDeviceToHost); cudaFree(cuda_count); cout << "count: " << count << '\n'; return 0; } 

Exit

 count: 0 count: 0 

What is the problem?

Thanks in advance!

+6
c ++ cuda
source share
3 answers

I have found a solution. I just needed to use an atomic function, that is, a function that runs without interference from other threads. In other words, no other thread can access a specific address until the operation is complete.

the code:

 #include <iostream> using namespace std; __global__ void inc(int *foo) { atomicAdd(foo, 1); } int main() { int count = 0, *cuda_count; cudaMalloc((void**)&cuda_count, sizeof(int)); cudaMemcpy(cuda_count, &count, sizeof(int), cudaMemcpyHostToDevice); cout << "count: " << count << '\n'; inc <<< 100, 25 >>> (cuda_count); cudaMemcpy(&count, cuda_count, sizeof(int), cudaMemcpyDeviceToHost); cudaFree(cuda_count); cout << "count: " << count << '\n'; return 0; } 

Output:

 count: 0 count: 2500 

Thank you for making me understand the mistake I made.

+6
source share

You must pass cuda_count to your kernel function. In addition, all your threads are trying to increase the same memory location. The effect of this is undefined (at least one record will be successful, but more than one).

You need to prevent this if only one thread does the work:

 __global__ void inc(int *foo) { if (blockIdx.x == 0 && threadIdx.x == 0) ++*foo; } 

(unverified)

+8
source share

The problem with your code is that you are passing the device kernel pointer to the count pointer. Not a pointer to counting. One '&' too much

This line

 inc <<< 100, 25 >>> (&count); 

Must be

 inc <<< 100, 25 >>> (count); 
0
source share

All Articles