I am studying CUDA and now I have something like this.
__device__ void iterate_temperatures(int fieldSize, Atom *atoms) {
int temperature = threadIdx.x + blockDim.x * blockIdx.x;
nAtoms = pow(fieldSize, DIMENSION);
iterate_atoms<<< nAtoms >>>(atoms, nAtoms, temperature);
}
The thing is, every temperature needs a final result.
How can I make each block wait for the last.
Thank!
source
share