I have an array A[0...N]
of double
and an array B[0...N]
of int
. Each B[i]
changes to [0...P]
. All I need to do is compute the array C[0...P]
:
C[j] = SUM( A[i] : B[i] = j)
I cannot use N
threads with atomicAdd()
function since it does not support double
as far as I know. A direct implementation with flows P
diverges greatly. Is there a better way?
source share