I have an array A[0...N] of double and an array B[0...N] of int . Each B[i] changes to [0...P] . All I need to do is compute the array C[0...P] :
C[j] = SUM( A[i] : B[i] = j)
I cannot use N threads with atomicAdd() function since it does not support double as far as I know. A direct implementation with flows P diverges greatly. Is there a better way?
source share