I calculate the Euclidean distance between n-dimensional points using OpenCL. I get two lists of n-dimensional points, and I have to return an array containing only the distances from each point in the first table to each point in the second table.
My approach is to make a normal doble cycle (for each point in table 1 {for each point in table 2 {...}}, and then do the calculation for each pair of points in parallel.
The Euclidean distance is then divided into 3 parts: 1. accept the difference between each measurement at points 2. the square of this difference (still for each measurement) 3. summarize all the values obtained in 2. 4. Take the square root of the value obtained in 3 . (this step was omitted in this example.)
Everything works like a charm until I try to accumulate the sum of all the differences (namely, by completing step 3 of the above procedure, line 49 below).
As test data I use DescriptorLists for 2 points: DescriptorList1: 001,002,003, ..., 127,128; (P1) 129130131, ..., 255256; (P2)
DescriptorList2: 000,001,002, ..., 126,127; (P1) 128129130, ..., 254255; (P2)
Thus, the resulting vector must have values: 128, 2064512, 2130048, 128 Now I get random numbers that change with each run.
I appreciate any help or leads to what I'm doing wrong. Hopefully everything is clear about the scenario in which I work.
#define BLOCK_SIZE 128
typedef struct
{
int length;
int num_elements;
__global float *elements;
} DescriptorList;
__kernel void CompareDescriptors_deb(__global float *C, DescriptorList A, DescriptorList B, int elements, __local float As[BLOCK_SIZE])
{
int gpidA = get_global_id(0);
int featA = get_local_id(0);
float dif_acum[BLOCK_SIZE];
int loop = 0;
for (int i = 0; i < A.num_elements/BLOCK_SIZE; i++){
DescriptorList tmpA = GetDescriptor(A, i);
As[featA] = GetElement(tmpA, 0, featA);
barrier(CLK_LOCAL_MEM_FENCE);
for (int k = 0; k < B.num_elements/BLOCK_SIZE; k++){
dif_acum[featA] = As[featA]-B.elements[k*BLOCK_SIZE + featA];
barrier(CLK_LOCAL_MEM_FENCE);
C[loop] = 0;
C[loop] += dif_acum[featA]*dif_acum[featA];
loop += 1;
barrier(CLK_LOCAL_MEM_FENCE);
}
}
}