What are the sizes of the image and the core? If the core is large, then you can use FFT-based convolution, otherwise for small kernels just use direct convolution.
Perhaps the DSP is not the best way to do this, because it has a MAC instruction does not mean that it will be more efficient. Does the ARM processor on the Beagle board have NEON SIMD? If so, then this may be the way to go (and more fun too).
For a small kernel, you can do a direct convolution as follows:
// in, out are mxn images (integer data) // K is the kernel size (KxK) - currently needs to be an odd number, eg 3 // coeffs[K][K] is a 2D array of integer coefficients // scale is a scaling factor to normalise the filter gain for (i = K / 2; i < m - K / 2; ++i) // iterate through image { for (j = K / 2; j < n - K / 2; ++j) { int sum = 0; // sum will be the sum of input data * coeff terms for (ii = - K / 2; ii <= K / 2; ++ii) // iterate over kernel { for (jj = - K / 2; jj <= K / 2; ++jj) { int data = in[i + ii][j +jj]; int coeff = coeffs[ii + K / 2][jj + K / 2]; sum += data * coeff; } } out[i][j] = sum / scale; // scale sum of convolution products and store in output } }
You can change this to maintain even K values ββ- for this you need to take a little care of the upper / lower bounds for the two inner loops.
Paul r
source share