OpenMP parallel code does not have the same output as serial code

I had to change and expand my algorithm for some signal analysis (using the polyfilter technique) and could not use my old OpenMP code, but the results in the new code are not as expected (the results in the initial positions in the array are incorrect compared to sequential launch [serial code shows Expected Result]).

So, in the first tFFTin loop, I have some FFT data that I multiply using a window function.

The goal is for the thread to perform internal loops for each multiphase factor. To avoid blockages, I use the reduction pragma (no complex reduction is defined by the standard, so I use it where every omp_priv variable of the stream is initialized with omp_orig [so with tFFTin]). The reason I use ordered pragma is because the results have to be added to the output vector in an ordered way.

typedef std::complex<float> TComplexType;
typedef std::vector<TComplexType> TFFTContainer;

#pragma omp declare reduction(complexMul:TFFTContainer:\
        transform(omp_in.begin(), omp_in.end(),\
                omp_out.begin(), omp_out.begin(),\
                std::multiplies<TComplexType>()))\
                initializer (omp_priv(omp_orig))


void ConcreteResynthesis::ApplyPolyphase(TFFTContainer& tFFTin, TFFTContainer& tFFTout, TWindowContainer& tWindow, *someparams*) {;


#pragma omp parallel for shared(tWindow) firstprivate(sFFTParams) reduction(complexMul: tFFTin) ordered  if(iFFTRawDataLen>cMinParallelSize)
    for (int p = 0; p < uPolyphase; ++p) {
        int iPolyphaseOffset = p * uFFTLength;
        for (int i = 0; i < uFFTLength; ++i) {
            tFFTin[i] *= tWindow[iPolyphaseOffset + i]; ///< get FFT input data from raw data
        }    

#pragma omp ordered
        {
//using the overlap and add method
        for (int i = 0; i < sFFTParams.uFFTLength; ++i) {
            pDataPool->GetFullSignalData(workSignal)[mSignalPos + iPolyphaseOffset + i] += tFFTin[i];
        }
        }

    }

    mSignalPos = mSignalPos + mStep;
}

Is there any race condition or something else that makes the wrong exits at the start? Or do I have a logical error?

Another problem: I do not really like my solution using ordered pragma, is there a better approach (I tried to use a reduction model for this as well, but the compiler does not allow me to use a pointer type for this)?

+4
1

, , tFFTin. . tFFTin. , . , , - .

, .

0

All Articles