In char, which I posted below, I compare the results of IFFT execution in FFTW and CUFFT.
What are the possible reasons for this? Are there really so many rounding errors?
Here is the relevant code snippet:
cufftHandle plan; cufftComplex *d_data; cufftComplex *h_data; cudaMalloc((void**)&d_data, sizeof(cufftComplex)*W); complex<float> *temp = (complex<float>*)fftwf_malloc(sizeof(fftwf_complex) * W); h_data = (cufftComplex *)malloc(sizeof(cufftComplex)*W); memset(h_data, 0, W*sizeof(cufftComplex)); cufftPlan1d(&plan, W, CUFFT_C2C, 1); if (!reader->getData(rowBuff, row)) return 0;
ifft was defined as follows:
ifft = fftwf_plan_dft_1d(freqCols, reinterpret_cast<fftwf_complex*>(indata), reinterpret_cast<fftwf_complex*>(outdata), FFTW_BACKWARD, FFTW_ESTIMATE);
and to generate the graph, I unloaded h_data and outdata after fftw_execute W - this is the line width of the image processed by me.
Anything obvious?

c ++ cuda fftw
Derek
source share