#pragma omp parallel for(int i=0; i<N; i++) { ... }
This code creates a parallel area, and each individual thread does what is in your loop. In other words, you execute a full loop N times, instead of N threads sharing the loop and completing all iterations only once.
You can do:
#pragma omp parallel { #pragma omp for for( int i=0; i < N; ++i ) { } #pragma omp for for( int i=0; i < N; ++i ) { } }
This will create one parallel region (for example, one fork / join, which is expensive, and therefore you do not want to do this for each cycle) and run several cycles in parallel inside this region. Just make sure you already have a parallel area in which you use #pragma omp for , not #pragma omp parrallel for , since the latter will mean that each of your N threads spawns N more threads to execute the loop.
Ryanp
source share