I have two scenarios for measuring indicators such as calculation time and parallel acceleration (sequential_time / parallel_time).
Scenario 1:
Sequential time measurement:
startTime=omp_get_wtime(); for loop computation endTime=omp_get_wtime(); seq_time = endTime-startTime;
Parallel time measurement:
startTime = omp_get_wtime(); for loop computation (#pragma omp parallel for reduction (+:pi) private (i) for (blah blah) { computation; } endTime=omp_get_wtime(); paralleltime = endTime-startTime; speedup = seq_time/paralleltime;
Scenario 2:
Sequential time measurement:
for loop{ startTime=omp_get_wtime(); computation; endTime=omp_get_wtime(); seq_time += endTime-startTime; }
Parallel time measurement:
for loop computation (#pragma omp parallel for reduction (+:pi, paralleltime) private (i,startTime,endTime) for (blah blah) { startTime=omp_get_wtime(); computation; endTime=omp_get_wtime(); paralleltime = endTime-startTime; } speedup = seq_time/paralleltime;
I know that scenario 2 is NOT the best production code, but I think that it measures the actual theoretical performance due to OVERLOOKING overhead associated with creating openmp and managing (thread switching) multiple threads. Thus, this will give us linear acceleration. But Scenario 1 addresses the overhead associated with spawning and flow control.
My doubt is this: With Scenario 1, I get an acceleration that starts linear, but narrows when we go to more iterations. With scenario 2, I get full linear acceleration regardless of the number of iterations. I was told that actually scenario 1 will give me linear acceleration, regardless of the number of iterations. But I think this is not due to high congestion due to flow control. Can someone explain to me why I'm wrong?
Thanks! And it is a pity that a rather long post.