How to use omp parallel for and omp simd together?

I want to test #pragma omp parallel for and #pragma omp simd for a simple matrix addition program. When I use each of them separately, I do not get errors, and it seems perfect. But, I want to check how much performance can be achieved using both of them. If I use #pragma omp parallel for before the outer loop and #pragma omp simd before the inner loop, I also get no error. The error occurs when I use both of them before the outer loop. I get an error at runtime, not compile time. ICC and GCC , but Clang not. Perhaps this is due to the fact that Clang defines parallelization. In my experiments, Clang does not parallelize or run a program with only one thread.

The program is here:

 #include <stdio.h> //#include <x86intrin.h> #define N 512 #define MN int __attribute__(( aligned(32))) a[N][M], __attribute__(( aligned(32))) b[N][M], __attribute__(( aligned(32))) c_result[N][M]; int main() { int i, j; #pragma omp parallel for #pragma omp simd for( i=0;i<N;i++){ for(j=0;j<M;j++){ c_result[i][j]= a[i][j] + b[i][j]; } } return 0; } 

Error for: ICC:

IMP1.c (20): error: omp directive is not followed by parallelizable for #pragma omp parallel loop for ^

compilation canceled for IMP1.c (code 2)

GCC:

IMP1.c: In the main function:

IMP1.c: 21:10: error: for statement expected before '#pragma #pragma omp simd

Because in my other pragma omp simd for the outer loop we get better performance, I need to put it there (right?).

Platform: Intel Core i7 6700 HQ, Fedora 27

Tested Compilers: ICC 18, GCC 7.2, Clang 5

Compiler Command Line:

icc -O3 -qopenmp -xHOST -no-vec

gcc -O3 -fopenmp -march=native -fno-tree-vectorize -fno-tree-slp-vectorize

clang -O3 -fopenmp=libgomp -march=native -fno-vectorize -fno-slp-vectorize

+7
c x86 parallel-processing simd openmp
source share
1 answer

From OpenMP 4.5 specification:

2.11.4 Parallel loop SIMD Construct

A parallel loop SIMD construct is a shortcut to indicate a parallel construct containing one loop SIMD construct and no other operator.

The syntax of the SIMD parallel loop construct is as follows:

#pragma omp parallel for simd ...

You can also write:

 #pragma omp parallel { #pragma omp for simd for ... } 
+5
source share

All Articles