Existence of "sim-abbreviations (:)" In GCC and MSVC?

simd pragma can be used with the icc compiler to execute the reduction operator:

#pragma simd #pragma simd reduction(+:acc) #pragma ivdep for(int i( 0 ); i < N; ++i ) { acc += x[i]; } 

Is there an equivalent solution in msvc or / and gcc?

Ref (p28): http://d3f8ykwhia686p.cloudfront.net/1live/intel/CompilerAutovectorizationGuide.pdf

+3
gcc vectorization visual-c ++ simd icc
source share
3 answers

GCC can definitely vectorize. Suppose you have a reduc.c file with the contents:

 int foo(int *x, int N) { int acc, i; for( i = 0; i < N; ++i ) { acc += x[i]; } return acc; } 

Compile it (I used gcc 4.7.2) with the command line:

 $ gcc -O3 -S reduc.c -ftree-vectorize -msse2 

Now you can see the vectorized loop in assembler.

In addition, you can enable the detailed output of the vector, say, with

 $ gcc -O3 -S reduc.c -ftree-vectorize -msse2 -ftree-vectorizer-verbose=1 

You will now receive a console report:

 Analyzing loop at reduc.c:5 Vectorizing loop at reduc.c:5 5: LOOP VECTORIZED. reduc.c:1: note: vectorized 1 loops in function. 

Check out the white papers for a better understanding of cases where the GCC can and cannot vectorize.

+2
source share

For Visual Studio 2012: With the /O1 /O2/GL options, use /Qvec-report:(1/2) to use vectorization

 int s = 0; for ( int i = 0; i < 1000; ++i ) { s += A[i]; // vectorizable } 

In the case of abbreviations of the " float " or " double " type, vectorization requires the /fp:fast switch to be cleared. This is because the vectorization of the reduction operation depends on the "floating point reassociation". Reassociation is allowed only when /fp:fast selected

Ref (related doc; p12) http://blogs.msdn.com/b/nativeconcurrency/archive/2012/07/10/auto-vectorizer-in-visual-studio-11-cookbook.aspx

+3
source share

gcc requires -ffast-math to enable this optimization (as mentioned in the link above), regardless of the use of #pragma omp simd reduction. icc becomes less pragma-dependent for this optimization (except that / fp: fast is needed in the absence of a pragma), but additional ivdep and simd pragmas in the original message are undesirable. icc can do bad things when providing the simd pragma, which does not include all the appropriate abbreviations, firstprivate and lastprivate (and gcc can break with -ffast-math, especially when combined with -march or -mavx). msvc 2012/2013 is very limited in auto-vectorization. In parallel areas of OpenMP, there is no vectorization, there is no vectorization in parallel areas, there is no vectorization of conditional expressions, and there are no advantages to __restrict in vectorization (there is some runtime checking for vectorization less efficient, but safe without __restrict).

+1
source share

All Articles