Will gfortran or ifort compilers wisely use SIMD commands when summing a product from two arrays?

Question

Will gfortran or ifort compilers wisely use SIMD commands when summing a product from two arrays?

I have code written with numpy and I am considering porting it to Fortran for better performance.

One operation that I do several times is the summation of an elementary product from two arrays:

sum(A*B)

It seems that compiled instructions with several additives would handle this. My current processor does not support these instructions, so I cannot check it yet. However, I can upgrade to a new processor that supports FMA3 (Intel Haswell processor).

Does anyone know if compiling a program with "-march = native" (or the ifort equivalent) is enough to force the compiler (or gfortran or ifort) to use SIMD instructions intelligently to optimize this code, or do you think I will need compile compilers or code?

+4

fortran simd gfortran intel fortran- fma

lnmaurer Jan 10 '14 at 17:43

source share

3 answers

-march=native SIMD, SIMD, -xHost ifort.

, "". , -O3 ifort gfortran ( SIMD- , ). , . , , .

, . vdmul MKL gsl_vector_mul GSL .

-march=NEWARCH , NEWARCH, . -mtune=NEWARCH, NEWARCH - . , , . , -mtune, , , .

ifort , , . , ifort -vec-report=1 . , gfortran .

+2

Xiaolei Zhu 11 . '14 17:53

gfortran, sum(a*b) , dot_product(a,b), . , , fma AVX2.

dot_product ( ) fma, , , , . gfortran simd fma dot_product .

-O2 -ftree-vectorize -ffast-math -march=native or -O3 -ffast-math -march=native ( ), gfortran OpenMP.

gfortran 4.9, , -ftree-vectorizer-verbose. -fdump-tree-vect .vect , gcc.

+1

tim18 28 . '14 17:37

lnmaurer · Accepted Answer · 2014-01-11T21:47:56+0000

Thanks to the hint of Xiaolei Zhu, now I know that gfortran will use smooth multiple additions for optimization sum(A*B). For example, using this code:

software test implicit
real, dimension (7) :: a, b
a = (/2.0, 3.0, 5.0, 7.0, 11.0, 13.0, 17.0 /)
b = (/4.0, 6.0, 8.0, 10.0, 12.0, 14.0, 16.0 /)
print *, sum (a * b)
endprogram

f95 sum.f95 -o sum -O3 -march=core-avx2, objdump -d sum | grep vfmadd

40088b: c4 e2 71 99 44 24 30 vfmadd132ss 0x30 (% rsp),% xmm1,% xmm0
400892: c4 e2 69 b9 44 24 34 vfmadd231ss 0x34 (% rsp),% xmm2,% xmm0
400899: c4 e2 61 b9 44 24 38 vfmadd231ss 0x38 (% rsp),% xmm3,% xmm0
4008a0: c4 e2 59 b9 44 24 3c vfmadd231ss 0x3c (% rsp),% xmm4,% xmm0
4008a7: c4 e2 51 b9 44 24 40 vfmadd231ss 0x40 (% rsp),% xmm5,% xmm0
4008ae: c4 e2 49 b9 44 24 44 vfmadd231ss 0x44 (% rsp),% xmm6,% xmm0
4008b5: c4 e2 41 b9 44 24 48 vfmadd231ss 0x48 (% rsp),% xmm7,% xmm0

, gfortran 7 . , , , , vfmadd231ss ( ).

Will gfortran or ifort compilers wisely use SIMD commands when summing a product from two arrays?

More articles: