Does Klang have something like the #pragma GCC target?

I have code written that uses embedded AVXs when they are available in the current CPU. In GCC and Clang, unlike Visual C ++, in order to use the built-in functions, you must enable them on the command line.

The problem with GCC and Clang is that when you enable these options, you give the compiler free use of these instructions throughout your source file. This is very bad when you have header files that contain built-in functions or template functions, because the compiler will generate these functions using AVX instructions.

When linked, duplicate functions will be discarded. However, since some source files were compiled with -mavx and some were not, the various compilations of the inline / template functions will be different. If you are not lucky, the linker will randomly select the version with instructions for AVX, which will lead to a program crash when launched on a system without AVX.

GCC solves this with #pragma GCC target . You can disable special instructions for header files, and the generated code will not use AVX:

 #pragma GCC push_options #pragma GCC target("no-avx") #include "MyHeader.h" #pragma GCC pop_options 

Does the Clan have something like this? It appears that these options are ignored and generate AVX code.

+7
avx intrinsics clang pragma
source share
1 answer

You should probably use static inline instead of inline , so the version of the function compiled with -mavx will only be used by callers from this translation unit.

The linker will still merge the actual duplicates instead of just selecting one non-inline definition by name.

This also has the advantage that the compiler does not waste time highlighting an autonomous definition of the functions that it decides to embed in each caller in this translation unit.


The gcc / clang method makes sense if you are used to it and develop your own code for it. Note that MSVC must enable AVX if you are compiling features that use AVX. Otherwise, it will mix VEX and non-VEX encodings, which will lead to large fines, instead of using the VEX encoding for something like the 128-bit _mm_add_ps in the horizontal addition at the end of the _mm256_add_ps loop.

So, you basically have the same issue with MSVC, that compiling _mm_whatever will only make AVX native code.

+5
source share

All Articles