How to use this macro to check memory alignment?

I am a beginner symbic, I read this article on this topic (since I use an AVX2-compatible machine).

Now I read this question to check if the pointer is aligned.

I am testing it with this main.cpp toy example:

 #include <iostream> #include <immintrin.h> #define is_aligned(POINTER, BYTE_COUNT) \ (((uintptr_t)(const void *)(POINTER)) % (BYTE_COUNT) == 0) int main() { float a[8]; for(int i=0; i<8; i++){ a[i]=i; } __m256 evens = _mm256_set_ps(2.0, 4.0, 6.0, 8.0, 10.0, 12.0, 14.0, 16.0); std::cout<<is_aligned(a, 16)<<" "<<is_aligned(&evens, 16)<<std::endl; std::cout<<is_aligned(a, 32)<<" "<<is_aligned(&evens, 32)<<std::endl; } 

And compile it with icpc -std=c++11 -o main main.cpp .

Result Printing:

 1 1 1 1 

However, if I add these 3 lines before 4 prints:

 for(int i=0; i<8; i++) std::cout<<a[i]<<" "; std::cout<<std::endl; 

This is the result:

 0 1 2 3 4 5 6 7 1 1 0 1 

In particular, I do not understand that the last 0 . Why is it different from the last print? What am I missing?

+7
c ++ vectorization c ++ 11 simd avx2
source share
1 answer

Your is_aligned (which is a macro, not a function) determines whether the object was bound to a specific border. It does not define alignment requirements for an object type.

The compiler will guarantee the presence of a floating-point array so that it is aligned at least with the alignment requirement for a float, which is usually 4. 32 is not a coefficient of 4, so there is no guarantee that the array will be aligned with a 32-byte boundary. However, there are many memory addresses that are divided into 4 and 32, so it is possible that the memory address at the 4 byte border will also be at the 32 byte border. This is what happened in your first test, but as explained, there is no guarantee that this will happen. In the last test, you added some local variables, and the array ended up in a different memory location. It so happened that there was no other memory location at the boundary of 32 bytes.

To request more stringent alignment, which may be required by SIMD instructions, you can use the alignas :

 alignas(32) float a[8]; 
+6
source share

All Articles