GCC - How to rebuild the stack?

I am trying to create an application that uses pthreads and type SSE __m128. According to GCC guidelines, the default alignment is 16 bytes by default. Using __m128 requires 16-byte alignment.

My target processor supports SSE. I use the GCC compiler, which does not support reinstalling the stack at runtime (e.g. -mstackrealign). I can not use another version of the GCC compiler.

My test application looks like this:

#include <xmmintrin.h> #include <pthread.h> void *f(void *x){ __m128 y; ... } int main(void){ pthread_t p; pthread_create(&p, NULL, f, NULL); } 

The application throws an exception and exits. After simple debugging (printf "% p", & y), I found that the variable y is not aligned by 16 bytes.

My question is: how can I correctly rewrite the stack (16-byte) without using any GCC flags and attributes (they do not help)? Should I use the GCC inline assembler in this f () thread function?

+7
c gcc stack pthreads sse
source share
5 answers

I solved this problem. Here is my solution:

 void another_function(){ __m128 y; ... } void *f(void *x){ asm("pushl %esp"); asm("subl $16,%esp"); asm("andl $-0x10,%esp"); another_function(); asm("popl %esp"); } 

First, we increase the stack by 16 bytes. Secondly, we make the least significant nibble equal to 0x0. We keep the stack pointer using push / pop operands. We call another function that has all its own local variables, aligned by 16 bytes. All nested functions will also have their local variables aligned to 16 bytes.

And it works!

0
source share

Allocate an array on the stack that is 15 bytes larger than sizeof(__m128) , and use the first aligned address in this array. If you need several, select them in an array with one 15-byte field for alignment.

I don’t remember if allocating an unsigned char array allows you to get rid of strict compiler alias optimizations or if it works just the opposite.

 #include <stdint.h> void *f(void *x) { unsigned char y[sizeof(__m128)+15]; __m128 *py = (__m128*) (((uintptr_t)&y) + 15) & ~(uintptr_t)15); ... } 
+7
source share

This should not happen in the first place, but to solve the problem you can try:

 void *f(void *x) { __m128 y __attribute__ ((aligned (16))); ... } 
+3
source share

Another solution would be to use a fill function that first aligns the stack and then calls f . So instead of calling f directly, you call pad , which first loads the stack, and then calls foo with the stack aligned.

The code will look like this:

 #include <xmmintrin.h> #include <pthread.h> #define ALIGNMENT 16 void *f(void *x) { __m128 y; // other stuff } void * pad(void *val) { unsigned int x; // to get the current address from the stack unsigned char pad[ALIGNMENT - ((unsigned int) &x) % ALIGNMENT]; return f(val); } int main(void){ pthread_t p; pthread_create(&p, NULL, pad, NULL); } 
+1
source share

Sorry to resurrect the old stream ...

For those with a new compiler than OP, the OP mentions the -mstackrealign , which leads me to __attribute__((force_align_arg_pointer)) . If your function is optimized to use SSE, but %ebp wrong, it will make corrections at runtime if necessary for you, transparently. I also found out that this is only a problem on the i386 . x86_64 ABI ensures that arguments are aligned with 16 bytes.

__attribute__((force_align_arg_pointer)) void i_crash_when_not_aligned_to_16_bytes() { ... }

A cool article for those who might want to know more: http://wiki.osdev.org/System_V_ABI

0
source share

All Articles