Why is there more instructions in my gcc release?

GCC compiles (using gcc --omit-frame-pointer -s ):

  int the_answer() { return 42; } 

at

  .Text .globl _the_answer _the_answer: subl $12, %esp movl $42, %eax addl $12, %esp ret .subsections_via_symbols 

What is the constant '$ 12', and what is the% esp register?

+5
assembly gcc x86
Feb 01 '09 at 0:05
source share
4 answers

Short answer: stack frames.

Long answer: when you call a function, compilers will manipulate the stack pointer to allow local data, such as function variables. Since your code changes esp , the stack pointer, which I assume, happens here. I would think that GCC is smart enough to optimize this where it is not required, but you may not be using optimization.

+8
Feb 01 '09 at 0:12
source share
 _the_answer: subl $12, %esp movl $42, %eax addl $12, %esp ret 

The first subl reduces the stack pointer to make room for variables that can be used in your function. One slot can be used for a frame pointer, the other for storing a return address, for example. You said that it should omit the frame pointer. Usually this means that it does not allow loading / saving to save / restore the frame pointer. But often the code still reserves memory for it. The reason is that it makes code that parses the stack a lot easier. It is easy to give the stack offset a minimum width, and you know that you can always access FP + 0x12 to get into the first local slot of a variable, even if you omit saving the frame pointer.

Well, eax on x86 is used to handle the return value to the caller as far as I know. And the last addl just destroys the previously created frame for your function.

The code that generates instructions at the beginning and end of functions is called the “epilogue” and “prologue” of the function. Here is what my port does when it needs to create a function prolog in GCC (it is more complex for real ports that intend to be as fast and universal as possible):

 void eco32_prologue(void) { int i, j; /* reserve space for all callee saved registers, and 2 additional ones: * for the frame pointer and return address */ int regs_saved = registers_to_be_saved() + 2; int stackptr_off = (regs_saved * 4 + get_frame_size()); /* decrement the stack pointer */ emit_move_insn(stack_pointer_rtx, gen_rtx_MINUS(SImode, stack_pointer_rtx, GEN_INT(stackptr_off))); /* save return adress, if we need to */ if(eco32_ra_ever_killed()) { /* note: reg 31 is return address register */ emit_move_insn(gen_rtx_MEM(SImode, plus_constant(stack_pointer_rtx, -4 + stackptr_off)), gen_rtx_REG(SImode, 31)); } /* save the frame pointer, if it is needed */ if(frame_pointer_needed) { emit_move_insn(gen_rtx_MEM(SImode, plus_constant(stack_pointer_rtx, -8 + stackptr_off)), hard_frame_pointer_rtx); } /* save callee save registers */ for(i=0, j=3; i<FIRST_PSEUDO_REGISTER; i++) { /* if we ever use the register, and if it not used in calls * (would be saved already) and it not a special register */ if(df_regs_ever_live_p(i) && !call_used_regs[i] && !fixed_regs[i]) { emit_move_insn(gen_rtx_MEM(SImode, plus_constant(stack_pointer_rtx, -4 * j + stackptr_off)), gen_rtx_REG(SImode, i)); j++; } } /* set the new frame pointer, if it is needed now */ if(frame_pointer_needed) { emit_move_insn(hard_frame_pointer_rtx, plus_constant(stack_pointer_rtx, stackptr_off)); } } 

I omitted some code that deals with other problems, first of all with GCC indicating what instructions are important for handling exceptions (for example, where the frame pointer is stored, etc.). Well, saved registries are those that the caller does not need to save until the call. The called function takes care of saving / restoring them as needed. As you see in the first lines, we always allocate space for the return address and frame pointer. This space is only a few bytes and does not matter. But if necessary, we only generate stores / loads. Finally, note that the “hard” frame pointer is the “real” frame pointer register. This is necessary due to some internal reasons for gcc. The "frame_pointer_needed" flag is set by GCC when I cannot omit saving the frame pointer. In some cases, it should be stored, for example, when alloca used (it changes the dynamic speaker). GCC takes care of all this. Please note that some time has passed since I wrote this code, so I hope that the additional comments that I added above are not all wrong :)

+4
Feb 01 '09 at 0:18
source share

Stack alignment When entering the esp function, -4 mod 16, because the return address was pressed by call . Subtraction 12 will level it. There is no good reason for the stack to be aligned with 16 bytes on x86, except for the multimedia code that uses mmx / sse / etc. But somewhere in the 3.x era, gcc developers decided that the stack should always be aligned in in any case, imposing overhead on the prologue / epilogue, increased stack size and, as a result, strengthening the cache on all programs for the sake of several targeted interests (which, by the way, happens, these are some of my areas, but I still think it's unfair and bad decision).

Usually, if you enable any level of optimization, gcc will remove the unnecessary prolog / epilogue to align the stack for sheet functions (functions that do not make any function calls), but it will return as soon as you start calling.

You can also fix the problem with -mpreferred-stack-boundary=2 .

+3
Mar 30 '11 at 14:20
source share

Using GCC 4.3.2 I get this for a function:

 the_answer: movl $42, %eax ret 

... plus the surrounding garbage using the following command line: echo 'int the_answer() { return 42; }' | gcc --omit-frame-pointer -S -xc -o - - echo 'int the_answer() { return 42; }' | gcc --omit-frame-pointer -S -xc -o - -

Which version are you using?

+1
Feb 01 '09 at 0:15
source share



All Articles