No assembly of entire programs in general?

If you are writing an application that:

  • Single threaded
  • No cycles in call schedule
  • Doesn't use alloca or VLAs

Can modern compilers that optimize the entire program optimize the distribution of all stacks (for example, GCC, MSVC, ICC)? It seems that in these circumstances, he should be able to statically set all possible stack spaces. For "the whole program," I mean that the compiler has access to / all / source code (there is no way to perform operations at runtime, etc.).

+7
source share
4 answers

If you can guarantee the conditions you specified, then yes: it would be possible to efficiently completely statically distribute the stack. Each function will have a stack memory block.

However, will real compilers do this? Not.

It means absolutely nothing. In fact, he can get less than nothing. Often, most of the working stack is in the cache, so changing it is pretty cheap. If the stack was in static memory, then the only case when any particular function the memory “stack” function will be cached will be if you recently called this function. Using a real stack, you are likely to work in the cache.

In addition, by providing each function with a stack memory block, you can easily make the use of your static memory in your program much more than necessary. The stack is a fixed-size structure; no matter how many functions you have, the stack takes a certain size. If you have 100,000 functions, and each function occupies 64 bytes of space, then your static "stack" should occupy ~ 6.4 MB of space.

Why? You will never use most of this memory at any given time. The program will work perfectly in a stack of 1 MB or even 512 KB; Why take this memory 6 times for nothing?

Therefore, this is not a performance optimization and can inflate your program memory.

+2
source

This is a comment that is too long to be a comment:

Note that while all stack distributions can theoretically be optimized, more can be allocated than necessary. This is not what the OP asked about, but it may be interesting to consider. Finding the required minimum size will be equivalent to solving the stop problem. Imagine a program structured as:

<do 'something'> <call last thing which happens to require more stack space than everything else in 'something'> 

You only need extra stack space if <do 'something'> "stops".

You can also imagine other options in which optimization becomes arbitrarily difficult. For example, your program can simply evaluate a 3SAT expression with user input and do something depending on it - but this 3SAT expression may or may not have any value that leads to true.

Perhaps there is a more trivial case: the user may simply never enter an input that requires more stack space to process.

+1
source

This is possible for the compiler, but it will be such a specific optimization that it probably will not.

If you had a fully integrated program, you would take care of the overhead of setting up the stack frames for function calls.

However, if you want to also get rid of the stack distribution for local variables, the compiler must convert these local variables to global variables. I don’t know the compiler, and on some platforms additional instructions are required for referencing a global variable compared to a local variable (since the address must be loaded with two instructions, not one). In addition, since the reference to the stack variable is such a general operation, it is usually encoded into a smaller instruction.

+1
source

If you do not add “does not use any external libraries” to the list, any calls to external functions will require a stack, because they would be compiled, expecting the calling code to pass its parameters in a certain way, most likely to the stack. In addition, such libraries will almost certainly have to configure stacks for their own locales.

Also, depending on your exact application, even if you know that the stack can be assigned statically, it may be very difficult for the compiler to find out that there are no callbacks in your code, etc., which may result in the stack allocation required .

I just don’t see a case where the compilation will try to perform this optimization, because allocating stack space is already trivially faster (I assume a couple of register manipulations)

0
source

All Articles