Why there are local variables in stack based IL bytecode

In a stack-based intermediate language such as CIL or Java byte code, why do local variables exist? You can just use only the stack. It may not be that easy for artificial IL, but the compiler can certainly do it. But my C # compiler does not.

Both the stack and local variables are private to the method and disappear from the scope when the method returns. Thus, it could have nothing to do with side effects visible from outside the method (from another thread).

The JIT compiler would eliminate load and storage for both the stack slots and local variables when generating machine code, if I am right, so the JIT compiler also does not see the need for local variables.

On the other hand, the C # compiler generates loads and stores for local variables, even if compilation with optimizations enabled. Why?


Take, for example, the following contrived code example:

static int X() { int a = 3; int b = 5; int c = a + b; int d; if (c > 5) d = 13; else d = 14; c += d; return c; } 

When compiling in C # with optimization, it produces:

  ldc.i4.3 # Load constant int 3 stloc.0 # Store in local var 0 ldc.i4.5 # Load constant int 5 stloc.1 # Store in local var 1 ldloc.0 # Load from local var 0 ldloc.1 # Load from local var 1 add # Add stloc.2 # Store in local var 2 ldloc.2 # Load from local var 2 ldc.i4.5 # Load constant int 5 ble.s label1 # If less than, goto label1 ldc.i4.s 13 # Load constant int 13 stloc.3 # Store in local var 3 br.s label2 # Goto label2 label1: ldc.i4.s 14 # Load constant int 14 stloc.3 # Store in local var 3 label2: ldloc.2 # Load from local var 2 ldloc.3 # Load from local var 3 add # Add stloc.2 # Store in local var 2 ldloc.2 # Load from local var 2 ret # Return the value 

Pay attention to loads and storages on four local variables. I could write the same operations (without taking into account the obvious optimization of constant propagation) without using any local variables.

  ldc.i4.3 # Load constant int 3 ldc.i4.5 # Load constant int 5 add # Add dup # Duplicate top stack element ldc.i4.5 # Load constant int 5 ble.s label1 # If less than, goto label1 ldc.i4.s 13 # Load constant int 13 br.s label2 # Goto label2 label1: ldc.i4.s 14 # Load constant int 14 label2: add # Add ret # Return the value 

It seems right to me, and much shorter and more efficient. So why do stack-based intermediate languages โ€‹โ€‹have local variables? And why does the optimizing compiler use them so widely?

+8
java c # bytecode local-variables
source share
2 answers

Depending on the situation, but especially when calls are involved, when parameters must be redirected to match the call, a clean stack is not enough if you do not have registers or variables at your disposal. If you want to do this only for the stack, you will need additional stack manipulation locks, such as the ability to exchange / replace the top two elements of the stack.

In the end, although everything can be expressed as a clean stack in this case, it can add complex complexity, bloating, and optimization difficulties to the code (local variables are ideal candidates for register caching).

Also remember that in .NET you can pass parameters by reference, how could you create an IL for this method call without a local variable?

 bool TryGet(int key, out string value) {} 
+4
source share

This answer is purely speculative - but I suspect that the answer has three parts.

1: Converting code to prefer Dup over local variables is very nontrivial, even if you ignore the side effects. This adds a lot of complexity and potentially a lot of runtime for optimization.

2: You cannot ignore side effects. In an example where everything is literally simple, it is very easy to understand that the values โ€‹โ€‹are on the stack or locale, and therefore are under the full control of the current instructions. Once these values โ€‹โ€‹come from heap, static memory, or method calls, you can no longer shuffle things to use Dup instead of local ones. Changing the order can change how everything works, and cause unintended consequences due to side effects or external access to shared memory. This means that usually you cannot do these optimizations.

3: It is assumed that the stack values โ€‹โ€‹are faster than the local variables are not a good assumption - for a particular machine code conversion, IL-> it may be true that the stack values โ€‹โ€‹are faster, but there is no reason smart JIT would not put the stack location in memory and local variable in register. The JIT task is to know what is fast and what is slow for the current machine, and the JIT task is to solve this problem. By design, the CIL compiler does not answer the question of whether locales or stacks are faster or faster; and therefore, the measurable difference between these results depends only on the size of the code.

Let's unite, 1 means that it is complicated and has a non-trivial cost, 2 means that the cases of the real world where it would be valuable are few, and 3 means that 1 and 2 do not matter.

Even if the goal is to minimize the size of the CIL, which is a measurable goal for the CIL compiler, reason # 2 describes this as a small improvement for a small number of cases. The Pareto principle cannot tell us that it is the idea of โ€‹โ€‹BAD to implement this kind of optimization, but it will recommend that BETTER probably uses development time.

0
source share

All Articles