The principle you need is called "backpatching": fill in a dummy value for a direct jump, generate a code for the operator’s body, and then go back and replace the dummy value with the real one at the end.
eg.
# start of codegen for 'while' L1: [evaluate cond1] jne L2 # forward reference: use a dummy value for now # start of codegen for 'if ... else' L3: [evaluate cond2] jne L4 # another forward reference: use a dummy value for now [code in 'if' branch] j L5 # another forward reference: use a dummy value for now L4: [code in 'else' branch before 'break'] j L2 [code in 'else' branch after 'break'] L5: # end of 'if ... else' # now go back and fill in references to L4, L5 inside the 'if ... else' block # end of codegen for 'if ... else' # codegen for 'while' continues... j L1 # loop L2: # end of 'while' loop # now go back and fill in references to L2 inside the 'while' block # end of codegen for 'while'
Regarding your editing:
If I compile AST recursively when I reach the break statement within a number of loops and if-else blocks, how should the compiler know which empty label to go to? I suppose some kind of label with an inscription external to the recursive AST walk function.
The goal of a jump implementing the break statement is the end of the innermost closed loop; yes, you can track this with an explicit external stack.
But, if you have a recursive AST hosting function, you already have an implicit stack of call frames for recursive function calls, so you probably don't need an explicit one either.
eg. in
... while (outer) ... if (outer_done) break ... while (inner) ... if (inner_done) break ... [break from inner 'while' ends up here] ... if (outer_done_2) break ... [break from outer 'while' ends up here] ...
all code generation for the inner while occurs in the recursive motion of the AST for the body of the outer while . Inside this recursive call, you only need to take care of the inner loop, not the outer loop.
So all you have to do is:
- keep current backpatching state at the beginning of processing
while - run a new empty list to track any
break that appears in the body of this while - processes the body (which may result in a recursive call)
- apply backlinks
- then restore the previous state before exiting.
eg. something like that:
codegen for while: save previous break backpatch list initialise break backpatch list as empty perform codegen for evaluating condition perform codegen for body statements apply backpatches restore previous break backpatch list
The current break backpatch list should be part of some state that is passed to all codegen functions (you need to add code for break for it). But the saved list can simply be tracked as a local variable of the while codegen function.
Matthew slattery
source share