How production compilers implement destructor processing in flow control

Question

How production compilers implement destructor processing in flow control

In short, I am writing a compiler and having reached the OOP functions, I am faced with a dilemma related to processing destructors. Basically, I have two options:

1 - place all destructors for objects that need to be called at this point in the program. This option sounds as if it will be convenient and easy to use, but inflates the code, because depending on the control flow some destructors can be duplicated several times.
2 - section destructors for each block of code with labels and "spaghetti jump" only through those that are needed. The potential - no destructors will be duplicated, the disadvantage - it will include non-sequential execution and jumping around, as well as additional hidden variables and symbols that will be needed, for example, to determine whether a block leaves a block to continue execution in the parent block or break / continue / goto / return, which also increases its complexity. And additional variables and checks can very well eat the space that will be saved by this approach, depending on how many objects are and how complex the structure and control flow is inside it.

And I know that the usual answer to such questions is “to do both this and the profile and solve”, and what I would do if it were a trivial task, but creating a full-featured compiler turned out to be somewhat difficult, so I prefer an expert entering, rather than collecting two bridges, see which one is better and burn the other.

I put C ++ in tags because the language I use is a little familiar with it, and the RAII paradigm, which my compiler also models.

+7

c ++ oop destructor flow-control compiler-design

user3735658 Nov 02 '14 at 10:59

source share

4 answers

defube · Answer 1 · 2014-11-02T11:26:11+0000

For the most part, a call to a destructor can be handled just like a regular function call.

A smaller part deals with EH. I noticed that MSC generates a combination of built-in calls of the destructor in the "regular" code and for x86-64 creates a separate cleaning code, which itself may or may not have a copy of the destructor logic in it.

IMO, the simplest solution would always be to call non-trivial destructors like regular functions.

If optimization seems possible on the horizon, consider the above challenges, like everything else: will it go into the cache with everything else? Will it take up too much space in the image? Etc ..

An interface can insert “calls” into non-trivial destructors at the end of each active block in its AST output.

A backend can relate to such things as ordinary function calls, bind them together, create the logic of the logic of a large block-o-destructor somewhere, and move on to this, etc.

Associating functions with the same logic seems pretty common. For example, MSC tends to associate all trivial functions with the same implementation, destructor, or otherwise, optimizing or not.

This is primarily an experience. As usual, YMMV.

One more thing:

EH cleanup logic tends to work like a transition table. For this function, you can simply go to one list of calls to the destructor, depending on where the exception was thrown (if applicable).

Mats petersson · Answer 2 · 2014-11-02T11:10:57+0000

I don’t know how commercial compilers come up with code, but assuming that we ignore exceptions at this stage [1], the approach I would like to do is make a call to the destructor rather than embed it. Each destructor will contain a complete destructor for this object. Use a loop to process array destructors.

Embedding calls is an optimization, and you should not do this unless you “know it pays off” (code size and speed).

You will need to deal with “destruction in the closing block”, but provided that you do not have exits from the block, this should be easy. Block exits (for example, return, break, etc.) Mean that you need to go to the code fragment that clears the block in which you are located.

[1] Commercial compilers have special tables based on “where the exception was thrown”, and part of the code generated for this cleanup typically reuses the same cleanup for many exception points, with several jump labels in each cleanup fragment.

Basile starynkevitch · Answer 3 · 2014-11-02T11:21:00+0000

The optimizing compiler converts the internal representations of the compiled source code.

Usually it creates an oriented (usually cyclical) graph of the main blocks . When constructing this control flow graph, it adds a call to destructors.

For GCC (this is a free software compiler - and therefore Clang / LLVM - so you can learn its source code), you can probably try to compile a simple C ++ code with -fdump-tree-all and then see what this is done in gimplification . BTW, you can customize g++ using MELT to study its internal representations.

By the way, I don’t think it is so important to deal with destructors (note that in C ++ they are implicitly called in syntactically defined places, for example } their defining area). Most of the work of such a compiler is optimized (then dealing with destructors is not very relevant, they are almost subroutines, like others).

Dwayne robinson · Answer 4 · 2015-04-17T05:58:52+0000

Compilers use a combination of both approaches. MSVC uses built-in calls to the destructor for normal code flow and flush code blocks in the reverse order for early returns and exceptions. During a normal thread, it uses a single hidden local integer to track the progress of the constructor at the moment, so it knows where to go to early returns. A single whole is sufficient because the scope always forms a tree (instead of using a bitmask for each class that was or was not successfully created). For example, the following rather short code using a class with a nontrivial destructor and some random returns sprinkled all over ...

  ... if (randomBool()) return; Foo a; if (randomBool()) return; Foo b; if (randomBool()) return; { Foo c; if (randomBool()) return; } { Foo d; if (randomBool()) return; } ...

... can expand to pseudo-code, as shown below on x86, where the progress of the constructor increases immediately after each call to the constructor (sometimes by more than the next unique value) and decreases (or "gets knocked out" to an earlier value) immediately before each call to the destructor. Note that classes with trivial destructors do not affect this value.

  ... save previous exception handler // for x86, not 64-bit table based handling preallocate stack space for locals set new exception handler address to ExceptionCleanup set constructor progress = 0 if randomBool(), goto Cleanup0 Foo a; set constructor progress = 1 // Advance 1 if randomBool(), goto Cleanup1 Foo b; set constructor progress = 2 // And once more if randomBool(), goto Cleanup2 { Foo c; set constructor progress = 3 if randomBool(), goto Cleanup3 set constructor progress = 2 // Pop to 2 again c.~Foo(); } { Foo d; set constructor progress = 4 // Increment 2 to 4, not 3 again if randomBool(), goto Cleanup4 set constructor progress = 2 // Pop to 2 again d.~Foo(); } // alternate Cleanup2 set constructor progress = 1 b.~Foo(); // alternate Cleanup1 set constructor progress = 0 a.~Foo(); Cleanup0: restore previous exception handler wipe stack space for locals return; ExceptionCleanup: switch (constructor progress) { case 0: goto Cleanup0; // nothing to destroy case 1: goto Cleanup1; case 2: goto Cleanup2; case 3: goto Cleanup3; case 4: goto Cleanup4; } // admitting ignorance here, as I don't know how the exception // is propagated upward, and whether the exact same cleanup // blocks are shared for both early returns and exceptions. Cleanup4: set constructor progress = 2 d.~Foo(); goto Cleanup2; Cleanup3: set constructor progress = 2 c.~Foo(); // fall through to Cleanup2; Cleanup2: set constructor progress = 1 b.~Foo(); Cleanup1: set constructor progress = 0 a.~Foo(); goto Cleanup0; // or it may instead return directly here

The compiler, of course, can rearrange these blocks anyway, since it believes that it is more efficient, instead of putting all the cleanup at the end. Early returns might go instead of the alternative Cleanup1 / 2 at the end of the function. In 64-bit MSVC code, exceptions are handled through tables that display an instruction pointer when an exception occurred with the corresponding code clear blocks.

How production compilers implement destructor processing in flow control

One more thing:

More articles: