Let's start with some context.
The user memory pool used code similar to the following:
struct FastInitialization {}; template <typename T> T* create() { static FastInitialization const F = {}; void* ptr = malloc(sizeof(T)); memset(ptr, 0, sizeof(T)); new (ptr) T(F); return reinterpret_cast<T*>(ptr); }
The idea is that when called with FastInitialization constructor can assume that the store is already initialized to zero and therefore only initializes those members that require a different value.
GCC (6.2 and 6.3 at least), however, has an βinterestingβ optimization that works.
struct Memset { Memset(FastInitialization) { memset(this, 0, sizeof(Memset)); } double mDouble; unsigned mUnsigned; }; Memset* make_memset() { return create<Memset>(); }
Compiles to:
make_memset(): sub rsp, 8 mov edi, 16 call malloc mov QWORD PTR [rax], 0 mov QWORD PTR [rax+8], 0 add rsp, 8 ret
But:
struct DerivedMemset: Memset { DerivedMemset(FastInitialization f): Memset(f) {} double mOther; double mYam; }; DerivedMemset* make_derived_memset() { return create<DerivedMemset>(); }
Compiles to:
make_derived_memset(): sub rsp, 8 mov edi, 32 call malloc mov QWORD PTR [rax], 0 mov QWORD PTR [rax+8], 0 add rsp, 8 ret
That is, only the first 16 bytes of the struct corresponding to its base are initialized. Debugging information confirms that memset(ptr, 0, sizeof(T)); completely removed.
On the other hand, both ICC and Clang both call memset in full size, here is the result of Clang:
make_derived_memset():
Thus, the behavior of GCC and Clang is different, and the question arises: is GCC correct and creates a better assembly, or is Clang right and GCC buggy?
Or, in terms of language advocacy:
In what circumstances can a constructor rely on a previous value stored in allocated memory?
Note. I guess this only matters when placing new , but I'm happy it will be shown differently.