Access to the common part of the union from the base class

I have a Result<T> template template that contains a union of some error_type and T I would like to expose the common part (error) in the base class without resorting to virtual functions.

Here is my attempt:

 using error_type = std::exception_ptr; struct ResultBase { error_type error() const { return *reinterpret_cast<const error_type*>(this); } protected: ResultBase() { } }; template <class T> struct Result : ResultBase { Result() { new (&mError) error_type(); } ~Result() { mError.~error_type(); } void setError(error_type error) { mError = error; } private: union { error_type mError; T mValue; }; }; static_assert(std::is_standard_layout<Result<int>>::value, ""); void check(bool condition) { if (!condition) std::terminate(); } void f(const ResultBase& alias, Result<int>& r) { r.setError(std::make_exception_ptr(std::runtime_error("!"))); check(alias.error() != nullptr); r.setError(std::exception_ptr()); check(alias.error() == nullptr); } int main() { Result<int> r; f(r, r); } 

(This will be abbreviated; see extended version if unclear).

The base class uses a standard layout to find the address of the error field when shifting zero. He then points to error_type (assuming this is really the current dynamic union type).

Can this be considered portable? Or does this violate the rule for changing the pointer?


EDIT: My question was “is this portable”, but many commentators are puzzled by the use of inheritance here, so I will clarify.

Firstly, this is an example of a toy. Please do not take it too literally or assume that there is nothing to use for the base class.

The project has three goals:

  • Compactness. Error and result are mutually exclusive, so they must be in union.
  • Overhead without fulfillment. Virtual functions are excluded (plus holding vtable pointer conflicts with a target of 1). RTTI is also excluded.
  • Homogeneity. Common fields of different Result types must be accessible through homogeneous pointers or wrappers. For example: if instead of Result<T> we talked about Future<T> , then you can do whenAny(FutureBase& a, FutureBase& b) regardless of a / b particular type.

If you are willing to sacrifice (1), this becomes trivial. Something like:

 struct ResultBase { error_type mError; }; template <class T> struct Result : ResultBase { std::aligned_storage_t<sizeof(T), alignof(T)> mValue; }; 

If instead of goal (1) we donate (2), it may look like this:

 struct ResultBase { virtual error_type error() const = 0; }; template <class T> struct Result : ResultBase { error_type error() const override { ... } union { error_type mError; T mValue; }; }; 

Again, justification does not matter. I just want to make sure the original sample matches the C ++ 11 code.

+7
c ++ strict-aliasing c ++ 11
source share
5 answers

Here is my own attempt at an answer focusing solely on portability.

The standard layout is defined in §9.1 [class.name] / 7:

A standard layout class is a class that:

  • does not have non-static data members such as a non-standard class layout (or an array of such types) or a link,
  • does not have virtual functions (10.3) and there are no virtual base classes (10.1),
  • has the same access control (section 11) for all non-static data members,
  • does not have base classes of non-standard layout,
  • either does not have non-static data members in the derived class itself and no more than one base class with non-static data members, or does not have the base classes with non-static data elements and
  • does not have base classes of the same type as the first non-static data element.

By this definition, Result<T> is the standard layout provided that:

  • Both error_type and T are standard layouts. Note that this is not guaranteed for std::exception_ptr , although probably in practice.
  • T not a ResultBase .

§9.2 [class.mem] / 20 states that:

A pointer to an object of the standard layout structure, converted accordingly using reinterpret_cast, points to its initial member (or if this member is a bit field, then to the block in which it is located) and vice versa. [Note. Thus, in a standard structural object, but not at its beginning, as necessary to achieve appropriate alignment. -end note]

This means that empty base class optimization is mandatory for standard layout types. Assuming Result<T> has a standard layout, this in ResultBase guaranteed to point to the first field in Result<T> .

9.5 [class.union] / 1:

In a union, no more than one of the non-static data elements can be active at any time, that is, the value of no more than one non-static information can be stored in the union at any time. [...] Each non-static data member is allocated as if it were the only member of the structure.

And additionally §3.10 [basic.lval] / 10:

If a program tries to access the stored value of an object through a glvalue of another than one of the following types, the behavior is undefined

  • dynamic type of object
  • cv-qualified version of the dynamic type of an object,
  • a type similar (as defined in 4.4) for the dynamic type of an object,
  • a type that is a signed or unsigned type corresponding to a dynamic type of an object,
  • a type that is a signed or unsigned type corresponding to the receipt version of the dynamic type of an object,
  • an aggregate or association type that includes one of the above types among its elements or non-static data members (including, recursively, a sub-aggregate element or non-static data element or containing a union),
  • a type that is (possibly cv-qualified) a base class type of a dynamic object type,
  • a char or unsigned char type.

This ensures that reinterpret_cast<const error_type*>(this) will result in a valid pointer to the mError field.

All the contradictions aside, this method looks portable. Just follow the formal restrictions: error_type and T must be standard layouts, and T may not be a ResultBase type.

Side note. In most compilers (at least GCC, Clang and MSVC), non-standard layouts will work. As long as Result<T> has a predictable layout, errors and result types are not significant.

+1
source share

To answer the question: Is it portable?

No it's not even possible


Details:

It is impossible without erasing styles (for which RTTI / dynamic_cast is not required, but at least a virtual function is required). Working solutions for deleting styles already exist ( Boost.Any )

The reason is as follows:

  • You want to instantiate a class

    Result<int> r;

Creating an instance of the template class means that the compiler displays the size of the member variables so that it can allocate the object on the stack.

However, in your implementation:

 private: union { error_type mError; T mValue; }; 

You have an error_type variable that you want to use in a polymorphic way. However, if you correct the type when creating an instance of the template, you will not be able to change it later (another type may have a different size! You could also impose the size of objects on yourself, but do not do this. Ugly and hacky).

So, you have 2 solutions, use virtual functions or use error codes.

You may be able to do what you want, but you cannot do this:

  Result<int> r; r.setError(...); 

with the exact interface you want.

There are many possible solutions, if you enable virtual functions and error codes, why don’t you need virtual functions here? If performance is important, remember that the cost of "setting" the error is equal to setting the pointer to the virtual class (if you have no errors, you do not need to enable Vtable, and in any case, Vtable in the template will most likely be optimized in most cases).

Also, if you do not want to "highlight" error codes, you can pre-select them.

You can do the following:

 template< typename Rtype> class Result{ //... your detail here ~Result(){ if(error) delete resultOrError.errorInstance; else delete resultOrError.resultValue; } private: union { bool error; std::max_align_t mAligner; }; union uif { Rtype * resultValue; PointerToVirtualErrorHandler errorInstance; } resultOrError; } 

If you have 1 type of result or 1 pointer to a virtual class with the required error. You check the boolean to see if you currently have an error or result, and then you get the corresponding value from the union. The virtual cost is paid only if you have a mistake, and for a regular result you only have a penalty for a Boolean check.

Of course, in the above solution, I used a pointer to the result, because it allows you to get a general result, if you are interested in the results of a basic data type or POD structure with only basic data types, then you can avoid using a pointer for the result.

Note in your case, std::exception_ptr already gaining erasure , but you lose type information to get information about the missing type again, you can implement something like std::exception_ptr , but with enough virtual methods to ensure safe casting for the correct type of exception.

+2
source share

There is a common mistake made by C ++ programmers believing that virtual functions lead to higher processor and memory usage. I call this a mistake, although I know that with the help of virtual functions there is memory and a processor. But handwritten replacements for the virtual function mechanism are in many cases much worse.

You have already said how to achieve the goal using virtual functions - just repeat:

 class ResultBase { public: virtual ~ResultBase() {} virtual bool hasError() const = 0; virtual std::exception_ptr error() const = 0; protected: ResultBase() {} }; 

And its implementation:

 template <class T> class Result : public ResultBase { public: Result(error_type error) { this->construct(error); } Result2(T value) { this->construct(value); } ~Result(); // this does not change bool hasError() const override { return mHasError; } std::exception_ptr error() const override { return mData.mError; } void setError(error_type error); // similar to your original approach void setValue(T value); // similar to your original approach private: bool mHasError; union Data { Data() {} // in this way you can use also Non-POD types ~Data() {} error_type mError; T mValue; } mData; void construct(error_type error) { mHasError = true; new (&mData.mError) error_type(error); } void construct(T value) { mHasError = false; new (&mData.mValue) T(value); } }; 

Check out the full example here . As you can see, the version with virtual functions is 3 times smaller and 7 (!) Times faster - so it's not so bad ...

Another advantage is that you can have a “cleaner” design and no problems with “smoothing” / “alignment”.

If you really have a reason called compactness (I have no idea what it is) - with this very simple example you can implement virtual functions manually (but why ??? !!!). Here you are:

 class ResultBase; struct ResultBaseVtable { bool (*hasError)(const ResultBase&); error_type (*error)(const ResultBase&); }; class ResultBase { public: bool hasError() const { return vtable->hasError(*this); } std::exception_ptr error() const { return vtable->error(*this); } protected: ResultBase(ResultBaseVtable* vtable) : vtable(vtable) {} private: ResultBaseVtable* vtable; }; 

And the implementation is identical to the previous version with the readings below:

 template <class T> class Result : public ResultBase { public: Result(error_type error) : ResultBase(&Result<T>::vtable) { this->construct(error); } Result(T value) : ResultBase(&Result<T>::vtable) { this->construct(value); } private: static bool hasErrorVTable(const ResultBase& result) { return static_cast<const Result&>(result).hasError(); } static error_type errorVTable(const ResultBase& result) { return static_cast<const Result&>(result).error(); } static ResultBaseVtable vtable; }; template <typename T> ResultBaseVtable Result<T>::vtable{ &Result<T>::hasErrorVTable, &Result<T>::errorVTable, }; 

The above version is identical in using CPU / memory with a "virtual" implementation (surprise) ...

+2
source share

An abstract base class, two implementations, for errors and data, both with multiple inheritance, and using RTTI or the is_valid() member to determine at runtime.

+1
source share
 union { error_type mError; T mValue; }; 

Type T is not guaranteed to work with joins, for example, it may have a non-trivial constructor. some information about unions and constructors: Initializing a union with a non-trivial constructor

+1
source share

All Articles