OO Polymorphism in C, problems with an alias?

My colleague and I are trying to achieve a simple polymorphic class hierarchy. We are working on an embedded system and are limited only to using the C compiler. We have a basic design idea that compiles without warning (-Wall -Wextra -fstrict-aliasing -pedantic) and works fine under gcc 4.8.1.

However, we are a little concerned about smoothing problems, as we do not fully understand when this becomes a problem.

To demonstrate, we wrote a toy example with the "interface" IHello and two classes that implement this interface "Cat" and "Dog".

#include <stdio.h> /* -------- IHello -------- */ struct IHello_; typedef struct IHello_ { void (*SayHello)(const struct IHello_* self, const char* greeting); } IHello; /* Helper function */ void SayHello(const IHello* self, const char* greeting) { self->SayHello(self, greeting); } /* -------- Cat -------- */ typedef struct Cat_ { IHello hello; const char* name; int age; } Cat; void Cat_SayHello(const IHello* self, const char* greeting) { const Cat* cat = (const Cat*) self; printf("%s I am a cat! My name is %s and I am %d years old.\n", greeting, cat->name, cat->age); } Cat Cat_Create(const char* name, const int age) { static const IHello catHello = { Cat_SayHello }; Cat cat; cat.hello = catHello; cat.name = name; cat.age = age; return cat; } /* -------- Dog -------- */ typedef struct Dog_ { IHello hello; double weight; int age; const char* sound; } Dog; void Dog_SayHello(const IHello* self, const char* greeting) { const Dog* dog = (const Dog*) self; printf("%s I am a dog! I can make this sound: %s I am %d years old and weigh %.1f kg.\n", greeting, dog->sound, dog->age, dog->weight); } Dog Dog_Create(const char* sound, const int age, const double weight) { static const IHello dogHello = { Dog_SayHello }; Dog dog; dog.hello = dogHello; dog.sound = sound; dog.age = age; dog.weight = weight; return dog; } /* Client code */ int main(void) { const Cat cat = Cat_Create("Mittens", 5); const Dog dog = Dog_Create("Woof!", 4, 10.3); SayHello((IHello*) &cat, "Good day!"); SayHello((IHello*) &dog, "Hi there!"); return 0; } 

Output:

Good afternoon! I'm a cat! My name is Gauntlets, I'm 5 years old.

Hello! I am a dog! I can make this sound: Woof! I am 4 years old and weigh 10.3 kg.

We are confident that the take-off from Cat and Dog to IHello is safe, as IHello is the first member of both of these structures.

Our real problem is the downgrade from IHello to Cat and Dog, respectively, in the corresponding implementations of the SayHello interface. Does this mean any severe problems with the alias? Is our code guaranteed to work according to the C standard, or are we just lucky that this works with gcc?

Update

The solution that we ultimately decided to use should be standard C and cannot be supported, for example. gcc. The code should be able to compile and run on different processors using different (proprietary) compilers.

The purpose of this "template" is that client code must receive pointers to IHello and thus be able to call functions in the interface. However, these calls should behave differently depending on which IHello implementation was received. In short, we want identical behavior with the concept of interfaces and OOP classes that implement this interface.

We know that code only works if the IHello interface structure is placed as the first member of the structures implementing the interface. This is a limitation that we are ready to accept.

According to: Does accessing the first field of a structure using C cast violate a strict alias?

ยง6.7.2.1 / 13:

Inside the structure object, the members of the non-bit field and the units in which the bit fields are located have addresses that increase in the order in which they are declared. A pointer to a structure object, appropriately transformed, points to its initial member (or if this element is a bit field, and then to the block in which it is located) and vice versa. There may be an unnamed addition to the structure object, but not at the beginning.

The rule of aliases states the following (ยง6.5 / 7):

The object must have a stored value, accessed only by the lvalue expression, which has one of the following types:

  • a type compatible with an efficient object type,
  • qualified version of the type compatible with the effective type of the object,
  • a type that is a signed or unsigned type corresponding to the effective type of the object,
  • a type that is a signed or unsigned type corresponding to a qualified version of an effective object type,
  • aggregate or type of association that includes one of the above types among its members (including recursively, a member of a sub-aggregate or contained association) or
  • character type.

In accordance with the fifth mark above and the fact that there is no top-laying in the structures, we are pretty sure that โ€œraising the levelโ€ of the derived structure that implements the interface for the interface pointer is safe, i.e.

 Cat cat; const IHello* catPtr = (const IHello*) &cat; /* Upcast */ /* Inside client code */ void Greet(const IHello* interface, const char* greeting) { /* Users do not need to know whether interface points to a Cat or Dog. */ interface->SayHello(interface, greeting); /* Dereferencing should be safe */ } 

The big question is whether the "downcast" used when implementing interface functions is safe. As seen above:

 void Cat_SayHello(const IHello* hello, const char* greeting) { /* Is the following statement safe if we know for * a fact that hello points to a Cat? * Does it violate strict aliasing rules? */ const Cat* cat = (const Cat*) hello; /* Access internal state in Cat */ } 

Also note that changing the signature of the implementation functions to

 Cat_SayHello(const Cat* cat, const char* greeting); Dog_SayHello(const Dog* dog, const char* greeting); 

and commenting downcast also compiles and works fine. However, this generates a compiler warning for the mismatch of the function signature.

+5
source share
2 answers

I have been dealing with objects in c for many years, performing just the kind of composition that you do here. I will recommend you not to do the simple actor that you are describing, but to justify that I need an example. For example, a timer callback mechanism used with a tiered implementation:

 typedef struct MSecTimer_struct MSecTimer; struct MSecTimer_struct { DoubleLinkedListNode m_list; void (*m_expiry)(MSecTimer *); unsigned int m_ticks; unsigned int m_needsClear: 1; unsigned int m_user: 7; }; 

When one of these timers expires, the control system calls the m_expiry function and passes a pointer to the object:

 timer->m_expiry(timer); 

Then take a base object that does something awesome:

 typedef struct BaseDoer_struct BaseDoer; struct BaseDoer_struct { DebugID m_id; void (*v_beAmazing)(BaseDoer *); //object "virtual" function }; //BaseDoer version of BaseDoer 'virtual' beAmazing function void BaseDoer_v_BaseDoer_beAmazing( BaseDoer *self ) { printf("Basically, I'm amazing\n"); } 

My naming system has a purpose here, but that's not really the trick. We can see many object-oriented function calls that may be required:

 typedef struct DelayDoer_struct DelayDoer; struct DelayDoer_struct { BaseDoer m_baseDoer; MSecTimer m_delayTimer; }; //DelayDoer version of BaseDoer 'virtual' beAmazing function void DelayDoer_v_BaseDoer_beAmazing( BaseDoer *base_self ) { //instead of just casting, have the compiler do something smarter DelayDoer *self = GetObjectFromMember(DelayDoer,m_baseDoer,base_self); MSecTimer_start(m_delayTimer,1000); //make them wait for it } //DelayDoer::DelayTimer version of MSecTimer 'virtual' expiry function void DelayDoer_DelayTimer_v_MSecTimer_expiry( MSecTimer *timer_self ) { DelayDoer *self = GetObjectFromMember(DelayDoer,m_delayTimer,timer_self); BaseDoer_v_BaseDoer_beAmazing(&self->m_baseDoer); } 

I have been using the same macro for GetObjectFromMember since 1990, and somewhere near it the Linux kernel created the same macro and named it container_of (parameters in a different order):

  #define GetObjectFromMember(ObjectType,MemberName,MemberPointer) \ ((ObjectType *)(((char *)MemberPointer) - ((char *)(&(((ObjectType *)0)->MemberName))))) 

which relies on the (technically) undefined behavior (dereferencing a NULL object), but carries over to every old (and new) c compiler I have ever tested. A newer version requires the offsetof macro, which is now part of the standard (apparently C89):

 #define container_of(ptr, type, member) ({ \ const typeof( ((type *)0)->member ) *__mptr = (ptr); (type *)( (char *)__mptr - offsetof(type,member) );}) 

Of course, I prefer my name, but whatever. Using this method makes your code not rely on the underlying object first, and also makes the second use case possible, which I find very useful in practice. All problems with the aliasing compiler are managed inside the macro (casting via char * I think, but I'm not a lawyer by standards).

+1
source

From the section of the standard that you specified:

A pointer to a structure object, suitably transformed, points to it (or if this element is a bit field, then to the module in which it is located), and vice versa

It is definitely safe to convert a pointer, like cat-> hello, to a Cat pointer, as well as for dog-> hello, so castings in your SayHello functions should be fine.

On the call site, you do the opposite: converting the pointer into a structure into a pointer to the first element. It also guaranteed work.

+2
source

All Articles