Type punning struct in C and C ++ through union

Question

Type punning struct in C and C ++ through union

I compiled this in gcc and g ++ with pedantic, and I don't get the warning in any:

#include <stdio.h> #include <stdlib.h> #include <string.h> struct a { struct a *next; int i; }; struct b { struct b *next; int i; }; struct c { int x, x2, x3; union { struct aa; struct bb; } u; }; void foo(struct b *bar) { bar->next->i = 9; return; } int main(int argc, char *argv[]) { struct cc; memset(&c, 0, sizeof c); cuanext = (struct a *)calloc(1, sizeof(struct a)); foo(&c.ub); printf("%d\n", cuanext->i); return 0; }

Is this legal for C and C ++? I read about how to write, but I do not understand. Is foo(&c.ub) different from foo((struct b *)&c.ua) ? Wouldn't they be exactly the same? This exception for structures in a union (from C89 to 3.3.2.3) says:

If the union contains several structures that have a common initial sequence, and if the object of the union currently contains one of these structures, it is allowed to check the common initial part of any of them. Two structures have a common initial sequence if the corresponding members have compatible types for a sequence of one or more initial members.

In the union, the first member of struct a is equal to struct a *next , and the first member of struct b is equal to struct b *next . As you can see, a pointer to struct a *next written, and then a pointer to struct b *next is read in foo. Are they compatible types? They both point to a structure, and pointers to any structure should be the same size, why should they be compatible, and the layout should be the same correct? Is it possible to read i from one structure and write to another? Can I commit any pseudonym or type violations?

+11

c ++ c language-lawyer strict-aliasing type-punning

loop Feb 14 '15 at 22:52

source share

4 answers

What you could do (and I was bitten by this before), declares both the initial structure pointer as void* and does the casting. Since void is converted to / from any type of pointer, you will be forced to pay an ugliness tax rather than risk gcc reordering your operations (which I saw, even if you use union), as a result of a compiler error in some versions. As @TC correctly points out, the compatibility of this type of layout means that they are convertible at the language level; even if the types, by the way, are the same size, they are not necessarily compatible with layouts; which could allow some greedy compilers to take on some other things based on this.

+2

Mark Nunberg Feb 15 '15 at 17:04

source share

I had a similar question a while ago , and I think I can answer your question.

Yes, struct a and struct b are not compatible types, and pointers to them are also incompatible.

Yes, what you do is illegal even from the outdated point of view of the C89 standard. However, it may be interesting to note that if you change the order of the elements in struct a and struct b , you can access the int i struct c instance (but not access its next pointer in any way, i.e. bar->i = 9; instead of bar->next->i = 9; ), but only from the point of view of C89.

But even if you change the order of the elements in two struct s, what you do will still be illegal in terms of the C99 and C11 standards (as interpreted by the commit). On C99, part of the standard you specified was changed to:

To simplify the use of unions, there is one special guarantee: if the union contains several structures that have a common initial sequence (see below), and if the union object currently contains one of these structures, it is allowed to check the common initial part of any of them anywhere, so that the declaration of the completed join type is visible .

The last phrase is a bit ambiguous, since you can interpret the “visible” in several ways, but according to the commit, this means that the check must be performed on the object of the type of union in question.

So, in your case, the correct way to handle this would be as follows:

 struct a { int i; struct a *next; }; struct b { int i; struct b *next; }; union un { struct aa; struct bb; }; struct c { int x, x2, x3; union un u; }; /* ... */ void foo(union un *bar) { bar.b->next->i = 9; /* This is the "inspection" operation */ return; } /* ... */ foo(&c.u);

This is good and interesting from the point of view of a lawyer language, but in fact, if you do not apply different packaging settings to them, a struct with the same initial sequence will have the same layout (in 99.9% of cases). In fact, they must have the same layout even in the initial setup, since pointers to struct a and struct b must be the same size. That way, if your compiler doesn't get nasty when you break a strict alias, you can more or less safely display between them or use them in union the way you use them now.

EDIT : as @underscore_d noted in the comments on this answer, since the corresponding sentences in the C ++ standards do not have the line “wherever declared a completed type of union visible” in their respective parts, it is possible that the C ++ standard has such The same position on this issue as the standard C89.

+2

Mints97 Feb 15 '15 at 17:37

source share

Yes, that’s fine; the bold part of the quote in your question covers this case.

-one

Lightness Races in Orbit Feb 14 '15 at 23:36

source share

TC · Accepted Answer · 2015-02-15 02:34

In C:

struct a and struct b are not compatible types. Even in

 typedef struct s1 { int x; } t1, *tp1; typedef struct s2 { int x; } t2, *tp2;

s1 and s2 are not compatible types. (See the Example in 6.7.8 / p5.) An easy way to define incompatible structures is that if two types of structures are compatible, then something of the same type can be assigned to something of another type. If you expect the compiler to complain when trying to do this, then they are not compatible types.

Therefore, struct a * and struct b * also not compatible types, therefore, struct a and struct b do not have a common source subsequence. Instead, your union guard is governed by the same rule as merging in other cases (6.5.2.3 footnote 95):

If the element used to read the contents of the union object is not the same as the last element used to store the value in the object, the corresponding part of the object representation of the value is equally reinterpreted as representing the object in a new type, as described in 6.2.6 (process , sometimes called "type ping"). This may be a trap view.

In C ++, struct a and struct b also do not have a common source subsequence. [class.mem] / p18 (citation N4140):

Two structures of the standard layout share a common initial sequence if the corresponding members are of types compatible with the layouts and neither the member is a bit field or both are bit fields with the same width for a sequence of one or more initial elements.

[basic.types] / P9:

If the two types T1 and T2 are the same type, then T1 and T2 are layout compatible types. [Note. Enumerations corresponding to the layout are described in 7.2. Layout-compatible layout structures and Standard connection layouts are described in 9.2. -end note]

struct a * and struct b * are neither structures, nor unions, nor enumerations; therefore, they are only compatible with layouts if they are of the same type, but this is not so.

It is true that ([basic.compound] / p3)

Pointers to cv-qualifying and cv-unqualified versions (3.9.3) of types compatible with layouts must have the same representation of values and (3.11).

But this does not mean that these types of pointers are compatible with layout types, since this term is defined in the standard.

Type punning struct in C and C ++ through union

More articles: