Are C structures with the same types of members that have the same layout in memory?

Essentially, if I have

typedef struct { int x; int y; } A; typedef struct { int h; int k; } B; 

and I have A a , does the C standard guarantee that ((B*)&a)->k matches ay ?

+15
c casting struct
Nov 06 '13 at 5:17
source share
3 answers

Are C structures with the same member types that have the same layout in memory?

Almost so. Close enough for me.

From n1516, section 6.5.2.3, clause 6:

... if the union contains several structures that have a common initial sequence ... and if the object of the union currently contains one of these structures, it is allowed to check the general initial part of any of them at any place where the declaration of the completed type of union is visible. Two structures have a common initial sequence if the corresponding members have compatible types (and for bit fields, the same width) for a sequence of one or more initial elements.

This means that if you have the following code:

 struct a { int x; int y; }; struct b { int h; int k; }; union { struct aa; struct bb; } u; 

If you assign ua , the standard says that you can read the corresponding values ​​from ub . It stretches the bounds of plausibility to suggest that struct a and struct b may have different layouts, given this requirement. Such a system would be extremely pathological.

Remember that the standard also ensures that:

  • Structures are never traps.

  • Addresses of the fields in increasing structure ( ax always before ay ).

  • The offset of the first field is always zero.

However, this is important!

You rephrased the question,

Does the standard C mean that ((B*)&a)->k matches ay?

No! And he very clearly states that they are not the same!

 struct a { int x; }; struct b { int x; }; int test(int value) { struct aa; ax = value; return ((struct b *) &a)->x; } 

This is a pseudonym violation.

+14
Nov 06 '13 at 6:03
source share

Responses to other answers with a warning about section 6.5.2.3. There seems to be some debate about the exact wording anywhere that a declaration of the completed type of the union is visible , and at least GCC does not implement it as written . There are several tangential CR defects reports here and here , followed by comments from the committee.

I recently tried to figure out how other compilers (in particular, GCC 4.8.2, ICC 14, and clang 3.4) interpret this using the following code from the standard:

 // Undefined, result could (realistically) be either -1 or 1 struct t1 { int m; } s1; struct t2 { int m; } s2; int f(struct t1 *p1, struct t2 *p2) { if (p1->m < 0) p2->m = -p2->m; return p1->m; } int g() { union { struct t1 s1; struct t2 s2; } u; u.s1.m = -1; return f(&u.s1,&u.s2); } 

GCC: -1, clang: -1, ICC: 1 and warns of aliases

 // Global union declaration, result should be 1 according to a literal reading of 6.5.2.3/6 struct t1 { int m; } s1; struct t2 { int m; } s2; union u { struct t1 s1; struct t2 s2; }; int f(struct t1 *p1, struct t2 *p2) { if (p1->m < 0) p2->m = -p2->m; return p1->m; } int g() { union uu; u.s1.m = -1; return f(&u.s1,&u.s2); } 

GCC: -1, clang: -1, ICC: 1, but warns of aliases

 // Global union definition, result should be 1 as well. struct t1 { int m; } s1; struct t2 { int m; } s2; union u { struct t1 s1; struct t2 s2; } u; int f(struct t1 *p1, struct t2 *p2) { if (p1->m < 0) p2->m = -p2->m; return p1->m; } int g() { u.s1.m = -1; return f(&u.s1,&u.s2); } 

GCC: -1, clang: -1, ICC: 1, without warning

Of course, without rigorous smoothing optimization, all three compilers return the expected result each time. Since clang and gcc have no outstanding results in any of the cases, the only real information comes from the lack of ICC diagnostics in the latter. This is also consistent with the example given by the standards committee in the first defect report mentioned above.

In other words, this aspect of C is a real minefield, and you need to be careful that your compiler does the right thing, even if you follow the writing standard. All the worse, because it is intuitively clear that such a pair of structures must be compatible in memory.

+6
Nov 06 '13 at 8:40
source share

This type of anti-aliasing requires union . C11 Β§6.5.2.3 / 6:

To simplify the use of unions, there is one special guarantee: if the union contains several structures that have a common initial sequence (see below), and if the union object currently contains one of these structures, it is allowed to check the common initial part of any of them in any The place where the declaration of the completed union type is visible. Two structures have a common initial sequence if the corresponding members have compatible types (and for bit fields, the same width) for a sequence of one or more initial elements.

In this example:

The following is an invalid snippet (since the type of union is not visible inside the function f):

 struct t1 { int m; }; struct t2 { int m; }; int f(struct t1 *p1, struct t2 *p2) { if (p1->m < 0) p2->m = -p2->m; return p1->m; } int g() { union { struct t1 s1; struct t2 s2; } u; /* ... */ return f(&u.s1, &u.s2);} } 

The requirements, apparently, are that 1. the alias of the object is stored inside the union and 2. that the definition of this type of union is in scope.

For what it costs, the corresponding ratio of initial and subsequences in C ++ does not require union . In general, such a union dependency would be extremely pathological behavior for the compiler. If in some way the existence of a union type can affect the concerete memory model, it might be better not to try to display it.

My guess is that the intention is that a memory access verifier (Valgrind on steroids, I think) can check the potential alias error for these "strict" rules.

+3
Nov 06 '13 at 5:56
source share



All Articles