Strict overlay and overlay overlay

Question

Strict overlay and overlay overlay

Consider this code example:

#include <stdio.h> typedef struct AA; struct A { int x; int y; }; typedef struct BB; struct B { int x; int y; int z; }; int main() { B b = {1,2,3}; A *ap = (A*)&b; *ap = (A){100,200}; //a clear http://port70.net/~nsz/c/c11/n1570.html#6.5p7 violation ap->x = 10; ap->y = 20; //lvalues of types int and int at the right addrresses, ergo correct ? printf("%d %d %d\n", bx, by, bz); }

I used to think that something like casting B * to * and using A * to manipulate a B * object was a severe violation of aliases. But then I realized that the standard really only requires that:

The object must have a stored value, accessible only with the value of the lvalue expression, which has one of the following types: 1) a compatible type with an effective type of the object, (...)

and expressions like ap->x have the correct type and address, and the ap type doesn't have any meaning there (or does it?). This, in my opinion, implies that this type of overlay overlay is correct until the substructure is managed as a whole.

Is this interpretation erroneous or supposedly contrary to what the authors of the standard suggest?

+8

c language-lawyer struct strict-aliasing

PSkocik Feb 20 '17 at 19:20

source share

2 answers

When C89 was written, it would be impractical for the compiler to support Common Initial Sequence guarantees for joins without defending them for structure pointers. In contrast, specifying CIS guarantees for pointers to structures does not mean that unions will exhibit similar behavior if their address is not accepted. Given that CIS guarantees were applicable to structure pointers from January 1974 — even before the union keyword was added to the language — and a lot of code relied on such behavior for many years in circumstances that could not be plausibly related with objects of type union , and that the authors of C89 were more interested in making the Standard concise than in making it “evidence for lawyers,” I would suggest that C89 is a specification of the CIS rule in terms of unions rather than pointers to the structure was almost certain yak caused by the desire to avoid redundancy, but not the desire to provide compilers freedom to go out of their way to break more than 15 years of precedent in the application of CIS guarantees to structure pointers.

The authors of C99 acknowledged that in some cases, applying the CIS rule to structural pointers could degrade what would otherwise be a useful optimization, and pointed out that if a pointer to one type of structure is used to validate a CIS member of another, the CIS Guarantee will not be valid if no definition of the full type of union containing both structures will be defined. Thus, for your example to be compatible with C99, it must contain a definition of type union containing both of your structures. This rule, apparently, was motivated by the desire to allow compilers to limit the use of the CIS to cases where they would have reason to expect that the two types can be used appropriately and allow the code to indicate which types are connected without having to add a new language construct for this goal.

The gcc authors seem to think that since it would be unusual for the code to get a pointer to a union member and then want to access another union member, the mere appearance of a full definition of the union type should not be enough to force the compiler to support the CIS guarantees, although most CIS applications have always revolved around structure indicators rather than trade unions. Consequently, gcc authors refuse to support designs like yours, even when it requires the C99 standard.

+1

supercat Mar 2 '17 at 0:18

source share

MM · Accepted Answer · 2017-02-20T20:52:00+0000

A line with *ap = is a strict aliens violation: an object of type B written using the lvalue expression of type A

Suppose there is no line, and we switched to ap->x = 10; ap->y = 20; ap->x = 10; ap->y = 20; . In this case, an int type lvalue is used for the int object type.

There is disagreement that this is a strict violation of pseudonyms or not. I think the Standard letter says that it’s not, but others (including the gcc and clang developers) see ap->x as implying that *ap . Most agree that the standard definition of strict aliases is too vague and needs to be improved.

Sample code using structure definitions:

 void f(A* ap, B* bp) { ap->x = 213; ++bp->x; ap->x = 213; ++bp->x; } int main() { B b = { 0 }; f( (A *)&b, &b ); printf("%d\n", bx); }

For me, this produces 214 in -O2 and 2 in -O3 , with gcc. The generated build on godbolt for gcc 6.3 was:

 f: movl (%rsi), %eax movl $213, (%rdi) addl $2, %eax movl %eax, (%rsi) ret

which shows that the compiler has changed this function to:

 int temp = bp->x + 2; ap->x = 213; bp->x = temp;

and therefore, the compiler must consider that ap->x cannot be an alias of bp->x .

Strict overlay and overlay overlay

More articles: