Unions and a strict alias in C11

Assuming I have such a union

union buffer { struct { T* data; int count; int capacity; }; struct { void* data; int count; int capacity; } __type_erased; }; 

Am I having problems if I mix read / write anonymous structure members and __type_erased members according to the C11 alias rules?

In particular, I am interested in the behavior that occurs if they are accessed independently (for example, using different pointers). To illustrate:

 grow_buffer(&buffer.__type_erased); buffer.data[buffer.count] = ... 

I read all the relevant questions that I could find, but I still don’t understand it 100%, as some people seem to believe that this behavior is undefined, while others say that it is legal. In addition, the information I find is a combination of the rules C ++, C99, C11, etc., which are quite difficult to digest. Here I am clearly interested in the behavior set by C11 and presented by popular compilers (Clang, GCC)

Edit: additional information

Now I have done several experiments with several compilers and decided to share my conclusions if someone encounters a similar problem. The basis of my question is that I tried to write a convenient high-performance universal implementation of a dynamic array on simple C. The idea is that the array operation is performed using macros and heavy-duty operations (for example, to increase the array) is performed using a template template with an alias . For example, I might have a macro like this:

 #define ALLOC_ONE(A)\ (_array_ensure_size(&A.__type_erased, A.count+1), A.count++) 

which increments the array, if necessary, and returns the index of the newly selected item. The specification (6.5.2.3) states that access to the same location through different union members is permitted. My interpretation of this is that although _array_ensure_size () does not know the type of union, the compiler should know that the __type_erased member could potentially be mutated by a side effect. That is, I would suggest that this should work. However, it seems that this is a gray area (and, frankly, the specification is really not clear on what constitutes access to membership). Apple's latest Clang (clang-800.0.33.1) has no problems with it. The code compiles without warning and works as expected. However, when compiling with GCC 5.3.0, the code crashes using segfault. In fact, I have a strong suspicion that the behavior of GCC is a mistake - I tried to make the mutation of the union member explicit by removing the mutable ref pointer and adopting a clear functional style, for example:

 #define ALLOC_ONE(A) \ (A.__type_erased = _array_ensure_size(A.__type_erased, A.count+1),\ A.count++) 

This works again with Clang, as expected, but calls GCC again. My conclusion is that extended type manipulations using connections are a gray area where you need to be carefully pushed through.

+4
language-lawyer strict-aliasing c11 unions
Jul 19 '16 at 7:36
source share
1 answer

The C11 standard states the following:

6.5.2.3 Elements of structure and association

95) If the element used to read the contents of the union object is not the same as the last element used to store the value in the object, the Corresponding part of the object representation of the value is equal to reinterpreted as representing the object in a new type, as described in 6.2.6 ( process, sometimes called "type ping"). This may be a trap view.

So, in terms of reading / writing the join field in C11, this is correct. But strict anti-aliasing is type-based analysis, so its naive implementation may say that read / write operations are independent. As far as I understand, modern gcc can detect cases with union fields and avoid such errors.

Aloso, you must remember that there are some cases with pointers to members of a union that are not valid:

The following is an invalid snippet (since the type of union is not visible inside the function f):

 struct t1 { int m; }; struct t2 { int m; }; int f(struct t1 *p1, struct t2 *p2) { if (p1->m < 0) p2->m = -p2->m; return p1->m; } int g() { union { struct t1 s1; struct t2 s2; } u; /* ... */ return f(&u.s1, &u.s2); } 

In my opinion, using read / write unions in different members is dangerous, and it is better to use it.

+3
Jul 19 '16 at 15:13
source share



All Articles