Is it filled and then the behavior of the undefined subobject is partially initialized?

Consider the following structure initialization:

#include<stdio.h> struct bar { int b; int a; int r; }; struct foo { struct bar bar; }; int main(int argc, char **argv) { struct bar b = {1, 2, 3}; struct foo f = {.bar = b, .bar.a = 5 }; // should this print "1, 5, 3", "1, 5, 0", or "0, 5, 0"? // clang on Mac prints "1, 5, 3", while gcc on Ubuntu prints "0, 5, 0" printf("%d, %d, %d\n", f.bar.b, f.bar.a, f.bar.r); return 0; } 

The C11 standard seems to do a pretty poor job of describing what kind of behavior should be expected here in section 6.7.9, but it seems to believe that it does a reasonable job since I don't see any warnings about undefined behavior in this case.

In practice, it seems that the behavior is either not standardized, or the standard is violated by at least one common compiler, with clang / llvm 8.0.0 on a Mac producing "1, 5, 3" and gcc 5.4 on Ubuntu producing "0, 5, 0" .

According to the C standard, if f.bar.b and f.bar.r clearly defined at this point, or does this initialization lead to undefined or unspecified behavior?

+7
c initialization
source share
4 answers

The C11 standard seems to do a pretty poor job of describing what behavior to expect here in section 6.7.9.

Standard may be hard to read, but I do not think this area of ​​the standard is worse in this regard than would be expected.

but seems to believe that he is doing a reasonable job, since I do not see any warnings regarding undefined behavior in this case.

The standard is not required to explicitly declare undefined behavior. In fact, the standard contains a comprehensive statement that wherever it defines behavior for a given piece of code, this behavior is undefined. However, I believe section 6.7.9 covers this area pretty well. The main open area is:

The ratings of the initialization list expressions are indefinitely sequenced relative to each other and, therefore, the order in which any side effects occur is not specified.

(C2011, 6.7.9 / 23)

This is not a problem for your example.

In practice, it seems that the behavior is either not standardized or the standard is violated by at least one common compiler, with clang / llvm on Mac producing "1, 5, 3" and gcc on Ubuntu, producing "0, 5, 0".

I am fully prepared to believe that one or the other of them does not meet the requirements in this area. However, pay attention also to the compiler versions and compilation options - they can be compiled for different versions of the standard with or without extensions.

According to the C standard, should f.bar.b and f.bar.r be clearly defined at this point, or does this initialization lead to undefined or unspecified behavior?

If the declaration of the object has an associated initializer, then the whole object is initialized, and, in addition, the resulting initial value is well defined by the standard, taking into account the reservations arising from 6.7.9 / 23. Regarding the initial values ​​needed for the appropriate implementation in your example, the key points are:

Initialization should be performed in the order of the initializer list, with each initializer providing a specific subobject redefining any previously specified initializer for the same subobject ; all subobjects that are not explicitly initialized should be implicitly initialized the same as objects that have a static storage duration.

(C2011, 6.7.9 / 19, highlighted)

Each list of notations begins its description with the current object associated with the nearest outer bracket. Each element in the notation list (in order) indicates a specific member of its current object and changes the current object for the next pointer (if any) as that member. The current object that appears at the end of the notation list is a subobject that must be initialized by the next initializer.

(C2011, 6.7.9 / 18, highlighted)

If an aggregate or association contains elements or elements that are aggregates or associations, these rules are applied recursively to subaggregates or containing associations.

(C2011, 6.7.9 / 20)

Thus, when specifying f initializer

  struct foo f = {.bar = b, .bar.a = 5 }; 

we first process the .bar = b element, as required in accordance with 6.7.9 / 19. It contains a list of pointers denoting foo.b , of type struct bar , as an object to initialize from the next initializer. This initializer uses the option "one expression that has a compatible structure or type of union", in accordance with 6.7.9 / 13, so the initial value of f.bar is the value of b , subject to partial or complete redefinition to subsequent initializers.

Then we process the second element .bar.a = 5 . This initializes f.bar.a and only this subobject on 6.7.9 / 18, overriding the initialization indicated by the previous initializer on 6.7.9 / 19.

The result of the corresponding initialization, thus, leads to printing

 1, 5, 3 

GCC seems to fail by reinitializing all of f.bar when it processes the second initializer, not just f.bar.a

+2
source share

The standard says ...

I will quote from §6.7.9 the Initializers ISO / IEC 9899: 2011 (standard C11), in the same section that Vlad from Moscow quotes in his answer:

¶16 Otherwise, the initializer for an object that is of type aggregate or union must be a list of initializers for elements or named elements enclosed in brackets.

¶17 Each initializer list enclosed in brackets has a linked current object. When there is no designation, the subobjects of the current object are initialized in order according to the type of the current object: array elements in ascending substring order, structure members in the declaration order, and the first named member of the union. 148) On the contrary, the designation causes the next initializer to start initializing the subobject described by the pointer. Initialization then continues in order, starting with the next subobject after it is described by the symbol. 149)

¶18 Each symbol list starts its description with the current object associated with the nearest neighboring bracket. Each element in the notation list (in order) indicates a specific member of its current object and changes the current object for the next pointer (if any) as that member. 150) The current object that leads to the end of the notation list is a subobject that must be initialized by the next initializer.

¶19 Initialization must be performed in the order of the list of initializers, each initializer has provided for a specific subobject redefining any previously specified initializer for the same subobject; 151) all subobjects that are not explicitly initialized should be implicitly initialized the same as objects that have a static storage duration.

¶20 If an aggregate or association contains elements or elements that are aggregates or associations, these rules apply recursively to subgroups or combined associations. If the initializer of a subaggregate or contained union begins with a left bracket, the initializers enclosed in this bracket and its matching right bracket initialize the elements or elements of the subaggregate or combined union. Otherwise, only a sufficient number of initializers from the list are taken into account for elements or members of the sub-aggregate or the first member of the joint union; any remaining initializers are left to initialize the next element or member of the population, of which the current sub-aggregate or containing the union is a part.

¶21 If the list enclosed in curly brackets contains fewer initializers than the element or elements of the collection or fewer characters in the string literal used to initialize an array of known size than in the array, the rest of the collection must be initialized implicitly in the same way as objects having a static storage duration.

148) If the list of initializers for the combined or contained union does not start with the left bracket, its subobjects are initialized as usual, but the sub-aggregate or containing the combined does not become the current object: the current objects are associated only with lists of initializers enclosed in brackets.

149) After the member of the association is initialized, the next object is not the next member of the association; instead, it is the next subobject of the object containing the union.

150) Thus, a pointer can indicate only a strict subobject of a population or association associated with a neighboring brace. Also note that each individual list of notation is independent.

151) Any initializer for a subobject that is overridden and therefore not used to initialize this subobject may not be evaluated at all.


Interpretation

I think your code is well-formed and that GCC doesn’t handle it correctly, and Clang processes it correctly.

With the modified code only so that unused argc and argv are replaced with void , working on Mac with macOS Sierra 10.12.1, compilation with GCC 6.2.0 and with Apple clang version "Apple LLVM version 8.0.0 (clang-800.0. 42.1) ", I get the same results as you:

  • 0, 5, 0 from the GCC.
  • 1, 5, 3 from Clang.

Key wording in the standard:

On the contrary, the notation causes the next initializer to start initializing the subobject described by the pointer.

In your initializer, you have:

  struct foo f = { .bar = b, .bar.a = 5 }; 

The first part of the initializer .bar = b, explicitly initializes the subobject bar . At this point .bar.b is 1 , .bar.a is 2 , .bar.r is 3 . If you omit the part , .bar.a = 5 initializer, the compilers agree.

When you turn , .bar.a = 5 , the pointer calls the next initialization to start initializing the subobject described by the pointer - and the designation .bar.a , so initializing 5 initializes .bar.a . Compilers agree with this; both set .bar.a to 5 . But the subobject indicated by .bar was previously initialized, so the initializer for .bar.a only affects the .a element; it should not redefine any other element.

If the initializer is expanded with , 19 , then 19 not a designation, but it initializes the subobject after the previous designation, which is .bar.r . Both compilers agree with this.

This test code, a minor version of your code, illustrates:

 #include <stdio.h> struct bar { int b; int a; int r; }; struct foo { struct bar bar; }; static inline void foobar(struct foo f) { printf("%d, %d, %d\n", f.bar.b, f.bar.a, f.bar.r); } int main(void) { struct bar b = {1, 2, 3}; struct foo f0 = {.bar = b, .bar.a = 5 }; struct foo f1 = {.bar = b, .bar.a = 5, 19 }; struct foo f2 = {.bar = b }; foobar(f0); foobar(f1); foobar(f2); return 0; } 

Exiting GCC:

 0, 5, 0 0, 5, 19 1, 2, 3 

Exiting Clang:

 1, 5, 3 1, 5, 19 1, 2, 3 

Note that even without special warnings, clang captures this code:

 $ clang -O3 -g -std=c11 so-4092-0714.c -o so-4092-0714 so-4092-0714.c:21:36: warning: subobject initialization overrides initialization of other fields within its enclosing subobject [-Winitializer-overrides] struct foo f0 = {.bar = b, .bar.a = 5 }; ^~~~~~ so-4092-0714.c:21:29: note: previous initialization is here struct foo f0 = {.bar = b, .bar.a = 5 }; ^ so-4092-0714.c:22:36: warning: subobject initialization overrides initialization of other fields within its enclosing subobject [-Winitializer-overrides] struct foo f1 = {.bar = b, .bar.a = 5, 19 }; ^~~~~~ so-4092-0714.c:22:29: note: previous initialization is here struct foo f1 = {.bar = b, .bar.a = 5, 19 }; ^ 2 warnings generated. $ 

As I said, I think Klang correctly initializes these structures, even if he complains more than necessary by doing this.

+3
source share

The C standard says (6.7.9 Initialization)

17 Each initializer list enclosed in brackets has a corresponding current object. When there is no designation, the subobjects of the current object are initialized in the order that corresponds to the type of the current object: array elements in ascending order of the substring, the structure of the members in the order of prescription, and the first member named union.148) On the contrary, the designation calls the next initializer to start the initialization of the subobject described by the pointer . Initialization then continues in order, starting with the next subobject after that described by the pointer

and

19 Initialization must be performed in the order of the list of initializers, each initializer provided for a particular subobject, the previously specified initializer for the same subobject ; 151) all subobjects that are not initialized explicitly must be implicitly initialized the same as objects with a static storage duration.

This footnote is important.

148) If the list of initializers for the subaggregate or contained union does not start with the left bracket, its subobjects are initialized as usual, but the aggregate or containing the union does not become the current object: the current objects are connected only with the list of initializers enclosed in brackets.

Thus, I see neither undefined nor undefined behavior.

In my opinion, the result should look like { 1, 5, 3 } .

If you leave aside the Standard, then you must first initialize the memory with the default initialization, and then overwrite it with explicit initializers.

+2
source share

This behavior is not undefined.

From section 6.7.9 of standard C:

19 Initialization should be performed in the order of the list of initializers, each initializer provided for redefinition of a specific subobject any previously specified initializer for the same subobject; all subobjects that are not explicitly initialized should be implicitly initialized the same as objects that have a static storage duration.

So, when there is a conflict between the designated initializers, the last one takes precedence.

In your example, you initialize .bar , then .bar.b . Both of these are initialized by .bar , so the second is used. So .bar initialized along with its subfield .bar.b , but not .bar.a or .bar.r . And since some fields are initialized, but not all, the rest are initialized to 0:

21 If the list enclosed in curly brackets contains fewer initializers than there are elements or elements of the population or fewer characters in the string literal used to initialize an array of known size than the array contains elements, the rest of the population must be implicitly initialized in the same way as objects that have a static storage duration.

This means that the correct behavior is the output of "0.5.0". So gcc matches, but the Mac compiler doesn't.

0
source share

All Articles