Say I have a struct typedef that is:
typedef struct {
int32_t memberOne;
int32_t memberTwo;
} myStruct_t;
I instantiate a const of that type as follows:
const myStruct_t myConst = {.memberTwo = 32};
What does C say the compiler should set memberOne to be? I've tried it, of course, and I happen to get 0 on the compilers I've tried, what I'm after here is does the C standard require uninitialised members of a const struct to be initialised to something or other by the compiler, i.e. would the code above be considered portable? Clause 6.7.9 of C99 says:
If an object that has static or thread storage duration is not initialized explicitly, then:
— if it has pointer type, it is initialized to a null pointer;
— if it has arithmetic type, it is initialized to (positive or unsigned) zero;
— if it is an aggregate, every member is initialized (recursively) according to these rules, and any padding is initialized to zero bits;
— if it is a union, the first named member is initialized (recursively) according to these rules, and any padding is initialized to zero bits;
...but what about consts? Are they considered to be of static storage type, just without the static keyword?
does the C standard require uninitialised members of a const struct to be initialised to something
Yes, they are guaranteed to be set to zero/null pointers as long as you initialize at least one member explicitly. const plays no part in it.
You've already found the relevant part about how objects with static storage duration are initialized. Just keep reading the same chapter:
C17 6.7.9 §19
The initialization shall occur in initializer list order, each initializer provided for a
particular subobject overriding any previously listed initializer for the same subobject;
all subobjects that are not initialized explicitly shall be initialized implicitly the same as
objects that have static storage duration.
C17 6.7.9 §21
If there are fewer initializers in a brace-enclosed list than there are elements or members of an
aggregate, or fewer characters in a string literal used to initialize an array of known size than there
are elements in the array, the remainder of the aggregate shall be initialized implicitly the same as
objects that have static storage duration.
I've seen code like this:
char str[1024] = {0, };
and suspect that it is similar to doing this:
char str[1024];
str[0] = '\0';
But I couldn't find anything on it so I'm not sure.
What is this (called) and what does it do?
Disclaimer: I'm aware this might have been asked and answered before, but searching for {0, } is astonishingly hard. If you can point out a duplicate, I'll happily delete this question.
No, they are not the same.
This statement
char str[1024] = {0, };
initializes the first element to the given value 0, and all other elements are to be initialized as if they have static storage, in this case, with a value 0. Syntactically this is analogous to using
char str[1024] = {0};
Quoting C11, chapter 6.7.9, p21
If there are fewer initializers in a brace-enclosed list than there are elements or members of an aggregate, or fewer characters in a string literal used to initialize an array of known size than there are elements in the array, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration.
and, from p10 (emphasis mine)
If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate. If an object that has static or thread storage duration is not initialized explicitly, then:
if it has pointer type, it is initialized to a null pointer;
if it has arithmetic type, it is initialized to (positive or unsigned) zero;
if it is an aggregate, every member is initialized (recursively) according to these rules, and any padding is initialized to zero bits;
if it is a union, the first named member is initialized (recursively) according to these rules, and any padding is initialized to zero bits;
On the other hand
char str[1024];
str[0] = '\0';
only initializes the first element, and the remaining elements remains unitialized, containing indeterminate values.
The {0, } initializer is the same as the more common {0} initializer.
The trailing comma is allowed by the syntax but it makes no difference.
6.7.9--Initialization:
initializer:
assignment-expression
{ initializer-list }
{ initializer-list , }
initializer-list:
designationopt initializer
initializer-list , designationopt initializer
designation:
designator-list =
designator-list:
designator
designator-list designator
designator:
[ constant-expression ]
. identifier
The semantics are such that the 0, which is syntactically required because at least one initializer-list item is syntactically required (it's a kind of a arbitrary requirement: compilers frequently support an empty {} as well) initializes the first subobject recursively (6.7.9p17):
Each brace-enclosed initializer list has an associated current object. When no designations are present, subobjects of the current object are initialized in order according to the type of the current object: array elements in increasing subscript order, structure members in declaration order, and the first named member of a union.148) In contrast, a designation causes the following initializer to begin initialization of the subobject described by the designator. Initialization then continues forward in order, beginning with the next subobject after that described by the designator.
and the rest is initialized as it would be if the whole object had static storage duration (6.7.9p19,6.7.9p21). This practically means to 0 (as with memset(,0,) although with the caveat that paddings need not be initialized and that 0-initialized pointers need not necessarily be "all-bits zero".
As far as I know, compilers on usual platforms (where pointers are all-bits-zero) just mostly do what they would do with memset(,0,).
This "universal" zero initialization works because the first 0 will recursively hit a scalar type (number on pointer) which can be invariably initialized with the 0 initializer. The default "as-with-static-storage-duration" then initialization applies to the rest.
A perhaps slightly more interesting of the trailing initializer comma doing nothing would be:
int main()
{
char one[]={0,}; //<the comma doesn't introduce another member
_Static_assert(sizeof(one)==1,""); //holds
}
This question already has answers here:
What will be the value of un-initialized element in the array?
(1 answer)
Are unmentioned struct fields *always* initialised to zero (i.e. when the struct is on the stack)?
(2 answers)
Closed 5 years ago.
Is it safe to initialize some of the elements in the array like this?
const char *str_array[50] = {
[0] = "str_0",
[10] = "str_10",
[24] = "str_24",
[45] = "str_45",
};
Can I rely on the other elements of the array being properly initialized?
The initialization shown in the question is safe, and the elements not specifically initialized with a designated initializer are (in this context) initialized to NULL. In general, the uninitialized elements are initialized the same as a static variable of the same type would be initialized, which is some variation on the theme of 'zero'.
The relevant section of the C standard (ISO/IEC 9899:2011) is §6.7.9 Initialization, and specifically ¶19:
The initialization shall occur in initializer list order, each initializer provided for a particular subobject overriding any previously listed initializer for the same subobject;151) all subobjects that are not initialized explicitly shall be initialized implicitly the same as objects that have static storage duration.
151) Any initializer for the subobject which is overridden and so not used to initialize that subobject might not be evaluated at all.
There was a rejoinder in the comments:
What if I want other elements to set to same default value?
Unfortunately, you have to choose NULL (in this case; zero in the general case) as the default value. You don't have any other alternatives in standard C (unlike sophisticated modern languages such as, oh, I dunno — let's think of Fortran 66). There's no way in standard C to repeat an initializer other than by writing it many times.
GCC has an extension that allows you to do that (which is documented in the GCC manual in a section with the title Designated Initializers that documents both standard behaviour and non-standard behaviour). Using the GNU extension, you could write:
const char *str_array[50] = {
[1 ... 49] = "empty string", // GCC extension
[0] = "str_0",
[10] = "str_10",
[24] = "str_24",
[45] = "str_45",
};
Note that it is OK to specify two initializers for a cell (such as 10, 24, 45 — the other is via the repeated initializer); the last one mentioned wins. Also note the space separating the ... from the 1 (and 45); that is crucial because of the 'maximal munch rule' which means that [1...45] would be tokenized as [, 1., ., .45, ], where the floating point numbers are not what's wanted.
Yes, it's quite safe.
See N1570 section 6.7.9.
Paragraph 19, discussing initializer lists:
... all subobjects that are not initialized explicitly shall be
initialized implicitly the same as objects that have static storage
duration.
Paragraph 10:
If an object that has automatic storage duration is not initialized
explicitly, its value is indeterminate. If an object that has static
or thread storage duration is not initialized explicitly, then:
if it has pointer type, it is initialized to a null pointer;
if it has arithmetic type, it is initialized to (positive or unsigned) zero;
— if it is an aggregate, every member is initialized (recursively) according to these rules, and any padding is initialized to zero bits;
if it is a union, the first named member is initialized (recursively) according to these rules, and any padding is initialized to zero bits;
This question already has an answer here:
C struct automatic initialization values, array initializations
(1 answer)
Closed 4 years ago.
Consider the following code:
void func()
{
int p;
...
if (p > MAX) {
struct my_struct s;
...
/* here we access the contents 's' as '&s' */
}
}
In this snippet s is on the stack. Is it guaranteed that the compiler initializes all structure fields to zero?
If a variable (struct or otherwise) is declared local to a function or a containing scope (i.e. has automatic storage duration), it is not initialized in any way. You need to explicitly set the fields in the struct.
If you initialize at least one field of a struct but not all, then the remaining fields will be initialized the same as file scope variables (i.e. variables with static storage duration), which means NULL for pointer types and 0 for numeric types.
From section 6.7.9 of the C standard:
10 If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate. If an object that
has static or thread storage duration is not initialized explicitly,
then:
— if it has pointer type, it is initialized to a null pointer;
— if it has arithmetic type, it is initialized to (positive or unsigned)
zero;
— if it is an aggregate, every member is initialized
(recursively) according to these rules, and any padding is initialized
to zero bits;
— if it is a union, the first named member is
initialized (recursively) according to these rules, and any padding is
initialized to zero bits;
...
21 If there are fewer initializers in a brace-enclosed list than there are elements or members of an aggregate, or fewer characters in
a string literal used to initialize an array of known size than there
are elements in the array, the remainder of the aggregate shall be
initialized implicitly the same as objects that have static storage
duration.
No, it's quite the opposite.
Since s is an automatic storage local scoped (i.e., block scoped) variable, unless initialized explicitly, the contents are indeterminate.
Quoting C11, chapter §6.7.9
If an object that has automatic storage duration is not initialized explicitly, its value is
indeterminate. [...].
However, if you want to zero-initialize the variable for an(y) aggregate type, you can simply use an initialization statement like
aggregate-type variable = {0};
which uses the following property from paragraph 21 of the same chapter, (emphasis mine)
If there are fewer initializers in a brace-enclosed list than there are elements or members
of an aggregate, or fewer characters in a string literal used to initialize an array of known
size than there are elements in the array, the remainder of the aggregate shall be
initialized implicitly the same as objects that have static storage duration.
No, they won't be initialized at all. The structure values will end up with whatever garbage is on the stack where the structure is placed.
Consider the following struct initialization:
#include<stdio.h>
struct bar {
int b;
int a;
int r;
};
struct foo {
struct bar bar;
};
int main(int argc, char **argv) {
struct bar b = {1, 2, 3};
struct foo f = {.bar = b, .bar.a = 5 };
// should this print "1, 5, 3", "1, 5, 0", or "0, 5, 0"?
// clang on Mac prints "1, 5, 3", while gcc on Ubuntu prints "0, 5, 0"
printf("%d, %d, %d\n", f.bar.b, f.bar.a, f.bar.r);
return 0;
}
The C11 standard seems to do a quite poor job of describing what behavior should be expected here in section 6.7.9, but seems to think it's doing a reasonable job, as I don't see any warnings regarding undefined behavior in this case either.
In practice, it seems the behavior is either not standardized or the standard is violated by at least one common compiler, with clang/llvm 8.0.0 on a Mac producing "1, 5, 3", and gcc 5.4 on Ubuntu producing "0, 5, 0".
According to the C standard, should f.bar.b and f.bar.r well defined at this point, or does this initialization result in undefined or unspecified behavior?
The C11 standard seems to do a quite poor job of describing what behavior should be expected here in section 6.7.9,
Standardese can be difficult to read, but I don't think this area of the standard is worse in that respect than should be expected.
but seems to think it's doing a reasonable job, as I don't see any warnings regarding undefined behavior in this case either.
The standard is not required to explicitly declare undefined behaviors. Indeed, the standard contains a blanket statement that wherever it does not define behavior for a given piece of code, that code's behavior is undefined. Nevertheless, I do think section 6.7.9 covers this area pretty thoroughly. The main area left open is this:
The evaluations of the initialization list expressions are indeterminately sequenced with respect to one another and thus the order in which any side effects occur is unspecified.
(C2011, 6.7.9/23)
That doesn't present any problem for your example.
In practice, it seems the behavior is either not standardized or the standard is violated by at least one common compiler, with clang/llvm on a Mac producing "1, 5, 3", and gcc on Ubuntu producing "0, 5, 0".
I'm completely prepared to believe that one or the other of those is non-conforming in this area. However, do also pay attention to compiler versions and compilation options -- they may be compiling for different versions of the standard, with or without extensions.
According to the C standard, should f.bar.b and f.bar.r well defined at this point, or does this initialization result in undefined or unspecified behavior?
If the declaration of an object has an associated initializer then the whole object is initialized, and furthermore, the resulting initial value is well-defined by the standard, subject to caveats arising from 6.7.9/23. As for the initial values required of a conforming implementation in your example, the key provisions are these:
The initialization shall occur in initializer list order, each initializer provided for a particular subobject overriding any previously listed initializer for the same subobject; all subobjects that are not initialized explicitly shall be initialized implicitly the same as objects that have static storage duration.
(C2011, 6.7.9/19; emphasis added)
Each designator list begins its description with the current object associated with the closest surrounding brace pair. Each item in the designator list (in order) specifies a particular member of its current object and changes the current object for the next designator (if any) to be that member. The current object that results at the end of the designator list is the subobject to be initialized by the following initializer.
(C2011, 6.7.9/18; emphasis added)
If the aggregate or union contains elements or members that are aggregates or unions, these rules apply recursively to the subaggregates or contained unions.
(C2011, 6.7.9/20)
Thus, given f's initializer,
struct foo f = {.bar = b, .bar.a = 5 };
we first process element .bar = b, as required by 6.7.9/19. That contains a designator list designating foo.b, of type struct bar, as the object to initialize from the following initializer. This initializer exercises the option of being "a single expression that has compatible structure or union type", per 6.7.9/13, therefore the initial value of f.bar is the value of b, subject to partial or full override by subsequent initializers.
We next process the second element, .bar.a = 5. This initializes f.bar.a and only that subobject, per 6.7.9/18, overriding the initialization specified by the previous initializer per 6.7.9/19.
The result of conforming initialization thus leads to printing
1, 5, 3
GCC seems to be failing by re-initializing all of f.bar when it processes the the second initializer, instead of only f.bar.a.
In the C Standard there is written (6.7.9 Initialization)
17 Each brace-enclosed initializer list has an associated current
object. When no designations are present, subobjects of the current
object are initialized in order according to the type of the current
object: array elements in increasing subscript order, structure
members in declaration order, and the first named member of a
union.148) In contrast, a designation causes the following initializer
to begin initialization of the subobject described by the designator.
Initialization then continues forward in order, beginning with the
next subobject after that described by the designator
And
19 The initialization shall occur in initializer list order, each
initializer provided for a particular subobject overriding any
previously listed initializer for the same subobject;151) all
subobjects that are not initialized explicitly shall be initialized
implicitly the same as objects that have static storage duration.
This footnote is important
148) If the initializer list for a subaggregate or contained union
does not begin with a left brace, its subobjects are initialized as
usual, but the subaggregate or contained union does not become the
current object: current objects are associated only with
brace-enclosed initializer lists.
Thus I see neither undefined nor unspecified behavior.
In my opinion the result should look like { 1, 5, 3 }.
If to leave aside the Standard then it is reasonable at first to initialize the memory with the default initializes and then overwrite it with the explicit initializers.
The standard says…
I'm going to quote from §6.7.9 Initializers of ISO/IEC 9899:2011 (the C11 standard), the same section as Vlad from Moscow quotes in his answer:
¶16 Otherwise, the initializer for an object that has aggregate or union type shall be a brace-enclosed list of initializers for the elements or named members.
¶17 Each brace-enclosed initializer list has an associated current object. When no designations are present, subobjects of the current object are initialized in order according to the type of the current object: array elements in increasing subscript order, structure members in declaration order, and the first named member of a union.148) In contrast, a designation causes the following initializer to begin initialization of the subobject described by the designator. Initialization then continues forward in order, beginning with the next subobject after that described by the designator.149)
¶18 Each designator list begins its description with the current object associated with the closest surrounding brace pair. Each item in the designator list (in order) specifies a particular member of its current object and changes the current object for the next designator (if any) to be that member.150) The current object that results at the end of the designator list is the subobject to be initialized by the following initializer.
¶19 The initialization shall occur in initializer list order, each initializer provided for a particular subobject overriding any previously listed initializer for the same subobject;151) all subobjects that are not initialized explicitly shall be initialized implicitly the same as objects that have static storage duration.
¶20 If the aggregate or union contains elements or members that are aggregates or unions, these rules apply recursively to the subaggregates or contained unions. If the initializer of a subaggregate or contained union begins with a left brace, the initializers enclosed by that brace and its matching right brace initialize the elements or members of the subaggregate or the contained union. Otherwise, only enough initializers from the list are taken to account for the elements or members of the subaggregate or the first member of the contained union; any remaining initializers are left to initialize the next element or member of the aggregate of which the current subaggregate or contained union is a part.
¶21 If there are fewer initializers in a brace-enclosed list than there are elements or members of an aggregate, or fewer characters in a string literal used to initialize an array of known size than there are elements in the array, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration.
148) If the initializer list for a subaggregate or contained union does not begin with a left brace, its subobjects are initialized as usual, but the subaggregate or contained union does not become the current object: current objects are associated only with brace-enclosed initializer lists.
149) After a union member is initialized, the next object is not the next member of the union; instead, it is the next subobject of an object containing the union.
150) Thus, a designator can only specify a strict subobject of the aggregate or union that is associated with the surrounding brace pair. Note, too, that each separate designator list is independent.
151) Any initializer for the subobject which is overridden and so not used to initialize that subobject might not be evaluated at all.
Interpretation
I think your code is well-formed and that GCC is handling it incorrectly and Clang is handling it correctly.
With your code modified only so that the unused argc and argv are replaced by void, running on a Mac with macOS Sierra 10.12.1, compiling with GCC 6.2.0 and with Apple's clang version 'Apple LLVM version 8.0.0 (clang-800.0.42.1)', I get the same results as you:
0, 5, 0 from GCC.
1, 5, 3 from Clang.
The key wording in the standard is:
In contrast, a designation causes the following initializer to begin initialization of the subobject described by the designator.
In your initializer, you have:
struct foo f = { .bar = b, .bar.a = 5 };
The first part of the initializer, .bar = b, clearly initializes the bar subobject. At that point, .bar.b has the value 1, .bar.a has the value 2, .bar.r has the value 3. If you omit the , .bar.a = 5 portion of the initializer, the compilers agree.
When you include the , .bar.a = 5, the designator causes the following initialize to begin intialization of the subobject described by the designator — and the designator is .bar.a so the initialization 5 initializes .bar.a. The compilers agree on this; both set .bar.a to 5. But the subobject designated by .bar was previously initialized, so the initializer for .bar.a only affects the .a element; it should not override any other element.
If the initializer is extended with with , 19, then the 19 is not a designation, but it initializes the subobject after the previous designation, which is .bar.r. Both the compilers agree with this.
This test code, a minor variant on your code, illustrates:
#include <stdio.h>
struct bar
{
int b;
int a;
int r;
};
struct foo
{
struct bar bar;
};
static inline void foobar(struct foo f)
{
printf("%d, %d, %d\n", f.bar.b, f.bar.a, f.bar.r);
}
int main(void)
{
struct bar b = {1, 2, 3};
struct foo f0 = {.bar = b, .bar.a = 5 };
struct foo f1 = {.bar = b, .bar.a = 5, 19 };
struct foo f2 = {.bar = b };
foobar(f0);
foobar(f1);
foobar(f2);
return 0;
}
The output from GCC is:
0, 5, 0
0, 5, 19
1, 2, 3
The output from Clang is:
1, 5, 3
1, 5, 19
1, 2, 3
Note that even with no warnings specifically enabled, clang gripes about this code:
$ clang -O3 -g -std=c11 so-4092-0714.c -o so-4092-0714
so-4092-0714.c:21:36: warning: subobject initialization overrides initialization of other fields within its
enclosing subobject [-Winitializer-overrides]
struct foo f0 = {.bar = b, .bar.a = 5 };
^~~~~~
so-4092-0714.c:21:29: note: previous initialization is here
struct foo f0 = {.bar = b, .bar.a = 5 };
^
so-4092-0714.c:22:36: warning: subobject initialization overrides initialization of other fields within its
enclosing subobject [-Winitializer-overrides]
struct foo f1 = {.bar = b, .bar.a = 5, 19 };
^~~~~~
so-4092-0714.c:22:29: note: previous initialization is here
struct foo f1 = {.bar = b, .bar.a = 5, 19 };
^
2 warnings generated.
$
As I said, I think Clang is initializing these structures correctly, even if it complains more than necessary while doing so.
This is not undefined behavior.
From section 6.7.9 of the C standard:
19 The initialization shall occur in initializer list order, each initializer provided for a particular subobject overriding
any previously listed initializer for the same subobject; all
subobjects that are not initialized explicitly shall be initialized
implicitly the same as objects that have static storage duration.
So when there is a conflict among designated initializers, the last one listed takes precedence.
In your example, you initialize .bar, then .bar.b. Both of these initialize .bar, so the second one is used. So .bar is initialized, along with its subfield .bar.b, but not .bar.a or .bar.r. And because some fields are initialized but not all, the others are initialized to 0:
21 If there are fewer initializers in a brace-enclosed list than there are elements or members of an aggregate, or fewer characters in
a string literal used to initialize an array of known size than
there are elements in the array, the remainder of the
aggregate shall be initialized implicitly the same as objects that
have static storage duration.
This means that the correct behavior is to output "0,5,0". So gcc is conforming and the Mac compiler is not.
Since this question was posted, the C18 standard has been released, and includes some additional clarifications that make the answer completely clear.
A clarification request similar to this question had been asked of the standards body as early as 2012, with some changes to the language being briefly discussed that might make the meaning clearer...ultimately it was decided that the C11 language was already correct, but that an example should be added to clarify.
Example 12 in section 6.7.9 of the last publicly available C17 draft demonstrates the correct behavior when a subobject is fully and then partially initialized, noting that any trailing values of the larger subobject not explicitly overwritten should survive (rather than be overwritten by default values), "because implicit initialization does not override explicit initialization."
So the correct behavior is to print "1, 5, 3".