This is a theoretical question, I know how to do this unambiguously, but I got curious and dug into the standard and I need a second pair of standards lawyer eyes.
Let's start with two structs and one init function:
struct foo {
int a;
};
struct bar {
struct foo *f;
};
struct bar *
init_bar(struct foo *f)
{
struct bar *b = malloc(sizeof *b);
if (!b)
return NULL;
b->f = f;
return b;
}
We now have a sloppy programmer who doesn't check return values:
void
x(void)
{
struct bar *b;
b = init_bar(&((struct foo){ .a = 42 }));
b->f->a++;
free(b);
}
From my reading of the standard there's nothing wrong here other than potentially dereferencing a NULL pointer. Modifying struct foo through the pointer in struct bar should be legal because the lifetime of the compound literal sent into init_bar is the block where it's contained, which is the whole function x.
But now we have a more careful programmer:
void
y(void)
{
struct bar *b;
if ((b = init_bar(&((struct foo){ .a = 42 }))) == NULL)
err(1, "couldn't allocate b");
b->f->a++;
free(b);
}
Code does the same thing, right? So it should work too. But more careful reading of the C11 standard is leading me to believe that this leads to undefined behavior. (emphasis in quotes mine)
6.5.2.5 Compound literals
5 The value of the compound literal is that of an unnamed object initialized by the
initializer list. If the compound literal occurs outside the body of a function, the object
has static storage duration; otherwise, it has automatic storage duration associated with
the enclosing block.
6.8.4 Selection statements
3 A selection statement is a block whose scope is a strict subset of the scope of its
enclosing block. Each associated substatement is also a block whose scope is a strict
subset of the scope of the selection statement.
Am I reading this right? Does the fact that the if is a block mean that the lifetime of the compound literal is just the if statement?
(In case anyone wonders about where this contrived example came from, in real code init_bar is actually pthread_create and the thread is joined before the function returns, but I didn't want to muddy the waters by involving threads).
The second part of the Standard you quoted (6.8.4 Selection statements) says this. In code:
{//scope 1
if( ... )//scope 2
{
}//end scope 2
}//end scope 1
Scope 2 is entirely inside scope 1. Note that a selection statement in this case is the entire if statement, not just the brackets:
if( ... ){ ... }
Anything defined in that statement is in scope 2. Therefore, as shown in your third example, the lifetime of the compound literal, which is declared in scope 2, ends at the closing if bracket (end scope 2), so that example will cause undefined behavior if the function returns non-NULL (or NULL if err() doesn't terminate the program).
(Note that I used brackets in the if statement, even though the third example doesn't use them. That part of the example is equivalent to this (6.8.2 Compound statement):
if ((b = init_bar(&((struct foo){ .a = 42 }))) == NULL)
{
err(1, "couldn't allocate b");
}
Related
In the following code, why is the variable i not assigned the value 1?
#include <stdio.h>
int main(void)
{
int val = 0;
switch (val) {
int i = 1; //i is defined here
case 0:
printf("value: %d\n", i);
break;
default:
printf("value: %d\n", i);
break;
}
return 0;
}
When I compile, I get a warning about i not being initialized despite int i = 1; that clearly initializes it
$ gcc -Wall test.c
warning: ‘i’ is used uninitialized in this function [-Wuninitialized]
printf("value %d\n", i);
^
If val = 0, then the output is 0.
If val = 1 or anything else, then the output is also 0.
Please explain to me why the variable i is declared but not defined inside the switch. The object whose identifier is i exists with automatic storage duration (within the block) but is never initialized. Why?
According to the C standard (6.8 Statements and blocks), emphasis mine:
3 A block allows a set of declarations and statements to be grouped
into one syntactic unit. The initializers of objects that have
automatic storage duration, and the variable length array declarators
of ordinary identifiers with block scope, are evaluated and the values
are stored in the objects (including storing an indeterminate value
in objects without an initializer) each time the declaration is
reached in the order of execution, as if it were a statement, and
within each declaration in the order that declarators appear.
And (6.8.4.2 The switch statement)
4 A switch statement causes control to jump to, into, or past the
statement that is the switch body, depending on the value of a
controlling expression, and on the presence of a default label and the
values of any case labels on or in the switch body. A case or default
label is accessible only within the closest enclosing switch
statement.
Thus the initializer of variable i is never evaluated because the declaration
switch (val) {
int i = 1; //i is defined here
//...
is not reached in the order of execution due to jumps to case labels and like any variable with the automatic storage duration has indeterminate value.
See also this normative example from 6.8.4.2/7:
EXAMPLE In the artificial program fragment
switch (expr)
{
int i = 4;
f(i);
case 0:
i = 17; /* falls through into default code */
default:
printf("%d\n", i);
}
the object whose identifier is i exists with
automatic storage duration (within the block) but is never
initialized, and thus if the controlling expression has a nonzero
value, the call to the printf function will access an indeterminate
value. Similarly, the call to the function f cannot be reached.
In the case when val is not zero, the execution jumps directly to the label default. This means that the variable i, while defined in the block, isn't initialized and its value is indeterminate.
6.8.2.4 The switch statement
A switch statement causes control to jump to, into, or past the statement that is the
switch body, depending on the value of a controlling expression, and on the presence of a
default label and the values of any case labels on or in the switch body. A case or
default label is accessible only within the closest enclosing switch statement.
Indeed, your i is declared inside the switch block, so it only exists inside the switch. However, its initialization is never reached, so it stays uninitialized when val is not 0.
It is a bit like the following code:
{
int i;
if (val==0) goto zerovalued;
else goto nonzerovalued;
i=1; // statement never reached
zerovalued:
i = 10;
printf("value:%d\n",i);
goto next;
nonzerovalued:
printf("value:%d\n",i);
goto next;
next:
return 0;
}
Intuitively, think of raw declaration like asking the compiler for some location (on the call frame in your call stack, or in a register, or whatever), and think of initialization as an assignment statement. Both are separate steps, and you could look at an initializing declaration in C like int i=1; as syntactic sugar for the raw declaration int i; followed by the initializing assignment i=1;.
(actually, things are slightly more complex e.g. with int i= i!=i; and even more complex in C++)
Line for initialization of i variable int i = 1; is never called because it does not belong to any of available cases.
The initialization of variables with automatic storage durations is detailed in C11 6.2.4p6:
For such an object that does not have a variable length array type, its lifetime extends from entry into the block with which it is associated until execution of that block ends in any way. (Entering an enclosed block or calling a function suspends, but does not end, execution of the current block.) If the block is entered recursively, a new instance of the object is created each time. The initial value of the object is indeterminate. If an initialization is specified for the object, it is performed each time the declaration or compound literal is reached in the execution of the block; otherwise, the value becomes indeterminate each time the declaration is reached.
I.e. the lifetime of i in
switch(a) {
int i = 2;
case 1: printf("%d",i);
break;
default: printf("Hello\n");
}
is from { to }. Its value is indeterminate, unless the declaration int i = 2; is reached in the execution of the block. Since the declaration is before any case label, the declaration cannot be ever reached, since the switch jumps to the corresponding case label - and over the initialization.
Therefore i remains uninitialized. And since it does, and since it has its address never taken, the use of the uninitialized value to undefined behaviour C11 6.3.2.1p2:
[...] If the lvalue designates an object of automatic storage duration that could have been declared with the register storage class (never had its address taken), and that object is uninitialized (not declared with an initializer and no assignment to it has been performed prior to use), the behavior is undefined.
(Notice that the standard itself here words the contents in the clarifying parenthesis incorrectly - it is declared with an initializer but the initializer is not executed).
Is it allowed to take the address of an object on the right hand-side of its definition, as happens in foo() below:
typedef struct { char x[100]; } chars;
chars make(void *p) {
printf("p = %p\n", p);
chars c;
return c;
}
void foo(void) {
chars b = make(&b);
}
If it is allowed, is there any restriction on its use, e.g., is printing it OK, can I compare it to another pointer, etc?
In practice it seems to compile on the compilers I tested, with the expected behavior most of the time (but not always), but that's far from a guarantee.
To answer the question in the title, with your code sample in mind, yes it may. The C standard says as much in §6.2.4:
The lifetime of an object is the portion of program execution during
which storage is guaranteed to be reserved for it. An object exists,
has a constant address, and retains its last-stored value throughout
its lifetime.
For such an object that does not have a variable length array type,
its lifetime extends from entry into the block with which it is
associated until execution of that block ends in any way.
So yes, you may take the address of a variable from the point of declaration, because the object has the address at this point and it's in scope. A condensed example of this is the following:
void *p = &p;
It serves very little purpose, but is perfectly valid.
As for your second question, what can you do with it. I can mostly say I wouldn't use that address to access the object until initialization is complete, because the order of evaluation for expressions in initializers is left unsepcified (§6.7.9). You can easily find your foot shot off.
One place where this does come through, is when defining all sorts of tabular data structures that need to be self referential. For instance:
typedef struct tab_row {
// Useful data
struct tab_row *p_next;
} row;
row table[3] = {
[1] = { /*Data 1*/, &table[0] },
[2] = { /*Data 2*/, &table[1] },
[0] = { /*Data 0*/, &table[2] },
};
6.2.1 Scopes of identifiers
Structure, union, and enumeration tags have scope that begins just after the appearance of
the tag in a type specifier that declares the tag. Each enumeration constant has scope that
begins just after the appearance of its defining enumerator in an enumerator list. Any
other identifier has scope that begins just after the completion of its declarator.
In
chars b = make(&b);
// ^^
the declarator is b, so it is in scope in its own initializer.
6.2.4 Storage durations of objects
For such an [automatic] object that does not have a variable length array type, its lifetime extends
from entry into the block with which it is associated until execution of that block ends in
any way.
So in
{ // X
chars b = make(&b);
}
the lifetime of b starts at X, so by the time the initializer executes, it is both alive and in scope.
As far as I can tell, this is effectively identical to
{
chars b;
b = make(&b);
}
There's no reason you couldn't use &b there.
The question has already been answered, but for reference, it doesn't make much sense. This is how you would write the code:
typedef struct { char x[100]; } chars;
chars make (void) {
chars c;
/* init c */
return c;
}
void foo(void) {
chars b = make();
}
Or perhaps preferably in case of an ADT or similar, return a pointer to a malloc:ed object. Passing structs by value is usually not a good idea.
In the following code, why is the variable i not assigned the value 1?
#include <stdio.h>
int main(void)
{
int val = 0;
switch (val) {
int i = 1; //i is defined here
case 0:
printf("value: %d\n", i);
break;
default:
printf("value: %d\n", i);
break;
}
return 0;
}
When I compile, I get a warning about i not being initialized despite int i = 1; that clearly initializes it
$ gcc -Wall test.c
warning: ‘i’ is used uninitialized in this function [-Wuninitialized]
printf("value %d\n", i);
^
If val = 0, then the output is 0.
If val = 1 or anything else, then the output is also 0.
Please explain to me why the variable i is declared but not defined inside the switch. The object whose identifier is i exists with automatic storage duration (within the block) but is never initialized. Why?
According to the C standard (6.8 Statements and blocks), emphasis mine:
3 A block allows a set of declarations and statements to be grouped
into one syntactic unit. The initializers of objects that have
automatic storage duration, and the variable length array declarators
of ordinary identifiers with block scope, are evaluated and the values
are stored in the objects (including storing an indeterminate value
in objects without an initializer) each time the declaration is
reached in the order of execution, as if it were a statement, and
within each declaration in the order that declarators appear.
And (6.8.4.2 The switch statement)
4 A switch statement causes control to jump to, into, or past the
statement that is the switch body, depending on the value of a
controlling expression, and on the presence of a default label and the
values of any case labels on or in the switch body. A case or default
label is accessible only within the closest enclosing switch
statement.
Thus the initializer of variable i is never evaluated because the declaration
switch (val) {
int i = 1; //i is defined here
//...
is not reached in the order of execution due to jumps to case labels and like any variable with the automatic storage duration has indeterminate value.
See also this normative example from 6.8.4.2/7:
EXAMPLE In the artificial program fragment
switch (expr)
{
int i = 4;
f(i);
case 0:
i = 17; /* falls through into default code */
default:
printf("%d\n", i);
}
the object whose identifier is i exists with
automatic storage duration (within the block) but is never
initialized, and thus if the controlling expression has a nonzero
value, the call to the printf function will access an indeterminate
value. Similarly, the call to the function f cannot be reached.
In the case when val is not zero, the execution jumps directly to the label default. This means that the variable i, while defined in the block, isn't initialized and its value is indeterminate.
6.8.2.4 The switch statement
A switch statement causes control to jump to, into, or past the statement that is the
switch body, depending on the value of a controlling expression, and on the presence of a
default label and the values of any case labels on or in the switch body. A case or
default label is accessible only within the closest enclosing switch statement.
Indeed, your i is declared inside the switch block, so it only exists inside the switch. However, its initialization is never reached, so it stays uninitialized when val is not 0.
It is a bit like the following code:
{
int i;
if (val==0) goto zerovalued;
else goto nonzerovalued;
i=1; // statement never reached
zerovalued:
i = 10;
printf("value:%d\n",i);
goto next;
nonzerovalued:
printf("value:%d\n",i);
goto next;
next:
return 0;
}
Intuitively, think of raw declaration like asking the compiler for some location (on the call frame in your call stack, or in a register, or whatever), and think of initialization as an assignment statement. Both are separate steps, and you could look at an initializing declaration in C like int i=1; as syntactic sugar for the raw declaration int i; followed by the initializing assignment i=1;.
(actually, things are slightly more complex e.g. with int i= i!=i; and even more complex in C++)
Line for initialization of i variable int i = 1; is never called because it does not belong to any of available cases.
The initialization of variables with automatic storage durations is detailed in C11 6.2.4p6:
For such an object that does not have a variable length array type, its lifetime extends from entry into the block with which it is associated until execution of that block ends in any way. (Entering an enclosed block or calling a function suspends, but does not end, execution of the current block.) If the block is entered recursively, a new instance of the object is created each time. The initial value of the object is indeterminate. If an initialization is specified for the object, it is performed each time the declaration or compound literal is reached in the execution of the block; otherwise, the value becomes indeterminate each time the declaration is reached.
I.e. the lifetime of i in
switch(a) {
int i = 2;
case 1: printf("%d",i);
break;
default: printf("Hello\n");
}
is from { to }. Its value is indeterminate, unless the declaration int i = 2; is reached in the execution of the block. Since the declaration is before any case label, the declaration cannot be ever reached, since the switch jumps to the corresponding case label - and over the initialization.
Therefore i remains uninitialized. And since it does, and since it has its address never taken, the use of the uninitialized value to undefined behaviour C11 6.3.2.1p2:
[...] If the lvalue designates an object of automatic storage duration that could have been declared with the register storage class (never had its address taken), and that object is uninitialized (not declared with an initializer and no assignment to it has been performed prior to use), the behavior is undefined.
(Notice that the standard itself here words the contents in the clarifying parenthesis incorrectly - it is declared with an initializer but the initializer is not executed).
I am tying to understand the effect of the following in C:
int func(int arg) {
if (arg == 0) {
double *d = malloc(...);
}
//...
}
My understanding is:
Regardless of the value of arg, stack space will be made for the pointer d when func is invoked
d is only initialised, i.e. malloc called, if arg == 0
d can only be accessed inside the if block; trying to access it outside will generate a compile error - even though the stack space for d is allocated regardless.
So, it is equivalent to the following except for the scoping rules that prevent access outside the if block:
int func(int arg) {
double *d;
if (arg == 0) {
d = malloc(...);
}
//...
}
Is this correct? I am compiling with icc default settings which seems to be std=gnu89.
The lifetime of the object denoted by d starts at the beginning of the block in which it is declared (which might be prior to the declaration), not necessarily at the beginning of the function. In practice, compilers may choose to allocate space for all variables at function entry; Gcc, for example, compiles both versions of func to identical assembly. With only a few automatic variables in a function, it's likely that they are all placed in registers and no stack space is used for them at all.
Initialization happens at the point where the initializer appears. All this is subject to the as-if rule (as always): In this case, Gcc doesn't generate any call to malloc when optimizing (and thereby removes the memory leak), a compiler is allowed to "know" what standard library functions do. If this wasn't a library function and the definition not known to the compiler, the call was guaranteed to occur exactly when the initializer is reached.
Using an undeclared identifier (or one that has gone out of scope) is a syntax error, and thus caught at compile-time. The lifetime of the denoted object (with automatic storage duration) ends with the enclosing block, any attempt to refer to it afterwards (through a pointer which used to point to the object) is undefined, no diagnostic required.
In the second code snippet, it's not only syntactically possible to use d after the if block, it's also defined to access the denoted object.
To illustrate the difference between the scope of an identifier and the lifetime of the denoted object, this is valid C99 (and C11) code:
void foo(void) {
int *p = 0;
again:
if(p) {
printf("%d\n", *p); /* n is not in scope here, but the object exists */
*p = 0;
}
int n = 42;
printf("%d\n", n);
if(!p) {
p = &n;
goto again;
}
}
The output is three times 42, when the initializer is reached the second time, n is re-initialized to 42 (and does not stay 0).
Such questions don't arise for C89 (where a label cannot be above a declaration); in GNU89, mixed declarations and code is allowed, though it's not clear to me from the documentation if the C99 rules of lifetime are guaranteed to be honoured.
This code is undefined (in all C standards):
void foo(void) {
int *p = 0;
for(int i=0; i<2; ++i) {
int n = 42;
if(p) { /* (*) */
printf("%d\n", *p);
}
p = &n;
}
}
In the second iteration, p refers to the n of the first iteration, after its lifetime, though both n likely reside at the same storage location, and 42 is outputted. NB, the behaviour is undefined when (*) is reached the second time, reading an invalid pointer is undefined, not only the indirection in the printf call.
Lets assume that we've got a type:
typedef struct __BUFF_T__
{
u_int8_t *buf;
u_int32_t size;
}buff_t;
Is it correct allocating memory next way in c99?
buff_t a = {.size = 20,.buf = calloc(a.size,1)};
Compiler shows warning
Variable 'data' is uninitialized when used within its own initialization
Memory's available and all, but are there some other non-warning options to do the same?
From 6.7.9p23:
The evaluations of the initialization list expressions are indeterminately sequenced with
respect to one another [...] (152) In particular, the evaluation order need not be the same as the order of subobject initialization.
So there is no guarantee that a.size is initialized at the point calloc(a.size, 1) is evaluated for the initialization of a.buf.
In this case, a suitable initializer would be a creation function:
inline buff_t create_buff(u_int32_t size) {
return (buff_t) {.size = size, .buf = calloc(size, 1)};
}
buff_t a = create_buff(20);
This can't be used for static or file-scope objects; in that case a macro would be necessary (or, for example, a gcc statement expression, which could be used in a macro).
The structure is not fully initialized until after the assignment of a, because you don't know in which order the expressions will be evaluated.
If you need to use a structure field to initialize another field in the same structure, you have to do it in separate steps.