In the following code, why is the variable i not assigned the value 1?
#include <stdio.h>
int main(void)
{
int val = 0;
switch (val) {
int i = 1; //i is defined here
case 0:
printf("value: %d\n", i);
break;
default:
printf("value: %d\n", i);
break;
}
return 0;
}
When I compile, I get a warning about i not being initialized despite int i = 1; that clearly initializes it
$ gcc -Wall test.c
warning: ‘i’ is used uninitialized in this function [-Wuninitialized]
printf("value %d\n", i);
^
If val = 0, then the output is 0.
If val = 1 or anything else, then the output is also 0.
Please explain to me why the variable i is declared but not defined inside the switch. The object whose identifier is i exists with automatic storage duration (within the block) but is never initialized. Why?
According to the C standard (6.8 Statements and blocks), emphasis mine:
3 A block allows a set of declarations and statements to be grouped
into one syntactic unit. The initializers of objects that have
automatic storage duration, and the variable length array declarators
of ordinary identifiers with block scope, are evaluated and the values
are stored in the objects (including storing an indeterminate value
in objects without an initializer) each time the declaration is
reached in the order of execution, as if it were a statement, and
within each declaration in the order that declarators appear.
And (6.8.4.2 The switch statement)
4 A switch statement causes control to jump to, into, or past the
statement that is the switch body, depending on the value of a
controlling expression, and on the presence of a default label and the
values of any case labels on or in the switch body. A case or default
label is accessible only within the closest enclosing switch
statement.
Thus the initializer of variable i is never evaluated because the declaration
switch (val) {
int i = 1; //i is defined here
//...
is not reached in the order of execution due to jumps to case labels and like any variable with the automatic storage duration has indeterminate value.
See also this normative example from 6.8.4.2/7:
EXAMPLE In the artificial program fragment
switch (expr)
{
int i = 4;
f(i);
case 0:
i = 17; /* falls through into default code */
default:
printf("%d\n", i);
}
the object whose identifier is i exists with
automatic storage duration (within the block) but is never
initialized, and thus if the controlling expression has a nonzero
value, the call to the printf function will access an indeterminate
value. Similarly, the call to the function f cannot be reached.
In the case when val is not zero, the execution jumps directly to the label default. This means that the variable i, while defined in the block, isn't initialized and its value is indeterminate.
6.8.2.4 The switch statement
A switch statement causes control to jump to, into, or past the statement that is the
switch body, depending on the value of a controlling expression, and on the presence of a
default label and the values of any case labels on or in the switch body. A case or
default label is accessible only within the closest enclosing switch statement.
Indeed, your i is declared inside the switch block, so it only exists inside the switch. However, its initialization is never reached, so it stays uninitialized when val is not 0.
It is a bit like the following code:
{
int i;
if (val==0) goto zerovalued;
else goto nonzerovalued;
i=1; // statement never reached
zerovalued:
i = 10;
printf("value:%d\n",i);
goto next;
nonzerovalued:
printf("value:%d\n",i);
goto next;
next:
return 0;
}
Intuitively, think of raw declaration like asking the compiler for some location (on the call frame in your call stack, or in a register, or whatever), and think of initialization as an assignment statement. Both are separate steps, and you could look at an initializing declaration in C like int i=1; as syntactic sugar for the raw declaration int i; followed by the initializing assignment i=1;.
(actually, things are slightly more complex e.g. with int i= i!=i; and even more complex in C++)
Line for initialization of i variable int i = 1; is never called because it does not belong to any of available cases.
The initialization of variables with automatic storage durations is detailed in C11 6.2.4p6:
For such an object that does not have a variable length array type, its lifetime extends from entry into the block with which it is associated until execution of that block ends in any way. (Entering an enclosed block or calling a function suspends, but does not end, execution of the current block.) If the block is entered recursively, a new instance of the object is created each time. The initial value of the object is indeterminate. If an initialization is specified for the object, it is performed each time the declaration or compound literal is reached in the execution of the block; otherwise, the value becomes indeterminate each time the declaration is reached.
I.e. the lifetime of i in
switch(a) {
int i = 2;
case 1: printf("%d",i);
break;
default: printf("Hello\n");
}
is from { to }. Its value is indeterminate, unless the declaration int i = 2; is reached in the execution of the block. Since the declaration is before any case label, the declaration cannot be ever reached, since the switch jumps to the corresponding case label - and over the initialization.
Therefore i remains uninitialized. And since it does, and since it has its address never taken, the use of the uninitialized value to undefined behaviour C11 6.3.2.1p2:
[...] If the lvalue designates an object of automatic storage duration that could have been declared with the register storage class (never had its address taken), and that object is uninitialized (not declared with an initializer and no assignment to it has been performed prior to use), the behavior is undefined.
(Notice that the standard itself here words the contents in the clarifying parenthesis incorrectly - it is declared with an initializer but the initializer is not executed).
Related
In the following code, why is the variable i not assigned the value 1?
#include <stdio.h>
int main(void)
{
int val = 0;
switch (val) {
int i = 1; //i is defined here
case 0:
printf("value: %d\n", i);
break;
default:
printf("value: %d\n", i);
break;
}
return 0;
}
When I compile, I get a warning about i not being initialized despite int i = 1; that clearly initializes it
$ gcc -Wall test.c
warning: ‘i’ is used uninitialized in this function [-Wuninitialized]
printf("value %d\n", i);
^
If val = 0, then the output is 0.
If val = 1 or anything else, then the output is also 0.
Please explain to me why the variable i is declared but not defined inside the switch. The object whose identifier is i exists with automatic storage duration (within the block) but is never initialized. Why?
According to the C standard (6.8 Statements and blocks), emphasis mine:
3 A block allows a set of declarations and statements to be grouped
into one syntactic unit. The initializers of objects that have
automatic storage duration, and the variable length array declarators
of ordinary identifiers with block scope, are evaluated and the values
are stored in the objects (including storing an indeterminate value
in objects without an initializer) each time the declaration is
reached in the order of execution, as if it were a statement, and
within each declaration in the order that declarators appear.
And (6.8.4.2 The switch statement)
4 A switch statement causes control to jump to, into, or past the
statement that is the switch body, depending on the value of a
controlling expression, and on the presence of a default label and the
values of any case labels on or in the switch body. A case or default
label is accessible only within the closest enclosing switch
statement.
Thus the initializer of variable i is never evaluated because the declaration
switch (val) {
int i = 1; //i is defined here
//...
is not reached in the order of execution due to jumps to case labels and like any variable with the automatic storage duration has indeterminate value.
See also this normative example from 6.8.4.2/7:
EXAMPLE In the artificial program fragment
switch (expr)
{
int i = 4;
f(i);
case 0:
i = 17; /* falls through into default code */
default:
printf("%d\n", i);
}
the object whose identifier is i exists with
automatic storage duration (within the block) but is never
initialized, and thus if the controlling expression has a nonzero
value, the call to the printf function will access an indeterminate
value. Similarly, the call to the function f cannot be reached.
In the case when val is not zero, the execution jumps directly to the label default. This means that the variable i, while defined in the block, isn't initialized and its value is indeterminate.
6.8.2.4 The switch statement
A switch statement causes control to jump to, into, or past the statement that is the
switch body, depending on the value of a controlling expression, and on the presence of a
default label and the values of any case labels on or in the switch body. A case or
default label is accessible only within the closest enclosing switch statement.
Indeed, your i is declared inside the switch block, so it only exists inside the switch. However, its initialization is never reached, so it stays uninitialized when val is not 0.
It is a bit like the following code:
{
int i;
if (val==0) goto zerovalued;
else goto nonzerovalued;
i=1; // statement never reached
zerovalued:
i = 10;
printf("value:%d\n",i);
goto next;
nonzerovalued:
printf("value:%d\n",i);
goto next;
next:
return 0;
}
Intuitively, think of raw declaration like asking the compiler for some location (on the call frame in your call stack, or in a register, or whatever), and think of initialization as an assignment statement. Both are separate steps, and you could look at an initializing declaration in C like int i=1; as syntactic sugar for the raw declaration int i; followed by the initializing assignment i=1;.
(actually, things are slightly more complex e.g. with int i= i!=i; and even more complex in C++)
Line for initialization of i variable int i = 1; is never called because it does not belong to any of available cases.
The initialization of variables with automatic storage durations is detailed in C11 6.2.4p6:
For such an object that does not have a variable length array type, its lifetime extends from entry into the block with which it is associated until execution of that block ends in any way. (Entering an enclosed block or calling a function suspends, but does not end, execution of the current block.) If the block is entered recursively, a new instance of the object is created each time. The initial value of the object is indeterminate. If an initialization is specified for the object, it is performed each time the declaration or compound literal is reached in the execution of the block; otherwise, the value becomes indeterminate each time the declaration is reached.
I.e. the lifetime of i in
switch(a) {
int i = 2;
case 1: printf("%d",i);
break;
default: printf("Hello\n");
}
is from { to }. Its value is indeterminate, unless the declaration int i = 2; is reached in the execution of the block. Since the declaration is before any case label, the declaration cannot be ever reached, since the switch jumps to the corresponding case label - and over the initialization.
Therefore i remains uninitialized. And since it does, and since it has its address never taken, the use of the uninitialized value to undefined behaviour C11 6.3.2.1p2:
[...] If the lvalue designates an object of automatic storage duration that could have been declared with the register storage class (never had its address taken), and that object is uninitialized (not declared with an initializer and no assignment to it has been performed prior to use), the behavior is undefined.
(Notice that the standard itself here words the contents in the clarifying parenthesis incorrectly - it is declared with an initializer but the initializer is not executed).
Is it allowed to jump to a label that's inside an inner scope or a sibling scope? If so, is it allowed to use variables declared in that scope?
Consider this code:
int cond(void);
void use(int);
void foo()
{
{
int y = 2;
label:
use(y);
}
{
int z = 3;
use(z);
/* jump to sibling scope: */ if(cond()) goto label;
}
/* jump to inner scope: */ if(cond()) goto label;
}
Are these gotos legal?
If so, is y guaranteed to exist when I jump to label and to hold the last value assigned to it (2)?
Or is the compiler allowed to assume y won't be used after it goes out of scope, which means a single memory location may be used for both y and z?
If this code's behavior is undefined, how can I get GCC to emit a warning about it?
From the C99 standard (emphasis mine):
6.2.4 Storage durations of objects
[6] For such an object that does have a variable length array type, its lifetime extends from the declaration of the object until execution of the program leaves the scope of the declaration. ... If the scope is entered recursively, a new instance of the object is created each time. The initial value of the object is indeterminate.
6.8.6.1 The goto statement
[1] The identifier in a goto statement shall name a label located somewhere in the enclosing function. A goto statement shall not jump from outside the scope of an identifier having a variably modified type to inside the scope of that identifier.
[4] ... A goto statement is not allowed to jump past any declarations of objects with variably modified types.
Conclusion
y is not a variably modified type, so, according to the standard, the jumps are legal.
y is guaranteed to exist, however, the jumps skip the initialization (y = 2), so the value of y is indeterminate.
You can use -Wjump-misses-init to get GCC to emit a warning like the following:
warning: jump skips variable initialization [-Wjump-misses-init]
In C++, the jumps are not legal, C++ does not allow to skip the initialization of y.
The jumps are legal (in C, in C++ they aren't).
is y guaranteed to exist when I jump to label
Yes.
and to hold the last value assigned to it (2)?
No.
From the C11 Standard (draft) 6.2.4/6:
For such an object [without the storage-class
specifier static] that does not have a variable length array type, its lifetime extends
from entry into the block with which it is associated until execution of that block ends in
any way. [...] The initial value of the object is indeterminate. If an
initialization is specified for the object, it is performed each time the declaration [...] is reached in the execution of the block; otherwise, the value becomes
indeterminate each time the declaration is reached.
From the above one would conclude the for the 2nd and 3rd time use(y) gets called the value of y ins "indeterminate[d]", as the initialisation of y is not "reached".
I am tying to understand the effect of the following in C:
int func(int arg) {
if (arg == 0) {
double *d = malloc(...);
}
//...
}
My understanding is:
Regardless of the value of arg, stack space will be made for the pointer d when func is invoked
d is only initialised, i.e. malloc called, if arg == 0
d can only be accessed inside the if block; trying to access it outside will generate a compile error - even though the stack space for d is allocated regardless.
So, it is equivalent to the following except for the scoping rules that prevent access outside the if block:
int func(int arg) {
double *d;
if (arg == 0) {
d = malloc(...);
}
//...
}
Is this correct? I am compiling with icc default settings which seems to be std=gnu89.
The lifetime of the object denoted by d starts at the beginning of the block in which it is declared (which might be prior to the declaration), not necessarily at the beginning of the function. In practice, compilers may choose to allocate space for all variables at function entry; Gcc, for example, compiles both versions of func to identical assembly. With only a few automatic variables in a function, it's likely that they are all placed in registers and no stack space is used for them at all.
Initialization happens at the point where the initializer appears. All this is subject to the as-if rule (as always): In this case, Gcc doesn't generate any call to malloc when optimizing (and thereby removes the memory leak), a compiler is allowed to "know" what standard library functions do. If this wasn't a library function and the definition not known to the compiler, the call was guaranteed to occur exactly when the initializer is reached.
Using an undeclared identifier (or one that has gone out of scope) is a syntax error, and thus caught at compile-time. The lifetime of the denoted object (with automatic storage duration) ends with the enclosing block, any attempt to refer to it afterwards (through a pointer which used to point to the object) is undefined, no diagnostic required.
In the second code snippet, it's not only syntactically possible to use d after the if block, it's also defined to access the denoted object.
To illustrate the difference between the scope of an identifier and the lifetime of the denoted object, this is valid C99 (and C11) code:
void foo(void) {
int *p = 0;
again:
if(p) {
printf("%d\n", *p); /* n is not in scope here, but the object exists */
*p = 0;
}
int n = 42;
printf("%d\n", n);
if(!p) {
p = &n;
goto again;
}
}
The output is three times 42, when the initializer is reached the second time, n is re-initialized to 42 (and does not stay 0).
Such questions don't arise for C89 (where a label cannot be above a declaration); in GNU89, mixed declarations and code is allowed, though it's not clear to me from the documentation if the C99 rules of lifetime are guaranteed to be honoured.
This code is undefined (in all C standards):
void foo(void) {
int *p = 0;
for(int i=0; i<2; ++i) {
int n = 42;
if(p) { /* (*) */
printf("%d\n", *p);
}
p = &n;
}
}
In the second iteration, p refers to the n of the first iteration, after its lifetime, though both n likely reside at the same storage location, and 42 is outputted. NB, the behaviour is undefined when (*) is reached the second time, reading an invalid pointer is undefined, not only the indirection in the printf call.
#include<stdio.h>
int main()
{
int i = 10;
printf("0 i %d %p\n",i,&i);
if (i == 10)
goto f;
{
int i = 20;
printf("1 i %d\n",i);
}
{
int i = 30;
f:
printf("2 i %d %p\n",i,&i); //statement X
}
return 0;
}
Output:
[test]$ ./a.out
0 i 10 0xbfbeaea8
2 i 134513744 0xbfbeaea4
I have difficulty in understanding how statement X works?? As you see the output it is junk. It should rather say i not declared??
That's because goto skips the shadowing variable i's initialization.
This is one of the minor nuances of the differences between C and C++. In strict C++ go to crossing variable initialization is an error, while in C it's not. GCC also confirms this, when you compile with -std=c11 it allows while with std=c++11 it complains: jump to label 'f' crosses initialization of 'int i'.
From C99:
A goto statement shall not jump from outside the scope of an identifier having a variably modified type to inside the scope of that identifier.
VLAs are of variably modified type. Jumps inside a scope not containing VM types are allowed.
From C++11 (emphasis mine):
A program that jumps from a point where a variable with automatic storage duration is not in scope to a point where it is in scope is ill-formed unless the variable has scalar type, class type with a trivial default constructor and a trivial destructor, a cv-qualified version of one of these types, or an array of one of the preceding types and is declared without an initializer.
From the output, it is clear that the address of 'i's are unique, since they are declared in different scopes.
0 i 10 0xbfbeaea8
2 i 134513744 0xbfbeaea4
how statement X works?? As you see the output it is junk. It should
rather say I not declared??
i is also declared in the local scope of statement x but the initialization of i to 30 is skipped because of goto statement. Therefore the local variable i contains a garbage value.
In the first printf statement, you accessed the i in address 0xbfbeaea8 which was declared and initialized in the statement int i = 10;
Once you hit the goto f; statement, you are in the scope of the 2nd i, which is declared at this point and resides in address 0xbfbeaea4 but which is not initialized as you skipped the initialization statement.
That's why you were getting rubbish.
When control reaches the third block, i is declared for the compiler, hence i represents some memory address therefore compiler tries to read it again. But since i has now become out-of-scope, you cannot be sure that it will contain the same value what it originally had.
My suggestion to understand somewhat complex code is to strip out, one by one, all "unnecessary" code and leave the bare problem. How do you know what's unnecessary? Initially, when you're not fluent with the language, you'll be removing parts of the code at random, but very quickly you'll learn what's necessary and what is not.
Give it a try: my hint is to start removing or commenting out the "goto" statement. Recompile and, if there are no errors, see what changed when you run the program again.
Another suggestion would be: try to recreate the problem "from scratch": imagine you are working on a top-secret project and you cannot show any single line of code to anyone, let alone post on Stack Overflow. Now, try to replicate the problem by rewriting equivalent source code, that would show the same behaviour.
As they say, "asking the right question is often solving half the problem".
The i you print in this printf("2 i %d %p\n",i,&i); statement, is not the i which value was 10 in if statement, and as you skip this int i = 30; statement with goto you print garbage. This int i = 30; is actual definition of the i that would be printed, i.e. where compiler allocates room and value of i.
The problem is that your goto is skipping the assignment to the second i, which shadows (conceals) the first i whose value you've set, so you're printing out an uninitialized variable.
You'll get a similar wrong answer from this:
#include<stdio.h>
int main()
{
int i = 10; /* First "i" */
printf("0 i %d %p\n",i,&i);
{ /* New block scope */
int i; /* Second "i" shadows first "i" */
printf("2 i %d %p\n",i,&i);
}
return 0;
}
Three lessons: don't shadow variables; don't create blocks ({ ... }) for no reason; and turn on compiler warnings.
Just to clarify: variable scope is a compile-time concept based on where variables are declared, not something that is subject to what happens at runtime. The declaration of i#2 conceals i#1 inside the block that i#2 is declared in. It doesn't matter if the runtime control path jumps into the middle of the block — i#2 is the i that will be used and i#1 is hidden (shadowed). Runtime control flow doesn't carry scope around in a satchel.
Suppose I have a function that declares and initializes two local variables – which by default have the storage duration auto. This function then calls a second function, to which it passes the addresses of these two local variables. Can this second function safely use these pointers?
A trivial programmatic example, to supplement that description:
#include <stdio.h>
int adder(int *a, int *b)
{
return *a + *b;
}
int main()
{
auto int a = 5; // `auto' is redundant; included for clarity
auto int b = 3;
// adder() gets the addresses of two auto variables! is this an issue?
int result = adder(&a, &b);
printf("5 + 3 = %d\n", result);
return 0;
}
This program works as expected, printing 5 + 3 = 8.
Usually, when I have questions about C, I turn to the standard, and this was no exception. Specifically, I checked ISO/IEC 9899, §6.2.4. It says there, in part:
4
An object whose identifier is declared with no linkage and without
the storage-class specifier static has automatic storage duration.
5
For such an object that does not have a variable length array type,
its lifetime extends from entry into the block with which it is
associated until execution of that block ends in any way. (Entering an
enclosed block or calling a function suspends, but does not end,
execution of the current block.) If the block is entered recursively,
a new instance of the object is created each time. The initial value
of the object is indeterminate. If an initialization is specified for
the object, it is performed each time the declaration is reached in
the execution of the block; otherwise, the value becomes indeterminate
each time the declaration is reached.
Reading this, I reason the following points:
Variables a and b have storage duration auto, which I've made explicit using the auto keyword.
Calling the adder() function corresponds to the parenthetical in clause 5, in the partial quote above. That is, entering the adder() function "suspends, but does not end," the execution of the current block (which is main()).
Since the main() block is not "end[ed] in any way," storage for a and b is guaranteed. Thus, accessing them using the addresses &a and &b, even inside adder(), should be safe.
My question, then, is: am I correct in this? Or am I just getting "lucky," and accessing memory locations that, by happenstance, have not been overwritten?
P.S. I was unable to find an exact answer to this question through either Google or SO's search. If you can, mark this as a duplicate and I'll delete it.
Yes, it is safe and basically your assumptions are correct. The lifetime of an automatic object is from the entry in the block where it has been declared until the block terminates.
(C99, 6.2.4p5) "For such an object [...] its lifetime extends from entry into the block with which it is associated until execution of that block ends in any way.
Your reasoning is correct for your particular function call chain, and you have read and quoted the relevant portions of the standard. This is a perfectly valid use of pointers to local variables.
Where you have to be wary is if the function stores the pointer values in a structure that has a lifetime longer than its own call. Consider two functions, foo(), and bar():
int *g_ptr;
void bar (int *p) {
g_ptr = p;
}
void foo () {
int x = 10;
bar(&x);
}
int main () {
foo ();
/* ...do something with g_ptr? */
return 0;
}
In this case, the variable xs lifetime ends with foo() returns. However, the pointer to x has been stored in g_ptr by bar(). In this case, it was an error for foo() to pass a pointer to its local variable x to bar().
What this means is that in order to know whether or not it is valid to pass a pointer to a local variable to a function, you have to know what that function will do with it.
Those variables are allocated in the stack. As long as you do not return from the function that declared them, they remain valid.
As I'm not yet allowed to comment, I'd rather write another answer as amendment to jxh's answer above:
Please see my elaborate answer here for a similar question. This contains a real world example where the aliasing in the called function makes your code break even though it follows all the c-language rules.
Even though it is legal in the C-language I consider it as harmful to pass pointers to automatic variables in a function call. You never know (and often you don't want to know) what exactly the called function does with the passed values. When the called function establishes an alias, you get in big trouble.