Variable i when was it defined but statement was never executed? [duplicate] - c

In the following code, why is the variable i not assigned the value 1?
#include <stdio.h>
int main(void)
{
int val = 0;
switch (val) {
int i = 1; //i is defined here
case 0:
printf("value: %d\n", i);
break;
default:
printf("value: %d\n", i);
break;
}
return 0;
}
When I compile, I get a warning about i not being initialized despite int i = 1; that clearly initializes it
$ gcc -Wall test.c
warning: ‘i’ is used uninitialized in this function [-Wuninitialized]
printf("value %d\n", i);
^
If val = 0, then the output is 0.
If val = 1 or anything else, then the output is also 0.
Please explain to me why the variable i is declared but not defined inside the switch. The object whose identifier is i exists with automatic storage duration (within the block) but is never initialized. Why?

According to the C standard (6.8 Statements and blocks), emphasis mine:
3 A block allows a set of declarations and statements to be grouped
into one syntactic unit. The initializers of objects that have
automatic storage duration, and the variable length array declarators
of ordinary identifiers with block scope, are evaluated and the values
are stored in the objects (including storing an indeterminate value
in objects without an initializer) each time the declaration is
reached in the order of execution, as if it were a statement, and
within each declaration in the order that declarators appear.
And (6.8.4.2 The switch statement)
4 A switch statement causes control to jump to, into, or past the
statement that is the switch body, depending on the value of a
controlling expression, and on the presence of a default label and the
values of any case labels on or in the switch body. A case or default
label is accessible only within the closest enclosing switch
statement.
Thus the initializer of variable i is never evaluated because the declaration
switch (val) {
int i = 1; //i is defined here
//...
is not reached in the order of execution due to jumps to case labels and like any variable with the automatic storage duration has indeterminate value.
See also this normative example from 6.8.4.2/7:
EXAMPLE In the artificial program fragment
switch (expr)
{
int i = 4;
f(i);
case 0:
i = 17; /* falls through into default code */
default:
printf("%d\n", i);
}
the object whose identifier is i exists with
automatic storage duration (within the block) but is never
initialized, and thus if the controlling expression has a nonzero
value, the call to the printf function will access an indeterminate
value. Similarly, the call to the function f cannot be reached.

In the case when val is not zero, the execution jumps directly to the label default. This means that the variable i, while defined in the block, isn't initialized and its value is indeterminate.
6.8.2.4 The switch statement
A switch statement causes control to jump to, into, or past the statement that is the
switch body, depending on the value of a controlling expression, and on the presence of a
default label and the values of any case labels on or in the switch body. A case or
default label is accessible only within the closest enclosing switch statement.

Indeed, your i is declared inside the switch block, so it only exists inside the switch. However, its initialization is never reached, so it stays uninitialized when val is not 0.
It is a bit like the following code:
{
int i;
if (val==0) goto zerovalued;
else goto nonzerovalued;
i=1; // statement never reached
zerovalued:
i = 10;
printf("value:%d\n",i);
goto next;
nonzerovalued:
printf("value:%d\n",i);
goto next;
next:
return 0;
}
Intuitively, think of raw declaration like asking the compiler for some location (on the call frame in your call stack, or in a register, or whatever), and think of initialization as an assignment statement. Both are separate steps, and you could look at an initializing declaration in C like int i=1; as syntactic sugar for the raw declaration int i; followed by the initializing assignment i=1;.
(actually, things are slightly more complex e.g. with int i= i!=i; and even more complex in C++)

Line for initialization of i variable int i = 1; is never called because it does not belong to any of available cases.

The initialization of variables with automatic storage durations is detailed in C11 6.2.4p6:
For such an object that does not have a variable length array type, its lifetime extends from entry into the block with which it is associated until execution of that block ends in any way. (Entering an enclosed block or calling a function suspends, but does not end, execution of the current block.) If the block is entered recursively, a new instance of the object is created each time. The initial value of the object is indeterminate. If an initialization is specified for the object, it is performed each time the declaration or compound literal is reached in the execution of the block; otherwise, the value becomes indeterminate each time the declaration is reached.
I.e. the lifetime of i in
switch(a) {
int i = 2;
case 1: printf("%d",i);
break;
default: printf("Hello\n");
}
is from { to }. Its value is indeterminate, unless the declaration int i = 2; is reached in the execution of the block. Since the declaration is before any case label, the declaration cannot be ever reached, since the switch jumps to the corresponding case label - and over the initialization.
Therefore i remains uninitialized. And since it does, and since it has its address never taken, the use of the uninitialized value to undefined behaviour C11 6.3.2.1p2:
[...] If the lvalue designates an object of automatic storage duration that could have been declared with the register storage class (never had its address taken), and that object is uninitialized (not declared with an initializer and no assignment to it has been performed prior to use), the behavior is undefined.
(Notice that the standard itself here words the contents in the clarifying parenthesis incorrectly - it is declared with an initializer but the initializer is not executed).

Related

fscanf crash segmentation fault with gdb [duplicate]

In the following code, why is the variable i not assigned the value 1?
#include <stdio.h>
int main(void)
{
int val = 0;
switch (val) {
int i = 1; //i is defined here
case 0:
printf("value: %d\n", i);
break;
default:
printf("value: %d\n", i);
break;
}
return 0;
}
When I compile, I get a warning about i not being initialized despite int i = 1; that clearly initializes it
$ gcc -Wall test.c
warning: ‘i’ is used uninitialized in this function [-Wuninitialized]
printf("value %d\n", i);
^
If val = 0, then the output is 0.
If val = 1 or anything else, then the output is also 0.
Please explain to me why the variable i is declared but not defined inside the switch. The object whose identifier is i exists with automatic storage duration (within the block) but is never initialized. Why?
According to the C standard (6.8 Statements and blocks), emphasis mine:
3 A block allows a set of declarations and statements to be grouped
into one syntactic unit. The initializers of objects that have
automatic storage duration, and the variable length array declarators
of ordinary identifiers with block scope, are evaluated and the values
are stored in the objects (including storing an indeterminate value
in objects without an initializer) each time the declaration is
reached in the order of execution, as if it were a statement, and
within each declaration in the order that declarators appear.
And (6.8.4.2 The switch statement)
4 A switch statement causes control to jump to, into, or past the
statement that is the switch body, depending on the value of a
controlling expression, and on the presence of a default label and the
values of any case labels on or in the switch body. A case or default
label is accessible only within the closest enclosing switch
statement.
Thus the initializer of variable i is never evaluated because the declaration
switch (val) {
int i = 1; //i is defined here
//...
is not reached in the order of execution due to jumps to case labels and like any variable with the automatic storage duration has indeterminate value.
See also this normative example from 6.8.4.2/7:
EXAMPLE In the artificial program fragment
switch (expr)
{
int i = 4;
f(i);
case 0:
i = 17; /* falls through into default code */
default:
printf("%d\n", i);
}
the object whose identifier is i exists with
automatic storage duration (within the block) but is never
initialized, and thus if the controlling expression has a nonzero
value, the call to the printf function will access an indeterminate
value. Similarly, the call to the function f cannot be reached.
In the case when val is not zero, the execution jumps directly to the label default. This means that the variable i, while defined in the block, isn't initialized and its value is indeterminate.
6.8.2.4 The switch statement
A switch statement causes control to jump to, into, or past the statement that is the
switch body, depending on the value of a controlling expression, and on the presence of a
default label and the values of any case labels on or in the switch body. A case or
default label is accessible only within the closest enclosing switch statement.
Indeed, your i is declared inside the switch block, so it only exists inside the switch. However, its initialization is never reached, so it stays uninitialized when val is not 0.
It is a bit like the following code:
{
int i;
if (val==0) goto zerovalued;
else goto nonzerovalued;
i=1; // statement never reached
zerovalued:
i = 10;
printf("value:%d\n",i);
goto next;
nonzerovalued:
printf("value:%d\n",i);
goto next;
next:
return 0;
}
Intuitively, think of raw declaration like asking the compiler for some location (on the call frame in your call stack, or in a register, or whatever), and think of initialization as an assignment statement. Both are separate steps, and you could look at an initializing declaration in C like int i=1; as syntactic sugar for the raw declaration int i; followed by the initializing assignment i=1;.
(actually, things are slightly more complex e.g. with int i= i!=i; and even more complex in C++)
Line for initialization of i variable int i = 1; is never called because it does not belong to any of available cases.
The initialization of variables with automatic storage durations is detailed in C11 6.2.4p6:
For such an object that does not have a variable length array type, its lifetime extends from entry into the block with which it is associated until execution of that block ends in any way. (Entering an enclosed block or calling a function suspends, but does not end, execution of the current block.) If the block is entered recursively, a new instance of the object is created each time. The initial value of the object is indeterminate. If an initialization is specified for the object, it is performed each time the declaration or compound literal is reached in the execution of the block; otherwise, the value becomes indeterminate each time the declaration is reached.
I.e. the lifetime of i in
switch(a) {
int i = 2;
case 1: printf("%d",i);
break;
default: printf("Hello\n");
}
is from { to }. Its value is indeterminate, unless the declaration int i = 2; is reached in the execution of the block. Since the declaration is before any case label, the declaration cannot be ever reached, since the switch jumps to the corresponding case label - and over the initialization.
Therefore i remains uninitialized. And since it does, and since it has its address never taken, the use of the uninitialized value to undefined behaviour C11 6.3.2.1p2:
[...] If the lvalue designates an object of automatic storage duration that could have been declared with the register storage class (never had its address taken), and that object is uninitialized (not declared with an initializer and no assignment to it has been performed prior to use), the behavior is undefined.
(Notice that the standard itself here words the contents in the clarifying parenthesis incorrectly - it is declared with an initializer but the initializer is not executed).

Using goto to jump to inner or sibling scope

Is it allowed to jump to a label that's inside an inner scope or a sibling scope? If so, is it allowed to use variables declared in that scope?
Consider this code:
int cond(void);
void use(int);
void foo()
{
{
int y = 2;
label:
use(y);
}
{
int z = 3;
use(z);
/* jump to sibling scope: */ if(cond()) goto label;
}
/* jump to inner scope: */ if(cond()) goto label;
}
Are these gotos legal?
If so, is y guaranteed to exist when I jump to label and to hold the last value assigned to it (2)?
Or is the compiler allowed to assume y won't be used after it goes out of scope, which means a single memory location may be used for both y and z?
If this code's behavior is undefined, how can I get GCC to emit a warning about it?
From the C99 standard (emphasis mine):
6.2.4 Storage durations of objects
[6] For such an object that does have a variable length array type, its lifetime extends from the declaration of the object until execution of the program leaves the scope of the declaration. ... If the scope is entered recursively, a new instance of the object is created each time. The initial value of the object is indeterminate.
6.8.6.1 The goto statement
[1] The identifier in a goto statement shall name a label located somewhere in the enclosing function. A goto statement shall not jump from outside the scope of an identifier having a variably modified type to inside the scope of that identifier.
[4] ... A goto statement is not allowed to jump past any declarations of objects with variably modified types.
Conclusion
y is not a variably modified type, so, according to the standard, the jumps are legal.
y is guaranteed to exist, however, the jumps skip the initialization (y = 2), so the value of y is indeterminate.
You can use -Wjump-misses-init to get GCC to emit a warning like the following:
warning: jump skips variable initialization [-Wjump-misses-init]
In C++, the jumps are not legal, C++ does not allow to skip the initialization of y.
The jumps are legal (in C, in C++ they aren't).
is y guaranteed to exist when I jump to label
Yes.
and to hold the last value assigned to it (2)?
No.
From the C11 Standard (draft) 6.2.4/6:
For such an object [without the storage-class
specifier static] that does not have a variable length array type, its lifetime extends
from entry into the block with which it is associated until execution of that block ends in
any way. [...] The initial value of the object is indeterminate. If an
initialization is specified for the object, it is performed each time the declaration [...] is reached in the execution of the block; otherwise, the value becomes
indeterminate each time the declaration is reached.
From the above one would conclude the for the 2nd and 3rd time use(y) gets called the value of y ins "indeterminate[d]", as the initialisation of y is not "reached".

goto statement in C

#include<stdio.h>
int main()
{
int i = 10;
printf("0 i %d %p\n",i,&i);
if (i == 10)
goto f;
{
int i = 20;
printf("1 i %d\n",i);
}
{
int i = 30;
f:
printf("2 i %d %p\n",i,&i); //statement X
}
return 0;
}
Output:
[test]$ ./a.out
0 i 10 0xbfbeaea8
2 i 134513744 0xbfbeaea4
I have difficulty in understanding how statement X works?? As you see the output it is junk. It should rather say i not declared??
That's because goto skips the shadowing variable i's initialization.
This is one of the minor nuances of the differences between C and C++. In strict C++ go to crossing variable initialization is an error, while in C it's not. GCC also confirms this, when you compile with -std=c11 it allows while with std=c++11 it complains: jump to label 'f' crosses initialization of 'int i'.
From C99:
A goto statement shall not jump from outside the scope of an identifier having a variably modified type to inside the scope of that identifier.
VLAs are of variably modified type. Jumps inside a scope not containing VM types are allowed.
From C++11 (emphasis mine):
A program that jumps from a point where a variable with automatic storage duration is not in scope to a point where it is in scope is ill-formed unless the variable has scalar type, class type with a trivial default constructor and a trivial destructor, a cv-qualified version of one of these types, or an array of one of the preceding types and is declared without an initializer.
From the output, it is clear that the address of 'i's are unique, since they are declared in different scopes.
0 i 10 0xbfbeaea8
2 i 134513744 0xbfbeaea4
how statement X works?? As you see the output it is junk. It should
rather say I not declared??
i is also declared in the local scope of statement x but the initialization of i to 30 is skipped because of goto statement. Therefore the local variable i contains a garbage value.
In the first printf statement, you accessed the i in address 0xbfbeaea8 which was declared and initialized in the statement int i = 10;
Once you hit the goto f; statement, you are in the scope of the 2nd i, which is declared at this point and resides in address 0xbfbeaea4 but which is not initialized as you skipped the initialization statement.
That's why you were getting rubbish.
When control reaches the third block, i is declared for the compiler, hence i represents some memory address therefore compiler tries to read it again. But since i has now become out-of-scope, you cannot be sure that it will contain the same value what it originally had.
My suggestion to understand somewhat complex code is to strip out, one by one, all "unnecessary" code and leave the bare problem. How do you know what's unnecessary? Initially, when you're not fluent with the language, you'll be removing parts of the code at random, but very quickly you'll learn what's necessary and what is not.
Give it a try: my hint is to start removing or commenting out the "goto" statement. Recompile and, if there are no errors, see what changed when you run the program again.
Another suggestion would be: try to recreate the problem "from scratch": imagine you are working on a top-secret project and you cannot show any single line of code to anyone, let alone post on Stack Overflow. Now, try to replicate the problem by rewriting equivalent source code, that would show the same behaviour.
As they say, "asking the right question is often solving half the problem".
The i you print in this printf("2 i %d %p\n",i,&i); statement, is not the i which value was 10 in if statement, and as you skip this int i = 30; statement with goto you print garbage. This int i = 30; is actual definition of the i that would be printed, i.e. where compiler allocates room and value of i.
The problem is that your goto is skipping the assignment to the second i, which shadows (conceals) the first i whose value you've set, so you're printing out an uninitialized variable.
You'll get a similar wrong answer from this:
#include<stdio.h>
int main()
{
int i = 10; /* First "i" */
printf("0 i %d %p\n",i,&i);
{ /* New block scope */
int i; /* Second "i" shadows first "i" */
printf("2 i %d %p\n",i,&i);
}
return 0;
}
Three lessons: don't shadow variables; don't create blocks ({ ... }) for no reason; and turn on compiler warnings.
Just to clarify: variable scope is a compile-time concept based on where variables are declared, not something that is subject to what happens at runtime. The declaration of i#2 conceals i#1 inside the block that i#2 is declared in. It doesn't matter if the runtime control path jumps into the middle of the block — i#2 is the i that will be used and i#1 is hidden (shadowed). Runtime control flow doesn't carry scope around in a satchel.

Safe to pass pointer to auto variable to function?

Suppose I have a function that declares and initializes two local variables – which by default have the storage duration auto. This function then calls a second function, to which it passes the addresses of these two local variables. Can this second function safely use these pointers?
A trivial programmatic example, to supplement that description:
#include <stdio.h>
int adder(int *a, int *b)
{
return *a + *b;
}
int main()
{
auto int a = 5; // `auto' is redundant; included for clarity
auto int b = 3;
// adder() gets the addresses of two auto variables! is this an issue?
int result = adder(&a, &b);
printf("5 + 3 = %d\n", result);
return 0;
}
This program works as expected, printing 5 + 3 = 8.
Usually, when I have questions about C, I turn to the standard, and this was no exception. Specifically, I checked ISO/IEC 9899, §6.2.4. It says there, in part:
4
An object whose identifier is declared with no linkage and without
the storage-class specifier static has automatic storage duration.
5
For such an object that does not have a variable length array type,
its lifetime extends from entry into the block with which it is
associated until execution of that block ends in any way. (Entering an
enclosed block or calling a function suspends, but does not end,
execution of the current block.) If the block is entered recursively,
a new instance of the object is created each time. The initial value
of the object is indeterminate. If an initialization is specified for
the object, it is performed each time the declaration is reached in
the execution of the block; otherwise, the value becomes indeterminate
each time the declaration is reached.
Reading this, I reason the following points:
Variables a and b have storage duration auto, which I've made explicit using the auto keyword.
Calling the adder() function corresponds to the parenthetical in clause 5, in the partial quote above. That is, entering the adder() function "suspends, but does not end," the execution of the current block (which is main()).
Since the main() block is not "end[ed] in any way," storage for a and b is guaranteed. Thus, accessing them using the addresses &a and &b, even inside adder(), should be safe.
My question, then, is: am I correct in this? Or am I just getting "lucky," and accessing memory locations that, by happenstance, have not been overwritten?
P.S. I was unable to find an exact answer to this question through either Google or SO's search. If you can, mark this as a duplicate and I'll delete it.
Yes, it is safe and basically your assumptions are correct. The lifetime of an automatic object is from the entry in the block where it has been declared until the block terminates.
(C99, 6.2.4p5) "For such an object [...] its lifetime extends from entry into the block with which it is associated until execution of that block ends in any way.
Your reasoning is correct for your particular function call chain, and you have read and quoted the relevant portions of the standard. This is a perfectly valid use of pointers to local variables.
Where you have to be wary is if the function stores the pointer values in a structure that has a lifetime longer than its own call. Consider two functions, foo(), and bar():
int *g_ptr;
void bar (int *p) {
g_ptr = p;
}
void foo () {
int x = 10;
bar(&x);
}
int main () {
foo ();
/* ...do something with g_ptr? */
return 0;
}
In this case, the variable xs lifetime ends with foo() returns. However, the pointer to x has been stored in g_ptr by bar(). In this case, it was an error for foo() to pass a pointer to its local variable x to bar().
What this means is that in order to know whether or not it is valid to pass a pointer to a local variable to a function, you have to know what that function will do with it.
Those variables are allocated in the stack. As long as you do not return from the function that declared them, they remain valid.
As I'm not yet allowed to comment, I'd rather write another answer as amendment to jxh's answer above:
Please see my elaborate answer here for a similar question. This contains a real world example where the aliasing in the called function makes your code break even though it follows all the c-language rules.
Even though it is legal in the C-language I consider it as harmful to pass pointers to automatic variables in a function call. You never know (and often you don't want to know) what exactly the called function does with the passed values. When the called function establishes an alias, you get in big trouble.

Within the same function, is it UB to access—through indirection—a local variable not in scope?

After the second closing brace, b is accessible only through indirection through a.
int main() {
int *a;
{
int b = 42;
a = &b;
}
printf("%d", *a); // UB?
return 0;
}
Since b is not anymore in scope, is this UB? I know it's UB to dereference a pointer to a non-static local variable from a function that has already returned, but in this case everything is within the same function.
This is UB in C++, but I'm not sure about C.
Yes, it's undefined behaviour to access any variable that has reached it's end of life. Scope and storage duration are subtly different here. Scope is more "when is the variable identifier visible?" and storage duration is "when does the variable itself exist?".
You can have things in scope and enduring such as:
int main (void) {
int spoon = 42;
// spoon is both in scope and enduring here
return 0;
}
Or out of scope but enduring:
int main (void) {
int *pSpoon;
{
static int spoon = 42;
pSpoon = &spoon;
}
// spoon is out of scope but enduring here (use *pSpoon to get to it)
return 0;
}
You can also have variables out of scope and not enduring, such as with:
int main (void) {
// spoon is neither in scope nor enduring here ("there is no spoon")
return 0;
}
In fact, the only thing you can't have is a variable in scope but not enduring. The identifier is tied to the the storage since it makes little sense allowing a variable with no backing storage.
I'm not talking about pointers here, that's an extra level of indirection - an in-scope pointer variable always has storage for the pointer value itself, even if the thing it points to has ended, or not yet begun, its storage duration.
The fact that undefined behaviour may work in some situations in no way makes the behaviour defined, in fact that's one of the most annoying features of undefined behaviour in that it sometimes does work. Otherwise it would be much easier to detect.
In this particular case, the b variable storage duration ends at the inner closing brace so trying to access it after that point is not wise.
The controlling part of the standard is c11 6.2.4 Storage duration of objects (slightly paraphrased to remove unnecessary bits):
An object has a storage duration that determines its lifetime. There are four storage
durations: static, thread, automatic, and allocated.
An object whose identifier is declared with no linkage and without the storage-class specifier static has automatic storage duration.
For such an object, its lifetime extends from entry into the block with which it is associated until execution of that block ends in any way.
b is still within a different, nested, scope that has exited and thus destroyed b by the time you access it. So it IS undefined behavior. Everything being within the same function is immaterial.
For the purposes of variable lifetimes, you can think of "functions that have returned" as nested scopes, or vice versa.

Resources