I have this code that I'm using for something else, but, boiled it down to the root problem I think. If I enter 5 for the scanf variable when I run it, the printf out is 0,16. I don't understand why this is giving me 16 for *pScores?
#include <stdio.h>
int main(void) {
int a=0;
int sum=0;
scanf("%d",&a);
int scores[a];
int *pScores = &scores[0];
printf("%d, %d\n",scores[0],*pScores);
}
You are declaring an array
int scores[a];
and then printing out the value of scores[0] in two different ways. However, you have not stored anything into any of the elements of the scores array, so the values there are indeterminate.
Whether use of uninitialized (and therefore indeterminate) values in this way actually rises to the level of Undefined Behavior is a surprisingly deep and actually somewhat contentious question. (See the comment thread raging at the other answer.) Nevertheless, printing an uninitialized value like this isn't terribly useful. If you fill in a well-defined value to at least scores[0], I believe you'll find that both scores[0] and *pScores will print the same — that same — value.
Now, you might expect that the uninitialized value — whatever it is — would at least be consistent no matter how you print it (and I might agree with you), but when it comes to gray areas like this, and especially when a modern compiler starts leveraging every nuance of the rules in performing aggressive optimizations, the end results can be pretty surprising. When I tried your program, I got the same number printed twice (that is, I couldn't initially reproduce your result), but as suggested by Barmar in a comment, when I turned on optimization (with -O3), I started seeing conflicting results, also.
You have undefined behavior, caused by reading a variable with automatic storage duration whose value is indeterminate.
In 6.2.4 Storage durations of objects one finds the following rule
For such an object that does have a variable length array type, its lifetime extends from the declaration of the object until execution of the program leaves the scope of the declaration. If the scope is entered recursively, a new instance of the object is created
each time. The initial value of the object is indeterminate.
Then in J.2 Undefined behavior:
The behavior is undefined in the following circumstances
...
The value of an object with automatic storage duration is used while it is indeterminate.
...
Among permitted very weird outcomes when dealing with indeterminate values is that they have a different value each time you read them. The Schroedinger wavefunction does not collapse!
Related
#include <stdio.h>
int main()
{
int a;
const int b = a;
printf("%d %d\n", a, b);
return 0;
}
The same code I tried to execute on onlinegdb.com compiler and on Ubuntu WSL. In onlinegdb.com, I got both a and b as 0 with every run, whereas in WSL it was a garbage value. I am not able to understand why garbage value is not coming onlinegdb.com
Using int a; inside a function is described by C 2018 6.2.4 5:
An object whose identifier is declared with no linkage and without the storage-class specifier static has automatic storage duration, as do some compound literals…
Paragraph 6 continues:
… The initial value of the object is indeterminate…
An indeterminate value is not an actual specific value but is a theoretical state used to describe the semantics of C. 3.19.2 says it is:
… either an unspecified value or a trap representation…
and 3.19.3 says an unspecified value is:
… valid value of the relevant type where this document imposes no requirements on which value is chosen in any instance…
That “any instance” part means that the program may behave as if a has a different value each time it is used. For example, these two statements may print different values for a:
printf("%d\n", a);
printf("%d\n", a);
Regarding const int b = a;, this is not covered explicitly by the C standard, but I have seen a committee response: When an indeterminate value is assigned (or initialized into) another object, the other object is also said to have an indeterminate value. So, after this declaration, b has an indeterminate value. The const is irrelevant; it means the source code of the program is not supposed to change b, but it cannot remedy the fact that b does not have a determined value.
Since the C standard permits any value to be used in each instance, onlinegdb.com conforms when it prints zero, and WSL conforms when it prints other values. Any int values printed for printf("%d %d\n", a, b); conform to the C standard.
Further, another provision in the C standard actually renders the entire behavior of the program undefined. C 2018 6.3.2.1 2 says:
… If the lvalue designates an object of automatic storage duration that could have been declared with the register storage class (never had its address taken), and that object is uninitialized (not declared with an initializer and no assignment to it has been performed prior to use), the behavior is undefined.
This applies to a: It could have been declared register because its address is never taken, and it is not initialized or assigned a value. So, using a has undefined behavior, and that extends to the entire behavior of the program on the code branch where a is used, which is the entire program execution.
I am not able to understand why garbage value is not coming
This is a very strange statement. I wonder: what kind of answer or explanation do you expect you might get? Something like:
Everyone who said that "uninitialized local variables start out containing random values" lied to you. WSL was wrong for giving you random values. You should have gotten 0, like you did with onlinegdb.com.
onlinegdb.com is buggy. It should have given truly random values.
The rules for const variables are special. When you say const int b = a;, it magically makes a's uninitialized value more predictable.
Are you expecting to get an answer like any of those? Because, no, none of those is true, none of those can possibly be true.
I'm sorry if it sounds like I'm teasing you here. I agree, it's surprising at first if an uninitialized local variable always starts out containing 0, because that's not very random, is it?
But the point is, the value of an uninitialized local variable is not defined. It is unspecified, indeterminate, and/or undefined. You cannot know what it is going to be. But that means that no value — no possible value — that it contains can ever be "wrong". In particular, onlinegdb.com is not wrong for not giving you random values: remember, it's not obligated to give you anything!
Think about it like this. Suppose you buy a carton of milk. Suppose it's printed on the label, "Contents may spoil — keep refrigerated." Suppose you leave the carton of milk on the counter overnight. That is, suppose you fail to properly refrigerate it. Suppose that, a day later, you realize your mistake. Horrified, you carefully open the milk carton and take a small taste, to see if it has spoiled. But you got lucky! It's still okay! It didn't spoil!
Now, at this point, what do you do?
Hastily put the milk in the refrigerator, and vow to be more careful next time.
March back to the store where you bought the milk, and accuse the shopkeeper of false advertising: "The label says 'contents may spoil', but it didn't!!" What do you think the shopkeeper is going to say?
This may seem like a silly analogy, but really, it's just like your C/C++ coding situation. The rules say you're supposed to initialize your local variables before you use them. You failed. Yet, somehow, you got predictable values anyway, at least under one compiler. But you can't complain about this, because it's not causing you a problem. And you can't depend on it, because as your experience with the other compiler showed you, it's not a guaranteed result.
Typically, local variables are stored on the stack. The stack gets used for all sorts of stuff. When you call a function, it receives a new stack frame where it can store its local variables, and where other stuff pertaining to that function call is stored, too. When a function returns, its stack frame is popped, but the memory is typically not cleared. That means that, when the next function is called, its newly-allocated stack frame may end up containing random bits of data left over from the previous function whose stack frame happened to occupy that part of stack memory.
So the question of what value an uninitialized local variable contains ends up depending on what the previous function might have been and what it might have left lying around on the stack.
In the case of main, it's quite possible that since it's the first function to be called, and the stack might start out empty, that main's stack frame always ends up being built on top of virgin, untouched, all-0 memory. That would mean that uninitialized variables in main might always seem to start out containing 0.
But this is not, not, not, not, not guaranteed!!!
Nobody said the stack was guaranteed to start out containing 0. Nobody said that there wasn't some startup code that ran before main that might have left some random garbage lying around on the stack.
If you want to enumerate possibilities, I can think of 3:
The function you're wondering about is one that always gets called first, or always gets called at a "deep leaf" of the call stack, meaning that it always gets a brand-new stack frame, and on a machine where the stack always starts out containing 0. Under these circumstances, uninitialized variables might always seem to start out containing 0.
The function you're wondering about does not always get a brand-new stack frame. It always gets a "dirty" stack frame with some previous function's random data lying around, and the program is such that, during every run, that previous function was doing something different and left something different on the stack, such that the next function's uninitialized local variables always seem to start out containing different, seemingly random values.
The function you're wondering about is always called right after a previous function that always does the same thing, meaning that it always leaves the same values lying around, meaning that the next function's uninitialized local variables always seem to start out with the same values every time, which aren't zero but aren't random.
But I hope it's obvious that you absolutely can't depend on any of this! And of course there's no reason to depend on any of this. If you want your local variables to have predictable values, you can simply initialize them. But if you're curious what happens when you don't, hopefully this explanation has helped you understand that.
Also, be aware that the explanation I've given here is somewhat of a simplification, and is not complete. There are systems that don't use a conventional stack at all, meaning that none of those possibilities 1, 2, or 3 could apply. There are systems that deliberately randomize the stack every time, either to help new programs not to accidentally become dependent on uninitialized variables, or to make sure that attackers can't exploit certain predictable results of a badly written program's undefined behavior.
When your operating system gives your program memory to work with, it will likely be zero to start (though not guaranteed). As your program calls functions it creates stack frames, and your program will effectively go from the .start assembly function to the int main() c function, so when main is called, no stack frame has written the memory that local variables are placed at. Therefore, a and b are both likely to be 0 (and b is guaranteed to be the same as a). However, it's not guaranteed to be 0, especially if you call some functions that have local variables or lots of parameters. For instance, if your code was instead
#include <stdio.h>
void foo()
{
int x = 42;
}
int main()
{
foo();
int a;
const int b = a;
printf("%d %d\n", a, b);
return 0;
}
then a would PROBABLY have the value 42 (in unoptimized builds), but that would depend on the ABI (https://en.wikipedia.org/wiki/Application_binary_interface) that your compiler uses and probably a few other things.
Basically, don't do that.
I have a question regarding how uninitialized variables work in C. If I declare a variable and then print it, the program should print a random value, however my program almost always outputs 0. If I try to declare a second variable, the program always outputs 16 as the new assigned value.
#include <stdio.h>
int main(){
int x,y,z,t;
printf("%d %d",x,y);
}
This program outputs 0 and 16, however if I add the following line of code: y--; right after declaring the variables but before printing them, the program outputs x=15 and y =-1, totally different values from what I had before. Why does this happen? How does C handle uninitialized variables?
According to the standard (draft N1570) the following applies:
6.2.4
The initial value of the object is indeterminate.
This is further described in
6.7.9
If an object that has automatic storage duration is not initialized explicitly, its value is
indeterminate.
The definition of "indeterminate" says
3.19.2
indeterminate value
either an unspecified value or a trap representation
And finally
J.2 Undefined behavior
The behavior is undefined in the following circumstances:
...
The value of an object with automatic storage duration is used while it is
indeterminate (6.2.4, 6.7.9, 6.8).
So all together: There is no way to know what value you get when reading an uninitialized variable and further it's not even sure that your program will continue to execute as it may be a trap.
One way of answering this is that when you read that "uninitialized variables start out containing random values", in this context, those values 0, 16, 15, and -1 that you saw are "random".
Now, it's true, they're certainly not random in the way that, say, the roll of a dice is random. They seem to stay the same, then change for no seemingly sensible reason, like when you change some other part of your program.
But you obviously can't predict them. So you obviously can't use them, or depend on them in any way. They might as well be truly random.
Why do they have the specific values that they do? It's almost impossible to say. Whatever the reasons are, they tend not to be very interesting. Whatever the reasons are, they're usually not important to know, since in any sane program you're never going to be depending on those uninitialized values anyway.
It's kind of like asking, "I sawed three legs off my kitchen table. Everything fell to the floor and broke. Except one plate didn't break. Why didn't it break? Also there was an apple, and it rolled into the living room. I expected it would roll down the basement stairs. Why did it roll into the living room?" Most of the time, the answer is simply: "Don't saw legs off your table in the first place!"
The other part of your question had to do with the fact that your uninitialized variable y seemed to start out containing the value 16, but when you added the statement y--, the value went to -1. What's up with that?
And the answer is that we're talking about uninitialized variables and unpredictable behavior. When you fail to initialize a variable, meaning that your program has unpredictable behavior, what does not happen is that some oracle somewhere picks an initial value for your variable, and then arranges for everything else about your program to be predictable. No, what it means for your program to be unpredictable is that you can't predict it! If the unpredictable value seemed to be 16 yesterday, you cannot predict that subtracting 1 from it will give you 15 tomorrow.
(To continue my silly analogy, if you were to saw the legs off your kitchen table again tomorrow, you would not expect that you'd necessarily end up with exactly one unbroken plate again. And if you sawed the legs off in a different order, hoping for the table to fall at a different angle and direct the apple down the basement stairs like you expected, you probably wouldn't be too disappointed if it didn't work.)
If you really want to know what's going on, part of the answer is that local variables are typically stored on the stack, and they start out containing whatever "random" bit pattern was left in that location on the stack by whatever function last used that piece of the stack for its own variables. So anything that moves things around on the stack, that causes your uninitialized variable to occupy a different spot in the stack frame, or that causes the previous function to leave different garbage on the stack, will cause your uninitialized variable to start out holding a different "random" value. See this other answer for some more discussion on what happens when you try to use uninitialized stack-based variables.
See also xkcd 221.
I am learning about scoping of variable in C.
Can anyone please explain what is going on below?
int w;
printf("\nw=%d\n", w);
w =-1;
Despite the fact that I initialized variable 'w' after 'printf', it always gets the value of "-1". This confused me, as I expect it to run sequentially. Hence, it should have printed some random value.
*** I also tried changing the value there, and it always read the written value. Hence, it did not randomly show "-1"
For experiment, I again tried the code below.
int w;
printf("\nw=%d\n", w);
w =-9;
w =-1;
Now, it reads a value of "2560". As I expect since it was not properly initialized before.
In your code
int w;
printf("\nw=%d\n", w);
invokes undefined behavior as you're trying to read the value of an uninitialized (automatic local) variable. The content of w is indeterminate at this point, and the output result is, well, undefined.
Always initialize your local variable before reading (using) the value.
Related: Quoting C11, chapter §6.7.9, Initialization
If an object that has automatic storage duration is not initialized explicitly, its value is
indeterminate. [....]
and, related to Undefined behavior, annex §J.2
The value of an object with automatic storage duration is used while it is
indeterminate
The variable in uninitialized. In "C", this means its value is "nondeterministic". In reality, the variable generally gets a value based on what's "laying around" at the memory address to which it gets assigned. In this case, its some value left on the stack.
It just so happens that often you will get consistent results across multiple runs simply due to external factors on which a program should not rely.
The compiler is optimizing the assignment of w in the first case. In the second case, it is deciding not to optimize.
In both cases, the compiler could choose to optimize out both assignments, since w is not used after they appear.
Initialize your variables before using them.
In both the above cases
int w;
printf("\nw=%d\n", w);
returns a random garbage value as we might call it which could be anything including -1 or 2560.
Blockquote
When you do not initialize a variable it can contain garbage value. Hence it's undefined behaviour and in most cases it will print random numbers as you experienced. By the way, as pointed out by others it's up to the compiler, so it may work with the expected value or it may don't work.
I'm reading this tutorial about debugging. I pasted the factorial code in my .c archive:
#include <stdio.h>
int main()
{
int i, num, j;
printf ("Enter the number: ");
scanf ("%d", &num );
for (i=1; i<num; i++)
j=j*i;
printf("The factorial of %d is %d\n",num,j);
}
When I run the executable, it always print 0, however, the author of the tutorial says that it return numbers garbage value. I've googled about this and I've read that this is right, except for static variables. So it should return a garbage number instead of 0.
I thought that this might be due to a different version of C, but the guide is from 2010.
Why do I always see 0, instead of a garbage value?
Both the C99 draft standard and the C11 draft standard say the value of an uninitialized automatic variable is indeterminate, from the draft c99 standard section 6.2.4 Storage durations of objects paragraph 5 says (emphasis mine):
For such an object that does not have a variable length array type, its lifetime extends
from entry into the block with which it is associated until execution of that block ends in
any way. (Entering an enclosed block or calling a function suspends, but does not end,
execution of the current block.) If the block is entered recursively, a new instance of the
object is created each time. The initial value of the object is indeterminate. If an
initialization is specified for the object, it is performed each time the declaration is
reached in the execution of the block; otherwise, the value becomes indeterminate each
time the declaration is reached.
the draft standard defines indeterminate as:
either an unspecified value or a trap representation
and an unspecified value is defined as:
valid value of the relevant type where this International Standard imposes no
requirements on which value is chosen in any instance
so the value can be anything. It can vary with the compiler, optimization settings and it can even vary from run to run but it can not be relied and thus any program that uses a indeterminate value is invoking undefined behavior.
The standard says this is undefined in one of the examples in section 6.5.2.5 Compound literals paragraph 17 which says:
Note that if an iteration statement were used instead of an explicit goto and a labeled statement, the lifetime of the unnamed object would be the body of the loop only, and on entry next time around p would have an indeterminate value, which would result in undefined behavior.
this is also covered in Annex J.2 Undefined behavior:
The value of an object with automatic storage duration is used while it is
indeterminate (6.2.4, 6.7.8, 6.8).
In some very specific cases you can make some predictions about such behavior, the presentation Deep C goes into some of them. These types of examination should only be used as a tool to further understand how systems work and should never even come close to a production system.
You need to initialize j to 1. If j happens to be zero, the answer will always be zero (one type of garbage). If j happens to non-zero, you'll get different garbage. Using uninitialized variables is undefined behaviour; 'undefined' does not exclude always being zero in the tests you've done so far.
Some systems have their memory set to 0 (Mac OS for example) so your variable will often contain 0 when you initialise it but it's a bad practice that will lead to unstable results.
You can't say what should happen in this case because the language specification doesn't say what should happen. In fact it says that the values of uninitialised non-static variables are indeterminate.
That means they can be any value. They can be different values on different runs of your program, or when your code is compiled on a different compiler, or when compiled on the same compiler with different optimisation settings. Or on different days of the week, national holidays or after 6pm.
An uninitialised variable can even hold what's called a trap representation, which is a value which is not valid for that type. If you access such a value then you're into the scary world of undefined behaviour where literally anything can happen.
I thought that once a function returns, all the local variables declared within (barring those with static keyword) are garbage collected. But when I am trying out the following code, it still prints the value after the function has returned. Can anybody explain why?
int *fun();
main() {
int *p;
p = fun();
printf("%d",*p); //shouldn't print 5, for the variable no longer exists at this address
}
int *fun() {
int q;
q = 5;
return(&q);
}
There's no garbage collection in C. Once the scope of a variable cease to exist, accessing it in any means is illegal. What you see is UB(Undefined behaviour).
It's undefined behavior, anything can happen, including appearing to work. The memory probably wasn't overwritten yet, but that doesn't mean you have the right to access it. Yet you did! I hope you're happy! :)
If you really want it to loose the value, perhaps call another function with at least a few lines of code in it, before doing the printf by accessing the location. Most probably your value would be over written by then.
But again as mentioned already this is undefined behavior. You can never predict when (or if at all) it crashes or changes. But you cannot rely upon it 'changing or remaining the same' and code an application with any of these assumptions.
What i am trying to illustrate is, when you make another function call after returning from previous one, another activation record is pushed on to the stack, most likely over writing the previous one including the variable whose value you were accessing via pointer.
No body is actually garbage collecting or doing a say memset 0 once a function and it's data goes out of scope.
C doesn't support garbage collection as supported by Java. Read more about garbage collection here
Logically, q ceases to exist when fun exits.
Physically (for suitably loose definitions of "physical"), the story is a bit more complicated, and depends on the underlying platform. C does not do garbage collection (not that garbage collection applies in this case). That memory cell (virtual or physical) that q occupied still exists and contains whatever value was last written to it. Depending on the architecture / operating system / whatever, that cell may still be accessible by your program, but that's not guaranteed:
6.2.4 Storage durations of objects
2 The lifetime of an object is the portion of program execution during which storage is
guaranteed to be reserved for it. An object exists, has a constant address,33)
and retains
its last-stored value throughout its lifetime.34)
If an object is referred to outside of its
lifetime, the behavior is undefined. The value of a pointer becomes indeterminate when
the object it points to (or just past) reaches the end of its lifetime.
33) The term ‘‘constant address’’ means that two pointers to the object constructed at possibly different
times will compare equal. The address may be different during two different executions of the same
program.
34) In the case of a volatile object, the last store need not be explicit in the program.
"Undefined behavior" is the C language's way of dealing with problems by not dealing with them. Basically, the implementation is free to handle the situation any way it chooses to, up to ignoring the problem completely and letting the underlying OS kill the program for doing something naughty.
In your specific case, accessing that memory cell after fun had exited didn't break anything, and it had not yet been overwritten. That behavior is not guaranteed to be repeatable.