Behaviour of Uninitialized Variables in C - c

I have a question regarding how uninitialized variables work in C. If I declare a variable and then print it, the program should print a random value, however my program almost always outputs 0. If I try to declare a second variable, the program always outputs 16 as the new assigned value.
#include <stdio.h>
int main(){
int x,y,z,t;
printf("%d %d",x,y);
}
This program outputs 0 and 16, however if I add the following line of code: y--; right after declaring the variables but before printing them, the program outputs x=15 and y =-1, totally different values from what I had before. Why does this happen? How does C handle uninitialized variables?

According to the standard (draft N1570) the following applies:
6.2.4
The initial value of the object is indeterminate.
This is further described in
6.7.9
If an object that has automatic storage duration is not initialized explicitly, its value is
indeterminate.
The definition of "indeterminate" says
3.19.2
indeterminate value
either an unspecified value or a trap representation
And finally
J.2 Undefined behavior
The behavior is undefined in the following circumstances:
...
The value of an object with automatic storage duration is used while it is
indeterminate (6.2.4, 6.7.9, 6.8).
So all together: There is no way to know what value you get when reading an uninitialized variable and further it's not even sure that your program will continue to execute as it may be a trap.

One way of answering this is that when you read that "uninitialized variables start out containing random values", in this context, those values 0, 16, 15, and -1 that you saw are "random".
Now, it's true, they're certainly not random in the way that, say, the roll of a dice is random. They seem to stay the same, then change for no seemingly sensible reason, like when you change some other part of your program.
But you obviously can't predict them. So you obviously can't use them, or depend on them in any way. They might as well be truly random.
Why do they have the specific values that they do? It's almost impossible to say. Whatever the reasons are, they tend not to be very interesting. Whatever the reasons are, they're usually not important to know, since in any sane program you're never going to be depending on those uninitialized values anyway.
It's kind of like asking, "I sawed three legs off my kitchen table. Everything fell to the floor and broke. Except one plate didn't break. Why didn't it break? Also there was an apple, and it rolled into the living room. I expected it would roll down the basement stairs. Why did it roll into the living room?" Most of the time, the answer is simply: "Don't saw legs off your table in the first place!"
The other part of your question had to do with the fact that your uninitialized variable y seemed to start out containing the value 16, but when you added the statement y--, the value went to -1. What's up with that?
And the answer is that we're talking about uninitialized variables and unpredictable behavior. When you fail to initialize a variable, meaning that your program has unpredictable behavior, what does not happen is that some oracle somewhere picks an initial value for your variable, and then arranges for everything else about your program to be predictable. No, what it means for your program to be unpredictable is that you can't predict it! If the unpredictable value seemed to be 16 yesterday, you cannot predict that subtracting 1 from it will give you 15 tomorrow.
(To continue my silly analogy, if you were to saw the legs off your kitchen table again tomorrow, you would not expect that you'd necessarily end up with exactly one unbroken plate again. And if you sawed the legs off in a different order, hoping for the table to fall at a different angle and direct the apple down the basement stairs like you expected, you probably wouldn't be too disappointed if it didn't work.)
If you really want to know what's going on, part of the answer is that local variables are typically stored on the stack, and they start out containing whatever "random" bit pattern was left in that location on the stack by whatever function last used that piece of the stack for its own variables. So anything that moves things around on the stack, that causes your uninitialized variable to occupy a different spot in the stack frame, or that causes the previous function to leave different garbage on the stack, will cause your uninitialized variable to start out holding a different "random" value. See this other answer for some more discussion on what happens when you try to use uninitialized stack-based variables.
See also xkcd 221.

Related

In C, I'm struggling with Pointers

I have this code that I'm using for something else, but, boiled it down to the root problem I think. If I enter 5 for the scanf variable when I run it, the printf out is 0,16. I don't understand why this is giving me 16 for *pScores?
#include <stdio.h>
int main(void) {
int a=0;
int sum=0;
scanf("%d",&a);
int scores[a];
int *pScores = &scores[0];
printf("%d, %d\n",scores[0],*pScores);
}
You are declaring an array
int scores[a];
and then printing out the value of scores[0] in two different ways. However, you have not stored anything into any of the elements of the scores array, so the values there are indeterminate.
Whether use of uninitialized (and therefore indeterminate) values in this way actually rises to the level of Undefined Behavior is a surprisingly deep and actually somewhat contentious question. (See the comment thread raging at the other answer.) Nevertheless, printing an uninitialized value like this isn't terribly useful. If you fill in a well-defined value to at least scores[0], I believe you'll find that both scores[0] and *pScores will print the same — that same — value.
Now, you might expect that the uninitialized value — whatever it is — would at least be consistent no matter how you print it (and I might agree with you), but when it comes to gray areas like this, and especially when a modern compiler starts leveraging every nuance of the rules in performing aggressive optimizations, the end results can be pretty surprising. When I tried your program, I got the same number printed twice (that is, I couldn't initially reproduce your result), but as suggested by Barmar in a comment, when I turned on optimization (with -O3), I started seeing conflicting results, also.
You have undefined behavior, caused by reading a variable with automatic storage duration whose value is indeterminate.
In 6.2.4 Storage durations of objects one finds the following rule
For such an object that does have a variable length array type, its lifetime extends from the declaration of the object until execution of the program leaves the scope of the declaration. If the scope is entered recursively, a new instance of the object is created
each time. The initial value of the object is indeterminate.
Then in J.2 Undefined behavior:
The behavior is undefined in the following circumstances
...
The value of an object with automatic storage duration is used while it is indeterminate.
...
Among permitted very weird outcomes when dealing with indeterminate values is that they have a different value each time you read them. The Schroedinger wavefunction does not collapse!

What value does a const variable take when it's initialized by a non-const and non-initialized variable?

#include <stdio.h>
int main()
{
int a;
const int b = a;
printf("%d %d\n", a, b);
return 0;
}
The same code I tried to execute on onlinegdb.com compiler and on Ubuntu WSL. In onlinegdb.com, I got both a and b as 0 with every run, whereas in WSL it was a garbage value. I am not able to understand why garbage value is not coming onlinegdb.com
Using int a; inside a function is described by C 2018 6.2.4 5:
An object whose identifier is declared with no linkage and without the storage-class specifier static has automatic storage duration, as do some compound literals…
Paragraph 6 continues:
… The initial value of the object is indeterminate…
An indeterminate value is not an actual specific value but is a theoretical state used to describe the semantics of C. 3.19.2 says it is:
… either an unspecified value or a trap representation…
and 3.19.3 says an unspecified value is:
… valid value of the relevant type where this document imposes no requirements on which value is chosen in any instance…
That “any instance” part means that the program may behave as if a has a different value each time it is used. For example, these two statements may print different values for a:
printf("%d\n", a);
printf("%d\n", a);
Regarding const int b = a;, this is not covered explicitly by the C standard, but I have seen a committee response: When an indeterminate value is assigned (or initialized into) another object, the other object is also said to have an indeterminate value. So, after this declaration, b has an indeterminate value. The const is irrelevant; it means the source code of the program is not supposed to change b, but it cannot remedy the fact that b does not have a determined value.
Since the C standard permits any value to be used in each instance, onlinegdb.com conforms when it prints zero, and WSL conforms when it prints other values. Any int values printed for printf("%d %d\n", a, b); conform to the C standard.
Further, another provision in the C standard actually renders the entire behavior of the program undefined. C 2018 6.3.2.1 2 says:
… If the lvalue designates an object of automatic storage duration that could have been declared with the register storage class (never had its address taken), and that object is uninitialized (not declared with an initializer and no assignment to it has been performed prior to use), the behavior is undefined.
This applies to a: It could have been declared register because its address is never taken, and it is not initialized or assigned a value. So, using a has undefined behavior, and that extends to the entire behavior of the program on the code branch where a is used, which is the entire program execution.
I am not able to understand why garbage value is not coming
This is a very strange statement. I wonder: what kind of answer or explanation do you expect you might get? Something like:
Everyone who said that "uninitialized local variables start out containing random values" lied to you. WSL was wrong for giving you random values. You should have gotten 0, like you did with onlinegdb.com.
onlinegdb.com is buggy. It should have given truly random values.
The rules for const variables are special. When you say const int b = a;, it magically makes a's uninitialized value more predictable.
Are you expecting to get an answer like any of those? Because, no, none of those is true, none of those can possibly be true.
I'm sorry if it sounds like I'm teasing you here. I agree, it's surprising at first if an uninitialized local variable always starts out containing 0, because that's not very random, is it?
But the point is, the value of an uninitialized local variable is not defined. It is unspecified, indeterminate, and/or undefined. You cannot know what it is going to be. But that means that no value — no possible value — that it contains can ever be "wrong". In particular, onlinegdb.com is not wrong for not giving you random values: remember, it's not obligated to give you anything!
Think about it like this. Suppose you buy a carton of milk. Suppose it's printed on the label, "Contents may spoil — keep refrigerated." Suppose you leave the carton of milk on the counter overnight. That is, suppose you fail to properly refrigerate it. Suppose that, a day later, you realize your mistake. Horrified, you carefully open the milk carton and take a small taste, to see if it has spoiled. But you got lucky! It's still okay! It didn't spoil!
Now, at this point, what do you do?
Hastily put the milk in the refrigerator, and vow to be more careful next time.
March back to the store where you bought the milk, and accuse the shopkeeper of false advertising: "The label says 'contents may spoil', but it didn't!!" What do you think the shopkeeper is going to say?
This may seem like a silly analogy, but really, it's just like your C/C++ coding situation. The rules say you're supposed to initialize your local variables before you use them. You failed. Yet, somehow, you got predictable values anyway, at least under one compiler. But you can't complain about this, because it's not causing you a problem. And you can't depend on it, because as your experience with the other compiler showed you, it's not a guaranteed result.
Typically, local variables are stored on the stack. The stack gets used for all sorts of stuff. When you call a function, it receives a new stack frame where it can store its local variables, and where other stuff pertaining to that function call is stored, too. When a function returns, its stack frame is popped, but the memory is typically not cleared. That means that, when the next function is called, its newly-allocated stack frame may end up containing random bits of data left over from the previous function whose stack frame happened to occupy that part of stack memory.
So the question of what value an uninitialized local variable contains ends up depending on what the previous function might have been and what it might have left lying around on the stack.
In the case of main, it's quite possible that since it's the first function to be called, and the stack might start out empty, that main's stack frame always ends up being built on top of virgin, untouched, all-0 memory. That would mean that uninitialized variables in main might always seem to start out containing 0.
But this is not, not, not, not, not guaranteed!!!
Nobody said the stack was guaranteed to start out containing 0. Nobody said that there wasn't some startup code that ran before main that might have left some random garbage lying around on the stack.
If you want to enumerate possibilities, I can think of 3:
The function you're wondering about is one that always gets called first, or always gets called at a "deep leaf" of the call stack, meaning that it always gets a brand-new stack frame, and on a machine where the stack always starts out containing 0. Under these circumstances, uninitialized variables might always seem to start out containing 0.
The function you're wondering about does not always get a brand-new stack frame. It always gets a "dirty" stack frame with some previous function's random data lying around, and the program is such that, during every run, that previous function was doing something different and left something different on the stack, such that the next function's uninitialized local variables always seem to start out containing different, seemingly random values.
The function you're wondering about is always called right after a previous function that always does the same thing, meaning that it always leaves the same values lying around, meaning that the next function's uninitialized local variables always seem to start out with the same values every time, which aren't zero but aren't random.
But I hope it's obvious that you absolutely can't depend on any of this! And of course there's no reason to depend on any of this. If you want your local variables to have predictable values, you can simply initialize them. But if you're curious what happens when you don't, hopefully this explanation has helped you understand that.
Also, be aware that the explanation I've given here is somewhat of a simplification, and is not complete. There are systems that don't use a conventional stack at all, meaning that none of those possibilities 1, 2, or 3 could apply. There are systems that deliberately randomize the stack every time, either to help new programs not to accidentally become dependent on uninitialized variables, or to make sure that attackers can't exploit certain predictable results of a badly written program's undefined behavior.
When your operating system gives your program memory to work with, it will likely be zero to start (though not guaranteed). As your program calls functions it creates stack frames, and your program will effectively go from the .start assembly function to the int main() c function, so when main is called, no stack frame has written the memory that local variables are placed at. Therefore, a and b are both likely to be 0 (and b is guaranteed to be the same as a). However, it's not guaranteed to be 0, especially if you call some functions that have local variables or lots of parameters. For instance, if your code was instead
#include <stdio.h>
void foo()
{
int x = 42;
}
int main()
{
foo();
int a;
const int b = a;
printf("%d %d\n", a, b);
return 0;
}
then a would PROBABLY have the value 42 (in unoptimized builds), but that would depend on the ABI (https://en.wikipedia.org/wiki/Application_binary_interface) that your compiler uses and probably a few other things.
Basically, don't do that.

why garbage value is stored during declaration?

I heard several times that if you do not initialise a variable then garbage value is stored in it.
Say
int i;
printf("%d",i);
The above code prints any garbage value, but I want to know that what is the need for storing garbage value if uninitialized?
The value of an uninitialized value is not simply unknown or garbage, but indeterminate and evaluating that variable may invoke undefined behavior or implementation-defined behavior.
One possible scenario (which is probably the scenario you are seeing) is that the variable, when evaluated, will return the value that was previously present in that memory address. Therefore, it's not like garbage is explicitly written to that variable.
It's worth noting that languages (or even C implementations) that do not exhibit the behavior you're seeing, do so by explicitly writing zeroes (or other initial values) to that area, before allowing you to use it.
It is not storing garbage, it prints whatever happens to be there in memory at that address when it is running. This is in the name of efficiency. You don't pay for what you didn't ask for.
EDIT
To answer why there is something in memory. All sort of program runs and need to share memory. When memory is allocated to your process, it is not reset, again for performance reason. Since the variable we are observing is declared on the stack, it could even be your program that put the value there in a previous function call.
C only does what you tell it to. The standard defines reading an uninitialized variable as undefined behavior.
This question elaborates: (Why) is using an uninitialized variable undefined behavior?
The accepted answer has a very good explanation.
EDIT:
A funny sidenote though, if you declare the variable static it is guaranteed to be initialized to zero per the standard. Can't find a quote right now, working on it..
EDIT2:
I left my C reference at work and CBA to download one. This answer elaborates on the initial values of variables, whether they be local/auto, global, static or indeterminate: https://stackoverflow.com/a/1597491/700170
The other answers point out (correctly) that what's being printed is whatever's already in memory in the memory location that happens to have been assigned to i.
They don't, however, clarify why there are any values stored in these locations in the first place, which is perhaps what you're really asking.
There are two reasons for this: first, upon startup, we can't be sure exactly how the memory circuits will initialize themselves. So they could be set to any arbitrary value. The second (and, in general, more likely reason, unless you just restarted your computer) is that before you started your program, that memory location had been used by another program, which stored something there--something that wasn't garbage at the time, since it was stored intentionally. From the perspective of your program, however, it is garbage, since your program has no way of knowing why that particular value was stored there.
EDIT: As I mentioned in a comment on another answer, even if the value stored in memory under some uninitialized variable is actually 0, that's not the same thing as "not having a value stored." The value stored is 0, which is to say, the physical hardware that represents one bit of memory is faithfully storing the value 0. As long as a circuit is active (i.e. turned on), the memory cells must store something; for an explanation of why this is, look into flip-flop gates. (There's a decent overview here, assuming you already understand a little bit about NAND gates: http://computer.howstuffworks.com/boolean4.htm)
It's happens only in case of local varibales. As memory for local variables are allocated on stack and while allocating the memory the runtime system does not clear the memory before allocating it to the variable unlike in case of allocating memory in heap for global and static variables. Hence the default value of local varibles beomes the content of its memory on stack while that of constant and static variables is 0.

ANSI C (1989): is it possible to init a variable with an OR?

I got some code to analyze and I can't realize why this expression always give me the same result, the letter "b" in this example.
unsigned char ucVal2;
ucVal2 |= 0x62;
I always thought that when you don't define a variable, its value is non-deterministic..
So, in this case, I was supposing that the value of ucVal2 would have been SOMETHING OR 0x62, but the execution always show me that ucVal2 is 0x62, just like SOMETHING is always 0x00.
That might just be luck. Or your particular compiler and/or OS might guarantee that memory is initialized to 0, even though the C language itself doesn't.
According to the standard, the value of ucVal2 is undefined and the above code is not portable. Ther eis nothing special about using the |= operator like this that in some way affects the fact that it's undefined.
If ucVal2 is defined at file scope (ie a "global") then it will always be initialized to 0. If it's declared at function scope, then it won't be, and its value will be random.
non-deterministic is not the same thing as random.
Just because there are no guarantees about the value of an uninitialized variable doesn't mean it you cannot reasonably guess what the value is.
Most memory ends up being zero after a computer starts up. Until a process writes to it, it will continue to be zero. When your program declares this uninitialized variable, the value is most likely still zero. But you cannot be assured that it will be zero.
You absolutely cannot rely on the value being zero. Just because the variable starts at zero for one thousand tests in a row, for the thousand-and-first test, it may be something completely different.
Accessing the contents of an uninitialized variable is undefined behavior. Meaning it could be non-deterministic junk, deterministic junk or zero. The program is even free to crash and burn when you access it, although on most systems that is unlikely.
Think of it as if blindly sticking your hand into a public garbage bin: it can be empty, it could contain more or less disgusting garbage, there could be a gigantic rat inside biting your hand off. There is no guarantee of what you will come up with.
Trying to understand why a certain kind of undefined behavior gives a certain result isn't particularly meaningful, it usually yields little or no useful knowledge.
If we are to guess, then the junk you have there on the stack is a left-over from some program pre-initialization code, or previously executed functions. In such a case, the nature of the junk will likely be of a deterministic nature.

Why local variable is initialized to zero

According to my knowledge, local variables are uninitialized i.e, it contains garbage value.
But following program is giving 0 (zero) as output.
main()
{
int i;
printf("%d\n",i);
}
When i run above program it is giving always 0.
I know that 0 is also can be a garbage value but every time i am getting zero as output.
Can anybody know reason for it?
Garbage value means whatever happened to be in that memory location. In your case, the value happened to be zero. It may not be the case in another machine.
Note that some compiler will fill uninitialized variable with some magic value for the purpose of debugging (like 0xA5A5), but it's usually not zero, either.
When i run above program it is giving always 0. I know that 0 is also
can be a garbage value but every time i am getting zero as output.
Whatever happened to cause a 0 to be written into the location where i is now probably happens every time the program runs. Computers are nice and reliable like that. "garbage" doesn't necessarily mean "random" or "always changing," it just means "not meaningful in any context that I care about."
I think it is just an accident. local variable is indeed uninitialized, but the memory your compiler allocate for the (int i) variable is not used previously by your current process, so there is no garbage value.
Luck! The behaviour is undefined, and so the answer depends on your compiler and system. This time, you happened to get lucky that the first four bytes in that area of memory were zero. But there is no guarantee that it will always do so, from one system to the next or even from one invocation to the next.
A possible reason for printing always 0 is that main is started in a well defined state; more exactly, an ELF program is starting with a well defined stack (defined by the ELF specification) and registers, so its _start function (from crt*.o) which is the ELF executable starting point gets a well defined stack, and calls main .
Try making your function some other name, and call it in various states (e.g. call it several times from main in more complex ways). Try also running your program with different program arguments and environments. You'll probably observe different values for i
Your program exhibits some undefined behavior (and with all warnings enabled gcc -Wall is warning you).
As far as I know, uninitialized variables in Linux are first "allocated" in the Zero Page - a special page that contains only zeros.
Then, at the first write to the unitialized variable, the variable is moved from the zero page to another page that is not write-protected.

Resources