How does an uninitiliazed variable get a garbage value? - c

When we create a variable and don't initialize it, then some (random) number called garbage value is assigned to it.
How this value is assigned to the variable?
What is whole concept/mechanism behind this?
Does this happen only in C?

The garbage value is not assigned, rather the value is already there. When you allocate a variable you are reserving a piece of memory - until you overwrite it that memory will contain whatever "random" information was there before.
As a metaphor, think of allocating a variable like buying a piece of land - until you do something with it (like build a house) the land will just have whatever trash was already sitting there (like an old crumbling house).
Some languages will automatically fill newly allocated variables with zeros - this takes time to do. In more "do-it-yourself" languages like C this extra behavoir is not guarenteed (though on some systems memory is cleared regardless of language, for example as a security measure)

Memory is used and reused at various points in your application. For example, as the call stack of your application grows and shrinks the same location in memory may be overwritten many, many times. The thing to remember is that as a piece of memory is abandoned it is not zeroed out, so if you do not specify a new initial value for that place in memory when you use it again you will get the old, "garbage" value.
Some languages and structure implementations do default-initialize memory as it is used. Others do not, so it is important to read the documentation of your language carefully to know what to expect.

Nobody explicitly assigns a grabage value. If you create a variable, only location of the variable is determined, and not its value. Thats why we are initializing it. The garbage value might come from some previous operations over the same memory by old processes! So it can hold anything. I think it holds to pretty good number of languages. I am not sure about the list! :)

When we create a variable and don't initialize it, then nothing happens. When you read value from that variable you get data from memory where variable located now. It could looks like garbage/random value only because variables are placed in memory with some degree of randomness.

C standards say:
undefined behavior for local variables: (Why) is using an uninitialized variable undefined behavior? (e.g. segfault is legal)
zero for global variables: What happens to a declared, uninitialized variable in C? Does it have a value?
Implementation: detailed examination of an implementation at: https://stackoverflow.com/a/36725211/895245 Summary:
local: the address is never written to, so whatever was there previously gets used
global: .bss

Related

What does uninitialized read mean?

Someone said uninitialized read is accessing an unwritten but allocated memory space.
And there’s also someone said it is accessing an unallicated memory space. So I am here to double check the meaning and BTW: Could you briefly explain what do "written" and "allocated" mean.
Hard to say without full context but here is best guesses --
uninitialized read -- you would say this when a variable or structure is read from memory without a value or default having been written to it. Thus you are reading unitialized (random) data. If a hacker could write to that memory location they could cause your system to act unexpectedly.*
TO FIX: make sure all allocated data and structures have default values written to them.
unallocated memory -- this is memory that has not specifically been marked as used by your application. This means any application or system could write to this memory and impact your system (since you are not reading from space that is designated for your application.
TO FIX: make sure you allocate all memory you use using your memory management system of choice.
*It has been pointed out that the system might behave unexpected anyway but the fact the system could be controlled by an outside agency was my point
Could you briefly explain what do "written" and "allocated" mean.
“Allocated” means the memory has been designated for a specific use.
When int x; appears inside a function in a C program, memory is automatically allocated for it. (It is automatic in that the compiler arranges for the memory to be reserved for x, so the author of this function does not have to do anything else to get that memory.) Memory can also be allocated in other ways, such as by explicit request, and C has rules for which declarations do or do not reserve memory that can be somewhat complicated.
When memory is automatically allocated in this way, it is not automatically initialized. This means the program has decided a certain part of memory will be used for x but it has not put any value into it. That memory could contain a value left over from prior use, or it could contain zero from when the operating system cleared it before assigning it to the program, or it could contain something else. (Additionally, due to the rules of the C standard and the complexities of modern compilers, memory that is not initialized can cause complications in your program. It may act in ways that are confusing to beginners.)
To ensure the memory has a defined value, you should initialize it. This can be done in the definition, as with int x = 3;, or it can be done later, as with x = 3;.
Setting an object to a value is also called writing to memory, storing to memory, storing to an object, and assigning a value. So, if you have written a value to an object, you have initialized it. (“Initialization” generally refers to the first time a value is written to a new object, but we can also say we are “reinitializing” something when we are resetting its value to a state we consider “earlier” in some sense.)
Someone said uninitialized read is accessing an unwritten but allocated memory space. And there’s also someone said it is accessing an unallicated memory space.
“Uninitialized read” is a somewhat crude term. Properly, we might say a “read of uninitialized memory,” and that is indeed reading memory that is uninitialized. Even if the memory assigned for a new object, say x, was previously used for something else, we refer to that memory as uninitialized once it has been newly designated for the new object and not yet written to.
“Uninitialized read” does not mean accessing unallocated memory.

How are garbage values for variables generated in C?

I mean to ask if it follows some specific algorithm and actually are not junk.
In other words, how exactly the "garbage" values be present? Considering not invoking UB, if a garbage value is read, what is the source of that value?
The standard does not mention the term "garbage", it mentions "indeterministic / indeterminate values". The value can be anything.Note
From the user point of view, if we are unable to get a fix on a certain value (for any variable), then the "expectation" is not matched anytime and the value (if) we get is not of any use, thus terming them as "garbage" is common.
The most relatable and common observation / implementation is, for an automatic variable left uninitialized, only the storage is allocated, the content of that storage is not touched. So, probably it still contains the last stored value which was put there. Now, that value, probably being a valid one in other (previous) scenario, in present case, does not make any sense, so it is "garbage" in current scenario.
TL;DR The "garbage" value is not generated, most of the cases, it's just the last stored value in that memory location.
Note:
Related quoting from C11, chapter §6.7.9
If an object that has automatic storage duration is not initialized explicitly, its value is
indeterminate. [....]
§§ Additional Read:
This is very closely related to the topic, so adding it as a footnote.
In case, there exist a variable, which holds indeterministic value and
the data type can have trap representation
the address of the variable is not taken
then, trying to read the value actually causes undefined behavior. Be aware.
When C allocates memory on the stack and heap, it does not modify what's stored at the memory addresses, it simply designates the space for your variable. An un-initialized variable will contain whatever was in memory at that location before you declared the variable. Some times the values will be previous memory from your program that was used and cleared, and some times they will be values from the OS, stack canaries, etc. There is no way to predict what will be there.
The garbage values exist for variables on the stack (also known as automatic variable) if not initialized. They are pushed, popped, initialized with previous variable residing on the address in the stack, previous function call, etc.
Every specific memory addresses (either stack or heap) will be having some data. Before assigning that memory to your variable, it might be used by another variable (may be by OS or other programs). So, it might contain last values assigned by those last allocated variables. And those values are now useless to us. Thats why they are garbage for us

why garbage value is stored during declaration?

I heard several times that if you do not initialise a variable then garbage value is stored in it.
Say
int i;
printf("%d",i);
The above code prints any garbage value, but I want to know that what is the need for storing garbage value if uninitialized?
The value of an uninitialized value is not simply unknown or garbage, but indeterminate and evaluating that variable may invoke undefined behavior or implementation-defined behavior.
One possible scenario (which is probably the scenario you are seeing) is that the variable, when evaluated, will return the value that was previously present in that memory address. Therefore, it's not like garbage is explicitly written to that variable.
It's worth noting that languages (or even C implementations) that do not exhibit the behavior you're seeing, do so by explicitly writing zeroes (or other initial values) to that area, before allowing you to use it.
It is not storing garbage, it prints whatever happens to be there in memory at that address when it is running. This is in the name of efficiency. You don't pay for what you didn't ask for.
EDIT
To answer why there is something in memory. All sort of program runs and need to share memory. When memory is allocated to your process, it is not reset, again for performance reason. Since the variable we are observing is declared on the stack, it could even be your program that put the value there in a previous function call.
C only does what you tell it to. The standard defines reading an uninitialized variable as undefined behavior.
This question elaborates: (Why) is using an uninitialized variable undefined behavior?
The accepted answer has a very good explanation.
EDIT:
A funny sidenote though, if you declare the variable static it is guaranteed to be initialized to zero per the standard. Can't find a quote right now, working on it..
EDIT2:
I left my C reference at work and CBA to download one. This answer elaborates on the initial values of variables, whether they be local/auto, global, static or indeterminate: https://stackoverflow.com/a/1597491/700170
The other answers point out (correctly) that what's being printed is whatever's already in memory in the memory location that happens to have been assigned to i.
They don't, however, clarify why there are any values stored in these locations in the first place, which is perhaps what you're really asking.
There are two reasons for this: first, upon startup, we can't be sure exactly how the memory circuits will initialize themselves. So they could be set to any arbitrary value. The second (and, in general, more likely reason, unless you just restarted your computer) is that before you started your program, that memory location had been used by another program, which stored something there--something that wasn't garbage at the time, since it was stored intentionally. From the perspective of your program, however, it is garbage, since your program has no way of knowing why that particular value was stored there.
EDIT: As I mentioned in a comment on another answer, even if the value stored in memory under some uninitialized variable is actually 0, that's not the same thing as "not having a value stored." The value stored is 0, which is to say, the physical hardware that represents one bit of memory is faithfully storing the value 0. As long as a circuit is active (i.e. turned on), the memory cells must store something; for an explanation of why this is, look into flip-flop gates. (There's a decent overview here, assuming you already understand a little bit about NAND gates: http://computer.howstuffworks.com/boolean4.htm)
It's happens only in case of local varibales. As memory for local variables are allocated on stack and while allocating the memory the runtime system does not clear the memory before allocating it to the variable unlike in case of allocating memory in heap for global and static variables. Hence the default value of local varibles beomes the content of its memory on stack while that of constant and static variables is 0.

How is undefined value created? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
How an uninitialised variable gets a garbage value?
So when an undefined (but declared) variable is used, it contains strange value each time. How does it have such value? Is it randomly generated purposely?
It is not randomly generated, it is just residual memory.
It would be highly inefficient to clear all unused memory every time. So the memory is released to the OS and made available. When you request new memory, you get some of this memory, which is not owned by anyone but still has garbage in it since it was just freed, but not cleared.
When a local variable is declared, the compiler allocates a slot within the enclosing function's stack frame in which the variable will live. Whatever value was in that particular spot in memory before that stack frame was setup (usually from a previous function call who's stack frame occupied that space) becomes the initial contents of that variable.
In some cases, uninitialised variables are in fact deliberately set to some value, but it's rarely random. For instance, a debug malloc() might set every word of a newly allocated block to 0xbadf00d, to serve as a marker that the memory hasn't been allocated. Thus, struct members might be initialised to something other than whatever was there before. I don't know of any compilers that do this for stack variables, but they might exist.
C is not cleaning the memory it allocates. These values are the 'leftover' in the allocated location in memory.
C as a principle is not hiding operations from the programmer. It does only what asked to.Since you did not ask to initialize the variable, it doesn't do it for you, so the same bits that were 'up'/'down' in the destination memory - do not change.
Variable can have undefined values to avoid overhead of initializing with some sensible values, so it is definitely not generated randomly (which is itself nontrivial operation). The value variable initially holds is just what happens to be in variable's memory location at that time.
The value of a unitialized value in c depends of whatever the value stored at the memory address was. It isn't randomized on purpose.
Whenever you declare a variable, it will already have a memory space to hold it's value. If you don't set anything explicitly, it will contain whatever value that was previously stored on that location. So it's not randomly generated purposely by the program, just the value that happens to be there.
Declaration of a variable is a indication to the compiler saying that there is one such variable of so and so type. Definition of a variable allocates memory for that. The allocated memory could be anywhere from stack (auto variables), heap (dynamically allocated memory), etc. Unless it is a static variable, it will be allocated memory from an un-initialized data segment. So the random values that you are seeing is nothing but the values that are stored in that memory locations earlier! So it is advised to initialize the variables before using them (for the first time) or in other words, do not use/de-reference un-initialized variables/pointers.
More information on a structure of a program in memory can be found here.
Hope it helps!

In C if a variable is not assigned a value then why does it take garbage value?

Why do the variables take garbage values?
I guess the rationale for this is that your program will be faster.
If compiler automatically reset (ie: initialize to 0 or to NaN for float/doubles etc etc) your variables, it would take some time doing that (it'd have to write to memory).
In many cases initializing variables could be unneeded: maybe you will never access your variable, or will write on it the first time you access it.
Today this optimization is arguable: the overhead due to initializing variables is maybe not worth the problems caused by variables uninitialized by mistake, but when C has been defined things were different.
Unassigned variables has so-called indeterminate state that can be implemented in whatever way, usually by just keeping unchanged whatever data was in memory now occupied by the variable.
It just takes whatever is in memory at the address the variable is pointing to.
When you allocate a variable you are allocating some memory. if you dont overwrite it, memory will contain whatever "random" information was there before and that is called garbage value.
Why would it not? A better question might be "Can you explain how it comes about that a member variable in C# which is not initialised has a known default value?"
When variable is declared in C, it involves only assigning memory to variable and no implicit assignment. Thus when you get value from it, it has what is stored in memory cast to your variable datatype. That value we call as garbage value. It remains so, because C language implementations have memory management which does not handle this issue.
This happens with local variables and memory allocated from the heap with malloc(). Local variables are the more typical mishap. They are stored in the stack frame of the function. Which is created simply by adjusting the stack pointer by the amount of storage required for the local variables.
The values those variables will have upon entry of the function is essentially random, whatever happened to be stored in those memory locations from a previous function call that happened to use the same stack area.
It is a nasty source of hard to diagnose bugs. Not in the least because the values aren't really random. As long as the program has predictable call patterns, it is likely that the initial value repeats well. A compiler often has a debug feature that lets it inject code in the preamble of the function that initializes all local variables. A value that's likely to produce bizarre calculation results or a protected mode access violation.
Notable perhaps as well is that managed environments initialize local variables automatically. That isn't done to help the programmer fall into the pit of success, it's done because not initializing them is a security hazard. It lets code that runs in a sandbox access memory that was written by privileged code.

Resources