why garbage value is stored during declaration? - c

I heard several times that if you do not initialise a variable then garbage value is stored in it.
Say
int i;
printf("%d",i);
The above code prints any garbage value, but I want to know that what is the need for storing garbage value if uninitialized?

The value of an uninitialized value is not simply unknown or garbage, but indeterminate and evaluating that variable may invoke undefined behavior or implementation-defined behavior.
One possible scenario (which is probably the scenario you are seeing) is that the variable, when evaluated, will return the value that was previously present in that memory address. Therefore, it's not like garbage is explicitly written to that variable.
It's worth noting that languages (or even C implementations) that do not exhibit the behavior you're seeing, do so by explicitly writing zeroes (or other initial values) to that area, before allowing you to use it.

It is not storing garbage, it prints whatever happens to be there in memory at that address when it is running. This is in the name of efficiency. You don't pay for what you didn't ask for.
EDIT
To answer why there is something in memory. All sort of program runs and need to share memory. When memory is allocated to your process, it is not reset, again for performance reason. Since the variable we are observing is declared on the stack, it could even be your program that put the value there in a previous function call.

C only does what you tell it to. The standard defines reading an uninitialized variable as undefined behavior.
This question elaborates: (Why) is using an uninitialized variable undefined behavior?
The accepted answer has a very good explanation.
EDIT:
A funny sidenote though, if you declare the variable static it is guaranteed to be initialized to zero per the standard. Can't find a quote right now, working on it..
EDIT2:
I left my C reference at work and CBA to download one. This answer elaborates on the initial values of variables, whether they be local/auto, global, static or indeterminate: https://stackoverflow.com/a/1597491/700170

The other answers point out (correctly) that what's being printed is whatever's already in memory in the memory location that happens to have been assigned to i.
They don't, however, clarify why there are any values stored in these locations in the first place, which is perhaps what you're really asking.
There are two reasons for this: first, upon startup, we can't be sure exactly how the memory circuits will initialize themselves. So they could be set to any arbitrary value. The second (and, in general, more likely reason, unless you just restarted your computer) is that before you started your program, that memory location had been used by another program, which stored something there--something that wasn't garbage at the time, since it was stored intentionally. From the perspective of your program, however, it is garbage, since your program has no way of knowing why that particular value was stored there.
EDIT: As I mentioned in a comment on another answer, even if the value stored in memory under some uninitialized variable is actually 0, that's not the same thing as "not having a value stored." The value stored is 0, which is to say, the physical hardware that represents one bit of memory is faithfully storing the value 0. As long as a circuit is active (i.e. turned on), the memory cells must store something; for an explanation of why this is, look into flip-flop gates. (There's a decent overview here, assuming you already understand a little bit about NAND gates: http://computer.howstuffworks.com/boolean4.htm)

It's happens only in case of local varibales. As memory for local variables are allocated on stack and while allocating the memory the runtime system does not clear the memory before allocating it to the variable unlike in case of allocating memory in heap for global and static variables. Hence the default value of local varibles beomes the content of its memory on stack while that of constant and static variables is 0.

Related

What is the origin of garbage value? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
When we define a variable, and do not initialize it, the block of memory allocated to the variable still contains a value from previous programs, known as garbage value. But suppose, in a hypothetical case, a block of unused memory is present in the system, and when I declare and define a variable, that block of memory is allocated to the variable. If I do not initialize the variable, and try to print the value of the variable, the system doesn't have any garbage value to print. What will be the result? What will the system do?
When we define a variable, and do not initialize it, the block of memory allocated to the variable still contains a value from previous programs, known as garbage value.
If I do not initialize the variable, and try to print it, it doesn't have any garbage value to print.
C does not specify these behaviors. There is no specified garbage value.
If code attempts to print (or use) the value of an uninitialized object, the result is undefined behavior (UB). Anything may happen: a trap error occurs, the value is 42, code dies, anything.
There is a special case if the uninitialized object is an unsigned char in that a value will be there of indeterminate value, just something in the range [0...UCHAR_MAX], but no UB. This is the closest to garbage value C specifies.
Firstly, it isn't defined how an implementation behaves precisely when an uninitialised read is made, by the C standard. Merely that the value is not defined. The system may use whatever method it wishes to choose the value. Or it is even possible it is a trap representation and will crash the program I believe.
However on most real modern OSes. The data in reality is fresh pages that get mapped into the programs address space. For security reasons most kernels actually explicitly ensure these are zeroed out to avoid software spying on data from previous programs which were run and got left in memory.
However some OSes as you say will just leave this data in, meaning the page is either fresh and usually zeroed out or contains arbitrary data from previous programs (or even potentially arbitrary data defined by how the memory starts up, but at least with DRAM, that is generally in a zeroed state).
I think you need more of hardware perspective.
What is a memory? An example of memory: is made up of transistors and capacitors. A transistor and capacitor make a memory bit. A bit is either of value 0 or 1, a hypothetical scenario of non-existance of this bit value does not exist ;) as it has to hold either 0 or 1 and nothing else. If you think there is "nothing" in bit value, that means the hardware(transistor/capacitor) you are imagining is not working.
A bunch of bits makes a word or byte. A bunch of bytes holds an integer/ float or whatever variable you define. So even without initializing the variable, it contains 0's and 1's in each of the memory cells. When you access this - it's called garbage.
But suppose, in a hypothetical case, a block of unused memory is present in the system, and when I declare and define a variable, that block of memory is allocated to the variable. If I do not initialize the variable, and try to print it, it doesn't has any garbage value to print. What will it do?
Any given memory location has some value, no matter how it got there. The "garbage" value doesn't have to come from a program that ran in that space previously, it could just be the initial state of the memory location when the system starts up. The reason it's "garbage" is that you don't know what it is -- you didn't put it there, you don't have any idea how it got there, and you don't really care.

How are garbage values for variables generated in C?

I mean to ask if it follows some specific algorithm and actually are not junk.
In other words, how exactly the "garbage" values be present? Considering not invoking UB, if a garbage value is read, what is the source of that value?
The standard does not mention the term "garbage", it mentions "indeterministic / indeterminate values". The value can be anything.Note
From the user point of view, if we are unable to get a fix on a certain value (for any variable), then the "expectation" is not matched anytime and the value (if) we get is not of any use, thus terming them as "garbage" is common.
The most relatable and common observation / implementation is, for an automatic variable left uninitialized, only the storage is allocated, the content of that storage is not touched. So, probably it still contains the last stored value which was put there. Now, that value, probably being a valid one in other (previous) scenario, in present case, does not make any sense, so it is "garbage" in current scenario.
TL;DR The "garbage" value is not generated, most of the cases, it's just the last stored value in that memory location.
Note:
Related quoting from C11, chapter §6.7.9
If an object that has automatic storage duration is not initialized explicitly, its value is
indeterminate. [....]
§§ Additional Read:
This is very closely related to the topic, so adding it as a footnote.
In case, there exist a variable, which holds indeterministic value and
the data type can have trap representation
the address of the variable is not taken
then, trying to read the value actually causes undefined behavior. Be aware.
When C allocates memory on the stack and heap, it does not modify what's stored at the memory addresses, it simply designates the space for your variable. An un-initialized variable will contain whatever was in memory at that location before you declared the variable. Some times the values will be previous memory from your program that was used and cleared, and some times they will be values from the OS, stack canaries, etc. There is no way to predict what will be there.
The garbage values exist for variables on the stack (also known as automatic variable) if not initialized. They are pushed, popped, initialized with previous variable residing on the address in the stack, previous function call, etc.
Every specific memory addresses (either stack or heap) will be having some data. Before assigning that memory to your variable, it might be used by another variable (may be by OS or other programs). So, it might contain last values assigned by those last allocated variables. And those values are now useless to us. Thats why they are garbage for us

Garbage values in a multiprocess operating system

Does the allocated memory holds the garbage value since the start of the OS session? Does it have some significance before we name it as a garbage value in our program runtime session? If so then why?
I need some advice on study materials regarding linux kernel programming, device driver programming and also want to develop an understanding on how the computer devices actually work. I get stuck into the situations like the "garbage value" and feel like I have to study something else also for better understanding of the programming language. I am studying by myself and getting a lot of confusing situations. Any advice will be really helpful.
"Garbage value" is a slang term, meaning "I don't know what value is there, or why, and for that reason I will not use the value". It is "garbage" in the sense of "useless nonsense", and sometimes it is also "garbage" in the sense of "somebody else's leavings".
Formally, uninitialized memory in C takes "indeterminate values". This might be some special value written there by the C implementation, or it might be something "left over" by an earlier user of the same memory. So for examples:
A debug version of the C runtime might fill newly-allocated memory with an eye-catcher value, so that if you see it in the debugger when you were expecting your own stored data, you can reasonably conclude that either you forgot to initialize it or you're looking in the wrong place.
The kernel of a "proper" operating system will overwrite memory when it is first assigned to a process, to avoid one process seeing data that "belongs" to another process and that for security reasons should not leak across process boundaries. Typically it will overwrite it with some known value, like 0.
If you malloc memory, write something in it, then free it and malloc some more memory, you might get the same memory again with its previous contents largely intact. But formally your newly-allocated buffer is still "uninitialized" even though it happens to have the same contents as when you freed it, because formally it's a brand new array of characters that just so happens to have the same address as the old one.
One reason not to use an "indeterminate value" in C is that the standard permits it to be a "trap representation". Some machines notice when you load certain impossible values of certain types into a register, and you'd get a hardware fault. So if the memory was previously used for, say, an int, but then that value is read as a float, who is to say whether the left-over bit pattern represents a so-called "signalling NaN", that would halt the program? The same could happen if you read a value as a pointer and it's mis-aligned for the type. Even integer types are permitted to have "parity bits", meaning that reading garbage values as int could have undefined behavior. In practice, I don't think any implementation actually does have trap representations of int, and I doubt that any will check for mis-aligned pointers if you just read the pointer value -- although they might if you dereference it. But C programmers are nothing if not cautious.
What is garbage value?
When you encounter values at a memory location and cannot conclusively say what these values should be then those values are garbage value for you. i.e: The value is Indeterminate.
Most commonly, when you use a variable and do not initialize it, the variable has an Indeterminate value and is said to possess a garbage value. Note that using an Uninitialized variable leads to an Undefined Behavior, which means the program is not a valid C/C++ program and it may show(literally) any behavior.
Why the particular value exists at that location?
Most of the Operating systems of today use the concept of virtual memory. The memory address a user program sees is an virtual memory address and not the physical address. Implementations of virtual memory divide a virtual address space into pages, blocks of contiguous virtual memory addresses. Once done with usage these pages are usually at least 4 kilobytes. These pages are not explicitly wiped of their contents they are only marked as free for reuse and hence they still contain the old contents if not properly initialized.
On a typical OS, your userspace application only sees a range of virtual memory. It is up to the kernel to map this virtual memory to actual, physical memory.
When a process requests a piece of (virtual) memory, it will initially hold whatever is left in it -- it may be a reused piece of memory that another part of the process was using earlier, or it may be memory that a completely different process had been using... or it may never have been touched at all and be in whatever state it was when you powered on the machine.
Usually nobody goes and wipes a memory page with zeros (or any other equally arbitrary value) on your behalf, because there'd be no point. It's entirely up to your application to use the memory in whatever way you please, and if you're going to write to it anyway, then you don't care what was in it before.
Consequently, in C it is simply not allowed to read a variable before you have written to it, under pain of undefined behaviour.
If you declare a variable without initialising it to a particular value, it may contain a value which was previously assigned by a different program that has since released that piece of memory, or it may simply be a random value from when the computer was booted (iirc, PCs used to initialise all RAM to 0 on bootup because early versions of DOS required it, but new computers no longer do this). You can't assume the value will be zero, for instance.
Garbage value, e.g. in C, typically refers to the fact that if you just reserve memory, but never intialize it, it will hold random values, since it simply is not initialized yet (C doesn't do that for you automatically; it would just be overhead, and C is designed for as little overhead as possible).
The random values in the memory are leftovers from whatever was in there before.
These previous values are left in there, because usually there is not much use in going around setting memory to zero - or any other value - that will later be overwritten again anway. Because for the general case, there is no use in reading uninitialized memory (except if you e.g. want to exploit possible security issues - see the special cases where memory is actually zeroed: Kernel zeroes memory?).

In C if a variable is not assigned a value then why does it take garbage value?

Why do the variables take garbage values?
I guess the rationale for this is that your program will be faster.
If compiler automatically reset (ie: initialize to 0 or to NaN for float/doubles etc etc) your variables, it would take some time doing that (it'd have to write to memory).
In many cases initializing variables could be unneeded: maybe you will never access your variable, or will write on it the first time you access it.
Today this optimization is arguable: the overhead due to initializing variables is maybe not worth the problems caused by variables uninitialized by mistake, but when C has been defined things were different.
Unassigned variables has so-called indeterminate state that can be implemented in whatever way, usually by just keeping unchanged whatever data was in memory now occupied by the variable.
It just takes whatever is in memory at the address the variable is pointing to.
When you allocate a variable you are allocating some memory. if you dont overwrite it, memory will contain whatever "random" information was there before and that is called garbage value.
Why would it not? A better question might be "Can you explain how it comes about that a member variable in C# which is not initialised has a known default value?"
When variable is declared in C, it involves only assigning memory to variable and no implicit assignment. Thus when you get value from it, it has what is stored in memory cast to your variable datatype. That value we call as garbage value. It remains so, because C language implementations have memory management which does not handle this issue.
This happens with local variables and memory allocated from the heap with malloc(). Local variables are the more typical mishap. They are stored in the stack frame of the function. Which is created simply by adjusting the stack pointer by the amount of storage required for the local variables.
The values those variables will have upon entry of the function is essentially random, whatever happened to be stored in those memory locations from a previous function call that happened to use the same stack area.
It is a nasty source of hard to diagnose bugs. Not in the least because the values aren't really random. As long as the program has predictable call patterns, it is likely that the initial value repeats well. A compiler often has a debug feature that lets it inject code in the preamble of the function that initializes all local variables. A value that's likely to produce bizarre calculation results or a protected mode access violation.
Notable perhaps as well is that managed environments initialize local variables automatically. That isn't done to help the programmer fall into the pit of success, it's done because not initializing them is a security hazard. It lets code that runs in a sandbox access memory that was written by privileged code.

How does an uninitiliazed variable get a garbage value?

When we create a variable and don't initialize it, then some (random) number called garbage value is assigned to it.
How this value is assigned to the variable?
What is whole concept/mechanism behind this?
Does this happen only in C?
The garbage value is not assigned, rather the value is already there. When you allocate a variable you are reserving a piece of memory - until you overwrite it that memory will contain whatever "random" information was there before.
As a metaphor, think of allocating a variable like buying a piece of land - until you do something with it (like build a house) the land will just have whatever trash was already sitting there (like an old crumbling house).
Some languages will automatically fill newly allocated variables with zeros - this takes time to do. In more "do-it-yourself" languages like C this extra behavoir is not guarenteed (though on some systems memory is cleared regardless of language, for example as a security measure)
Memory is used and reused at various points in your application. For example, as the call stack of your application grows and shrinks the same location in memory may be overwritten many, many times. The thing to remember is that as a piece of memory is abandoned it is not zeroed out, so if you do not specify a new initial value for that place in memory when you use it again you will get the old, "garbage" value.
Some languages and structure implementations do default-initialize memory as it is used. Others do not, so it is important to read the documentation of your language carefully to know what to expect.
Nobody explicitly assigns a grabage value. If you create a variable, only location of the variable is determined, and not its value. Thats why we are initializing it. The garbage value might come from some previous operations over the same memory by old processes! So it can hold anything. I think it holds to pretty good number of languages. I am not sure about the list! :)
When we create a variable and don't initialize it, then nothing happens. When you read value from that variable you get data from memory where variable located now. It could looks like garbage/random value only because variables are placed in memory with some degree of randomness.
C standards say:
undefined behavior for local variables: (Why) is using an uninitialized variable undefined behavior? (e.g. segfault is legal)
zero for global variables: What happens to a declared, uninitialized variable in C? Does it have a value?
Implementation: detailed examination of an implementation at: https://stackoverflow.com/a/36725211/895245 Summary:
local: the address is never written to, so whatever was there previously gets used
global: .bss

Resources