What does uninitialized read mean? - c

Someone said uninitialized read is accessing an unwritten but allocated memory space.
And there’s also someone said it is accessing an unallicated memory space. So I am here to double check the meaning and BTW: Could you briefly explain what do "written" and "allocated" mean.

Hard to say without full context but here is best guesses --
uninitialized read -- you would say this when a variable or structure is read from memory without a value or default having been written to it. Thus you are reading unitialized (random) data. If a hacker could write to that memory location they could cause your system to act unexpectedly.*
TO FIX: make sure all allocated data and structures have default values written to them.
unallocated memory -- this is memory that has not specifically been marked as used by your application. This means any application or system could write to this memory and impact your system (since you are not reading from space that is designated for your application.
TO FIX: make sure you allocate all memory you use using your memory management system of choice.
*It has been pointed out that the system might behave unexpected anyway but the fact the system could be controlled by an outside agency was my point

Could you briefly explain what do "written" and "allocated" mean.
“Allocated” means the memory has been designated for a specific use.
When int x; appears inside a function in a C program, memory is automatically allocated for it. (It is automatic in that the compiler arranges for the memory to be reserved for x, so the author of this function does not have to do anything else to get that memory.) Memory can also be allocated in other ways, such as by explicit request, and C has rules for which declarations do or do not reserve memory that can be somewhat complicated.
When memory is automatically allocated in this way, it is not automatically initialized. This means the program has decided a certain part of memory will be used for x but it has not put any value into it. That memory could contain a value left over from prior use, or it could contain zero from when the operating system cleared it before assigning it to the program, or it could contain something else. (Additionally, due to the rules of the C standard and the complexities of modern compilers, memory that is not initialized can cause complications in your program. It may act in ways that are confusing to beginners.)
To ensure the memory has a defined value, you should initialize it. This can be done in the definition, as with int x = 3;, or it can be done later, as with x = 3;.
Setting an object to a value is also called writing to memory, storing to memory, storing to an object, and assigning a value. So, if you have written a value to an object, you have initialized it. (“Initialization” generally refers to the first time a value is written to a new object, but we can also say we are “reinitializing” something when we are resetting its value to a state we consider “earlier” in some sense.)
Someone said uninitialized read is accessing an unwritten but allocated memory space. And there’s also someone said it is accessing an unallicated memory space.
“Uninitialized read” is a somewhat crude term. Properly, we might say a “read of uninitialized memory,” and that is indeed reading memory that is uninitialized. Even if the memory assigned for a new object, say x, was previously used for something else, we refer to that memory as uninitialized once it has been newly designated for the new object and not yet written to.
“Uninitialized read” does not mean accessing unallocated memory.

Related

By printing "garbage values" in uninitialized data segment (bss) can we map out all values from previous program

I have a weird question, and i am not sure if i will be able to explain it but here we go. While learning C and using it you usually come across the term "trash" or "garbage" value, my first question to that is, is data left over data in that memory address from some different program or of anything or is it actually some 'random' value , if i take that it is true that is leftover value in that memory address why are we still able to read from such memory address, i mean lets assume we just declare int x; and it is now stored in bss on some memory address #, and we were to output its value we would get the value of that resides on that address, so if all the things i said are true, doesnt that allow for us to declare many many many variables but only declare and not initialize perhaps we can map all the values previously stored in bss from some program from before etc.
I am mostly likely sure that this would be a big security threat and thus i know there is probably some measure against it but i want to know what prevents this?
No, the contents of the .bss section are zeroed out before your program starts. This is to satisfy C's guarantee that global and static variables, if not explicitly initialized, will be initialized to zero.
Indeed, on a typical multitasking system, all memory allocated by your process will be zeroed by the operating system before you are given access to it. This is to avoid precisely the security hole you mention.
The values of local (auto) variables, on the stack, do typically contain "garbage" if not initialized, but it would be garbage left over from the execution of your own program up to this point. If your program happens not to have written anything to that particular location on the stack, then it will still contain zero (again on a typical OS); it will never contain memory contents from other programs.
The same goes for memory allocated by malloc. If it is coming straight from the OS, it contains zeros. If it happens to be a block that was previously allocated and freed, it might contain garbage from your previous use of that memory, or from malloc's internal data, but again it will never contain another program's data.
Nothing in the C language itself prevents you from doing almost exactly as you say. The only thing you said that was wrong, considering only the requirements of C standard, was talking about variables "in bss". Objects with static storage duration and no initializer (which is the standardese equivalent of variables in bss) are guaranteed to be initialized to zero at program startup, so you cannot access the data of no-longer-running programs that way. But, in an environment like good old-fashioned MS-DOS or CP/M, there was nothing whatsoever to stop you from setting a pointer to the base of physical RAM, scanning to the end, and finding data from previous programs.
All modern operating systems for full-featured computers, however, provide memory protection which means, among other things, that they guarantee that no process can read another process's memory, whether or not the other process is still running, except via well-defined APIs that enforce security policy. The "Spectre" family of hardware bugs are a big deal just because they break this guarantee.
The details of how memory protection work are too complex to fit into this answer box, but one of the things that's almost always done is, whenever you allocate more memory from the operating system, that memory is initialized, either to all-bits-zero or to the contents of a file on disk. Either way you can't get at "garbage".

How is the destination that an uninitialized pointer in c points to determined?

I know that if a pointer is declared in C (and not initialized), it will be pointing to a "random" memory address which could contain anything.
How is where it actually points to determined though? Presumably it's not truly random, since this would be inefficient and illogical.
If this pointer is defined outside of all functions (or is static), it will be initialized to NULL before main() gets control.
If this pointer is created in the heap via malloc(sizeof(sometype*)), it will contain whatever happens to be at its location. It can be the data from the previously allocated and freed buffer of memory. Or it can be some old control information that malloc() and free() use to manage the lists of the free and allocated blocks. Or it can be garbage if the OS (if any) does not clear program's memory or if its system calls for memory allocation return uninitialized memory and so those can contain code/data from previously run programs or just some garbage that your RAM chips had when the system was powered on.
If this pointer is local to a function (and is not static), it will contain whatever has been at its place on the stack. Or if a CPU register is allocated to this pointer instead of a memory cell, it will contain whatever value the preceding instructions have left in this register.
So, it won't be totally random, but you rarely have full control here.
Uninitialized is undefined. Generally speaking, when the pointer is allocated the memory space is not cleared, so whatever the memory contained is now a pointer. It is random, but it is also efficient in the sense that the memory location is not being changed in the operation.
http://en.wikipedia.org/wiki/Uninitialized_variable
Languages such as C use stack space for variables, and the collection
of variables allocated for a subroutine is known as a stack frame.
While the computer will set aside the appropriate amount of space for
the stack frame, it usually does so simply by adjusting the value of
the stack pointer, and does not set the memory itself to any new state
(typically out of efficiency concerns). Therefore, whatever contents
of that memory at the time will appear as initial values of the
variables which occupy those addresses.
Although I would imagine this is implementation specific.
Furthermore you should probably always initialize your pointers, see the answer provided at How do you check for an invalid pointer? and the link given on the first answer: -
http://www.lysator.liu.se/c/c-faq/c-1.html
As far as the C standard is concerned, an uninitialized pointer doesn't point anywhere. It is illegal to dereference it. Thus it is impossible in principle to observe its target, and thus the target simply doesn't exist for all intents and purposes.
If you want a trite analogy, asking for the value of an uninitialized pointer is like asking for the value of the smallest positive real number, or the value of the last digit of π.
(The corolloary is that only Chuck Norris can dereference an uninitialized pointer.)
It is implementation specific / undefined behavior. It's possible the pointer was automatically initialized to NULL... or it is just whatever value is in memory at the time.
A pointer is an address in memory that points to another address in memory. The fact that it's not initialized does not mean that the pointer itself doesn't have an address, it only means that the address of the thing it points to is not known. So, either the compiler initialized it to NULL by default, or the address is whatever was in memory in the pointer's variable space at the time.
In large part, it is like an unerased whiteboard when you enter class. If you draw a box on part of the board but do not erase what is in the box, then what is in the box?
It is whatever was left there previously.
Similarly, if you allocate space for a pointer but do not erase the space, then what is in the space?
Data may be left over from earlier parts of your program or from special code that runs before any normal part of your program (such as the main function) starts running. (The special code may load and link certain libraries, set up a stack, and otherwise prepare the environment needed by a C program.) On embedded systems, rather than typical multi-user systems, data might be left over from previous processes. Quite likely, the previous use of the space was for something other than a pointer, so the value in it does not make sense when interpreted as a pointer. Its value as a pointer might point somewhere but not be meaningful in any way.
However, you cannot rely on this. The C standard does not define the behavior when you use an uninitialized object with automatic storage duration. In many C implementations, an uninitialized pointer will simply contain left over data. But, in some C implementations, the system may detect that you are using an uninitialized object and cause your program to crash. (Other behaviors are possible too.)

Garbage values in a multiprocess operating system

Does the allocated memory holds the garbage value since the start of the OS session? Does it have some significance before we name it as a garbage value in our program runtime session? If so then why?
I need some advice on study materials regarding linux kernel programming, device driver programming and also want to develop an understanding on how the computer devices actually work. I get stuck into the situations like the "garbage value" and feel like I have to study something else also for better understanding of the programming language. I am studying by myself and getting a lot of confusing situations. Any advice will be really helpful.
"Garbage value" is a slang term, meaning "I don't know what value is there, or why, and for that reason I will not use the value". It is "garbage" in the sense of "useless nonsense", and sometimes it is also "garbage" in the sense of "somebody else's leavings".
Formally, uninitialized memory in C takes "indeterminate values". This might be some special value written there by the C implementation, or it might be something "left over" by an earlier user of the same memory. So for examples:
A debug version of the C runtime might fill newly-allocated memory with an eye-catcher value, so that if you see it in the debugger when you were expecting your own stored data, you can reasonably conclude that either you forgot to initialize it or you're looking in the wrong place.
The kernel of a "proper" operating system will overwrite memory when it is first assigned to a process, to avoid one process seeing data that "belongs" to another process and that for security reasons should not leak across process boundaries. Typically it will overwrite it with some known value, like 0.
If you malloc memory, write something in it, then free it and malloc some more memory, you might get the same memory again with its previous contents largely intact. But formally your newly-allocated buffer is still "uninitialized" even though it happens to have the same contents as when you freed it, because formally it's a brand new array of characters that just so happens to have the same address as the old one.
One reason not to use an "indeterminate value" in C is that the standard permits it to be a "trap representation". Some machines notice when you load certain impossible values of certain types into a register, and you'd get a hardware fault. So if the memory was previously used for, say, an int, but then that value is read as a float, who is to say whether the left-over bit pattern represents a so-called "signalling NaN", that would halt the program? The same could happen if you read a value as a pointer and it's mis-aligned for the type. Even integer types are permitted to have "parity bits", meaning that reading garbage values as int could have undefined behavior. In practice, I don't think any implementation actually does have trap representations of int, and I doubt that any will check for mis-aligned pointers if you just read the pointer value -- although they might if you dereference it. But C programmers are nothing if not cautious.
What is garbage value?
When you encounter values at a memory location and cannot conclusively say what these values should be then those values are garbage value for you. i.e: The value is Indeterminate.
Most commonly, when you use a variable and do not initialize it, the variable has an Indeterminate value and is said to possess a garbage value. Note that using an Uninitialized variable leads to an Undefined Behavior, which means the program is not a valid C/C++ program and it may show(literally) any behavior.
Why the particular value exists at that location?
Most of the Operating systems of today use the concept of virtual memory. The memory address a user program sees is an virtual memory address and not the physical address. Implementations of virtual memory divide a virtual address space into pages, blocks of contiguous virtual memory addresses. Once done with usage these pages are usually at least 4 kilobytes. These pages are not explicitly wiped of their contents they are only marked as free for reuse and hence they still contain the old contents if not properly initialized.
On a typical OS, your userspace application only sees a range of virtual memory. It is up to the kernel to map this virtual memory to actual, physical memory.
When a process requests a piece of (virtual) memory, it will initially hold whatever is left in it -- it may be a reused piece of memory that another part of the process was using earlier, or it may be memory that a completely different process had been using... or it may never have been touched at all and be in whatever state it was when you powered on the machine.
Usually nobody goes and wipes a memory page with zeros (or any other equally arbitrary value) on your behalf, because there'd be no point. It's entirely up to your application to use the memory in whatever way you please, and if you're going to write to it anyway, then you don't care what was in it before.
Consequently, in C it is simply not allowed to read a variable before you have written to it, under pain of undefined behaviour.
If you declare a variable without initialising it to a particular value, it may contain a value which was previously assigned by a different program that has since released that piece of memory, or it may simply be a random value from when the computer was booted (iirc, PCs used to initialise all RAM to 0 on bootup because early versions of DOS required it, but new computers no longer do this). You can't assume the value will be zero, for instance.
Garbage value, e.g. in C, typically refers to the fact that if you just reserve memory, but never intialize it, it will hold random values, since it simply is not initialized yet (C doesn't do that for you automatically; it would just be overhead, and C is designed for as little overhead as possible).
The random values in the memory are leftovers from whatever was in there before.
These previous values are left in there, because usually there is not much use in going around setting memory to zero - or any other value - that will later be overwritten again anway. Because for the general case, there is no use in reading uninitialized memory (except if you e.g. want to exploit possible security issues - see the special cases where memory is actually zeroed: Kernel zeroes memory?).

Why are the contents pointed to by a pointer not changed when memory is deallocated using free()?

I am a newbie when it comes to dynamic memory allocation. When we free the memory using void free(void *ptr) the memory is deallocated but the contents of the pointer are not deleted. Why is that? Is there any difference in more recent C compilers?
Computers don't "delete" memory as such, they just stop using all references to that memory cell and forget that anything of value is stored there. For example:
int* func (void)
{
int x = 5;
return &x;
}
printf("%d", *func()); // undefined behavior
Once the function has finished, the program stops reserving the memory location where x is stored, any other part of the program (or perhaps another program) is free to use it. So the above code could print 5, or it could print garbage, or it could even crash the program: referencing the contents of a memory cell that has ceased to be valid is undefined behavior.
Dynamic memory is no exception to this and works in the same manner. Once you have called free(), the contents of that part of the memory can be used by anyone.
Also, see this question.
The thing is that accessing memory after it has been freed is undefined behavior. It's not only that the memory contents are undefined, accessing them could lead to anything. At least some compilers when you build a debug version of the code, actually do change the contents of the memory to aid in debugging, but in release versions it's generally unnecessary to do that, so the memory is just left as is, but anyway, that is not something you can safely rely upon, don't access freed memory, it's unsafe!
In C, parameters are passed by value. So free just can't change the value of ptr.
Any change it would make would only change the value within the free function, and won't affect the caller's variable.
Also, changing it won't be so much help. There can be multiple pointers pointing to the same piece of memory, and they should all be reset when freeing. The language can't keep track of them all, so it leaves the programmer to handle the pointers.
This is very normal, because clearing the memory location after free is an overhead and generally not necessary. If you have security concerns, you can wrap the free call within a function which clears the region before freeing. You'll also notice that this requires the knowledge of the allocation size, which is another overhead.
Actually the C programming language specifies that after the lifetime of the object, even the value of any pointer pointing to it becomes indeterminate, i.e. you can't even depend on the pointer to even retain the original value.
That is because a good compiler will try to aggressively store all the variables into the CPU registers instead of memory. So after it sees that the program flow calls a function named free with the argument ptr, it can mark the register of the ptr free for other use, until it has been assigned to again, for example ptr = malloc(42);.
In between these two it could be seen changing the value, or comparing inequal against its original value, or other similar behaviour. Here's an example of what might happen.

How does an uninitiliazed variable get a garbage value?

When we create a variable and don't initialize it, then some (random) number called garbage value is assigned to it.
How this value is assigned to the variable?
What is whole concept/mechanism behind this?
Does this happen only in C?
The garbage value is not assigned, rather the value is already there. When you allocate a variable you are reserving a piece of memory - until you overwrite it that memory will contain whatever "random" information was there before.
As a metaphor, think of allocating a variable like buying a piece of land - until you do something with it (like build a house) the land will just have whatever trash was already sitting there (like an old crumbling house).
Some languages will automatically fill newly allocated variables with zeros - this takes time to do. In more "do-it-yourself" languages like C this extra behavoir is not guarenteed (though on some systems memory is cleared regardless of language, for example as a security measure)
Memory is used and reused at various points in your application. For example, as the call stack of your application grows and shrinks the same location in memory may be overwritten many, many times. The thing to remember is that as a piece of memory is abandoned it is not zeroed out, so if you do not specify a new initial value for that place in memory when you use it again you will get the old, "garbage" value.
Some languages and structure implementations do default-initialize memory as it is used. Others do not, so it is important to read the documentation of your language carefully to know what to expect.
Nobody explicitly assigns a grabage value. If you create a variable, only location of the variable is determined, and not its value. Thats why we are initializing it. The garbage value might come from some previous operations over the same memory by old processes! So it can hold anything. I think it holds to pretty good number of languages. I am not sure about the list! :)
When we create a variable and don't initialize it, then nothing happens. When you read value from that variable you get data from memory where variable located now. It could looks like garbage/random value only because variables are placed in memory with some degree of randomness.
C standards say:
undefined behavior for local variables: (Why) is using an uninitialized variable undefined behavior? (e.g. segfault is legal)
zero for global variables: What happens to a declared, uninitialized variable in C? Does it have a value?
Implementation: detailed examination of an implementation at: https://stackoverflow.com/a/36725211/895245 Summary:
local: the address is never written to, so whatever was there previously gets used
global: .bss

Resources