why windows use stacks for storing the local variables? - c

Why C use stacks for storing the local variables? Is this just to have independent memory space or to have feature of automatic clearing of all the local variables and objects once it goes out of scope?
I have few more questions around the same,
Question 1) How local variables are referenced from the instruction part. Consider NewThreadFunc is the function which is called by createThread function.
DWORD WINAPI NewThreadFunc(PVOID p_pParam)
{
int l_iLocalVar1 = 10;
int l_iLocalVar2 = 20;
int l_iSumLocalVar = l_iLocalVar1 + l_iLocalVar2;
}
The stack for this thread would look like this,
| p_pParam |
| NewThreadFunc()|
| 10 |
| 20 |
| 30 |
| |
.
.
.
Now my question is, while executing this function how would CPU know the address of local variables (l_iSumLocalVar, l_iLocalVar1 and l_iLocalVar2)? These variables are not the pointers that they store the address from where they have to fetch the value. My question is wrt the stack above.
Question 2) If this function further calls any other function how would the stack behave to it? As I know, the stack would get divided into itself further. If this is true how the local variables of the callee function gets hidden from the called function. Basically how the local variables maintains the scope rules?
I know these could be very basic questions but some how I could not think an answer to these.

Firstly, it is not "Windows" that uses stack for local variables. It has absolutely nothing to do with "Windows", or with any other OS for that matter. It is your compiler that does that. Nobody forces your compiler to use system stack for that purpose, but normally this is the simplest and most efficient way to implement local variables.
Secondly, compilers use stacks to store local variables (be that system-provided stacks or compiler-implemented stack) simply because stack-like storage matches the language-mandated semantics of local variables very precisely. The storage duration of local variables is defined by their declarative regions (blocks) which strictly nest into each other. This immediately means that storage durations of local variables follow the LIFO principle: last in - first out. So, using a stack - a LIFO data structure - for allocating objects with LIFO storage duration is the first and the most natural thing that comes to mind.
Local variables are typically addressed by their offset from the beginning of the currently active stack frame. The compiler knows the exact offset of each local variable at compile time. The compiler generates the code that will allocate the stack frame for the current function by: 1) memorizing the current position of the stack pointer when the function is entered (let's say it is memorized in register R1) and 2) moving the current stack pointer by the amount necessary to store all local variables of the function. Once the stack frame is allocated in this fashion, your local variables l_iLocalVar1, l_iLocalVar2 and l_iSumLocalVar will simply be accessed through addresses R1 + 6, R1 + 10 and R1 + 14 (I used arbitrary offsets). In other words, local variables are not accessed by specific address values, since these addresses are not known at compile time. Instead local variables are accessed through calculated addresses. They are calculated as some run-time base address value + some compile-time offset value.

Normally the system calling convention reserves a register to be used as a "stack pointer". Local variable accesses are made relative to this register's value. Since every function must know how much stack space it uses, the compiler emits code to ensure the stack pointer is adjusted correctly for each function's requirements.
The scope of local variables is only enforced by the compiler, since it's a language construct, not anything to do with hardware. You can pass addresses of stack variables to other functions and they'll work correctly.

why is the stack used for local variables?
Well, the stack is an easy to use structure to reserve space for temporary variables. It has the benefit that it will be removed almost automatically when the function returns. An alternative would be to allocate memory from the OS, but then this would cause heavy memory fragmentation.
The stack can be easily allocated as well as freed again, so it is a natural choice.

All the variable addresses are relative to the stack pointer that is incremented at each function call or return. Fast easy way to allocate and cleanup memory used by these variables.

Related

Is the stack offset assigned to local stack variables ever reused, e.g. in case it becomes dead or goes out of scope?

In other words, will compilers allocate enough space in the program stack to store all variables at the deepest level of block nesting in the current function or do they look at liveness and the scope of variables too?
void zoo(int num) {
if (num) {
int a = foo();
bar(a);
} else {
int b = foo();
bar(b);
}
}
For example the above code will be assigned different offsets on the stack for a and b, even though, if they were assigned only one offset (e.g. rbp - 8) it would have been legal too. My question is that will compilers like gcc and clang ever output assembly where multiple variables are assigned the same static offset?
Is there anything in the specifications about this?
I want to know if there is a unique mapping between source variables and the stack offsets present in a compiled assembly file.
There is, in general, no unique mapping between objects with automatic storage duration (“local” objects defined inside a function or block) and stack offsets. I have seen compiler-generated code reuse the same stack location for different objects, either because the use of one did not overlap the use of the other in the C code or because the compiler had moved one into a register for whatever purposes and no longer needed to use the stack location for it.
The C and C++ standards do not require implementations to implement their stack allocation in any particular way. They are free to reuse stack locations. They are also free to allocate all the stack space that might be needed1 or to wait to see if particular blocks are entered or not before further allocating stack space for the objects inside those blocks.
Note
1 Implementations that support variable-length arrays generally must wait until the size of the array can be determined before allocating space for it.

Can we get the elements in the middle of "stack" directly?

Think of the situation as follow.
int main(void)
{
...
int a=10; //a get into the stack.
int b=20; //b get into the stack.
int c=30; //c get into the stack.
...
}
As we know, the “stack segment” is satisfied with the storage approach of "stack data structure"; and here, the local variables a, b, and c are exactly stored in such a direction of memory, so in theory we can only access to the element at the top of the stack.
But what if we do something like this?
printf("b = %d",b);
Local variable b is in the middle of a and c, but we can get it.
So...can we say that we can directly get the element in the middle of the stack?
Here is the image of " a, b, and c stored in stack "
the local variables a, b, and c are exactly stored in such a direction of memory
I don't know from where you got this but this is not true, at least in modern compilers.
First of all, C itself doesn't specify anything about using a stack. How the function calls are implemented is implementation defined. Lots of common implementations use stack like data structure to implement function call in the sense that last called function will be returned first.
But this doesn't mean that the local variables are stored in stack like structure. There are lots of options to the compiler like:
It can eliminate the variable completely if that is not needed in run-time.
It can place the variables in register.
It can re-order variables.
In all of these cases the only thing that compiler guarantees that the observable behavior of the code is not changed.
Since it doesn't store variables in stack like data structure, it has no problem to access them in middle.
First of all C standard never utter stack` anywhere. Now moving from standard - implementors may implement such a way that local variables are stored in stack memory as part of function frame.
Now you are thinking - stack is always accessed on top but here we can access the variable b directly. Though it is in the middle somewhere. Nope you are wrong - the thing is the frame of the functions are which are stored in stack and popped off when done(though an implementation can do it other way also - but speaking in general). It is not the variables that are the unit of operations here.
By accessing b we are not violating any rule of stack data structure. The function frames are the ones which are accessed in LIFO manner not the variables inside those frames.
Also it's a bit out of context now a days to segregate like that. We can simply say that they have automatic storage duration. And that's it. They can be implemented in group of registers also(standard won't stop them). The function frames are the one which will likely to be have that stack data-structures behavior.
Usage of the stack to store automatic variables is only an implementation detail, nothing is required by the standard. But at a lower level it is indeed the most common implementation. It is because in processors, a special register (stack pointer) is used to store return addresses in function calls (instructions call and return). When automatic variables are also stored in that stack, it is trivial to reclaim there storage back at return time or at end of the block. But they are not individually pushed onto a stack: a frame pointer is used to store the memory zone for the current block (including a reference for upper frames) and the stack pointer is increased in one single operation for the size of the frame containing all the local variables. Then those variables are known by the offset to the current frame pointer. So they are known by their own address and not as elements of a stack.

Organization of Virtual Memory in C

For each of the following, where does it appear to be stored in memory, and in what order: global variables, local variables, static local variables, function parameters, global constants, local constants, the functions themselves (and is main a special case?), dynamically allocated variables.
How will I evaluate this experimentally,i.e., using C code?
I know that
global variables -- data
static variables -- data
constant data types -- code
local variables(declared and defined in functions) -- stack
variables declared and defined in main function -- stack
pointers(ex: char *arr,int *arr) -- data or stack
dynamically allocated space(using malloc,calloc) -- heap
You could write some code to create all of the above, and then print out their addresses. For example:
void func(int a) {
int i = 0;
printf("local i address is %x\n", &i);
printf("parameter a address is %x\n", &a);
}
printf("func address is %x\n", (void *) &func);
note the function address is a bit tricky, you have to cast it a void* and when you take the address of a function you omit the (). Compare memory addresses and you will start to get a picture or where things are. Normally text (instructions) are at the bottom (closest to 0x0000) the heap is in the middle, and the stack starts at the top and grows down.
In theory
Pointers are no different from other variables as far as memory location is concerned.
Local variables and parameters might be allocated on the stack or directly in registers.
constant strings will be stored in a special data section, but basically the same kind of location as data.
numerical constants themselves will not be stored anywhere, they will be put into other variables or translated directly into CPU instructions.
for instance int a = 5; will store the constant 5 into the variable a (the actual memory is tied to the variable, not the constant), but a *= 5 will generate the code necessary to multiply a by the constant 5.
main is just a function like any other as far as memory location is concerned. A local main variable is no different from any other local variable, main code is located somewhere in code section like any other function, argc and argv are just parameters like any others (they are provided by the startup code that calls the main), etc.
code generation
Now if you want to see where the compiler and runtime put all these things, a possibility is to write a small program that defines a few of each, and ask the compiler to produce an assembly listing. You will then see how each element is stored.
For heap data, you will see calls to malloc, which is responsible for interfacing with the dynamic memory allocator.
For stack data, you will see strange references to stack pointers (the ebp register on x86 architectures), that will both be used for parameters and (automatic) local variables.
For global/static data, you will see labels named after your variables.
Constant strings will probably be labelled with an awful name, but you will notice they all go into a section (usually named bss) that will be linked next to data.
runtime addresses
Alternatively, you can run this program and ask it to print the addresses of each element. This, however, will not show you the register usage.
If you use a variable address, you will force the compiler to put it into memory, while it could have kept it into a register otherwise.
Note also that the memory organization is compiler and system dependent. The same code compiled with gcc and MSVC may have completely different addresses and elements in a completely different order.
Code optimizer is likely to do strange things too, so I advise to compile your sample code with all optimizations disabled first.
Looking at what the compiler does to gain size and/or speed might be interesting though.

How do pointers work "under the hood" in C?

Take a simple program like this:
int main(void)
{
char p;
char *q;
q = &p;
return 0;
}
How is &p determined? Does the compiler calculate all such references before-hand or is it done at runtime? If at runtime, is there some table of variables or something where it looks these things up? Does the OS keep track of them and it just asks the OS?
My question may not even make sense in the context of the correct explanation, so feel free to set me straight.
How is &p determined? Does the compiler calculate all such references before-hand or is it done at runtime?
This is an implementation detail of the compiler. Different compilers can choose different techniques depending on the kind of operating system they are generating code for and the whims of the compiler writer.
Let me describe for you how this is typically done on a modern operating system like Windows.
When the process starts up, the operating system gives the process a virtual address space, of, let's say 2GB. Of that 2GB, a 1MB section of it is set aside as "the stack" for the main thread. The stack is a region of memory where everything "below" the current stack pointer is "in use", and everything in that 1MB section "above" it is "free". How the operating system chooses which 1MB chunk of virtual address space is the stack is an implementation detail of Windows.
(Aside: whether the free space is at the "top" or "bottom" of the stack, whether the "valid" space grows "up" or "down" is also an implementation detail. Different operating systems on different chips do it differently. Let's suppose the stack grows from high addresses to low addresses.)
The operating system ensures that when main is invoked, the register ESP contains the address of the dividing line between the valid and free portions of the stack.
(Aside: again, whether the ESP is the address of the first valid point or the first free point is an implementation detail.)
The compiler generates code for main that pushes the stack pointer by lets say five bytes, by subtracting from it if the stack is growing "down". It decreases by five because it needs one byte for p and four for q. So the stack pointer changes; there are now five more "valid" bytes and five fewer "free" bytes.
Let's say that q is the memory that is now in ESP through ESP+3 and p is the memory now in ESP+4. To assign the address of p to q, the compiler generates code that copies the four byte value ESP+4 into the locations ESP through ESP+3.
(Aside: Note that it is highly likely that the compiler lays out the stack so that everything that has its address taken is on an ESP+offset value that is divisible by four. Some chips have requirements that addresses be divisible by pointer size. Again, this is an implementation detail.)
If you do not understand the difference between an address used as a value and an address used as a storage location, figure that out. Without understanding that key difference you will not be successful in C.
That's one way it could work but like I said, different compilers can choose to do it differently as they see fit.
The compiler cannot know the full address of p at compile-time because a function can be called multiple times by different callers, and p can have different values.
Of course, the compiler has to know how to calculate the address of p at run-time, not only for the address-of operator, but simply in order to generate code that works with the p variable. On a regular architecture, local variables like p are allocated on the stack, i.e. in a position with fixed offset relative to the address of the current stack frame.
Thus, the line q = &p simply stores into q (another local variable allocated on the stack) the address p has in the current stack frame.
Note that in general, what the compiler does or doesn't know is implementation-dependent. For example, an optimizing compiler might very well optimize away your entire main after analyzing that its actions have no observable effect. The above is written under the assumption of a mainstream architecture and compiler, and a non-static function (other than main) that may be invoked by multiple callers.
This is actually an extraordinarily difficult question to answer in full generality because it's massively complicated by virtual memory, address space layout randomization and relocation.
The short answer is that the compiler basically deals in terms of offsets from some “base”, which is decided by the runtime loader when you execute your program. Your variables, p and q, will appear very close to the “bottom” of the stack (although the stack base is usually very high in VM and it grows “down”).
Address of a local variable cannot be completely calculated at compile time. Local variables are typically allocated in the stack. When called, each function allocates a stack frame - a single continuous block of memory in which it stores all its local variables. The physical location of the stack frame in memory cannot be predicted at compile time. It will only become known at run-time. The beginning of each stack frame is typically stored at run-time in a dedicated processor register, like ebp on Intel platform.
Meanwhile, the internal memory layout of a stack frame is pre-determined by the compiler at compile-time, i.e. it is the compiler who decides how local variables will be laid out inside the stack frame. This means that the compiler knows the local offset of each local variable inside the stack frame.
Put this all together and we get that the exact absolute address of a local variable is the sum of the address of the stack frame itself (the run-time component) and the offset of this variable inside that frame (the compile-time component).
This is basically exactly what the compiled code for
q = &p;
will do. It will take the current value of the stack frame register, add some compile-time constant to it (offset of p) and store the result in q.
In any function, the function arguments and the local variables are allocated on the stack, after the position (program counter) of the last function at the point where it calls the current function. How these variables get allocated on the stack and then deallocated when returning from the function, is taken care of by the compiler during compile time.
For e.g. for this case, p (1 byte) could be allocated first on the stack followed by q (4 bytes for 32-bit architecture). The code assigns the address of p to q. The address of p naturally then is 5 added or subtracted from the the last value of the stack pointer. Well, something like that, depends on how the value of the stack pointer is updated and whether the stack grows upwards or downwards.
How the return value is passed back to the calling function is something that I'm not certain of, but I'm guessing that it is passed through the registers and not the stack. So, when the return is called, the underlying assembly code should deallocate p and q, place zero into the register, then return to the last position of the caller function. Of course, in this case, it is the main function, so it is more complicated in that, it causes the OS to terminate the process. But in other cases, it just goes back to the calling function.
In ANSI C, all the local variables should be placed at the top of the function and is allocated once into the stack when entering the function and deallocated when returning from the function. In C++ or later versions of C, this becomes more complicated when local variables can also be declared inside blocks (like if-else or while statement blocks). In this case, the local variable is allocated onto the stack when entering the block and deallocated when leaving the block.
In all cases, the address of a local variable is always a fixed number added or subtracted from the stack pointer (as calculated by the compiler, relative to the containing block) and the size of the variable is determined from the variable type.
However, static local variables and global variables are different in C. These are allocated in fixed locations in the memory, and thus there's a fixed address for them (or a fixed offset relative to the process' boundary), which is calculated by the linker.
Yet a third variety is memory allocated on the heap using malloc/new and free/delete. I think this discussion would be too lengthy if we include that as well.
That said, my description is only for a typical hardware architecture and OS. All of these are also dependent on a wide variety of things, as mentioned by Emmet.
p is a variable with automatic storage. It lives only as long as the function it is in lives. Every time its function is called memory for it is taken from the stack, therefore, its address can change and is not known until runtime.

Why C variables stored in specific memory locations?

Yesterday I had an interview where the interviewer asked me about the storage classes where variables are stored.
My answer war:
Local Variables are stored in Stack.
Register variables are stored in Register
Global & static variables are stored in data segment.
The memory created dynamically are stored in Heap.
The next question he asked me was: why are they getting stored in those specific memory area? Why is the Local variable not getting stored in register (though I need an auto variable getting used very frequently in my program)? Or why global or static variables are not getting stored in stack?
Then I was clueless. Please help me.
Because the storage area determines the scope and the lifetime of the variables.
You choose a storage specification depending on your requirement, i.e:
Lifetime: The duration you expect the particular variable needs to be alive and valid.
Scope: The scope(areas) where you expect the variable to be accessible.
In short, each storage area provides a different functionality and you need various functionality hence different storage areas.
The C language does not define where any variables are stored, actually. It does, however, define three storage classes: static, automatic, and dynamic.
Static variables are created during program initialization (prior to main()) and remain in existence until program termination. File-scope ('global') and static variables fall under the category. While these commonly are stored in the data segment, the C standard does not require this to be the case, and in some cases (eg, C interpreters) they may be stored in other locations, such as the heap.
Automatic variables are local variables declared in a function body. They are created when or before program flow reaches their declaration, and destroyed when they go out of scope; new instances of these variables are created for recursive function invocations. A stack is a convenient way to implement these variables, but again, it is not required. You could implement automatics in the heap as well, if you chose, and they're commonly placed in registers as well. In many cases, an automatic variable will move between the stack and heap during its lifetime.
Note that the register annotation for automatic variables is a hint - the compiler is not obligated to do anything with it, and indeed many modern compilers ignore it completely.
Finally, dynamic objects (there is no such thing as a dynamic variable in C) refer to values created explicitly using malloc, calloc or other similar allocation functions. They come into existence when explicitly created, and are destroyed when explicitly freed. A heap is a convenient place to put these - or rather, one defines a heap based on the ability to do this style of allocation. But again, the compiler implementation is free to do whatever it wants. If the compiler can perform static analysis to determine the lifetime of a dynamic object, it might be able to move it to the data segment or stack (however, few C compilers do this sort of 'escape analysis').
The key takeaway here is that the C language standard only defines how long a given value is in existence for. And a minimum bound for this lifetime at that - it may remain longer than is required. Exactly how to place this in memory is a subject in which the language and library implementation is given significant freedom.
It is actually just an implementation detail that is convenient.
The compiler could, if he wanted to, generate local variables on the heap if he wishes.
It is just easier to create them on the stack since when leaving a function you can adjust the frame pointer with a simple add/subtract depending on the growth direction of the stack and so automatically free the used space for the next function. Creating locals on the heap however would mean more house-keeping work.
Another point is local variables must not be created on the stack, they can be stored and used just in a register if the compiler thinks that's more appropriate and has enough registers to do so.
Local variables are stored in registers in most cases, because registers are pushed and poped from stack when you make function calls It looks like they are on stack.
There is actually no such tings as register variables because it is just some rarely used keyword in C that tells compiler to try to put this in registers. I think that most compilers just ignore this keyword.
That why asked you more, because he was not sure if you deeply understand topic. Fact is that register variables are virtually on stack.
in embedded systems we have different types of memories(read only non volatile(ROM), read write non volatile(EEPROM, PROM, SRAM, NVRAM, flash), volatile(RAM)) to use and also we have different requirements(cannot change and also persist after power cycling, can change and also persist after power cycling, can change any time) on data we have. we have different sections because we have to map our requirements of data to different types of available memories optimistically.

Resources