Memory location of function argument - c

I'm preparing for my UNIX exam and there is a question about memory location of C variables.
Let's say we have code like this
char sth;
int some_function(int arg) {
int some_int;
// some code here
}
so I suppose that sth is located on the heap, some_int on the stack, but where is arg located?
Can somebody please explain how are C variables managed?
Thank you

Note that everything of this is implementation dependent. The C standard does not even utter the words stack, heap and so on. It just talks about the behavior that is expected from variables depending on their storage(static,extern,register etc).
Having said so usually arg will be located in the stack frame which is provided for the function. It's scope is limited to the function just as scope of some_int.
By the way sth is not on heap it has a static global storage.

Everything here is totally platform dependent and really not about C the language, but about How My Compiler Does It.
sth has static (global) storage, so its probably not on the heap, but rather in the global data segment. some_int is indeed in the local stack frame of some_function. The variable arg is populated within some_function, but where it lives is up to the compiler and what's usually known as the "calling convention": It may be allocated and cleaned up in the stack frame of the caller or the callee, and by the caller or the callee, depending on conventions, or passed in a register and not go into memory at all.

arg will be located in the stack (for desktop platforms at least).
Read a document called "smashing the stack for fun and profit" and you will understand how the memory is managed in C.

sth is in the static memory, arg and some_int are in the stack. arg is copied ("pushed") to the stack when some_function is called. The heap is dynamic memory and contains data allocated at the run time (using malloc for example).

Arguments are (with very few exceptions) passed on the stack.
You should be able to verify that they are in your computer architecture by just doing;
printf("%p - %p\n", &arg, &some_int);
They should normally be within a few bytes of each other.
Edit: As others have noted, sth is not allocated on the heap, but in the program's data segment, ie the compiler has already allocated the memory at compile time.

sth is probably in block static storage (a.k.a. "BSS", depending on platform):
Again, this is entirely "platform dependent", but there are generally four regions "segments" where you can allocate variable space from:
a) Heap: your language's runtime manages "heap" data, e.g. with calls to "malloc()" or "new")
b) Stack: these are "automatic" variables
c) BSS: unintialized (variable) static data
d) Data: initialized (and often read-only) static data
http://en.wikipedia.org/wiki/Data_segment

It depends on the implementation; arguments may be pushed onto the stack frame, or they may be written to registers, or they may be passed by some other mechanism.
The language definition does not mandate where various objects should be stored; it only mandates how those objects should behave.

Related

Is the stack offset assigned to local stack variables ever reused, e.g. in case it becomes dead or goes out of scope?

In other words, will compilers allocate enough space in the program stack to store all variables at the deepest level of block nesting in the current function or do they look at liveness and the scope of variables too?
void zoo(int num) {
if (num) {
int a = foo();
bar(a);
} else {
int b = foo();
bar(b);
}
}
For example the above code will be assigned different offsets on the stack for a and b, even though, if they were assigned only one offset (e.g. rbp - 8) it would have been legal too. My question is that will compilers like gcc and clang ever output assembly where multiple variables are assigned the same static offset?
Is there anything in the specifications about this?
I want to know if there is a unique mapping between source variables and the stack offsets present in a compiled assembly file.
There is, in general, no unique mapping between objects with automatic storage duration (“local” objects defined inside a function or block) and stack offsets. I have seen compiler-generated code reuse the same stack location for different objects, either because the use of one did not overlap the use of the other in the C code or because the compiler had moved one into a register for whatever purposes and no longer needed to use the stack location for it.
The C and C++ standards do not require implementations to implement their stack allocation in any particular way. They are free to reuse stack locations. They are also free to allocate all the stack space that might be needed1 or to wait to see if particular blocks are entered or not before further allocating stack space for the objects inside those blocks.
Note
1 Implementations that support variable-length arrays generally must wait until the size of the array can be determined before allocating space for it.

Can we get the elements in the middle of "stack" directly?

Think of the situation as follow.
int main(void)
{
...
int a=10; //a get into the stack.
int b=20; //b get into the stack.
int c=30; //c get into the stack.
...
}
As we know, the “stack segment” is satisfied with the storage approach of "stack data structure"; and here, the local variables a, b, and c are exactly stored in such a direction of memory, so in theory we can only access to the element at the top of the stack.
But what if we do something like this?
printf("b = %d",b);
Local variable b is in the middle of a and c, but we can get it.
So...can we say that we can directly get the element in the middle of the stack?
Here is the image of " a, b, and c stored in stack "
the local variables a, b, and c are exactly stored in such a direction of memory
I don't know from where you got this but this is not true, at least in modern compilers.
First of all, C itself doesn't specify anything about using a stack. How the function calls are implemented is implementation defined. Lots of common implementations use stack like data structure to implement function call in the sense that last called function will be returned first.
But this doesn't mean that the local variables are stored in stack like structure. There are lots of options to the compiler like:
It can eliminate the variable completely if that is not needed in run-time.
It can place the variables in register.
It can re-order variables.
In all of these cases the only thing that compiler guarantees that the observable behavior of the code is not changed.
Since it doesn't store variables in stack like data structure, it has no problem to access them in middle.
First of all C standard never utter stack` anywhere. Now moving from standard - implementors may implement such a way that local variables are stored in stack memory as part of function frame.
Now you are thinking - stack is always accessed on top but here we can access the variable b directly. Though it is in the middle somewhere. Nope you are wrong - the thing is the frame of the functions are which are stored in stack and popped off when done(though an implementation can do it other way also - but speaking in general). It is not the variables that are the unit of operations here.
By accessing b we are not violating any rule of stack data structure. The function frames are the ones which are accessed in LIFO manner not the variables inside those frames.
Also it's a bit out of context now a days to segregate like that. We can simply say that they have automatic storage duration. And that's it. They can be implemented in group of registers also(standard won't stop them). The function frames are the one which will likely to be have that stack data-structures behavior.
Usage of the stack to store automatic variables is only an implementation detail, nothing is required by the standard. But at a lower level it is indeed the most common implementation. It is because in processors, a special register (stack pointer) is used to store return addresses in function calls (instructions call and return). When automatic variables are also stored in that stack, it is trivial to reclaim there storage back at return time or at end of the block. But they are not individually pushed onto a stack: a frame pointer is used to store the memory zone for the current block (including a reference for upper frames) and the stack pointer is increased in one single operation for the size of the frame containing all the local variables. Then those variables are known by the offset to the current frame pointer. So they are known by their own address and not as elements of a stack.

Are activation records created on stack or heap in C?

I am reading about memory allocation and activation records. I am having some doubts. Can anyone make the following crystal clear ?
A). My first doubt is that "Are activation records created on stack or heap in C" ?
B). These are few lines from an abstract which i am referring :-->
Even though memory on stack area is created during run time- the
amount of memory (activation record size) is determined at compile
time. Static and global memory area is compile time determined and
this is part of the binary. At run time, we cannot change this. Only
memory area freely available for the process to change during runtime
is heap.At compile time compiler only reserves the stack space for
activation record. This gets used (allocated on actual memory) only
during program run. Only DATA segment part of the program like static
variables, string literals etc. are allocated during compile time. For
heap area, how much memory to be allocated is also determined at run
time.
Can anyone please elaborate these lines as i am unable to understand anything ?
I am sure the explaination would be of great need to me.
As a quick answer, I don't even really know what an activation record is. The rest of the quote has very poor English and is quite misleading.
Honestly, the abstract is talking about absolutes when in reality, there really are not at all absolute. You do define a main stack at compile time, yes (though you can create many stacks at runtime as well).
Yes, when you want to allocate memory, one usually creates a pointer to store that information, but where you place that is completely up to you. It can be stack, it can be global memory, it can be in the heap from another allocation, or you can just leak memory and not store it anywhere it all if you wish. Perhaps this is what is meant by an activation record?
Or perhaps, it means that when dynamic memory is created, somewhere in memory, there has to be some sort of information that keeps track of used and unused memory. For many allocators, this is a list of pointers stored somewhere in the allocated memory, though others store it in a different piece of memory and some could even place that on the stack. It all depends on the needs of the memory system.
Finally, where dynamic memory is allocated from can vary as well. It can come from a call to the OS, though in some cases, it can also just be overlayed onto existing global (or even stack) memory - which is not uncommon in embedded programming.
As you can see, this abstract is not even close to what dynamic memory represents.
Additional info:
Many are jumping all over me stating that 'C' has no stack in the standard. Correct. That said, how many people have truly coded in C without one? I'll leave that alone for now.
Defined memory, as you call it, is anything declared with the 'static' keyword within a function or any variable declared outside of a function without the 'extern' keyword in front of it. This is memory that the compiler knows about and can reserve space for without any additional help.
Allocated memory - is not a good term as defined memory can also be considered allocated. Instead, use the term dynamic memory. This is memory that you allocate from a heap at run-time. An example:
char *foo;
int my_value;
int main(void)
{
foo = malloc(10 * sizeof(char));
// Do stuff with foo
free(foo);
return 0;
}
foo is "defined" as you say as a pointer. If nothing else were done, it would only reserve that much memory, but when the malloc is reached in main(), it now points to at least 10 bytes of dynamic memory as well. Once the free is reached, that memory is now made available to the program for other uses. It's allocated size is 'dynamic'. Compare that to my_value which will always be the size of an int and nothing else.
In C (given how it is almost universally implemented*) An activation record is exactly the same thing as a stack frame which is the same thing as a call frame. They are always created on the stack.
The stack segment is a memory area the process gets "for free" from the OS when it created. It does not need to malloc or free it. On x86, a machine register (e.g RSP) points to the end of the segment and stack frames/activation records/call frames are "allocated" by decrementing the pointer in that register by how many byte to allocate. E.g:
int my_func() {
int x = 123;
int y = 234;
int z = 345;
...
return 1;
}
An unoptimizing C compiler could generate assembly code for keeping those three variables in the stack frame like this:
my_func:
; "allocate" 24 bytes of stack space
sub rsp, 24
; Initialize the allocated stack memory
mov [rsp], 345 ; z = 345
mov [rsp+8], 234 ; y = 234
mov [rsp+16], 134 ; x = 123
...
; "free" the allocated stack space
add rsp, 24
; return 1
mov rax, 1
ret
In other contexts and languages activation records can be implemented differently. For example using linked lists. But as the language is C and the context is low-level programming I don't think it is useful to discuss that.
In theory, a C99 (or C11) compatible implementation (e.g. a C compiler & C standard library implementation) do not even need (in all cases) a call stack. For example, one could imagine a whole program compiler (notably for freestanding C implementation) which would analyze the entire program and decide that stack frames are unneeded (e.g. each local variable could be allocated statically, or fit in a register). Or one could imagine an implementation allocating the call frames as continuation frames (perhaps after CPS transformation by the compiler) elsewhere (e.g. in some "heap"), using techniques similar to those described in Appel old book Compiling with Continuations (describing an SML/NJ compiler).
(remember that a programming language is a specification -not some software-, often written in English, perhaps with additional formalization, in some technical report or standard document. AFAIK, the C99 or C11 standards do not even mention any stack or activation record. But in practice, most C implementations are made of a compiler and a standard library implementation.)
In practice, allocation records are call frames (for C, they are synonyms; things are more complex with nested functions) and are allocated on a hardware assisted call stack on all reasonable C implementations I know. on Z/Architecture there is no hardware stack pointer register, so it is a convention (dedicating some register to play the role of the stack pointer).
So look first at call stack wikipage. It has a nice picture worth many words.
Are activation records created on stack or heap
In practice, they (activation records) are call frames on the call stack (allocated following calling conventions and ABIs). Of course the layout, slot usage, and size of a call frame is computed at compile-time by the compiler.
In practice, a local variable may correspond to some slot inside the call frame. But sometimes, the compiler would keep it only in a register, or reuse the same slot (which has a fixed offset in the call frame) for various usages, e.g. for several local variables in different blocks, etc.
But most C compilers are optimizing compilers. They are able to inline a function, or sometimes make a tail call to it (then the caller's call frame is reused as or overwritten by the callee call frame), so details are more complex.
See also this How was C ported to architectures that had no hardware stack? question on retro.

Organization of Virtual Memory in C

For each of the following, where does it appear to be stored in memory, and in what order: global variables, local variables, static local variables, function parameters, global constants, local constants, the functions themselves (and is main a special case?), dynamically allocated variables.
How will I evaluate this experimentally,i.e., using C code?
I know that
global variables -- data
static variables -- data
constant data types -- code
local variables(declared and defined in functions) -- stack
variables declared and defined in main function -- stack
pointers(ex: char *arr,int *arr) -- data or stack
dynamically allocated space(using malloc,calloc) -- heap
You could write some code to create all of the above, and then print out their addresses. For example:
void func(int a) {
int i = 0;
printf("local i address is %x\n", &i);
printf("parameter a address is %x\n", &a);
}
printf("func address is %x\n", (void *) &func);
note the function address is a bit tricky, you have to cast it a void* and when you take the address of a function you omit the (). Compare memory addresses and you will start to get a picture or where things are. Normally text (instructions) are at the bottom (closest to 0x0000) the heap is in the middle, and the stack starts at the top and grows down.
In theory
Pointers are no different from other variables as far as memory location is concerned.
Local variables and parameters might be allocated on the stack or directly in registers.
constant strings will be stored in a special data section, but basically the same kind of location as data.
numerical constants themselves will not be stored anywhere, they will be put into other variables or translated directly into CPU instructions.
for instance int a = 5; will store the constant 5 into the variable a (the actual memory is tied to the variable, not the constant), but a *= 5 will generate the code necessary to multiply a by the constant 5.
main is just a function like any other as far as memory location is concerned. A local main variable is no different from any other local variable, main code is located somewhere in code section like any other function, argc and argv are just parameters like any others (they are provided by the startup code that calls the main), etc.
code generation
Now if you want to see where the compiler and runtime put all these things, a possibility is to write a small program that defines a few of each, and ask the compiler to produce an assembly listing. You will then see how each element is stored.
For heap data, you will see calls to malloc, which is responsible for interfacing with the dynamic memory allocator.
For stack data, you will see strange references to stack pointers (the ebp register on x86 architectures), that will both be used for parameters and (automatic) local variables.
For global/static data, you will see labels named after your variables.
Constant strings will probably be labelled with an awful name, but you will notice they all go into a section (usually named bss) that will be linked next to data.
runtime addresses
Alternatively, you can run this program and ask it to print the addresses of each element. This, however, will not show you the register usage.
If you use a variable address, you will force the compiler to put it into memory, while it could have kept it into a register otherwise.
Note also that the memory organization is compiler and system dependent. The same code compiled with gcc and MSVC may have completely different addresses and elements in a completely different order.
Code optimizer is likely to do strange things too, so I advise to compile your sample code with all optimizations disabled first.
Looking at what the compiler does to gain size and/or speed might be interesting though.

Runtime Memory allocation on stack

I want to know about runtime memory allocation on stack area and how its different from runtime memory allocation on Heap area.
I know how memory get allocated by using library function.
#include<alloca.h> void *alloca(size_t size); //(for runtime memory on stack)
#include<stdlib.h> void *malloc(size_t size); //(for run time memory on heap)
I also know that if we are using alloca function we don't need to free that memory explicitly because it is associated with stack, its get freed automatically.
I want to know which system calls are associated with alloc and malloc and how they works in both.
In short they usually don't use system calls, unless running out of available memory.
The bahavior is different for either, so I explain differently.
malloc
Let's say initially your program has 1MB (for example) available memory for allocation. malloc is a (standard) library function that takes this 1MB, looks at the memory you want to allocate, cut a part of the 1MB out and give it to you. For book-keeping, it keeps a linked-list of unallocated memories. The free function then adds the block being freed back to the free list, effectively freeing the memory (even though the OS still doesn't get any of it back, unless free decides that you have way too much memory and actually give it back to the OS).
Only when you run out of your 1MB does malloc actually ask the operating system for more memory. The system call itself is platform dependent. You can take a look at this answer for example.
alloca
This is not a standard function, and it could be implemented in various ways, none of which probably ever call any system functions (unless they are nice enough to increase your stack size, but you never know).
What alloca does (or equivalently the (C99) standard variable length arrays (VLA) do) is to increase the stack frame of the current function by adjusting proper registers (for example esp in x86). Any variable that happens to be on the same stack frame but located after the variable length array (or allocaed memory) would then be addressed by ebp + size_of_vla + constant instead of the good old simple ebp + constant.
Since the stack pointer is recovered to the frame of the previous function upon function return (or generally on exit of any {} block), any stack memory allocated would be automatically released.
The alloca() function is typically implemented by the compiler vendor, and doesn't have to be a "system call" at all.
Since all it needs to do is allocate space on the local stack frame, it can be implemented very simply and thus be incredibly fast when compared to malloc().
The Linux manual page for it says:
The inlined code often consists of a single instruction adjusting the stack pointer, and does not check for stack overflow.
Also, I'm not sure you realize that the memory gets deallocated "automatically" when the function that called alloca() exits. This is very important, you can't use alloca() to do long-lived allocations.
The alloca function is, according to its manpage a function that is inlined and will be specially treated by the compiler and expanded (at least for gcc).
The behavior is implementation-defined and as such, should not be used, for you cannot gurantee it to work the same way always.

Resources