Function calls, the stack - c

So I don't know why but I learned that when you call a function and pass an argument to it, it deals with it on the stack(processor?).
Can someone please explain it?
then how does it change values of variables, blocks of memory and so on?

There is no guarantee that parameters are passed on the stack, it's architecture and compiler dependent.
As to how values and memory get changed -- when you call a function that must make changes that are seen by the caller, it's normal that what you provide is not the actual value, but rather the address of (pointer to) that value. As long as the function knows the proper memory location it can make these changes.

Stack is used in most cases to pass arguments to function. The reason for using it is that you are not bound to fixed memory places (for arguments) to have your function functional. If you had function that could take arguments from fixed memory you would probably only be able to run it if the memory was free and you would be able to run just one instance of it. Stack gives you the possibility to store your arguments to current context of your program at any time. On x86 processors there is register that points to end of the stack and other register that points to the begining. Those are actualy just addresses to main memory where you want your stack to reside.
There is PUSH instruction that moves the stack-end register to the next place and stores specified data (could be value from other register or at some address or direct value) to address pointed by stack-end resgister. The other instruction is POP and it works the same just the other way around. This way, if you stick to the plan and keep track of what you pushed to stack, you can have your functions work from any context.
There are some other less used options to pass arguments like via registers, which are used for example by bios interrupts. If you want to know more about this I suggest you read something on "Calling conventions".

Lets start with this suppose you have a function
int foo(int value) {
int a = 10;
return a;
}
So whenever a function call is made OS needs some memory space to allocate the local variables of the function int a in this case and arguments to the function passed int value in this case. This memory requirement is fulfilled by allocating memory on stack. A stack is nothing but a memory region allocated to each process and it actually behaves as a stack data structure(LIFO).
Now the question arises what all things are stored on stack when a function call is made. The first thing pushed on the stack are the arguments passed to the function in reverse order(if more then one).
2. Then the return address of the function which called this function (because once this function foo completes execution it should return back to the place in the code from where it was called)
3. Finally local variables of the function called are pushed on the stack.
Once the called function completes executing the code it returns back to the return address previously stored on the stack and thus we say function call completes or returns.
In this case the function has a return value which it passes back to the callee function.
The space is then free to use and can be overwritten in the subsequent function calls.
(Now if you connect the dotes you can realize why local variables(automatic variables) in a function have scope limited to the life of the function call (you asked a SO question related to scope which was closed) because once a call returns the memory space allocated for these locale variable is gone(it is still there but you cant access them once a function returns) so life of these automatic variable int a in this case limits till foo() returns to the callee function.
Side Note:: I have read many questions that you have posted in SO. I guess you are trying to learn C and basic working of the underlying hardware and OS in general and the confusion in between them is killing you.
I would suggest you some pointers apart from the answer to this question to read and understand which will give you lots of insight into the questions you are facing.
For C refer K&R it is the best book.
In the starting read little bit about OS concepts(Memory handling, Virtual Memory in particular)
Try imagine the working of a system in broad sense as in how different components are interacting.
Some good links for understanding memory related stuff and system internals http://duartes.org/gustavo/blog/best-of
and if you want to dive into stack space for a function call try this link http://www.binarypirates.in/2011/02/17/understanding-function-stack-in-c/
Hope this helps

Related

Can I find how much stack memory is currently used in a Mac thread?

When I try to google this, all I find is stuff about getting and setting the stack limit, such as -[NSThread stackSize], but that's NOT what I want. I want to know how much memory is in actually in use on the stack in the current thread, or equivalently how much stack space remains available.
I'm hoping to figure out a stack overflow in a crash report submitted by a user. In my previous experience, a stack overflow has usually been caused by an infinite recursion, but not this time. So I'm wondering if some of my C++ functions are really using a heck of a lot more stack space than they should.
A comment suggested that I get the stack pointer at the start of the thread, and compare its value later. I happened across the question Print out value of stack pointer. It has several answers:
(The accepted answer) Take the address of a local variable.
Use a little assembly language to get the value of the stack pointer register.
Use the function __builtin_frame_address(0) in GCC or Clang.
I tried those techniques (Apple Clang, macOS 11.2). Methods 2 and 3 produced similar results, but method 1 produced absurdly different results. For one thing, method 1 gives values that increase as you go deeper into a call chain, while the others give values that decrease. What's up with this, are there two different kinds of stacks?
If you are trying to do that, I guess you want to know how much memory are you using to guess the optimum number of threads you can create of some kind.
The answer is not easy, as you normally don't have access to the stack pointer. But I'll try to devise a solution for you that will not require to access the stack pointer, while it requires to use a global variable per thread.
The idea is to force a parameter to be in the stack. Even if the ABI in your system uses register to pass parameters, if you save the address of a parameter (the actual parameter variable) into some local variable, and then after that you call a function, that takes a parameter (the type doesn't matter, as you are going to use it's address to compare both):
static char *initial_stack_pseudo_addr;
size_t save_initial_stack(char dumb)
{
/* the & operator forces dumb to be implemented in the stack */
initial_stack_pseudo_addr = &dumb;
}
size_t how_much_stack(int dumb)
{
return initial_stack_pseudo_addr - &dumb;
}
So when you start the thread, you call save_initial_stack(0);. When you want to know how much stack you have consumed, just can do the following:
size_t stack_size = how_much_stack(0);
printf("at this point I have %zi bytes of stack\n", stack_size);
Basically, what you have done is to calculate how many bytes are between the address of the local parameter of the call to save_initial_stack() to the address of the local parameter of the call you do now to get the stack size. This is approximate, but the stack changes too quick to have a precise idea.
The following example will illustrate the thing. A recursive function is called after setting the initial pointer value, then at each recursive call the current size of the stack (approximate) is computed and printed, and a new recursive call is made. The program should run until the process gets a stack overflow.
#include <stdio.h>
char *stack_at_start;
void save_stack_pointer(char dumb)
{
stack_at_start = &dumb;
}
size_t get_stack_size(char dumb)
{
return stack_at_start - &dumb;
}
void recursive()
{
printf("Stack size: %zi\n", get_stack_size(0));
recursive();
}
int main()
{
save_stack_pointer(0);
recursive();
}

how do stack works in Recursion

i don't understand that how stack works in Recursion. During Recursion the parameter of the function get pushed in the stack and return address also pushed on to the stack.The return address and parameter is pushed in the same stack or the return address get pushed in other stack?
Either way is possible. The results are equivalent. The local variables of the function also need space which is managed like a stack, and this again can be the same stack or a different one. This is a possible way to implement a C function call:
Push the function parameters to the parameter stack.
Push the desired return address (generally the address of the instruction after the branch) to the return address stack.
Branch to the function's address.
Push the initial values of the function's local variables to the local variable stack.
And a corresponding way to return from the call:
Pop the local variables from the local variable stack.
Pop from the return address stack.
Branch to the address that was just read.
Pop the parameters from the parameter stack.
If instead of having three separate stacks, there's just one, the procedure above still works. Note that it works because the steps are ordered correctly: with a single stack, you need to do the pops in reverse order from the pushes, whereas with multiple stacks, the order only needs to be consistent within each stack.
In practice, most platforms use a single stack for everything, because it makes memory management easier. Before calling the function, the code creates a stack frame by pushing both the parameters and the return address to the single stack. Pushing the parameters before the return address is typically easier because it uses a relative addressing mode to obtain the return address:
push parameter_1
push parameter_2
…
push program_counter + 2
branch my_function
; first instruction after returning
The first thing the code of the function does is to extend the stack frame to make room for its local variables. Concretely, “extend the stack frame” typically means adding the needed space to the register that points to the the top of the stack. Then, at the end of the function, the code loads the return address into a register, subtracts the length of the stack frame from the stack pointer, and branches to the return address.
There are a lot of possible variations and practical complications. The exact way to call a function is called a calling convention. Most platforms define a calling convention so that code compiled with one compiler can call a function compiled with another compiler. The calling convention can be different for functions with different prototypes: for example, often, some arguments are passed in registers, and the layout of the stack frame may be different for variadic and non-variadic functions. However, some platforms support multiple calling conventions, which requires an additional non-standard annotation on function prototypes (such as __cdecl vs __stdcall on Windows).
One of the possible complications is a shadow stack. Most platforms use a single stack because it's easier to implement and there's less memory management overhead. However, a single stack has the downside that a bug in a function, such as a buffer overflow in an array that is stored on the stack, can easily cause it to overwrite the return address. A shadow stack is an extra copy of the return address in a return address stack which is separate from the main stack. When returning from a function, the code checks that the two copies of the return address are the same, and jumps to an error handler if they aren't. The reason to keep the return address in the main stack is for compatibility. It's the called function that pushes the return address to the return address stack, not the caller; that way the caller doesn't need to know whether the function that it's calling was compiled with shadow stack support or not. The called function gets the return address from the main stack.

Examining local variables returned function

I have a coredump of a process that has crashed (hard to reproduce).
I have figured out that something goes wrong in a function that has just returned (it returned a NULL pointer rather than a non-NULL pointer).
It would be of great help for me to know the contents of the stack variables in that function. I think on most architectures, returning from a function just means changing the stack pointer. In other words, those values are still there (below the stack pointer then if we take x86 as an example).
Can anyone confirm my reasoning is correct and maybe provide an example how do this with gdb?
Does my reasoning also hold for MIPS ?
Local variables might have been stored on stack, but not necessarily. If there is only a small number of variables that fit into registers and code is optimized, then local variables were never saved on stack.
Depending on calling convention used, final values of local variables may still persist in registers.
Disassemble the function in question (you can use objdump -dS to do this, so you can easily correlate source). See how local variables were accessed . Were they stored in memory or registers? Were registers already restored to their value relevant for caller?
If original register value was not restored, you can just examine the register that was used to store local. If it was already restored, then it's probably lost.
If local values were stored to stack, then function prologue (first instructions) should tell you how stack and frame pointer were manipulated. Taking into account that call also saved to stack (PC saved) you can calculate the value of stack/frame pointer used in that function. Then use x to examine memory locations.
Depending on called function, you could also be able to examine its arguments (when called) and recalculate the value of local variables.
You may see local variable that hasn't be optimised using:
info locals
It may not work in a function that already return, though. If you can run that program again, try to put a breakpoint just before the function return.
Otherwise, you can manually investigate the stack using x/x and info register to know the stack pointer address.
You may then browse the stack using up and down.

Does C destroy return variable before the closing brace?

Somebody told me that in C (and C++) that the variable present in a return statement is destroyed before closing brace of the function.
Ex -
int func() {
int a = 10;
return a; // I was told that a is destroyed here
}
Does it really happen that way? If yes, how does the function return value to the calling function?
My intuition tells me that the variable value is pushed on to stack at the return value and when it goes back to the calling function, the stack top is popped there by getting the return value. Not sure if I'm correct.
Does C destroy return variable before the closing brace?
Yes ... sort of.
Local variables go out of scope at the end of a method, and after that, they cease to be accessible.
In C, that simply means that the storage for the variable itself becomes available for other uses. But there is no active "destruction" of the variable.
In C++, the variable's destructor (if there is one) will be invoked when the variable goes out of scope.
At an implementation level, the storage space for local variables is typically managed using a stack. But I don't think this is mandated by the respective language specifications.
It is also important to note that we are talking about variables, not values. In your example, the value of the variable is going to be returned to the caller (vide the return statement) and will continue to exist beyond the } ...
how does the function return value to the calling function?
Your intuition is right (partially). In some architecture the value is stored at stack and get popped when returning to the caller. But keep in mind that a value is returned from a function, not the variable itself.
C: How to Program: Ch-5: C Functions:
When a program calls a function, the called function must know how to return to its
caller, so the return address of the calling function is pushed onto the program execution stack (sometimes referred to as the function call stack).
The program execution stack also contains the memory for the local variables used in
each invocation of a function during a program’s execution. This data, stored as a portion of the program execution stack, is known as the activation record or stack frame of the function call. When a function call is made, the activation record for that function call is pushed onto the program execution stack. When the function returns to its caller, the activation record for this function call is popped off the stack and those local variables are no longer known to the program.
EDIT: As others mentioned is comments that this is implementation specific I changed my mind.
For x86 wiki says:
Calling conventions describe the interface of called code:
1. The order in which atomic (scalar) parameters, or individual parts of a complex parameter, are allocated
2. How parameters are passed (pushed on the stack, placed in registers, or a mix of both)
3. Which registers the callee must preserve for the caller
4. How the task of preparing the stack for, and restoring after, a function call is divided between the caller and the callee
There are often subtle differences in how various compilers implement these conventions.

In c, are variables always pushed from their registers to the stack before they go out of scope?

After a function A calls a function B, can the code in B trash all the registers (aside from those that hold the stack pointers and B's parameters) without affecting variables local to A? Accordingly, after function B returns to function A, does function A pop all its locals back off the stack (reasoning that the register states might have changed while function B was executed)?
What about global variables? Does function B need to worry at all about any register operations affecting the state of global variables?
(The main reason I ask this, is that I feel like experimenting with injecting machine code at runtime as function B by using mprotect to make an array executable, and then casting the array pointer to function pointer and calling it. With the above questions I hope to figure out what the extent of B's playground is.)
This is calling convention, which is architecture, operating system, and compiler dependent.
Edit 0:
One more link for you: application binary interface. Drill down for your particular hardware/OS/compiler combination. You'll find what registers are used for parameters/return values, which are reserved for specific things, and which are free for any given function to clobber.
It's up to the functions how they handle calling other functions. It's normal to store all your local variables on the stack before branching to another function, but if you know for fact that some other function only uses a specific two registers, and you avoid using those two anywhere, then you wouldn't need to store anything (other than the address to branch back to afterwards, of course) on the stack before branching to that function.
It is really just a low level implementation design decision (which is usually decided by a compiler) so you might find that some functions will trust B with what's currently in the registers, while other functions won't.

Resources