The stack frame of a caller function can be easily obtained via __builtin_frame_address(1), but what about the stack frame size?
Is there a function that will let me know how big is the stack frame of the caller function?
My first reaction would have been, why would anybody want this? It should be considered bad practice for a C function to dynamically determine the size of the stack frame. The whole point of cdecl (the classic C calling convention) is that the function itself (the 'callee') has no knowledge of the size of the stack frame. Any diversion from that philosophy may cause your code to break when switching over to a different platform, a different address size (e.g. from 32-bit to 64-bit), a different compiler or even different compiler settings (in particular optimizations).
On the other hand, since gcc already offers this function __builtin_frame_address, it will be interesting to see how much information can be derived from there.
From the documentation:
The frame address is normally the address of the first word pushed on to the stack by the function.
On x86, a function typically starts with:
push ebp ; bp for 16-bit, ebp for 32-bit, rbp for 64-bit
In other words, __builtin_frame_address returns the base pointer of the caller's stack frame.
Unfortunately, the base pointer says little or nothing about where any stack frame starts or ends;
the base pointer points to a location that is somewhere in the middle of the stack frame (between the parameters and the local variables).
If you are only interested in the part of the stack frame that holds the local variables, then the function itself has all the knowledge. The size of that part is the difference between the stack pointer and the base pointer.
register char * const basepointer asm("ebp");
register char * const stackpointer asm("esp");
size_localvars = basepointer - stackpointer;
Please keep in mind that gcc seems to allocate space on the stack right from the beginning that is used to hold parameters for other functions called from inside the callee. Strictly speaking, that space belongs to the stack frames of those other functions, but the boundary is unclear. Whether this is a problem, depends on your purpose; what you are going to do with the calculated stack frame size?
As for the other part (the parameters), that depends. If your function has a fixed number of parameters, then you could simply measure the size of the (formal) parameters. It does not guarantee that the caller actually pushed the same amount of parameters on the stack, but assuming the caller compiled without warnings against callee's prototype, it should be OK.
void callee(int a, int b, int c, int d)
{
size_params = sizeof d + (char *)&d - (char *)&a;
}
You can combine the two techniques to get the full stackframe (including return address and saved base pointer):
register char * const stackpointer asm("esp");
void callee(int a, int b, int c, int d)
{
total_size = sizeof d + (char *)&d - stackpointer;
}
If however, your function has a variable number of parameter (an 'ellipsis', like printf has), then the size of the parameters is known only to the caller. Unless the callee has a way to derive the size and number of parameters (in case of a printf-style function, by analyzing the format string), you would have to let the caller pass that information on to the callee.
EDIT:
Please note, this only works to let a function measure his own stack frame. A callee cannot calculate his caller's stack frame size; callee will have to ask caller for that information.
However, callee can make an educated guess about the size of caller's local variables. This block starts where callee's parameters end (sizeof d + (char *)&d), and ends at caller's base pointer (__builtin_frame_address(1)). The start address may be slightly inaccurate due to address alignment imposed by the compiler; the calculated size may include a piece of unused stack space.
void callee(int a, int b, int c, int d)
{
size_localvars_of_caller = __builtin_frame_address(1) - sizeof d - (char *)&d;
}
Related
I have some questions about stack appearance while calling functions, and I have some small examples that will help explain my confusion:
1)
Let's say we have these two functions
int h(int k){
k = k+3;
return k;
int f(int x, int y){
int q;
int z = 10; //Checkpoint_1
q = h(x); // Checkpoint_2
return z;
First question: how will the stack look like after we reach the line of the Checkpoint_1, will the stack have the all the local variables (x,y,q,z)? how would they look like inside the stack?
Second question:
How will the stack look like after we reach checkpoint_2 and enter the function h(x) and put x+3 in q, will q change to x+3 in the stack frame? or will it stay the same q (having x+3 as value)?
Third question:
What registers will these two functions use? I know that there's a register that will have the return value for each function in each frame in the stack (I think it's called %eax), but my confusion is, let's say in function h(int k), will %eax have value of k or k+3, or say in function f(int x, int y) will %eax have z or 10.
I would really appreciate any help and tips regarding my questions.
I'll preface this by saying that the compiler is free to do whatever it wants in some capacity - but all modern compilers are going to do it very similarly. I'm also going to assume we're talking about x86-64, utilizing the cdecl standard that's common on linux. It's different for 32-bit, and it's different for Windows.
Modern compilers usually allocate all space for stack variables at the beginning of the function call, regardless of scope. Usually this is done simply by decreasing the value in rsp. The stack grows down, which means decreasing the value is the same as growing the stack, hence, an allocation. In this case, you have two 4 byte values initialized on the stack, so the stack will have 8 bytes allocated on it. The two variables x and y will be stored in %rdi and %rsi respectively on 64-bit. (on 32-bit, they're stored above the return address on the stack.)
After the call to h(x), yes, the value of q in memory will change to (x+3).
The compiler will use whatever registers it feels like using. In this case, it's actually very likely for everything to be in registers, and for the stack to not be used at all except for the return address for h. When function h returns, the value of k (which is now k+3) will be stored in %eax and returned. When function f returns, the value in z will be stored in %eax and returned. Registers have no concept of variable names (like z), they simply have values. It is the responsibility of compilers to give those values meaning by associating that register with the concept of z.
My question is if i have some function
void func1(){
char * s = "hello";
char * c;
int b;
c = (char *) malloc(15);
strcpy(c,s);
}
I think the s pointer is allocated on the stack but where is the data "hello" stored does that go in the data segment of the program? As for c and b they are unitialized and since 'c = some memory address' and it doesnt have one yet how does that work? and b also has no contents so it cant stored on the stack?
Then when we allocate memory for c on the heap with malloc c now has some memory address, how is this unitialized c variable given the address of the first byte for that string on the heap?
We need to consider what memory location a variable has and what its contents are. Keep this in mind.
For an int, the variable has a memory address and has a number as its contents.
For a char pointer, the variable has a memory address and its contents is a pointer to a string--the actual string data is at another memory location.
To understand this, we need to consider two things:(1) the memory layout of a program
(2) the memory layout of a function when it's been called
Program layout [typical]. Lower memory address to higher memory address:code segment -- where instructions go:
...
machine instructions for func1
...
data segment -- where initialized global variables and constants go:
...
int myglobal_inited = 23;
...
"hello"
...
bss segment -- for unitialized globals:
...
int myglobal_tbd;
...
heap segment -- where malloc data is stored (grows upward towards higher memory
addresses):
...
stack segment -- starts at top memory address and grows downward toward end
of heap
Now here's a stack frame for a function. It will be within the stack segment somewhere. Note, this is higher memory address to lower:function arguments [if any]:
arg2
arg1
arg0
function's return address [where it will go when it returns]
function's stack/local variables:
char *s
char *c
int b
char buf[20]
Note that I've added a "buf". If we changed func1 to return a string pointer (e.g. "char *func1(arg0,arg1,arg2)" and we added "strcpy(buf,c)" or "strcpy(buf,c)" buf would be usable by func1. func1 could return either c or s, but not buf.
That's because with "c" the data is stored in the data segment and persists after func1 returns. Likewise, s can be returned because the data is in the heap segment.
But, buf would not work (e.g. return buf) because the data is stored in func1's stack frame and that is popped off the stack when func1 returns [meaning it would appear as garbage to caller]. In other words, data in the stack frame of a given function is available to it and any function that it may call [and so on ...]. But, this stack frame is not available to a caller of that function. That is, the stack frame data only "persists" for the lifetime of the called function.
Here's the fully adjusted sample program:
int myglobal_initialized = 23;
int myglobal_tbd;
char *
func1(int arg0,int arg1,int arg2)
{
char *s = "hello";
char *c;
int b;
char buf[20];
char *ret;
c = malloc(15);
strcpy(c,s);
strcpy(buf,s);
// ret can be c, s, but _not_ buf
ret = ...;
return ret;
}
Let's divide this answer in two points of view of the same stuff, because the standards only complicate understanding of this topic, but they're standards anyway :).
Subject common to both parts
void func1() {
char *s = "hello";
char *c;
int b;
c = (char*)malloc(15);
strcpy(c, s);
}
Part I: From a standardese point of view
According to the standards, there's this useful concept known as automatic variable duration, in which a variable's space is reserved automatically upon entering a given scope (with unitialized values, a.k.a: garbage!), it may be set/accessed or not during such a scope, and such a space is freed for future use. Note: In C++, this also involves construction and destruction of objects.
So, in your example, you have three automatic variables:
char *s, which gets initialized to whatever the address of "hello" happens to be.
char *c, which holds garbage until it's initialized by a later assignment.
int b, which holds garbage all of its lifetime.
BTW, how storage works with functions is unspecified by the standards.
Part II: From a real-world point of view
On any decent computer architecture you will find a data structure known as the stack. The stack's purpose is to hold space that can be used and recycled by automatic variables, as well as some space for some stuff needed for recursion/function calling, and can serve as a place to hold temporary values (for optimization purposes) if the compiler decides to.
The stack works in a PUSH/POP fashion, that is, the stack grows downwards. Let my explain it a little better. Imagine an empty stack like this:
[Top of the Stack]
[Bottom of the Stack]
If you, for example, PUSH an int of value 5, you get:
[Top of the Stack]
5
[Bottom of the Stack]
Then, if you PUSH -2:
[Top of the Stack]
5
-2
[Bottom of the Stack]
And, if you POP, you retrieve -2, and the stack looks as before -2 was PUSHed.
The bottom of the stack is a barrier that can be moved uppon PUSHing and POPing. On most architectures, the bottom of the stack is recorded by a processor register known as the stack pointer. Think of it as a unsigned char*. You can decrease it, increase it, do pointer arithmetic on it, etcetera. Everything with the sole purpose to do black magic on the stack's contents.
Reserving (space for) automatic variables in the stack is done by decreasing it (remember, it grows downwards), and releasing them is done by increasing it. Basing us on this, the previous theoretical PUSH -2 is shorthand to something like this in pseudo-assembly:
SUB %SP, $4 # Subtract sizeof(int) from the stack pointer
MOV $-2, (%SP) # Copy the value `-2` to the address pointed by the stack pointer
POP whereToPop is merely the inverse
MOV (%SP), whereToPop # Get the value
ADD %SP, $4 # Free the space
Now, compiling func1() may yield the following pseudo-assembly (Note: you are not expected to understand this at its fullest):
.rodata # Read-only data goes here!
.STR0 = "hello" # The string literal goes here
.text # Code goes here!
func1:
SUB %SP, $12 # sizeof(char*) + sizeof(char*) + sizeof(int)
LEA .STR0, (%SP) # Copy the address (LEA, load effective address) of `.STR0` (the string literal) into the first 4-byte space in the stack (a.k.a `char *s`)
PUSH $15 # Pass argument to `malloc()` (note: arguments are pushed last to first)
CALL malloc
ADD %SP, 4 # The caller cleans up the stack/pops arguments
MOV %RV, 4(%SP) # Move the return value of `malloc()` (%RV) to the second 4-byte variable allocated (`4(%SP)`, a.k.a `char *c`)
PUSH (%SP) # Second argument to `strcpy()`
PUSH 4(%SP) # First argument to `strcpy()`
CALL strcpy
RET # Return with no value
I hope this has led some light on you!
I disassembled a small program that asks the user for their name then outputs "Hello + [user's_name]"
This is the disassembled output:
Main function:
Say hello function:
I noticed that for the main() function, the ESP register is decremented by Ox10 and for the say_hello() function, the ESP register is decremented by Ox20. Why is this the case?
FYI: My processor is an 1.4 GHz Intel Core i5 and I'm running OSX
Original C code:
void say_hello (void);
int main (){
printf("Enter your name\n");
say_hello();
return 0;
}
void say_hello (void) {
char name[5];
gets(name); //this is a unsafe function to use. Results in stack overflow
printf("Hello %s\n", name);
}
It allocates space on the stack for local variables. First BP it set to the current value of SP, then SP is decremented to make room for the local variables used by the function. As you can see, later [ss:rbp+???] is used to access parts of memory of this reserved space.
This is basically the same as PUSHing some dummy value a repeated number of times onto the stack.
Before the function leaves, it is crucial that the exact amount is added back to SP, otherwise a wrong return address will be used by the RET instruction, and the program will most likely crash.
The stack is "implemented" by means of the stack pointer, which points into the stack segment. Every time something is pushed on the stack (by means of pushl, call, or a similar stack opcode), it is written to the address the stack pointer points to, and the stack pointer decremented (stack is growing downwards, i.e. smaller addresses). When you pop something off the stack (popl, ret), the stack pointer is incremented and the value read off the stack.
For different function calls, we reserve space for local variables in the stack, so we decrement it and get the space. This is usually done using prologue and epilogue.
Prologue
A function prologue typically does the following actions if the architecture has a base pointer (also known as frame pointer) and a stack pointer (the following actions may not be applicable to those architectures that are missing a base pointer or stack pointer) :
Pushes the old base pointer onto the stack, such that it can be restored later (by getting the new base pointer value which is set in the next step and is always pointed to this location).
Assigns the value of stack pointer (which is pointed to the saved base pointer and the top of the old stack frame) into base pointer such that a new stack frame will be created on top of the old stack frame (i.e. the top of the old stack frame will become the base of the new stack frame).
Moves the stack pointer further by decreasing or increasing its value, depending on whether the stack grows down or up. On x86, the stack pointer is decreased to make room for variables (i.e. the function's local variables).
Epilogue
Function epilogue reverses the actions of the function prologue and returns control to the calling function. It typically does the following actions (this procedure may differ from one architecture to another):
Replaces the stack pointer with the current base (or frame) pointer, so the stack pointer is restored to its value before the prologue
Pops the base pointer off the stack, so it is restored to its value before the prologue
Returns to the calling function, by popping the previous frame's program counter off the stack and jumping to it
As far as I rememeber, such decrements are mostly used to "reserve" place on stack or to guarantee even memory alignment.
What does it mean to align the stack?
I am trying to estimate the span of my program stack range. My strategy was to assume that since the stack grows downwards, I can create a local variable to the current stack frame and then use its address as a reference.
int main()
{
//Now we are in the main frame.
//Define a local variable which would be lying in the top of the stack
char a;
//Now define another variable
int b; //address should be lower assuming stack grows downwards
//Now estimate the stack size by rlimit
struct rlimit stack_size;
getrlimit(RLIMIT_STACK,&stack_size);
//A crude estimate would be stack goes from &a to &a - stack_size.rlim_cur
printf("%p \n",&a);
printf("%p \n",&b);
printf("stack spans from %u to %u",&a,&a - stack_size.rlim_cur);
return 0;
}
Interestingly when I use the gdb to debug the values address of a and b, address of b has a higher value than a. Also the stack pointer remains always in the same place in .
0xbfca65f4
0xbfca660f
Stack spans from 0xbfca65f4 to 0xbbca65f4.
ebx 0xb7faeff4 -1208291340
esp 0xbffff670 0xbffff670
Can anybody hep me understand where I am going wrong?
Thanks in advance!
This approach mostly works; your mistake is just examining both a and b in the same call frame. There's no reason for the compiler to order automatic variables the way you expect on the stack; it's likely to choose their order for data locality or alignment purposes.
If you compare the address of one automatic object in main and another in a separate call frame (make sure it's not one that might get inlined into main!) then you should get results closer to what you expect.
If the program counter points to the address of the next instruction to be executed, what do frame pointers do?
It's like a more stable version of the stack pointer
Storage for some local variables and parameters are generally allocated in stack frames that are automatically freed simply by popping the stack pointer back to its original level after a function call.
However, the stack pointer is frequently being adjusted in order to push arguments on to the stack for new call levels and at least once on entry to a method in order to allocate its own local variables. There are other more obscure reasons to adjust the stack pointer.
All of this adjusting complicates the use of offsets to get to the parameters, locals, and in some languages, intermediate lexical scopes. It is perhaps not too hard for the compiler to keep track but if the program is being debugged, then a debugger (human or program) must also keep track of the changing offset.
It is simpler, if technically an unnecessary overhead, to just allocate a register to point to the current frame. On x86 this is %ebp. On entry to a function it may have a fixed relationship to the stack pointer.
Besides debugging, this simplifies exception management and may even pay for itself by eliminating or optimizing some adjustments to the stack pointer.
You mentioned the program counter, so it's worth noting that generally the frame pointer is an entirely software construct, and not something that the hardware implements except to the extent that virtually every machine can do a register + offset addressing mode. Some machines like x86 do provide some hardware support in the form of addressing modes and macro instructions for creating and restoring frames. However, sometimes it is found that the core instructions are faster and the macro ops end up deprecated.
This isn't really a C question since it's totally dependent on the compiler.
However stack frames are a useful way to think about the current function and it's parent function. Typically a frame pointer points to a specific location on the stack (for the given stack depth) from which you can locate parameters that were passed in as well as local variables.
Here's an example, let's say you call a function which takes one argument and returns the sum of all numbers between 1 and that argument. The C code would be something like:
unsigned int x = sumOf (7);
: :
unsigned int sumOf (unsigned int n) {
unsigned int total = 0;
while (n > 0) {
total += n;
n--;
}
return total;
}
In order to call this function, the caller will push 7 onto the stack then call the subroutine. The function itself sets up the frame pointer and allocates space for local variables, so you may see the code:
mov r1,7 ; fixed value
push r1 ; push it for subroutine
call sumOf ; then call
retLoc: mov [x],r1 ; move return value to variable
: :
sumOf: mov fp,sp ; Set frame pointer to known location
sub sp,4 ; Allocate space for total.
: :
At that point (following the sub sp,4), you have the following stack area:
+--------+
| n(7) |
+--------+
| retLoc |
+--------+
fp -> | total |
+--------+
sp -> | |
+--------+
and you can see that you can find passed-in parameters by using addresses 'above' the frame pointer and local variables 'below' the frame pointer.
The function can access the passed in value (7) by using [fp+8], the contents of memory at fp+8 (each of those cells is four bytes in this example). It can also access its own local variable (total) with [fp-0], the contents of memory at fp-0. I've used the fp-0 nomenclature even though subtracting zero has no effect since other locals will have corresponding lower addresses like fp-4, fp-8 and so on.
As you move up and down the stack, the frame pointer also moves and it's typical that the previous frame pointer is pushed onto the stack before calling a function, to give easy recovery when leaving that function. But, whereas the stack pointer may move wildly while within a function, the frame pointer typically stays constant so you can always find your relevant variables.
Good discussion here, with examples and all.
In short: the FP points to a fixed spot within the function's frame on the stack (and does not change during function execution), so all passed-arguments and the function's local ("auto") variables can be accessed by offsets from the FP (while the SP can change during a function's execution, and the PC definitely does;-).
Usually the return address (but sometimes just past last argument, for example). The point is that the frame pointer is fixed during the life of a method while the stack pointer could move during execution.
This is very implementation dependent (and more a machine concept, not really a language concept).
Lifted from a comment you provided to another answer:
Woh... Stack Pointer?... is that synonymous to Program Counter?
Read about the call stack. Basically the call stack stores data local to a current method (local variables, parameters to the method and return address to the caller). The stack pointer points to the top of that structure which is where new space is allocated (by moving the stack pointer "higher").
The frame pointer points to an area of memory in the current frame (current local function), typically it points to the return address of the current local function.
Since no one has responded to this yet I'll give it a try. A frame pointer (if memory serves) is part of the stack along with the stack pointer. The stack is comprised of stack frames (sometimes called activation records). The stack pointer points to the top of the stack while the frame pointer typically points to some fixed point in a frame structure, such as the location of the return address. Theres a more detailed description along with a picture on wikipedia.
link text