I try to verify my understand of the stack memory layout in C by compiling following code and inspect the address in gdb. I only record the least significant digits, the higher ones are the same. The outputs are generated by using the
print \u &a
Here is a simple test code:
void test(int a,int b)
{
int c = a;
int d = b;
printf("%d,%d\n",c,d);
}
int main()
{
int x = 1;
int y = 2;
test(x,y);
return 0;
}
If I look at the test function frame, I have following results,
&b: 6808
&a: 6812
&c: 6824
&d: 6828
$rbp: 6832 (frame pointer).
I am confused. Shouldn't function parameters sit at higher memory address with respect to the local variables. Can someone explain this in detail please? Thanks.
edit:
if I print the memory out like:
printf("&a:%p,&b:%p\n",(&a),(&b));
printf("&c:%p,&d:%p\n",(&c),(&d));
I got
&a:0x7fff4737687c,&b:0x7fff47376878
&c:0x7fff47376888,&d:0x7fff4737688c
It turns to be in b a c d order. There is a 8 byte gap between end of a and beginning c. I guess it shall be the return address?
As per the flow of a function, first arguments are allocated and then the internal arguments.
Your concern is based on the presumption that stack grows upwards (which is not necessary).
Please follow the below link for more understanding:
Does stack grow upward or downward?
Related
this is the code :
#include <stdio.h>
#include <stdlib.h>
int main() {
int a = 10;
int b = 20;
//printf("\n&a value %p", &a);
int* x = &b;
x = x + 1;
*x = 5;
printf("\nb value %d", b);
printf("\na value %d", a);
}
I want override a with b adress for test the c overflow but when I comment the line 5(printf fuction) I can't write five in a. While if I print the a adress I can write five in a.
Why?
Sorry for my english and thank you.
The reason this occurred is that all normal compilers store objects with automatic storage duration (objects declared inside a block that are not static or extern) on a stack. Your compiler “pushed” a onto the stack, which means it wrote a to the memory location where the stack pointer was pointing and then decremented the pointer. (Decrementing the pointer adds to the stack, because the stack grows in the direction of decreasing memory addresses. Stacks can be oriented in the other direction, but the behavior you observed strongly suggests your system uses the common direction of growing downward.) Then your compiler pushed b onto the stack. So b ended up at a memory address just below a.
When you took the address of b and added one, that produced the memory address where a is. When you used that address to assign 5, that value was written to where a is.
None of this behavior is defined by the C standard. It is a consequence of the particular compiler you used and the switches you compiled with.
You probably compiled with little or no optimization. With optimization turned on, many compilers would simplify the code by removing unnecessary steps (essentially replacing them with shortcuts), so that 20 and 10 are not actually stored on the stack. A possible result with optimization is that “20” and “10” are printed, and your assignment to *x has no effect. However, the C standard does not say what the behavior must be when you use *x in this way, so the results are determined only by the particular compiler you are using, along with the input switches you give it.
After x = x + 1;, x contains an address that you do not own. And by doing *x = 5; you are trying to write to some location that might not be accessible to you. Thus causing UB. Nothing more can be reasoned about.
I want to ask how C the variables are stored in C?
To be more clear consider the following code:
int main() {
int a = 1, b;
b = a + 2;
return 0;
}
For example here in what memory C stores the names of variable places.
eg if &a=0x12A7(suppose) &b=0x123B1, then how and where does c stores the variable names like in which memory name a is stored?
Variable names need not be stored at all! The compiler can get rid of them entirely. Imagine, if the compiler is quite clever, it can reduce your entire program to this:
int main(){
return 0;
}
Note that the effect of this program is exactly the same as your original, and now there are no variables at all! No need to name them now, is there?
Even if the variables in your code were actually used, their names are purely a convenient notation when you write the program, but aren't needed by the processor when it executes your code. As far as a microprocessor is concerned, a function like this:
int foo(int x, int y) {
int z = x + y;
return z * 2;
}
Might result in compiled code that does this, in some hypothetical simple instruction set architecture (ISA):
ADD # consumes top two values on stack (x and y), pushes result (z)
PUSH 2 # pushes 2 on stack
MULT # consumes top two values on stack (z and 2), pushes result
RET
The longer story is that variable names are sometimes stored for debugging purposes. For example if you're using GCC you can pass the -g option to emit a "symbol table" which contains things like variable names for debugging. But it isn't needed simply to run a program, and it isn't covered by the language standard--it's an implementation feature which differs by platform.
C doesn't store name of the variables. Its the compiler that stores the names of variables in compiler's symbol table.
This data structure is created and maintained by compiler.
An example of a symbol table for the snippet
// Declare an external function
extern double bar(double x);
// Define a public function
double foo(int count)
{
double sum = 0.0;
// Sum all the values bar(1) to bar(count)
for (int i = 1; i <= count; i++)
sum += bar((double) i);
return sum;
}
may contain at least the following symbol:
Ok first off if you are just getting your head on strait with C this is where to start:
http://condor.cc.ku.edu/~grobe/intro-to-C.shtml
But that is more practical than your question. To answer that we first ask why variables have addresses. The why here is the stack. For a program to operate return calls must be directed to the appropriate buffer so the the pieces all fit together as designed. Now to what I believe was the original question, that is how the actual address is decided, for the answer to that you would have to understand how the processor is implementing the heap.
https://en.wikipedia.org/wiki/Memory_management
"Since the precise location of the allocation is not known in advance, the memory is accessed indirectly, usually through a pointer reference. The specific algorithm used to organize the memory area and allocate and deallocate chunks is interlinked with the kernel..."
Which brings us back to the practical side of things with the abstraction to pointers:
https://en.wikipedia.org/wiki/C_dynamic_memory_allocation
Hope tis gives you a little clearer picture of what's under the hood : )
Happy codding.
I wrote a program in C having dangling pointer.
#include<stdio.h>
int *func(void)
{
int num;
num = 100;
return #
}
int func1(void)
{
int x,y,z;
scanf("%d %d",&y,&z);
x=y+z;
return x;
}
int main(void)
{
int *a = func();
int b;
b = func1();
printf("%d\n",*a);
return 0;
}
I am getting the output as 100 even though the pointer is dangling.
I made a single change in the above function func1(). Instead of taking the value of y and z from standard input as in above program, now I am assigning the value during compile time.
I redefined the func1() as follows:
int func1(void)
{
int x,y,z;
y=100;
z=100;
x=y+z;
return x;
}
Now the output is 200.
Can somebody please explain me the reason for the above two outputs?
Undefined Behavior means anything can happen, including it'll do as you expect. Your stack variables weren't overwritten in this case.
void func3() {
int a=0, b=1, c=2;
}
If you include a call to func3() in between func1 and printf you'll get a different result.
EDIT: What actually happens on some platforms.
int *func(void)
{
int num;
num = 100;
return #
}
Let's assume, for simplicity, that the stack pointer is 10 before you call this function, and that the stack grows upwards.
When you call the function, the return address is pushed on stack (at position 10) and the stack pointer is incremented to 14 (yes, very simplified). The variable num is then created on stack at position 14, and the stack pointer is incremented to 18.
When you return, you return a pointer to address 14 - return address is popped from stack and the stack pointer is back to 10.
void func2() {
int y = 1;
}
Here, the same thing happens. Return address pushed at position, y created at position 14, you assign 1 to y (writes to address 14), you return and stack pointer's back to position 10.
Now, your old int * returned from func points to address 14, and the last modification made to that address was func2's local variable assignment. So, you have a dangling pointer (nothing above position 10 in stack is valid) that points to a left-over value from the call to func2
It's because of the way the memory gets allocated.
After calling func and returning a dangling pointer, the part of the stack where num was stored still has the value 100 (which is what you are seeing afterwards). We can reach that conclusion based on the observed behavior.
After the change, it looks like what happens is that the func1 call overwrites the memory location that a points to with the result of the addition inside func1 (the stack space previously used for func is reused now by func1), so that's why you see 200.
Of course, all of this is undefined behavior so while this might be a good philosophical question, answering it doesn't really buy you anything.
It's undefined behavior. It could work correctly on your computer right now, 20 minutes from now, might crash in an hour, etc. Once another object takes the same place on the stack as num, you will be doomed!
Dangling pointers (pointers to locations that have been disassociated) induce undefined behavior, i.e. anything can happen.
In particular, the memory locations get reused by chance* in func1. The result depends on the stack layout, compiler optimization, architecture, calling conventions and stack security mechanisms.
With dangling pointers, the result of a program is undefined. It depends on how the stack and the registers are used. With different compilers, different compiler versions and different optimization settings, you'll get a different behavior.
Returning a pointer to a local variable yields undefined behaviour, which means that anything the program does (anything at all) is valid. If you are getting the expected result, that's just dumb luck.
Please study functions from basic C. Your concept is flawed...main should be
int main(void)
{
int *a = func();
int b;
b = func1();
printf("%d\n%d",*a,func1());
return 0;
}
This will output 100 200
stack is increasing or decreasing using C program ?
Right, in C usually variables in function scope are realized by means of a stack. But this model is not imposed by the C standard, a compiler could realize this any way it pleases. The word "stack" isn't even mentioned in the standard, and even less if it is in- or decreasing. You should never try to work with assumptions about that.
False dichotomy. There are plenty of options other than increasing or decreasing, one of which is that each function call performs the equivalent of malloc to obtain memory for the callee's automatic storage, calls the callee, and performs the equivalent of free after it returns. A more sophisticated version of this would allocate large runs of "stack" at a time and only allocate more when it's about to be exhausted.
I would call both of those very bad designs on modern machines with virtual memory, but they might make sense when implementing a multiprocess operating system on MMU-less microprocessors where reserving a range of memory for the stack in each process would waste a lot of address space.
How about:
int stack_direction(void *pointer_to_local)
{
int other_local;
return (&other_local > pointer_to_local) ? 1 : -1;
}
...
int local;
printf("direction: %i", stack_direction(&local);
So you're comparing the address of a variable at one location on the call stack with one at an outer location.
If you only like to know if the stack has been changed you can keep the last inserted object to the stack, peek at the top of it and compare the two.
EDIT
Read the comments. It doesn't seem to be possible to determine the stack direction using my method.
END EDIT
Declare an array variable on the stack and compare the addresses of consecutive elements.
#include <stdio.h>
#include <stdlib.h>
int
main(void)
{
char buf[16];
printf("&buf[0]: %x\n&buf[1]: %x\n", &buf[0], &buf[1]);
return 0;
}
The output is:
misha#misha-K42Jr:~/Desktop/stackoverflow$ ./a.out
&buf[0]: d1149980
&buf[1]: d1149981
So the stack is growing down, as expected.
You can also monitor ESP register with inline assembly. ESP register holds address to unallocated stack. So if something is pushed to stack - ESP decreases and if pop'ed - ESP increases. (There are other commands which modifies stack, for example function call/return).
For example what is going on with stack when we try to compute recursive function such as Fibonacci Number (Visual Studio):
#include <stdio.h>
int FibonacciNumber(int n) {
int stackpointer = 0;
__asm {
mov stackpointer, esp
}
printf("stack pointer: %i\n", stackpointer);
if (n < 2)
return n;
else
return FibonacciNumber(n-1) + FibonacciNumber(n-2);
}
int main () {
FibonacciNumber(10);
return 0;
}
following short c program:
void foo(int a, int b) {
printf("a = %p b = %p\n", &a, &b);
}
main() {
foo(1, 2);
}
ok, now I used gdb to view this program. I got as output:
a = 0x7fff5fbff9ac b = 0x7fff5fbff9a8
and stopped execution after the output (in foo()). now I examined 0x7fff5fbff9ac and the content was:
1....correct
then 0x7fff5fbff9a8 and the content:
2...correct
now I wanted to view the return address of the function and examined (a + 4 bytes) with:
x/g 0x7fff5fbff9b1 (8 bytes!! address, therefore "g" (giant word))
and its content was:
(gdb) x/g 0x7fff5fbff9b1
0x7fff5fbff9b1: 0xd700007fff5fbff9
BUT: THIS IS NOT THE RETURN ADR FROM MAIN! where is my fault?
There are a whole bunch of faulty assumptions in your question.
You're assuming that integer arguments are passed on the stack immediately above the return address (as they are in many--not all--x86 ABIs under the default calling conventions). If this were the case, then immediately following the call, your stack would look like this:
// stack frame of main( )
// ...
value of b
value of a
return address <--- stack pointer
However, your assumption is incorrect. You have compiled your code into a 64-bit executable (as evidenced by the size of the pointer you are printing). Per the OS X ABI, in a 64-bit Intel executable, the first few integer arguments are passed in register, not on the stack. Thus, immediately following the call, the stack actually looks like this:
// stack frame of main( )
// ...
return address <--- stack pointer
Since you take the address of a and b, they will be written to the stack at some point before the call to printf( ) (unless the compiler is really clever, and realizes that it doesn't actually need to hand printf( ) valid pointers because it won't use the value pointed to, but that would be pretty evil as optimizations go), but you really don't know where they will be relative to the return address; in fact, because the 64-bit ABI provides a red zone, you don't even know whether they're above or below the stack pointer. Thus, at the time that you print out the address of a and b, your stack looks like this:
// stack frame of main( )
// ...
return address |
// ... |
// a and b are somewhere down here | <-- stack pointer points somewhere in here
// ... |
In general, the C language standard says nothing about stack layouts, or even that there needs to be a stack at all. You cannot get this sort of information in any portable fashion from C code.
Firstly, &a + 4 is 0x7FFF5FBFF9B0, so you are looking one byte offset from where you think you are.
Secondly, the saved frame pointer lies between a and the return address, and this is the value you are seeing.
What you are doing wrong is making a whole bunch of incorrect and random assumptions about the layout of the stack frame on your given platform. Where did you get that weird idea about "a + 4 bytes" location supposedly holding the return address?
If you really want to do this, get the documentation for your platform (or do some reverse engineering) to find out where and how exactly the return address is stored. Making random guesses and then asking other people why your random guesses do not produce results you for some reason expect is not exactly a productive way to do it.