How the stack is allocated for the process in linux - c

Can someone help me to understand the output of these program.
int* fun1();
void fun2();
int main()
{
int *p=fun1();
fun2();
printf("%d\n",*p);
return 0;
}
int* fun1()
{
int i=10;
return &i;
}
void fun2()
{
int a=100;
printf("%d\n",a);
}
It is 100 100 on windows and 100 10 on Linux. Windows output I am able to justify due to the fact that local variables are allocated on stack. but how come it is 100 10 in Linux.

Returning a pointer to a stack-allocated variable that went out of scope and using that pointer is undefined behavior, pure and simple.
But I'm guessing the answer "anything can happen" won't cut it for you.
What happens is that on *nix the memory isn't recycled so it's not overwritten yet, and on win it is. But that's just a guess, your best course of option is to use a debugger and walk through the assembler code.

Your problem relies on undefined behaviour [1], so anything can happen. You shouldn't even be expecting consistency on a given OS: factors such as changes to compiler options can alter the behaviour.
[1] fun1() returns the address of a variable on the stack, which is subsequently dereferenced.

Dangling pointer Problem , hence undefined behavior.

In a Linux (or other Operating System) process when a subroutine is called, the memory for local variables comes from stack area of the process. Any dynamically allocated memory (using malloc, new, etc.) comes from the heap area of the process. During recursion local memory is allocated from stack area during function call and get cleared when the function execution is done.
The memory is being represented with lowest address being at the bottom and highest being at the top. Here are the steps to find the direction of stack growth in recursion using a quick C code.
#include <stdio.h>
void test_stack_growth_direction(recursion_depth) {
int local_int1;
printf("%p\n", &local_int1);
if (recursion_depth < 10) {
test_stack_growth_direction(recursion_depth + 1);
}
}
main () {
test_stack_growth_direction(0);
}
out put on MAC
0x7fff6e9e19ac
0x7fff6f9e89a8
0x7fff6f9e8988
0x7fff6f9e8968
0x7fff6f9e8948
0x7fff6f9e8928
0x7fff6f9e8908
0x7fff6f9e88e8
0x7fff6f9e88c8
0x7fff6f9e88a8
0x7fff6f9e8888
output on ubuntu
0x7ffffeec790c
0x7ffffeec78dc
0x7ffffeec78ac
0x7ffffeec787c
0x7ffffeec784c
0x7ffffeec781c
0x7ffffeec77ec
0x7ffffeec77bc
0x7ffffeec778c
0x7ffffeec775c
0x7ffffeec772c
The stack is growing downwards on these specific setups as memory addresses are reducing. This depends on the architecture of the system and may have different behavior for other architectures. 0x7fff6f9e8868

Related

How come the following function works? [duplicate]

Consider the following code.
#include<stdio.h>
int *abc(); // this function returns a pointer of type int
int main()
{
int *ptr;
ptr = abc();
printf("%d", *ptr);
return 0;
}
int *abc()
{
int i = 45500, *p;
p = &i;
return p;
}
Output:
45500
I know according to link this type of behavior is undefined. But why i am getting correct value everytime i run the program.
Every time you call abc it "marks" a region at the top of the stack as the place where it will write all of its local variables. It does that by moving the pointer that indicates where the top of stack is. That region is called the stack frame. When the function returns, it indicates that it does not want to use that region anymore by moving the stack pointer to where it was originally. As a result, if you call other functions afterwards, they will reuse that region of the stack for their own purposes. But in your case, you haven't called any other functions yet. So that region of the stack is left in the same state.
All the above explain the behavior of your code. It is not necessary that all C compilers implement functions that way and therefore you should not rely on that behavior.
Well, undefined behavior is, undefined. You can never rely on UB (or on an output of a program invoking UB).
Maybe, just maybe in your environment and for your code, the memory location allocated for the local variable is not reclaimed by the OS and still accessible, but there's no guarantee that it will have the same behavior for any other platform.

Accessing array beyond it's size [duplicate]

A program accessing illegal pointer to pointer does not crash with SIGSEGV. This is not a good thing, but I’m wondering how this could be and how the process survived for many days in production. It is bewildering to me.
I have given this program a go in Windows, Linux, OpenVMS, and Mac OS and they have never complained.
#include <stdio.h>
#include <string.h>
void printx(void *rec) { // I know this should have been a **
char str[1000];
memcpy(str, rec, 1000);
printf("%*.s\n", 1000, str);
printf("Whoa..!! I have not crashed yet :-P");
}
int main(int argc, char **argv) {
void *x = 0; // you could also say void *x = (void *)10;
printx(&x);
}
I am not surprised by the lack of a memory fault. The program is not dereferencing an uninitialized pointer. Instead, it is copying and printing the contents of memory beginning at a pointer variable, and the 996 (or 992) bytes beyond it.
Since the pointer is a stack variable, it is printing memory near the top of stack for a ways down. That memory contains the stack frame of main(): possibly some saved register values, a count of program arguments, a pointer to the program arguments, a pointer to a list of environment variables, and a saved instruction register for main() to return, usually in the C runtime library startup code. In all implementations I have investigated, the stack frames below that has copies of the environment variables themselves, an array of pointers to them, and an array of pointers to the program arguments. In Unix environments (which you hint you are using) the program argument strings will be below that.
All of this memory is "safe" to print, except some non-printable characters will appear which might mess up a display terminal.
The chief potential problem is whether there is enough stack memory allocated and mapped to prevent a SIGSEGV during access. A segment fault could happen if there is too little environment data. Or if the implementation puts that data elsewhere so that there are only a few words of stack here. I suggest confirming that by cleaning out the environment variables and re-running the program.
This code would not be so harmless if any of the C runtime conventions are not true:
The architecture uses a stack
A local variable (void *x) is allocated on the stack
The stack grows toward lower numbered memory
Parameters are passed on the stack
Whether main() is called with arguments. (Some light duty environments, like embedded processors, invoke main() without parameters.)
In all mainstream modern implementations, all of these are generally true.
Illegal memory access is undefined behaviour. This means that your program might crash, but is not guaranteed to, because exact behaviour is undefined.
(A joke among developers, especially when facing coworkers that are careless about such things, is that "invoking undefined behaviour might format your hard drive, it's just not guaranteed to". ;-) )
Update: There's some hot discussion going on here. Yes, system developers should know what actually happens on a given system. But such knowledge is tied to the CPU, the operating system, the compiler etc., and generally of limited usefulness, because even if you make the code work, it would still be of very poor quality. That's why I limited my answer to the most important point, and the actual question asked ("why doesn't this crash"):
The code posted in the question does not have well-defined behaviour, but that does just mean that you can't really rely on what it does, not that it should crash.
If you dereference an invalid pointer, you are invoking undefined behaviour. Which means, the program can crash, it can work, it could cook some coffee, whatever.
When you have
int main(int argc, char **argv) {
void *x = 0; // you could also say void *x = (void *)10;
printx(&x);
}
You are declaring x as a pointer with value 0, and that pointer lives in the stack since it's a local variable. Now, you are passing to printx the address of x, which means that with
memcpy(str, rec, 1000);
you are copying data from above the stack (or in fact from the stack itself), to the stack (because the stack pointer address decreases on each push). The source data is likely to be covered by the same page table entry as you are copying just 1000 bytes, so you get no segmentation fault. However, ultimately, as already written, we are talking about undefined behavior.
It would be crashed with great probability if you write to unacceed area. But you are reading, it can be ok. But the behaviour will be still undefined.

why do i get the same value of the address?

#include<stdio.h>
#include<conio.h>
void vaibhav()
{
int a;
printf("%u\n",&a);
}
int main()
{
vaibhav();
vaibhav();
vaibhav();
getch();
return 0;
}
Every time I get the same value for the address of variable a.
Is this compiler dependent? I am using dev c++ ide.
Try to call it from within different stack depths, and you'll get different addresses:
void func_which_calls_vaibhav()
{
vaibhav();
}
int main()
{
vaibhav();
func_which_calls_vaibhav();
return 0;
}
The address of a local variable in a function depends on the state of the stack (the value of the SP register) at the point in execution when the function is called.
In your example, the state of the stack is simply identical each time function vaibhav is called.
It is not necessary. You may or may not get the same value of the address. And use %p instead.
printf("%p\n", (void *)&a);
You should use %p format specifier to print address of a variable. %u & %d are used to display integer values. And the address on stack can be same or not every time you call function vaibhav().
"a" in the function vaibhav() is an automatic variable, which means it's auto-created in the stack once this fun is called, and auto-released(invalid to the process) once the fun is returned.
When funA(here is main) calls another funB(here is vaibhav), a stack frame of funB will be allocated for funB. When funB returns, stack frame for funB is released.
In this case, the stack for funB(vaibhav) is called 3 times sequentially, exactly one by one. In each call, the stack frame for funB is allocated and released. Then re-allocated and recycled for times.
The same memory block in stack memory is re-used for 3 times. In that block, the exact memory for "a" is also re-used for 3 times. Thus, the same memory and you get the same address of it.
This is definitely compiler dependent. It depends on the compiler implementation. But I believe almost every C compiler will produce the same result, though I bet there is no specific requirement in C standard to define the expected output for this case.

Why does a program accessing illegal pointer to pointer not crash?

A program accessing illegal pointer to pointer does not crash with SIGSEGV. This is not a good thing, but I’m wondering how this could be and how the process survived for many days in production. It is bewildering to me.
I have given this program a go in Windows, Linux, OpenVMS, and Mac OS and they have never complained.
#include <stdio.h>
#include <string.h>
void printx(void *rec) { // I know this should have been a **
char str[1000];
memcpy(str, rec, 1000);
printf("%*.s\n", 1000, str);
printf("Whoa..!! I have not crashed yet :-P");
}
int main(int argc, char **argv) {
void *x = 0; // you could also say void *x = (void *)10;
printx(&x);
}
I am not surprised by the lack of a memory fault. The program is not dereferencing an uninitialized pointer. Instead, it is copying and printing the contents of memory beginning at a pointer variable, and the 996 (or 992) bytes beyond it.
Since the pointer is a stack variable, it is printing memory near the top of stack for a ways down. That memory contains the stack frame of main(): possibly some saved register values, a count of program arguments, a pointer to the program arguments, a pointer to a list of environment variables, and a saved instruction register for main() to return, usually in the C runtime library startup code. In all implementations I have investigated, the stack frames below that has copies of the environment variables themselves, an array of pointers to them, and an array of pointers to the program arguments. In Unix environments (which you hint you are using) the program argument strings will be below that.
All of this memory is "safe" to print, except some non-printable characters will appear which might mess up a display terminal.
The chief potential problem is whether there is enough stack memory allocated and mapped to prevent a SIGSEGV during access. A segment fault could happen if there is too little environment data. Or if the implementation puts that data elsewhere so that there are only a few words of stack here. I suggest confirming that by cleaning out the environment variables and re-running the program.
This code would not be so harmless if any of the C runtime conventions are not true:
The architecture uses a stack
A local variable (void *x) is allocated on the stack
The stack grows toward lower numbered memory
Parameters are passed on the stack
Whether main() is called with arguments. (Some light duty environments, like embedded processors, invoke main() without parameters.)
In all mainstream modern implementations, all of these are generally true.
Illegal memory access is undefined behaviour. This means that your program might crash, but is not guaranteed to, because exact behaviour is undefined.
(A joke among developers, especially when facing coworkers that are careless about such things, is that "invoking undefined behaviour might format your hard drive, it's just not guaranteed to". ;-) )
Update: There's some hot discussion going on here. Yes, system developers should know what actually happens on a given system. But such knowledge is tied to the CPU, the operating system, the compiler etc., and generally of limited usefulness, because even if you make the code work, it would still be of very poor quality. That's why I limited my answer to the most important point, and the actual question asked ("why doesn't this crash"):
The code posted in the question does not have well-defined behaviour, but that does just mean that you can't really rely on what it does, not that it should crash.
If you dereference an invalid pointer, you are invoking undefined behaviour. Which means, the program can crash, it can work, it could cook some coffee, whatever.
When you have
int main(int argc, char **argv) {
void *x = 0; // you could also say void *x = (void *)10;
printx(&x);
}
You are declaring x as a pointer with value 0, and that pointer lives in the stack since it's a local variable. Now, you are passing to printx the address of x, which means that with
memcpy(str, rec, 1000);
you are copying data from above the stack (or in fact from the stack itself), to the stack (because the stack pointer address decreases on each push). The source data is likely to be covered by the same page table entry as you are copying just 1000 bytes, so you get no segmentation fault. However, ultimately, as already written, we are talking about undefined behavior.
It would be crashed with great probability if you write to unacceed area. But you are reading, it can be ok. But the behaviour will be still undefined.

how to find stack is increasing or decreasing in C?

stack is increasing or decreasing using C program ?
Right, in C usually variables in function scope are realized by means of a stack. But this model is not imposed by the C standard, a compiler could realize this any way it pleases. The word "stack" isn't even mentioned in the standard, and even less if it is in- or decreasing. You should never try to work with assumptions about that.
False dichotomy. There are plenty of options other than increasing or decreasing, one of which is that each function call performs the equivalent of malloc to obtain memory for the callee's automatic storage, calls the callee, and performs the equivalent of free after it returns. A more sophisticated version of this would allocate large runs of "stack" at a time and only allocate more when it's about to be exhausted.
I would call both of those very bad designs on modern machines with virtual memory, but they might make sense when implementing a multiprocess operating system on MMU-less microprocessors where reserving a range of memory for the stack in each process would waste a lot of address space.
How about:
int stack_direction(void *pointer_to_local)
{
int other_local;
return (&other_local > pointer_to_local) ? 1 : -1;
}
...
int local;
printf("direction: %i", stack_direction(&local);
So you're comparing the address of a variable at one location on the call stack with one at an outer location.
If you only like to know if the stack has been changed you can keep the last inserted object to the stack, peek at the top of it and compare the two.
EDIT
Read the comments. It doesn't seem to be possible to determine the stack direction using my method.
END EDIT
Declare an array variable on the stack and compare the addresses of consecutive elements.
#include <stdio.h>
#include <stdlib.h>
int
main(void)
{
char buf[16];
printf("&buf[0]: %x\n&buf[1]: %x\n", &buf[0], &buf[1]);
return 0;
}
The output is:
misha#misha-K42Jr:~/Desktop/stackoverflow$ ./a.out
&buf[0]: d1149980
&buf[1]: d1149981
So the stack is growing down, as expected.
You can also monitor ESP register with inline assembly. ESP register holds address to unallocated stack. So if something is pushed to stack - ESP decreases and if pop'ed - ESP increases. (There are other commands which modifies stack, for example function call/return).
For example what is going on with stack when we try to compute recursive function such as Fibonacci Number (Visual Studio):
#include <stdio.h>
int FibonacciNumber(int n) {
int stackpointer = 0;
__asm {
mov stackpointer, esp
}
printf("stack pointer: %i\n", stackpointer);
if (n < 2)
return n;
else
return FibonacciNumber(n-1) + FibonacciNumber(n-2);
}
int main () {
FibonacciNumber(10);
return 0;
}

Resources