This simple C program rarely terminates at the same call depth:
#include <stdio.h>
#include <stdlib.h>
void recursive(unsigned int rec);
int main(void)
{
recursive(1);
return 0;
}
void recursive(unsigned int rec) {
printf("%u\n", rec);
recursive(rec + 1);
}
What could be the reasons behind this chaotic behavior?
I am using fedora (16GiB ram, stack size of 8192), and compiled using cc without any options.
EDIT
I am aware that this program will throw a stackoverflow
I know that enabling some compiler optimizations will change the behavior and that the program will reach integer overflow.
I am aware that this is undefined behavior, the purpose of this question is to understand/get an overview of the implementation specific internal behaviors that might explain what we observe there.
The question is more, given that on Linux the thread stack size is fixed and given by ulimit -s, what would influence the available stack size so that the stackoverflow does not always occur at the same call depth?
EDIT 2
#BlueMoon always sees the same output on his CentOS, while on my Fedora, with a stack of 8M, I see different outputs (last printed integer 261892 or 261845, or 261826, or ...)
Change the printf call to:
printf("%u %p\n", rec, &rec);
This forces gcc to put rec on the stack and gives you its address which is a good indication of what's going on with the stack pointer.
Run your program a few times and note what's going on with the address that's being printed at the end. A few runs on my machine shows this:
261958 0x7fff82d2878c
261778 0x7fffc85f379c
261816 0x7fff4139c78c
261926 0x7fff192bb79c
First thing to note is that the stack address always ends in 78c or 79c. Why is that? We should crash when crossing a page boundary, pages are 0x1000 bytes long and each function eats 0x20 bytes of stack so the address should end with 00X or 01X. But looking at this closer, we crash in libc. So the stack overflow happens somewhere inside libc, from this we can conclude that calling printf and everything else it calls needs at least 0x78c = 1932 (possibly plus X*4096) bytes of stack to work.
The second question is why does it take a different number of iterations to reach the end of the stack? A hint is in the fact that the addresses we get are different on every run of the program.
1 0x7fff8c4c13ac
1 0x7fff0a88f33c
1 0x7fff8d02fc2c
1 0x7fffbc74fd9c
The position of the stack in memory is randomized. This is done to prevent a whole family of buffer overflow exploits. But since memory allocations, especially at this level, can only be done in multiple of pages (4096 bytes) all initial stack pointers would be aligned at 0x1000. This would reduce the number of random bits in the randomized stack address, so additional randomness is added by just wasting a random amount of bytes at the top of the stack.
The operating system can only account the amount of memory you use, including the limit on the stack, in whole pages. So even though the stack starts at a random address, the last accessible address on the stack will always be an address ending in 0xfff.
The short answer is: to increase the amount of randomness in the randomized memory layout a bunch of bytes on the top of the stack are deliberately wasted, but the end of the stack has to end on a page boundary.
You won't have the same behaviour between executions because it depends on the current memory available. The more memory you have available, the further you'll go in this recursive function.
Your program runs infinitely as there is no base condition in your recursive function. Stack will grow continuously by each function call and will result in stack overflow.
If it would be the case of tail-recursion optimization (with option -O2), then stack overflow will occur for sure. Its invoke undefined behavior.
what would influence the available stack size so that the stackoverflow does not always occur at the same call depth?
When stack overflow occurs it invokes undefined behavior. Nothing can be said about the result in this case.
Your recursive call is not necessarily going to cause undefined behaviour due to stackoverflow (but will due to integer overflow) in practice. An optimizing compiler could simply turn your compiler into an infinite "loop" with a jump instruction:
void recursive(int rec) {
loop:
printf("%i\n", rec);
rec++;
goto loop;
}
Note that this is going to cause undefined behaviour since it's going to overflow rec (signed int overflow is UB). For example, if rec is of an unsigned int, for example, then the code is valid and in theory, should run forever.
The above code can cause two issue:
Stack Overflow.
Integer overflow.
Stack Overflow: When a recursive function is called, all its variable is pushed onto the call stack including its return address. As there is no base condition which will terminate the recursion and the stack memory is limited, the stack will exhausted resulting Stack Overflow exception. The call stack may consist of a limited amount of address space, often determined at the start of the program. The size of the call stack depends on many factors, including the programming language, machine architecture, multi-threading, and amount of available memory. When a program attempts to use more space than is available on the call stack (that is, when it attempts to access memory beyond the call stack's bounds, which is essentially a buffer overflow), the stack is said to overflow, typically resulting in a program crash.
Note that, every time a function exits/return, all of the variables pushed onto the stack by that function, are freed (that is to say, they are deleted). Once a stack variable is freed, that region of memory becomes available for other stack variables. But for recursive function, the return address are still on the stack until the recursion terminates. Moreover, automatic local variables are allocated as a single block and stack pointer advanced far enough to account for the sum of their sizes. You maybe interested at Recursive Stack in C.
Integer overflow: As every recursive call of recursive() increments rec by 1, there is a chance that Integer Overflow can occur. For that, you machine must have a huge stack memory as the range of unsigned integer is: 0 to 4,294,967,295. See reference here.
There is a gap between the stack segment and the heap segment. Now because the size of heap is variable( keeps on changing during execution), therefore the extent to which your stack will grow before stackoverflow occurs is also variable and this is the reason why your program rarely terminates at the same call depth.
When a process loads a program from an executable, typically it allocates areas of memory for the code, the stack, the heap, initialised and uninitialised data.
The stack space allocated is typically not that large, (10s of megabytes probably) and so you would imagine that physical RAM exhaustion would not be an issue on a modern system and the stack overflow would always happen at the same depth of recursion.
However, for security reasons, the stack isn't always in the same place. Address Space Layout Randomisation ensures that the base of the stack's location varies between invocations of the program. This means that the program may be able to do more (or fewer) recursions before the top of the stack hits something inaccessible like the program code.
That's my guess as to what is happening, anyway.
Related
I recently learned about stacks, so I was experimenting to see what the stack size is and what happens when it overflows. I found out that on Unix the default stack size is 8 MiB, and that supports my findings since I cannot declare a string having size greater than or equal to 8 MiB in my main function. However, when I declare a variable in main() it affects other functions. For example:
#include <stdio.h>
void foo(void)
{
long int size = 1024*1024*2;
char str[size];
str[size - 1] = 'a';
printf("%c\n", str[size - 1]);
}
int main(int argc, char** argv)
{
long int size = 1024*1024*6;
char str[size];
str[size - 1] = 'a';
printf("%c\n", str[size - 1]);
foo();
return 0;
}
This code results in segmentation fault but if I make the string size 5 MiB in main() then there is no segmentation fault. Does that mean my C program cannot allocate more than 8 MiB of RAM for local variables (of all functions)? If so, what IS the point of stacks?
No, each function doesn't get its own independent stack space. There's only one stack in your program and there's an limited finite amount of stack space available to you.
How Stack works
This LIFO behavior is exactly what a function does when returning
to the function that called it.
Flow in the Stack
The caller pushes the return address onto the stack
When the called function finishes its execution, it pops the return
address off the call stack (this popped element is also known as
stack frame) and transfers control to that address.
If a called function calls on to yet another function, it will push
another return address onto the top of the same call stack, and
so on, with the information stacking up and unstacking as the
program dictates.
All of the above process happens in the same stack memory. Each function does have its own space in the stack but every function gets its space allocated in the same stack. This is called the Global Call Stack of your program.
It is used to store local variables which are used inside the function.
However, dynamically allocated space is stored on the heap. Heap is used to store dynamic variables. It is a region of process’ memory. malloc(), calloc(), resize() all these inbuilt functions are generally used to store dynamic variables.
As for the stack overflow issue, the call stack size is limited. Only a certain amount of memory can be used. If many function calls happen, the stack space would eventually run out which would give you a stack overflow error which would most likely cause your program to crash.
If there are a lot of variables in your function or some variables which needs a huge amount of space in your program, then the stack space will eventually run out and it would cause a stack overflow. E.g. the following would probably give stack overflow in most cases and cause your program to crash:
int main() {
int A[100000][100000];
}
Hope this clears your doubt !
NOTE:
In an multi-threaded environment, each thread gets its own call stack space separately instead of having the same Global Call Stack. So, in an multi-threaded environment, the answer to your question will be YES.
Does that mean my c program cannot allocate more than 8MB of ram for local variables (of all functions) ?
Yes and no. Yes, your program can't use more space for local variables than the available stack space, whatever that is. But no, you're not limited to 8MB for all functions, you're only limited to that much total stack space for functions that are currently executing. A program might contain thousands of functions, but only a relative handful of those will be invoked at any given moment.
When your program calls a function, space is reserved on the stack for the function's return value and it's local variables. If that function calls another function, space will then be reserved for that next function's return value and local variables. When each function returns, the return value is read and the local variables and return value are popped off the stack. So functions only use stack space while they're executing.
If so, what's the point of stacks ?
The point is to provide the space needed for local variables, to facilitate returning a value to the caller, and to make allocating that space fast and efficient. Functions don't typically need huge amounts of storage for local variables, so 8MB is typically more than enough.
If you find that you need to allocate a large amount of memory, there are memory allocation functions that make that easy. Let's say you need to create a multi-megabyte string as in your example. You'd typically use a function like malloc() or calloc() to create that object on the heap instead of on the stack, and the only local variable you'd need is a pointer to the allocated memory.
The "stack" is one, shared space in memory, and true to its name, every nested function invocation "pushes" a new "frame" (set of space for local variables) onto that shared stack. Yes, the total size of the stack's space in memory is shared between all functions which are (currently) executing, and if the total used space during the run of you program exceeds what the OS has set aside for it, you will cause an (ahem) "stack overflow" crash.
The point is to provide work space for each function's invocation. Typically, the amount of space used by any particular function on the stack is quite small-- perhaps some integers or a couple smallish arrays, etc. Think tens or hundreds of bytes, not usually kilobytes or certainly megabytes. This is mostly just idiomatic and you get used to what makes sense to have on the stack and what doesn't when you've worked with enough code of your own and others'. It would be exceptionally unusual in production code to have something megabytes large as an actual local variable.
In practice, the primary cause of stack overflow errors in the real world is accidental infinite recursion-- when you end up calling through into the same functions over and over without a recursive base case. Those stack frames may each be small, but if the call chain is unbounded eventually you'll overflow.
When you want to use actual larger pieces of memory, large string buffers, etc, you'll typically allocate them from a different shared chunk of memory referred to as "the heap". You can allocate (with malloc and its cousins) what you need and then free that when done. The heap's memory space is global to your program, and is not constrained or related to particular function invocations.
C language standard does not know anything about the stack. How function are called, how parameter are passed and where automatic storage objects are stored is up to implementation.
Most of the implementation will actually have only one stack but I will give you some very common exeptions.
RTOSes. Many RTOS-es implement tasks as normal functions. Functions which are separate tasks will have separate stacks.
Many multitasking libraries (like pthread) will give threads (which are functions) separate stacks.
Many hardware designs have more than one stack - for example very popular ARM Cortex uCs - having two separate hardware stacks.
etc etc.
I invoke a function 147 times in recursive and when it invokes for 147. times, program exe stops(codeblocks).
Before invokin function again, it assigned 1 int global variable to local, 1 int 2 dimensional global array to local and 1 string global variable to local variable. So, 146 of those maybe became a very huge load for program?
The function is:
It seems your stack is overflowing by recursive calls.
Quoting from above wiki page
In software, a stack overflow occurs when the stack pointer exceeds
the stack bound. The call stack may consist of a limited amount of
address space, often determined at the start of the program. The size
of the call stack depends on many factors, including the programming
language, machine architecture, multi-threading, and amount of
available memory. When a program attempts to use more space than is
available on the call stack (that is, when it attempts to access
memory beyond the call stack's bounds, which is essentially a buffer
overflow), the stack is said to overflow, typically resulting in a
program crash
Very deep recursion and large stack variables along with recursion are some easy to fall reasons of stack overflow.
You may want to write a smarter code to get away from recursions.
Below links may help you get there.
Way to go from recursion to iteration
Replace Recursion with Iteration
Each time you invoke your function, you allocate:
int visitedS[2416] = 2416 * 32 bits = 9.4KB
char pathS[4500] = 4500 * 8 bits = 4.4KB
So that's almost 14KB that gets placed on the stack every time you recurse.
After 147 recursions, you've put 1.98MB on the stack. That's not so huge - a typical Linux stack limit is 8MB.
I would check - through using a debugger or even adding debug print statements - your assumption that this is truly happening after 147 recursions. Perhaps there is a bug causing more invocations than you believed.
Even so, it may well be worth thinking about ways to reduce the memory footprint of each invocation. You seem to be creating local arrays which are copies of a global. Why not just use the data in the global. If your function must make changes to that data, keep a small set of deltas locally.
So I wrote a toy C program that would intentionally cause a stack overflow, just to play around with the limits of my system:
#include <stdio.h>
int kefladhen(int i) {
int j = i + 1;
printf("j is %d\n",j);
kefladhen(j);
}
int main() {
printf("Hello!:D\n");
kefladhen(0);
}
I was surprised to find that the last line printed before a segmentation fault was "j is 174651". Of course the exact number it got to varied a little each time I ran it, but in general I'm surprised that 174-thousand odd stack frames are enough to exhaust the memory for a process on my 4GB linux laptop. I thought that maybe printf was incurring some overhead, but printf returns before I call kefladhen() recursively so the stack pointer should be back where it was before. I'm storing exactly one int per call, so each stack frame should only be 8 bytes total, right? So 174-thousand odd of them is only about a megabyte and a half of actual memory used, which seems way low to me. What am I misunderstanding here?
...but in general I'm surprised that 174-thousand odd stack frames are enough to exhaust the memory for a process on my 4GB linux laptop...
Note that the stack isn't the general memory pool. The stack is a chunk pre-allocated for the purpose of providing the stack. It could be just 1MB out of those 4GB of memory on the machine. My guess is your stack size is about 1.3MB; that would be enough for 174,651 eight-byte frames (four bytes for return address, four bytes for the int).
I think that the key misunderstanding here is that the stack does not grow dynamically by itself. It is set statically to a relatively small number, but you can change it in runtime (here is a link to an answer explaining how it is done with setrlimit call).
Others have already discussed the size and allocation of the stack. The reason why the "the exact number it got to varied a little each time I ran it" has to do with cache performance on multi-threaded systems.
In general, you can expect the memory pre-allocated for the stack to be page aligned. However, the starting stack pointer will vary from thread-to-thread/process-to-process/task-to-task. This is to help avoid cache line flushes, invalidations and loads. If all the tasks/threads/processes had the same virtual address for the stack pointer, you would expect that there would be more cache collisions whenever a context switch occurs. In an attempt to reduce this likelihood, many OSes will have the starting stack pointer somewhere within the starting stack page, but not necessarily at the very top, or at the same position. Thus, when a context switch occurs and a subsequent stack access occurs, there is ...
a better chance the variable will already be in the cache
a better chance that there will not be a cache collision
Hope this helps.
On Linux, using C, assume I have an dynamically determined n naming the number of elements I have to store in an array (int my_array[n]) just for a short period of time, say, one function call, whereby the called function only uses little memory (some hundred bytes).
Mostly n is little, some tenths. But sometimes n may be big, as much as 1000 or 1'000'000.
How do I calculate, whether my stack can hold n*o + p bytes without overflowing?
Basically: How much bytes are there left on my stack?
Indeed, the checking available stack question gives good answer.
But a more pragmatic answer is: don't allocate big data on the call stack.
In your case, you could handle differently the case when n<100 (and then allocating on the stack, perhaps thru alloca, makes sense) and the case when n>=100 (then, allocate on the heap with malloc (or calloc) and don't forget to free it). Make the threshold 100 a #define-d constant.
A typical call frame on the call stack should be, on current laptops or desktops, a few kilobytes at most (and preferably less if you have recursion or threads). The total stack space is ordinarily at most a few megabytes (and sometimes much less: inside the kernel, stacks are typically 4Kbytes each!).
If you are not using threads, or if you know that your code executes on the main stack, then
Record current stack pointer when entering main
In your routine, get current stack limit (see man getrlimit)
Compare difference between current stack pointer and the one recorded in step 1 with the limit from step 2.
If you are using threads and could be executing on a thread other than main, see man pthread_getattr_np
My system (linux kernel 2.6.32-24) is implementing a feature named Address Space Layout Randomization (ASLR). ASLR seems to change the stack size:
void f(int n)
{
printf(" %d ", n);
f(n + 1);
}
int main(...)
{
f(0);
}
Obviously if you execute the program you'll get a stack overflow. The problem is that segmentation fault happens on different values of "n" on each execution. This is clearly caused by the ASLR (if you disable it the program exits always at the same value of "n").
I have two questions:
does it mean that ASLR make stack size slightly variable?
if so, do you see a problem in this fact? Could be a kernel bug?
It might mean that in one instance the stack happens to flow into some other allocated block, and in the other instance, it trips over unallocated address-space.
ASLR stands for "address space layout randomization". What it does is change various section/segment start addresses on each run, and yes, this includes the stack.
It's not a bug; it's by design. Its purpose, in part, is to make it harder to gain access by overflowing buffers, since in order to execute arbitrary code, you need to trick the CPU into "returning" to a certain point on the stack, or in the runtime libraries. Legitimate code would know where to return to, but some canned exploit wouldn't -- it could be a different address every time.
As for why the apparent stack size changes, stack space is allocated in pages, not bytes. Tweaking the stack pointer, especially if it's not by a multiple of the page size, changes the amount of space you see available.