Recursive function, passing arguments - segmentation fault - c

I am writing a program for multiplying big numbers using a karatsuba algorithm.
There is a recursive function.
Right before the recursive call I print string values and they are ok. Then, inside this function at the beginning I print the passed arguments again (I should get exactly the same results as before recursive call, earlier printf()) and I get segmentation fault.
This happen not on first function execution, but after many recursive calls.
My code:
void karatsuba(char *result, char *first, char *second)
{
printf(" %s %s\n", first, second);
<somewhere here conditional return to end recursion>
...
...
printf(" %s %s\n", temp_first, temp_second);
karatsuba(temp, temp_first, temp_second);
...
...
}
What do can cause segmentation fault in that case?
UPDATE:
Thank you all for your answers. Stack overflow is propably the reason.
I created a static counter incremented at the start of recursive function and decremented at each of function ends and printed it. At the segmentation fault its value indicated depth of 46778.
Then, I increased stack size as Graham Borland pointed to 32MB. Now, counter indicated depth of 159126 calls, so increasing stack size made it better.
Sum of data in this function is 140B. Multiplying this value by stack depth gives me 21MB, which is less than 32MB.
After all, this number of recursive calls is too big. Doing calculation on paper I go maximum into <10 recursive calls for my data. Propably infinite recursion. :(

You are likely to be busting the stack. Depending on the platform, you may be able to increase the stack available to your process. For example, on a Unix-like platform, entering this at the bash prompt before running your program:
ulimit -s 32768
will increase the stack to 32MB.

The most probable reason is:
Stack Overflow
Infinite recursion??

Related

what's the memory structure detail of a very simple process, and what's the detail of the crash [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 1 year ago.
Improve this question
I'm trying to trigger a stack-overflow crash.
most probably, the crash value is not 1684 in you sandbox. but you will find your own value when running.
I just want to know a way to figure these out.
The program is compiled in CentOS 8, using a GNU compiler.
#include <stdio.h>
int main() {
int a[3];
for(int i=4;i<100000;i++){
printf("a[%d]=%d\n", i, a[i]);
a[i] = 0;
}
return 0;
}
Then 1864 will be the first I value which will cause a crash.
I known stack-overflow will cause undefined behavior.
I just want to know the memory structure of this process. why a[1864] will cause a crash.
You crash when you access an address that's not mapped into your address space, which happens when you run off the top of the stack.
You can get some insight by looking at the address of a as well as your program's memory mappings, which you can see by reading /proc/self/maps from within your program. See https://godbolt.org/z/1e5bvK for an example. In my test, a is 0x7ffcd39df7a0, and the stack is mapped into the address range 7ffcd39c0000-7ffcd39e1000, beyond which is a gap of unmapped addresses. Subtracting, a is 6240 bytes from the top of the stack. Each int is 4 bytes, so that's 1560 ints. Thus accessing a[1560] should crash, and that's exactly what happened.
The addresses and offsets will change from run to run due to ASLR and variation in how the startup code uses the stack.
(Just to be clear for other readers: accessing beyond the end of the stack is what will cause an immediate segfault. But as soon as you write even one element beyond the declared size of your array, even though that write instruction may not itself segfault, it is still overwriting other data that is potentially important. That is very likely to cause some sort of misbehavior further along, maybe very soon or maybe much later in your program's execution. The result could eventually be a segfault, if you overwrote a pointer or a return address or something of the kind, or it could be something worse: data corruption, granting access to an intruder, blowing up the machine you're controlling, becoming Skynet, etc.)
why a[1864] will cause crash.
This particular value is by no means guaranteed or reliable. It will depend on the compiler and libc version, compilation flags, and a host of other variables.
In general, the memory layout looks like this (on machines where stack grows down, which is the majority of current machines):
<- inaccessible memory
0x...C000 <- top of stack (page-aligned)
0x...B... <- stack frame for _start
... <- other stack frames, such as __libc_start_main
0x...A... <- stack frame for main (where a[] is located).
You start overwriting stack from &a[0], and continue going until you hit the top of the stack and step into inaccessible memory. The number of ints between &a[0] and top of stack depends on the factors I listed.
Using a powerful debugging tool, I was able to find the source of your troubles.
lstrand#styx:~/tmp$ insure gcc -g -o simple simple.c
lstrand#styx:~/tmp$ ./simple >/dev/null
[simple.c:8] **READ_BAD_INDEX**
>> printf("a[%d]=%d\n", i, a[i]);
Reading array out of range: a[i]
Index used : 4
Valid range: 0 thru 2 (inclusive)
Stack trace where the error occurred:
main() simple.c, 8
[simple.c:9] **WRITE_BAD_INDEX**
>> a[i] = 0;
Writing array out of range: a[i]
Index used : 4
Valid range: 0 thru 2 (inclusive)
Stack trace where the error occurred:
main() simple.c, 9
**Memory corrupted. Program may crash!!**
**Insure trapped signal: 11**
Stack trace where the error occurred:
main() simple.c, 8
Segmentation violation
** TCA log data will be merged with tca.log **
Segmentation fault (core dumped)
lstrand#styx:~/tmp$
Edit:
To clarify, your program seems to assume that you can grow an array in C/C++ simply by writing past the end, as in JavaScript. There is also the subtle suggestion that you are coming from a Fortran background. The first past-the-end array location is at index 3, not 4.
In other words, your test program declares a fixed-size array on the stack. An array of size 3: with valid elements a[0], a[1], and a[2]. At the very first iteration of the loop, you are corrupting the stack.
The proper way to cause a stack overflow, as people on this site should well know, is to do something like this:
void recurse()
{
char buf[1024];
recurse();
}
int main()
{
recurse();
return 0;
}
On Linux/x86_64, this still produces SIGSEGV when you run out of stack, but on other platforms (e.g., Windows) you will get a stack overflow violation.
(gdb) r
Starting program: /home/lstrand/tmp/so
Program received signal SIGSEGV, Segmentation fault.
0x0000000000400df2 in recurse () at so.c:5
5 recurse();
(gdb) where
#0 0x0000000000400df2 in recurse () at so.c:5
#1 0x0000000000400e79 in recurse () at so.c:5
#2 0x0000000000400e79 in recurse () at so.c:5
#3 0x0000000000400e79 in recurse () at so.c:5
[...]

return value 3221225725 while experimenting with simple recursive function

While studying pointers, I was experimenting with the following simple recursive code.
#include <stdio.h>
void test(void);
int main(void)
{
test();
}
void test(void)
{
int i = 4;
printf("%i %i\n", &i, i);
main();
}
output
...
4345292 4
4345228 4
--------------------------------
Process exited after 7.041 seconds with return value 3221225725
Press any key to continue . . .
Why it returns to value 3221225725. Does C run out of memory to store?
I was wondering why It only breaks after exact 32455 loops
(i am still in a learning phase, This is an experiment not actual code for any use. I purposefully made it break.)
The return value 3221225725 (C00000FD in hexadecimal) is the error code for stack overflow on Windows.
The stack overflow happens because your recursion never stops. main is calling test which is calling main which is calling test and so on indefinitely until the process' stack is full and then Windows kills the process and returns the error code to the caller.
If you want to use recursion, you need a stop condition so the recursion will end at some point. Google "factorial recursion" for a simple example.
Be aware that recursion can be abused and often an iterative approach is more efficient.
First of all, in the part
printf("%i %i\n", &i, i);
by supplying a pointer type as the argument of %i, you're invoking undefined behavour.
Change the code to
printf("%p %i\n", (void *)&i, i);
That said, calling main() recursively is not a very good idea. Then, it's an infinite recursion, so you'll eventually encounter a stack overflow.
Use a loop if you want some actions to be repeated, with a proper terminating condition.

Array & segmentation fault

I'm creating the below array:
int p[100];
int
main ()
{
int i = 0;
while (1)
{
p[i] = 148;
i++;
}
return (0);
}
The program aborts with a segmentation fault after writing 1000 positions of the array, instead of the 100. I know that C doesn't check if the program writes out of bounds, this is left to the OS. I'm running it on ubuntu, the size of the stack is 8MB (limit -s). Why is it aborting after 1000? How can I check how much memory my OS allocates for an array?
Sorry if it's been asked before, I've been googling this but can't seem to find a specific explanation for this.
Accessing an invalid memory location leads to Undefined Behavior which means that anything can happen. It is not necessary for a segmentation-fault to occur.
...the size of the stack is 8MB (limit -s)...
variable int p[100]; is not at the stack but in data area because it is defined as global. It is not initialized so it is placed into BSS area and filled with zeros. You can check that printing array values just at the beginning of main() function.
As other said, using p[i] = 148; you produced undefined behaviour. Filling 1000 position you most probably reached end of BSS area and got segmentation fault.
It appear that you clearly get over the 100 elements defined (int p[100];) since you make a loop without any limitation (while (1)).
I would suggest to you to use a for loop instead:
for (i = 0; i < 100; i++) {
// do your stuff...
}
Regarding you more specific question about the memory, consider that any outside range request (in your situation over the 100 elements of the array) can produce an error. The fact that you notice it was 1000 in your situation can change depending on memory usage by other program.
It will fail once the CPU says
HEY! that's not Your memory, leave it!
The fact that the memory is not inside of the array does not mean that it's not for the application to manipulate.
The program aborts with a segmentation fault after writing 1000 positions of the array, instead of the 100.
You do not reason out Undefined Behavior. Its like asking If 1000 people are under a coconut tree, will 700 hundred of them always fall unconscious if a Coconut smacks each of their heads?

A couple of questions on recursive functions in C language

This is a function to get sum of the digits of a number:
int sumOfDigits(int n)
{
int sum=0; //line 1
if(n==0)
return sum;
else
{
sum=(n%10)+sumOfDigits(n/10); //line 2
// return sum; //line 3
}
}
While writing this code, I realized the scope of the local variables is local to each individual recursion of the function. So am I right in saying that if n=11111, 5 sum variables are created and pushed on the stack with each recursion? If this is correct then what is the benefit of using recursion when I can do it in normal function using loops, thus overwriting only one memory location? If I use pointers, recursion will probably take similar memory as a normal function.
Now my second question, even though this function gives me the correct result each time, I don't see how the recursions (other than the last one which returns 0) return values without uncommenting line 3. (using geany with gcc)
I'm new to programming, so please pardon any mistakes
So am I right in saying that if n=11111, 5 sum variables are created and pushed on the stack with each recursion?
Conceptually, but compilers may turn some forms of recursion into jumps/loops. E.g. a compiler that does tail call optimization may turn
void rec(int i)
{
if (i > 0) {
printf("Hello, level %d!\n", i);
rec(i - 1);
}
}
into the equivalent of
void loop(int i)
{
for (; i > 0; i--)
printf("Hello, level %d!\n", i);
}
because the recursive call is in tail position: when the call is made, the current invocation of rec has no more work to do except a return to its caller, so it might as well reuse its stack frame for the next recursive call.
If this is correct then what is the benefit of using recursion when I can do it in normal function using loops, thus overwriting only one memory location? If I use pointers, recursion will probably take similar memory as a normal function.
For this problem, recursion is a pretty bad fit, at least in C, because a loop is much more readable. There are problems, however, where recursion is easier to understand. Algorithms on tree structures are the prime example.
(Although every recursion can be emulated by a loop with an explicit stack, and then stack overflows can be more easily caught and handled.)
I don't understand the remark about pointers.
I don't see how the recursions (other than the last one which returns 0) return values without uncommenting line 3.
By chance. The program exhibits undefined behavior, so it may do anything, even return the correct answer.
So am I right in saying that if n=11111, 5 sum variables are created
and pushed on the stack with each recursion?
The recursion is 5 levels deep, so traditionally 5 stack frames will be eventually created (but read below!), each one of which will have space to hold a sum variable. So this is mostly correct in spirit.
If this is correct then what is the benefit of using recursion when I
can do it in normal function using loops, thus overwriting only one
memory location?
There are several reasons, which include:
it might be more natural to express an algorithm recursively; if the performance is acceptable, maintainability counts for a lot
simple recursive solutions typically do not keep state, which means they are trivially parallelizable, which is a major advantage in the multicore era
compiler optimizations frequently negate the drawbacks of recursion
I don't see how the recursions (other than the last one which returns
0) return values without uncommenting line 3.
It's undefined behavior to comment out line 3. Why would you do that?
Yes, the parameters and local variables are local to each invokation and this is usually achieved by creating a copy of each invokation variables set on the program stack. Yes, that consumes more memory compared to an implementation with a loop, but only if the problem can be solved with a loop and constant memory usage. Consider traversing a tree - you will have to store the tree elements somewhere - be it on the stack or in some other structure. Recursion advantage is it is easier to implement (but not always easier to debug).
If you comment return sum; in the second branch the behavior is undefined - anything can happen, expected behavior included. That's not what you should do.

Is there any hard-wired limit on recursion depth in C

The program under discussion attempts to compute sum-of-first-n-natural-numbers using recursion. I know this can be done using a simple formula n*(n+1)/2 but the idea here is to use recursion.
The program is as follows:
#include <stdio.h>
unsigned long int add(unsigned long int n)
{
return (n == 0) ? 0 : n + add(n-1);
}
int main()
{
printf("result : %lu \n", add(1000000));
return 0;
}
The program worked well for n = 100,000 but when the value of n was increased to 1,000,000 it resulted in a Segmentation fault (core dumped)
The following was taken from the gdb message.
Program received signal SIGSEGV, Segmentation fault.
0x00000000004004cc in add (n=Cannot access memory at address 0x7fffff7feff8
) at k.c:4
My question(s):
Is there any hard-wired limit on recursion depth in C? or does the recursion depth depends on the available stack memory?
What are the possible reasons why a program would receive a reSIGSEGV signal?
Generally the limit will be the size of the stack. Each time you call a function, a certain amount of stack is eaten (usually dependent on the function). The eaten amount is the stack frame, and it is recovered when the function returns. The stack size is almost almost fixed when the program starts, either from being specified by the operating system (and often adjustable there), or even being hardcoded in the program.
Some implementations may have a technique where they can allocate new stack segments at run time. But in general, they don't.
Some functions will consume stack in slightly more unpredictable ways, such as when they allocate a variable-length array there.
Some functions may be compiled to use tail-calls in a way that will preserve stack space. Sometimes you can rewrite your function so that all calls (Such as to itself) happen as the last thing it does, and expect your compiler to optimise it.
It's not that easy to see exactly how much stack space is needed for each call to a function, and it will be subject to the optimisation level of the compiler. A cheap way to do that in your case would be to print &n each time its called; n will likely be on the stack (especially since the progam needs to take its address -- otherwise it could be in a register), and the distance between successive locations of it will indicate the size of the stack frame.
1)Consumption of the stack is expected to be reduced and written as tail recursion optimization.
gcc -O3 prog.c
#include <stdio.h>
unsigned long long int add(unsigned long int n, unsigned long long int sum){
return (n == 0) ? sum : add(n-1, n+sum); //tail recursion form
}
int main(){
printf("result : %llu \n", add(1000000, 0));//OK
return 0;
}
There is no theoretical limit to recursion depth in C. The only limits are those of your implementation, generally limited stack space.
(Note that the C standard doesn't actually require a stack-based implementation. I don't know of any real-world implementations that aren't stack based, but keep that in mind.)
A SIGSEGV can be caused by any number of things, but exceeding your stack limit is a relatively common one. Dereferencing a bad pointer is another.
The C standard does not define the minimum supported depth for function calls. If it did, which is quite hard to guarantee anyway, it would have it mentioned somewhere in section 5.2.4 Environmental limits.

Resources