Count number of digits recursively - c

In order to learn recursion, I want to count the number of decimal digits that compose an integer. For didactic purposes, hence, I would like to not use the functions from math.h, as presented in:
Finding the length of an integer in C
How do I determine the number of digits of an integer in C? .
I tried two ways, based on the assumption that the division of an integer by 10 will, at a certain point, result in 0.
The first works correctly. count2(1514, 1) returns 4:
int count2(int n, int i){
if(n == 0)
return 0;
else
return i + count2(n / 10, i);
}
But I would like to comprehend the behavior of this one:
int count3(int n, int i){
if(n / 10 != 0)
return i + count3(n / 10, i);
}
For example, from count3(1514, 1); I expect this:
1514 / 10 = 151; # i = 1 + 1
151 / 10 = 15; # i = 2 + 1
15 / 10 = 1; # i = 3 + 1
1 / 10 = 0; # Stop!
Unexpectedly, the function returns 13 instead of 4. Should not the function recurse only 3 times? What is the actual necessity of a base case of the same kind of count2()?

If you do not provide a return statement the result is indeterminate.
On most architectures that mean your function returns random data that happens to be present on the stack or service registers.
So, your count3() function is returning random data when n / 10 == 0 because there is no corresponding return statement.
Edit: it must be stressed that most modern compilers are able to warn when a typed function does not cover all exit points with a return statement.
For example, GCC 4.9.2 will silently accept the missing return. But if you provide it the -Wreturn-type compiler switch you will get a 'warning: control reaches end of non-void function [-Wreturn-type]' warning message. Clang 3.5.0, by comparison, will by default give you a similar warning message: 'warning: control may reach end of non-void function [-Wreturn-type]'. Personally I try to work using -Wall -pedantic unless some required 3rd party forces me to disable some specific switch.

In recursion there should be base conditions which is the building block of recursive solution. Your recursion base doesn't return any value when n==0 — so the returned value is indeterminate. So your recursion count3 fails.

Not returning value in a value-returning function is Undefined behavior. You should be warned on this behavior
Your logic is also wrong. You must return 1 when `(n >= 0 && n / 10 == 0) and
if(n / 10 != 0)
return i + count3(n / 10, i);
else if (n >= 0) return 1;
else return 0;

I don't think you need that i+count() in the recursion. Just 1+count() can work fine...
#include <stdio.h>
#include <stdlib.h>
static int count(), BASE=(10);
int main ( int argc, char *argv[] ) {
int num = (argc>1?atoi(argv[1]):9999);
BASE= (argc>2?atoi(argv[2]):BASE);
printf(" #digits in %d(base%d) is %d\n", num,BASE,count(num)); }
int count ( int num ) { return ( num>0? 1+count(num/BASE) : 0 ); }
...seems to work fine for me. For example,
bash-4.3$ ./count 987654
#digits in 987654(base10) is 6
bash-4.3$ ./count 123454321
#digits in 123454321(base10) is 9
bash-4.3$ ./count 1024 2
#digits in 1024(base2) is 11
bash-4.3$ ./count 512 2
#digits in 512(base2) is 10

Related

Question regarding tail call optimization

As far as I know, there is a prerequisite for performing tail call optimization is that the recursion point should be the last sentence in the function, and the result of the recursive call should be returned immediately. But why?
Here is a valid example for TCO:
int factorial(int num) {
if (num == 1 || num == 0)
return 1;
return num * factorial(num - 1);
}
So, with the rule, can the below code be optimized too? Why not?
#include <stdio.h>
int factorial(int num) {
if (num == 1 || num == 0)
return 1;
int temp = num * factorial(num - 1);
printf("%d", temp);
return temp;
}
I want to know how should I explain to others why the above rule is necessary for having a TCO. But not just simply follow.
the result of the recursive call should be returned immediately. But why?
That's because in order to optimize a tail call you need to convert the final recursive call into a simple "jump" instruction. When you do this, you are merely "replacing" function arguments and then re-starting the function.
This is only possible if you can "throw away" the current stack frame and re-use it for the same function again, possibly overwriting it. If you need to remember a value to do more calculations and then return, you cannot use the same stack frame for the recursive call (i.e. cannot turn the "call" into a "jump"), as it could possibly erase/modify the value you wanted to remember before returning.
Furthermore, if your function is very simple (like yours) chances are that it could be written without using the stack at all (except for the return address maybe), and only store data in registers. Even in this case, you don't want to make a jump to the same function (that uses the same registers) if you need to remember one of the values before returning.
Here is a valid example for TCO:
int factorial(int num) {
if (num == 1 || num == 0)
return 1;
return num * factorial(num - 1);
}
This is not valid for TCO! You are doing return num * <recursive-call>. The recursive call is not the last thing that the function does, there is a multiplication before returning. It's the same as writing:
int factorial(int num) {
if (num == 1 || num == 0)
return 1;
int tmp = factorial(num - 1);
tmp *= num;
return tmp;
}
can the below code be optimized too?
Nope! Again there simply is no tail call there, and it's even more obvious. You are first doing the recursive call, then some other stuff (multiplication and printf), and then returning. This cannot be optimized as a tail call by the compiler.
On the other hand, the following code can be optimized as a tail call:
int factorial(int n, int x) {
if (n == 1)
return x;
int tmp = factorial(n - 1, n * x);
return tmp;
}
You don't necessarily have to make the recursive call right on the last line of the function. The important thing is that you don't do work in the middle (between the recursive call and the return statement), like for example calling other functions or doing additional calculations.
IMPORTANT: note that just the fact that a classical TCO cannot be performed does not mean that the compiler will not be able to optimize your code in some other way. In fact, your first function is so simple that when compiled with GCC on x86-64 with at least -O2 it just gets converted from recursive to iterative (it basically becomes a single loop). The same goes for my example above, the compiler just doesn't care to do TCO, it sees an even better optimization to make in this case.
Here's the assembler dump of your first function generated by GCC 11 on x86-64 (Godbolt link if you want to play with it). In case you are not familiar with x86: the num argument is in edi, and eax is used for the return value.
factorial:
mov eax, 1
cmp edi, 1
jbe .L1
.L2:
mov edx, edi
sub edi, 1
imul eax, edx
cmp edi, 1
jne .L2
.L1:
ret
Each invocation of a function creates a stack frame with any data passed into that function via arguments. If a function calls another function (including itself) a new stack frame is pushed onto the stack. When a function is completely finished, its frame is popped off the stack.
Stack memory is limited. If we try to push too many frames onto the stack, we get a stack overflow error.
Where tail call optimization comes into play is to recognize that a function is complete if there is no work left to be done after the tail call.
Consider a way of recursively summing a range of numbers.
int sum(int start, int stop) {
if (start == stop) {
return start;
}
else {
return start + sum(start + 1, stop);
}
}
If we call sum(1, 5) the recursion looks something like:
sum(1, 5)
1 + sum(2, 5)
1 + 2 + sum(3, 5)
1 + 2 + 3 + sum(4, 5)
1 + 2 + 3 + 4 + sum(5, 5)
1 + 2 + 3 + 4 + 5
Several stack frames have to be created to hold this.
Typically tail-call optimization for something that requires building up a value involves an accumulator argument passed to the function.
int sum_tco(int start, int stop, int acc) {
if (start == stop) {
return start + acc;
}
else {
return sum_tco(start + 1, stop, start + acc);
}
}
Now consider what the recursion looks like:
sum_tco(1, 5, 0)
sum_tco(2, 5, 1 + 0)
sum_tco(3, 5, 2 + 1 + 0)
sum_tco(4, 5, 3 + 2 + 0)
sum_tco(5, 5, 5 + 4 + 3 + 2 + 1 + 0)
5 + 4 + 3 + 2 + 1 + 0
We don't need to know what the result of sum(1, 5, 0) or sum(3, 5, 2 + 1 + 0) is to know what the result of sum(5, 5, 5 + 4 + 3 + 2 + 1 + 0) is, and neither does your computer.
A smart compiler realizes this and removes all of those previous stack frames as it goes. With TCO, no matter how many times this function recursively calls itself, it will never overflow the stack.
(Descriptions of how the stack behaves have been generalized and are not intended to be technically in-depth but rather to demonstrate the generalized concept of TCO.)

How do i find the second largest element among given collection of numbers?

Without using an array, I am trying to do this. what is wrong with my code?
n is the number of elements,a is the first element(assumed to be maximum initially), b stores new element every time and sec variable stores the second-largest element. Numbers are all positive. This is from an online contest.
#include<stdio.h>
int main() {
int i,a,b,max,n,sec;
scanf("%d",&n);
scanf("%d",&a);
max=a;
while(n-1!=0) {
scanf("%d",&b);
if(b>max) {
sec=max;
max=b;
}
else if(b<max && b>sec)
sec=b;
else{}
n--;
}
printf("%d",sec);
return 0;
}
getting wrong answers in some test cases( i don't know )
Consider sequence 2, 12, 10 (leaving out surrounding code):
int sec; // unitialised!!!
max = a; // 12
if(b > max) // b got 10, so false!
{
sec = max; // this code is not hit! (b remains uninitalised)
max = b;
}
else if(b < max && b > sec)
// ^ comparing against uninitialised
// -> UNDEFINED BEHAVIOUR
You need to initialise sec appropriately, e. g. with INT_MIN (defined in <limits.h>); this is the minimal allowed value, with 32-bit int that would be a value of -232 - 1, i. e. -2 147 483 648. Pretty unlikely anybody would enter that value, so you could use it as sentinel.
You even could initialise max with that value, then you woudn't need special handling for the first value:
int sec = INT_MIN, max = INT_MIN;
int n;
scanf("%d", &n); // you should check the return value, which is number of
// successfully scanned values, i. e. 1 in given case,
// to catch invalid user input!
// you might check value of n for being out of valid range, at very least < 0
while(n--) // you can do the decrement inside loop header already...
{
// keep scope of variables as local as possible:
int a;
// scanf and comparisons as you had already
// again don't forget to check scanf's return value
}
if(sec == INT_MAX)
{
// likely hasn't been modified -> error, no second largest element
}
else
{
// ...
}
Now what if you do expect user to give you the value of INT_MIN as input?
You could have a separate counter, initialised to 0, you increment in both of the two if branches inside the loop; if this counter is < 2 after the loop, you didn't get at least two distinct numbers...
Lets look at the input
2 4 3
Two is the number of inputs.
4 ends up in max.
3 ends up in b.
b is not greater than max, the if does not do anything.
b is less than max, but b is not necessarily greater than sec,
because sec at this point can be anything - whatever currently is inside that non-initialised variable. sec at this point is for example not guaranteed to be 0. So the else if does not trigger and we end up in else {}.
So we end up executing the printf() at the end of the program with a still uninitialised sec. And that is unlikely to satisfy the judge.
To solve the problem, you need to initialise sec. Initialising to 0 might work, but actually you need to use the lowest possible input value.
Since you chose int, instead of unsigned int, I assume that 0 is NOT the lowest possible value. But you would have to quote the assignment/challenge to allow determining the lowest possible value. So you need to find that out yourself in order to make a solution code.
Alernatively, you can analyse the first input values to initialise max and sec (need to watch them coming in until you get two distinct values; credits to Aconcagua).
Usually it is however easier to determine the lowest possible value from requirements or the lowest possible int value from your environment.
At some level of nitpicking, you need to know the lowest possible value anyway, in order to select the correct data type for your implementation. I.e. even with analysing the first two values, you might fail for selecting the most narrow data type.
In case you "successfully" (as judged by the challenge) use 0 to initialise sec, try the input 2 1 -1.
It should fail.
Then try to find in your challenge/assignment description a reason why using 0 is allowed. It should be there, otherwise find a different challenge site to improve your coding skills.
I liked how OP initialized max with the first input value.
This brought me to the idea that the same can be done for sec.
(The value of max is a nice indicator that sec could not be determined whatever max contains. In regular case, max and sec can never be equal.)
Hence, one possibility is to initialize max and sec with the first input
and use max != sec as indicator whether sec has been written afterwards at all.
Demo:
#include <stdio.h>
int main()
{
/* read number of inputs */
int n;
if (scanf("%d", &n) != 1 || n < 1) {
fprintf(stderr, "ERROR!\n");
return -1;
}
/* read 1st input */
int max;
if (scanf("%d", &max) != 1) {
fprintf(stderr, "ERROR!\n");
return -1;
}
--n;
int sec = max;
/* read other input */
while (n--) {
int a;
if (scanf("%d", &a) != 1) {
fprintf(stderr, "ERROR!\n");
return -1;
}
if (max < a) { sec = max; max = a; }
else if (sec == max || (sec < a && a < max)) sec = a;
}
/* evaluate result */
if (sec == max) {
puts("No second largest value occurred!\n");
} else printf("%d\n", sec);
/* done */
return 0;
}
Output:
$ gcc -std=c11 -O2 -Wall -pedantic main.c
$ echo -e "3 3 4 5" | ./a.out
4
$ echo -e "3 3 5 4" | ./a.out
4
$ echo -e "3 4 3 5" | ./a.out
4
$ echo -e "3 4 5 3" | ./a.out
4
$ echo -e "3 5 3 4" | ./a.out
4
$ echo -e "3 5 4 3" | ./a.out
4
$ # edge case:
$ echo -e "2 3 3" | ./a.out
No second largest value occurred!
Live Demo on coliru

I am getting unexpected output

I assumed the output to be '0' for the following code but, I am getting the output as '3'.
#include<stdio.h>
int num_digit(int n);
int num_digit(int n)
{
if (n == 0)
return 0;
else
return 1 + num_digit(n/10);
}
int main() {
int k = num_digit(123);
printf("%d\n",k);
return 0;
}
The following link provides an excellent source for learning C Recursion and as #MFisherKDX pointed out help solve my confusion.
https://www.programiz.com/c-programming/c-recursion
After each time the recursion happens it returns a value.
adding up all the values :
0+1 = 1
1+1 = 2
2+1 = 3
gives the answer as 3.
This is basic recursion. Just try to create a recursion tree for the program that you have written and you should be able to figure out why is the output that you see coming as 3.
You are expecting 0 as answer, only based on the last recursive call (terminating condition), but when a recursive call happens, there is a concept of activation records which are maintained in the form of Stack data structure.
The recursion tree will look something like what is shown in Recursion Tree for shared code
num_digits(123) = 1 + num_digits(12)
num_digits(12) = 1 + num_digits(1)
num_digits(1) = 1 + num_digits(0)
num_digits(0) = 0
Using substitution:
num_digits(123) = 1 + (1 + (1 + (0)))
Please follow the parenthesis above clearly and you should be able to absolutely understand the output that you were getting out of the code that you wrote.
Recursion stack for your code is like below
1 + num_digit(123/10);
1 + num_digit(12/10);
1 + num_digit(1/10); //at this point your code will return 0 for num_digit(1/10)
and backtracking is like below
1+0=1
1+1=2
1+2=3
Hence the final answer is 3

C function output

Can anyone tell me the reason of getting 0 1 2 0 as output of below program?
#include <stdio.h>
main() {
e(3);
}
void e(int n) {
if(n>0) {
e(--n);
printf("%d",n);
e(--n);
}
}
Output is 0 1 2 0
Here' the flow of execution after e(3) is called from main.
e(3)
e(2)
e(1)
e(0) ... return
n is now 0
print n. results in output 0
e(-1) ... return
n is now 1
print n. results in output 1
e(0) ... return
n is now 2
print n. results in output 2
e(1)
e(0) ... return
n is now 0
print n. results in output 0
e(-1) ... return
return
And you see the output
0 1 2 0
I'm assuming the following is what you want:
#include <stdio.h>
void e(int);
int main()
{
e(3);
return 0;
}
void e(int n)
{
if(n > 0) {
e(--n);
printf("%d", n);
e(--n);
}
}
This is an example of a recursive function - a function calling itself. Here, at each call the parameter is decremented and the function is again called until the condition n > 0 is not met. Then, the printf("%d", 0) happens. Now the second e(--n) will have no effect until n is at least 2, since the if condition cannot be passed with a value of n less than 1. Further printf()s happen in the reverse order of the call as the function calls are removed from the stack. When the value gets to 2, the second e(--n) gets a chance to make an effect thus printing 0.
You need to learn about recursion (if you still haven't) and then you can get a good picture of how things happen. Also, it will help you if learn more about how the stack is set up when a function is called, and later returned.
The 'flow' goes as follows:
main -> e(3)
e(3) -> IF(3>0)
{
// n is pre-decremented to 2
e(2) -> IF(2>0)
{
// n is pre-decremented to 1
e(1) -> IF(1>0)
{
// n is pre-decremented to 0
e(0) -> 0 is not > 0 so this call does nothing.
// n is still 0 in this function call so...
printf(0) <-- The first '0' in the output
// n is pre-decremented to -1
e(-1) -> -1 is not > 0) so this call does nothing.
}
// n is still 1 in this function call so...
printf(1) <-- The '1' in the output
// n is pre-decremented to 0
e(0) -> 0 is not > 0 so this call does nothing
}
// n is still 2 in this function call so...
printf(2) <-- The '2' in the output
// n is pre-decremented to 1
e(1) -> (1 is > 0)
{
// n is pre-decremented to 0
e(0) -> 0 is not > 0 so this call does nothing
// n is still 0 in this function call so...
printf(0) <-- The second '0' in the output
// n is pre-decremented to -1
e(-1) -> -1 is not > 0 so this call does nothing
}
}
It helps if you set the code out more clearly:
#include<stdio.h>
main()
{
e(3);
}
void e(int n)
{
if(n>0)
{
e(--n); // First recursion here, but with n=n-1 on entry to the call.
printf("%d",n); // outputs (original value of n) - 1.
e(--n); // Second recursion here, now with n=n-2 on entry to the call.
}
}
After denesting the code the reason for the results can be deduced in a single run in a debugger.
e() is recursive and called once before the print and once after. So before you hit your print statement you'll have to go through e again, and again, and again till it finally hits 0.
After that things start unlooping and you'll see prints popping up but it's still a big recursive mess because of the second call to e(n) in which n dips into the negative. I was rather grateful n was signed because if it was unsigned it would loop round to 2^32 and the program would get stuck in, pretty much, an infinite loop.
So yeah, TL;DR: run it through a debugger and learn from the FUBAR a recursion like this can cause.

How does recursion work in C?

I'm trying to understand how recursion works in C. Can anyone give me an explanation of the control flow?
#include <stdio.h>
/* printd: print n in decimal */
void printd(int n)
{
if (n < 0)
{
putchar('-');
n = -n;
}
if (n / 10) printd(n / 10);
putchar(n % 10 + '0');
}
int main()
{
printd(123);
return 0;
}
The control flow looks like this (where -> is a function call)
main()
└─> printd(123)
├─> printd(12)
│ ├─> printd(1)
│ │ └─> putchar('1')
│ └─> putchar('2')
└─> putchar('3')
Call printd(123)
(123 / 10) != 0, so Call printd(12)
(12 / 10) != 0, so Call printd(1)
(1 / 10) == 0, so Call putchar "1"
Call putchar "2"
Call putchar "3"
return 0 (from main())
To understand recursion, you need to understand the storage model. Though there are several variations, basically "automatic" storage, the storage used to contain automatic variables, parameters, compiler temps, and call/return information, is arranged as a "stack". This is a storage structure starting at some location in process storage and "growing" either "up" (increasing addresses) or "down" (decreasing addresses) as procedures are called.
One might start out with a couple of variables:
00 -- Variable A -- 27
01 -- Variable B -- 45
Then we decide to call procedure X, so we generate a parameter of A+B:
02 -- Parameter -- 72
We need to save the location where we want control to return. Say instruction 104 is the call, so we'll make 105 the return address:
03 -- Return address -- 105
We also need to save the size of the above "stack frame" -- four words, 5 with the frame size itself:
04 -- Frame size -- 5
Now we begin executing in X. It needs a variable C:
05 -- Variable C -- 123
And it needs to reference the parameter. But how does it do that? Well, on entry a stack pointer was set to point at the "bottom" of X's "stack frame". We could make the "bottom" be any of several places, but let's make it the first variable in X's frame.
05 -- Variable C -- 123 <=== (Stack frame pointer = 5)
But we still need to reference the parameter. We know that "below" our frame (where the stack frame pointer is pointing) are (in decreasing address order) the frame size, return address, and then our parameter. So if we subtract 3 (for those 3 values) from 5 we get 2, which is the location of the parameter.
Note that at this point we don't really care if our frame pointer is 5 or 55555 -- we just subtract to reference parameters, add to reference our local variables. If we want to make a call we "stack" parameters, return address, and frame size, as we did with the first call. We could make call after call after call and just continue "pushing" stack frames.
To return we, load the frame size and the return address into registers. Subtract frame size from the stack frame pointer and put the return address into the instruction counter and we're back in the calling procedure.
Now this is an over-simplification, and there are numerous different ways to handle the stack frame pointer, parameter passing, and keeping track of frame size. But the basics apply regardless.
You have recursion in C (or any other programming language) by breaking a problem into 2 smaller problems.
Your example: print a number can be broken in these 2 parts
print the first part if it exists
print the last digit
To print "123", the simpler problems are then to print "12" (12 is 123 / 10) and to print "3".
To print "12", the simpler problems are then to print "1" (1 is 12 / 10) and to print "2".
To print "1", ... just print "1".
#include <stdio.h>
#define putd(d) (printf("%d", d))
#define RECURSIVE
void rprint(int n)
{
#ifndef RECURSIVE
int i = n < 0 ? -n : n;
for (; i / 10; i /= 10)
putd(i % 10);
putd(i % 10);
if (n < 0)
putchar('-');
/* Don't forget to reverse :D */
#else
if (n < 0) {
n = -n;
putchar('-');
}
int i = n / 10;
if (i)
rprint(i);
putd(n % 10);
#endif
}
int main()
{
rprint(-321);
return 0;
}
Recursion works on stack i.e, first in last out.
Recursion is a process of calling itself with different parameters until a base condition is achieved. Stack overflow occurs when too many recursive calls are performed.
Code:
main()
{print f ("stat");
main();
print f ("end") ;
}
Code:
main()
{int n, res;
pf("enter n value");
sf("%d",&n);
=fact(n);
}
int fact(int n)
{int res;
if(n==0)
{
res=1;
}
else
{
res = n*fact (n-1);
}
return res;
}

Resources