How to work out how many bits the result of a factorial should take up as a number? - factorial

The factorial function could return a very large number as a result.
How could I work out the size of the data which must return as a result of the factorial? Is there a function which can give me the size of the data quickly based upon the number n for which we are computing the factorial?
For example, factorial (5) = 5 * 4 * 3 * 2 = 120
The number 120 will be 120 = 0b1111000 where 0b indicates this is a binary number. At least, I need 7 bits to represent the result and probability I would like to fit that into 8 bits to be a byte.

you need to calculate log2(factorial(N)), rounded up to the next higher number to get the number of bits you need to represent the result. if you're not sure if your can calculate or represent the factorial result with your current setup, you may try to calculate the sum of log2(i) for all i in the range from 2 to N inclusive (including 2 and N, that is).
as a sample, let's calculate the number of bits for factorial(5):
log2(120) = 6.906, rounded up become 7 (bits)
otheriwise,
log2(2) + log2(3) + log2(4) + log2(5) = 6.906, which gives same result

Related

K&R C Book: Question on Malloc implementation p187

Can anyone explain why the following statement on Page 187 of Edition 2 for malloc implementation is used:
nunits = (nbytes+sizeof(Header)-1)/sizeof(Header)+1;
p187 Malloc
Specifically, why the offsets -1 and +1 are used to calculate nunits.
It rounds up the size of the allocation requested to the next unit of sizeof(Header) and divides by sizeof(Header) to give the number of units of headers needed to store the data requested, and adds one to give it a header to use for the control information that will be wrecked when you write outside the bounds of the allocated memory.
If the header size is 16 bytes, for example, then the request sizes produce:
1..16 2
17..32 3
33..48 4
Etc.
(n-1)/d + 1 calculates n divided by d with any non-zero fraction rounded up.
In C, with positive integer operands, n/d calculates n divided by d with rounding down. If we compute (n-1)/d + 1, then:
If there is any fraction in n/d, then (n-1)/d has the same value as n/d, so (n-1)/d + 1 is one greater, so it is the result of rounding up a fraction when dividing n by d.
If there is no fraction in n/d, then (n-1)/d is one less than n/d, so (n-1)/d + 1 is the same as n/d, so it is the desired result of calculating n/d when there is no fraction.
This is the desired function to calculate for the number of units needed because, if you need a space for a fraction of a unit, then you need a whole unit to keep that fraction in.

Integer division which results in less than 1

How we can use scale factor of 1000 for example to not get 0 to a as we work with integers. Its on 32bit microcontroller.
Example:
uint32 a;
a = 211/555 * x;
Should we just multiply everything on right by 1000, and then divide final result with 1000?
You may apply the scale factor before doing the division.
In your example you are effectively doing (assuming that x=1000)
a = (211/555) * x;
which will turn out to be
a = 0*x;
If you change it around to
a =(x*211)/555;
you can force the multiplication first, creating a numerator larger than 555 which will allow a to be greater than 0.
You cannot then divide this result by 1000 though because it will still be less than 0 which cannot be stored in an integer data type.
You need to keep it in this form and always treat that number as having a 1000 multiplier (for example if the units were originally kilometers, the new number is in meters) or you will have to use a type which can handle numbers less than 1 (like a float or double).

segmentation fault (core dumped) error in a c program for combination function

#include <stdio.h>
#include <stdlib.h>
int factorial(int i) {
if(i == 1) {
return 1;
}
else {
return i*factorial(i - 1);
}
}
int combination(int l, int m) {
return factorial(l)/(factorial(l-m)*factorial(m));
}
int main() {
int n,r;
printf("Input taken in form of nCr\n");
printf("Enter n: ");
scanf("%d", &n);
printf("Enter r: ");
scanf("%d", &r);
int y = combination(n, r);
printf("Result: %d", y);
return 0;
}
Tried to make a simple code for calculating the combination function in maths. It worked for small values and basically works till n = 12, and gives wrong values from n = 13 and onwards.
Also for n = 15 and r = 2, it returns the result -4.
And it gives the error
segmentation fault (core dumped)
for n = 40 and r = 20.
I would like to know how to solve this problem and why exactly is this happening.
The value of 13! is 6227020800 which is too large to fit into an 32 bit integer. By attempting to calculate this factorial or larger results in overflowing a 32 bit int. Signed integer overflow invokes undefined behavior.
In some cases, this undefined behavior manifests as outputting the wrong value, while in others it manifests as a crash. The cases where it crashes the factorial function is most likely passed a value less than 1, meaning that the recursive calls will attempt to go all the way down to INT_MIN but fills up the stack before that can happen.
Even changing to long long isn't enough to fix this, as the intermediate results will overflow that. So how do you fix this? If you were calculating these values by hand you wouldn't multiply out all of the numbers together then divide two huge numbers. You'd write out the factors and cancel out terms from the top and bottom of the equation. For example, suppose you wanted to calculate 12C7. You would write it out like this:
12 * 11 * 10 * 9 * 8 * 7 * 6 * 5 * 4 * 3 * 2 * 1
------------------------------------------------
( 5 * 4 * 3 * 2 * 1 ) * (7 * 6 * 5 * 4 * 3 * 2 * 1)
Then cancel out 7! from the top and bottom:
12 * 11 * 10 * 9 * 8
---------------------
5 * 4 * 3 * 2
Then cancel out other terms:
12 * 11 * 10 * 9 * 8 12 * 11 * 2 * 9 * 8 12 * 11 * 2 * 9
--------------------- = -------------------- = --------------- = 4 * 11 * 2 * 9
5 * 4 * 3 * 2 4 * 3 * 2 3
Then multiply what's left:
4 * 11 * 2 * 9 = 792
Now do this in code. :) Be sure to change all of your datatypes to long long, as the result of 40C20 is still a bit larger than what a 32-bit int can hold. This type is guaranteed to be at least 64 bits.
This is an overflow problem here. You result is above the max int value.
13! = 6227020800
Wich is more than INT_MAX (2147483647). If you want to handle larger numbers you should either use other variables types (for example unsigned long long), or handle the overflow in your function to avoid memory crashes.
Here is a topic that could be interesting about overflow checking in c here.
Also for n = 15 and r = 2, it returns the result -4
When a variable overflowed, it can underflow and
overflow in cycle. This is why you are getting negative values. I'm not sure but I think this is related. If somebody can validate this it would be great.
I guess there are 2 effects interacting:
Your integers overflow, that is the value of factorial(i) will become negative for sufficiently big i leading to
Your recursion (having factorial call itself) consumes all your stack space.
Try to change the condition in factorial from if(i == 1):
int factorial(int i) {
if(1 == i) {
return 1;
} else if(1 > i) {
return -1;
}
return i * factorial(i - 1);
}
This should have you get rid of the SEGFAULT.
For the integer overflow, the only possible solution would be to not rely on C integer arithmethic but using some bignum library (or write the code on your own).
Some explanation for what is probably going on:
As #WhozCraig pointed out, integers can only keep a range of numbers up to INT_MAX. However, factorial(i) just explodes even for relatively small numbers.
C however does not capture this exception and your integers will silently overflow to negative numbers.
This means at some point you start feeding factorial with negative numbers.
However, for each function call, some data has to be pushed onto the stack (usually the return address and local variables, possibly including the function arguments).
This memory will be released only after the function returns.
This means, if you call factorial(40), if everything works integer wise, you will eat up 40 times the amount of memory for 1 call to factorial.
Since your factorial does not handle negative numbers correctly, it will end up calling itself endlessly, overflowing from time to time, until the condition i == 1 at some point is randomly hit.
Ostensibly in most cases, this does not happen before your stack is exhausted.
When I run your program in a debugger with n = 40 and r = 20 on a 32-bit binary compiled with Microsoft Visual Studio, then I don't get a segmentation fault, but I get a division by zero error in the following line:
return factorial(l)/(factorial(l-m)*factorial(m));
factorial(l-m) and factorial(m) both evaluate to factorial(20), which is 2,192,834,560.
Assuming that sizeof(int) == 4 (32-bit), this number cannot be represented by a signed int. Therefore, the int overflows, which, according to the official C standard, causes undefined behavior.
However, even if the behavior is undefined, I can reasonably speculate that the following happens:
Due to the overflow, the number 2,192,834,560 will become -2,102,132,736. This is because the second number corresponds to the first number in Two's complement binary representation.
Since this number is multiplied with itself in your code (assuming n = 40 and r = 20), then the result of the multiplication will be 4,418,962,039,762,845,696. This number certainly does not fit into a signed int, so that an overflow occurs again.
The hexadecimal representation of this number is 0x3D534E9000000000.
Since this large number does not fit into a 32-bit integer, all the excess bits are stripped off, which is equivalent to subjecting the result to modulo UINT_MAX + 1 (4,294,967,296). The result of this modulo operation is 0.
Therefore, the expression
factorial(l-m)*factorial(m)
evaluates to 0.
This means that the line
return factorial(l)/(factorial(l-m)*factorial(m));
will cause a division by zero exception.
One way of solving the problem of handling large numbers is to use floating point numbers instead of integers. These can handle very large numbers without overflowing, but you may lose precision. If you use double instead of float, you will not so easily lose precision and, even if you do, the precision loss will be smaller.

how to convert float to fix16_14?

How to convert the floating point number to fixed point format fix16_14 in C. fix16_14 means 2-bit integer and 14-bit fraction?
Consider an example: -0.99633 = c03c in hex (two's complement representation). Please help me with this C code logic.
The conversion is done by multiplying the float by 16384.0. Be sure to round the result. Also, since there are only 2 integer bits, the number must be in the range -2 <= x < 2. Otherwise the calculation will overflow.
Here's example code:
#include <stdio.h>
#include <inttypes.h>
#include <math.h>
int main(void)
{
float x = -0.99633;
short int y = round(x * 16384.0);
printf("%#04hx\n", (unsigned short)y);
}
The output from the code is: 0xc03c
Fixed Point Number
The shifting process above is the key to understand fixed point number representation. To represent a real number in computers (or any hardware in general), we can define a fixed point number type simply by implicitly fixing the binary point to be at some position of a numeral. We will then simply adhere to this implicit convention when we represent numbers.
To define a fixed point type conceptually, all we need are two parameters:
width of the number representation, and
binary point position within the number
We will use the notation fixed for the rest of this article, where w denotes the number of bits used as a whole (the Width of a number), and b denotes the position of binary point counting from the least significant bit (counting from 0).
..................................
For example, fixed<8,3> denotes a 8-bit fixed point number, of which 3 right most bits are fractional. Therefore, the bit pattern:
0 0 0 1 0 1 1 0
represents a real number:
00010.1102
= 1 * 2^1 + 1 * 2^-1 + 1 * 2^-1
= 2 + 0.5 + 0.25
= 2.75
Note that on a computer, a bit patter can represents anything. Therefore the same bit pattern, if we "cast" it to another type, such as a fixed<8,5> type, will represents the number:
000.101102
= 1 * 2^-1 + 1 * 2^-3 + 1 * 2^-4
= 0.5 + 0.125 + 0.0625
= 0.6875
If we treat this bit patter as integer, it represents the number:
101102
= 1 * 2^4 + 1 * 2^2 + 1 * 2^1
= 16 + 4 + 2
= 22
Using Fixed Point Number in C
C does not have native "type" for fixed point number. However, due to the nature of fixed point representation, we simply don't need one. Recall all arithmetics on fixed point numbers are the same as integer, we can simply reuse the integer type int in C to perform fixed point arithmetic. The position of binary point only matters in cases when we print it on screen or perform arithmetic with different "type" (such as when adding int to fixed<32,6>).

Fixed point code division understanding

Code for division by 9 in fixed point.
1. q = 0; // quotient
2. y = (x << 3) - x; // y = x * 7
3. while(y) { // until nothing significant
4. q += y; // add (effectively) binary 0.000111
5. y >>= 6; // realign
6. }
7. q >>= 6; // align
Line 2 through 5 in the FIRST execution of while loop is effectively doing
x*.000111 (in decimal representation x*0.1), what it is trying to achieve in subsequent while loops?
Should it not be again multiplying that with 7 and again shifting instead
of doing only shifting to take care of recurrence?
Explanation with respect to plain decimal number multiplication as to what is being achieved with only shifting would be nice.
Detailed code explanation here:
Divide by 9 without using division or multiplication operator
Lets denote 7/64 by the letter F. 7/64 is represented in binary as 0.000111 and is very close to 1/9. But very close is not enough. We want to use F to get exactly to 1/9.
It is done in the following way
F+ (F/64) + (F/64^2) + (F/64^3) + (F/64^4)+ (F/64^5) + ...
As we add more elements to this sequence the results gets closer to 1/9
Note each element in the sequence is exactly 1/64 from previous element.
A fast way to divide by 64 is >>6
So effectively you want to build a loop which sums this sequence. You start from F and in each iteration do F>>6 and add it to the sum.
Eventually (after enough iterations) the sum will be exactly 1/9.
Ok now, you are ready to understand the code.
Instead of using F (which is a fraction and cannot be represented in fixed points) the code multiplies F by x.
So the sum of the sequence will be X/9 instead of 1/9
Moreover in order to work with fixed points it is better to store 64*X*F and the result would by 64*X/9.
Later after the summation we can divide by 64 to get the X/9
The code stores in the variable y the value of F*x*64
variable q stores the sum of the sequence. In each loop iteration we generate the next element in the sequence by dividing the previous element by 64 (y>>=6)
Finally after the loop we divide the sum by 64 (q>>=6) and get the result X/9.
Regarding you question. We should not multiply by 7 each time or else we will get a sum of the sequence of
F+ (F^2) + (F^3) + (F^4) + (F^5)...
This would yield a result of ~X/(8+1/7) instead of X/9.
Shifting by one to the left multiplies by two, shifting by one to the right divides by two. Why?
Shifting is the action of taking all the bits from your number and moving them n bits to the left/right. For example:
00101010 is 42 in Binary
42 << 2 means "shift 42 2 bits to the left" the result is
10101000 which is 168 in Binary
we multiplied 42 by 4.
42 >> 2 means "shift 42 2 bits to the right" the result is
00001010 which is 10 in binary (Notice the rightmost bits have been discarded)
we divided 42 by 4.
Similarly : (x << 3) is x * 8, so (x << 3) - x is (x * 8) - x => x * 7

Resources