Factorial Using FFT

Factorial Using FFT - c

I'm trying to implement a program in C that calculates the factorial of a very large n (up to a million), using fft and binary splitting method.
I've implemented a simple library to represent arbitrary precision integer.
To calculate the fft and ifft, i use twofft.c and four1.c routines from "Numerical Recipes in C"
Up to a certain n, all goes right, but when the numbers (floating arrays) are too big, the ifft (calculate with four1),after normalization and rounding, has values that are wrong.
For example, if i have two number with 2000 digits that ends with 40 zeros, and i have to multiply them each other (using fft), when i calculate the ifft, some ending zeros become "one".
this happens because when i rounded one of this "zeros", (0,50009 for examples), they became "one".
Now, i don't know if is my implementation wrong or if i have to rounding this numebrs in a different way.
I've tried to use both binary split method and prime factorization, but for n >= 9000, the result is wrong.
there is a way to resolve this?
thanks for your attention and sorry for my bad english.

How do you represent arbitrary precision integers?
I mean what type are you actually using?
Can you please show us your code?
If you feel really lazy you can clone this project i've made few months ago:
https://github.com/nomadster/ESP
Edit:
By further reading your post i suppose by this statement
"this happens because when i rounded one of this "zeros", (0,50009 for examples), they became "one""
that you are still unaware of the fact that fft multiplication only works when the roundoff error is smaller than 0.5.
So it seems to me (if and only if i've correctly interpreted your cryptic message) that you are using a floating point type that doesn't have the required precision.

For the record:
I also noticed wrong values returned by ifft from four1.c from numerical recipes. I only tested it with N=256 complex values as input, assembled in a way, that they should result in a real only time domain signal.
The resulting time domain vector has to be mirrored (end to start and vice versa ...) and shifted by one to correspond with the IFFTs of other implementations. (I tested numpy.fft.ifft, octave's ifft and a inverse discrete fourier transformation without any optimisation, simply based on the IDFT formula, which should be definitly correct).
There has to be a fundamental algorithm fault in the version provided by numerical recipies. In their books nothing related to this problem is described.

Related

Determine if a given integer number is element of the Fibonacci sequence in C without using float

I had recently an interview, where I failed and was finally told having not enough experience to work for them.
The position was embedded C software developer. Target platform was some kind of very simple 32-bit architecture, those processor does not support floating-point numbers and their operations. Therefore double and float numbers cannot be used.
The task was to develop a C routine for this architecture. This takes one integer and returns whether or not that is a Fibonacci number. However, from the memory only an additional 1K temporary space is allowed to use during the execution. That means: even if I simulate very great integers, I can't just build up the sequence and interate through.
As far as I know, a positive integer is a exactly then a Fibonacci number if one of
(5n ^ 2) + 4
or
(5n ^ 2) − 4
is a perfect square. Therefore I responded the question: it is simple, since the routine must determine whether or not that is the case.
They responded then: on the current target architecture no floating-point-like operations are supported, therefore no square root numbers can be retrieved by using the stdlib's sqrt function. It was also mentioned that basic operations like division and modulus may also not work because of the architecture's limitations.
Then I said, okay, we may build an array with the square numbers till 256. Then we could iterate through and compare them to the numbers given by the formulas (see above). They said: this is a bad approach, even if it would work. Therefore they did not accept that answer.
Finally I gave up. Since I had no other ideas. I asked, what would be the solution: they said, it won't be told; but advised me to try to look for it myself. My first approach (the 2 formula) should be the key, but the square root may be done alternatively.
I googled at home a lot, but never found any "alternative" square root counter algorithms. Everywhere was permitted to use floating numbers.
For operations like division and modulus, the so-called "integer-division" may be used. But what is to be used for square root?
Even if I failed the interview test, this is a very interesting topic for me, to work on architectures where no floating-point operations are allowed.
Therefore my questions:
How can floating numbers simulated (if only integers are allowed to use)?
What would be a possible soultion in C for that mentioned problem? Code examples are welcome.

The point of this type of interview is to see how you approach new problems. If you happen to already know the answer, that is undoubtedly to your credit but it doesn't really answer the question. What's interesting to the interviewer is watching you grapple with the issues.
For this reason, it is common that an interviewer will add additional constraints, trying to take you out of your comfort zone and seeing how you cope.
I think it's great that you knew that fact about recognising Fibonacci numbers. I wouldn't have known it without consulting Wikipedia. It's an interesting fact but does it actually help solve the problem?
Apparently, it would be necessary to compute 5n²±4, compute the square roots, and then verify that one of them is an integer. With access to a floating point implementation with sufficient precision, this would not be too complicated. But how much precision is that? If n can be an arbitrary 32-bit signed number, then n² is obviously not going to fit into 32 bits. In fact, 5n²+4 could be as big as 65 bits, not including a sign bit. That's far beyond the precision of a double (normally 52 bits) and even of a long double, if available. So computing the precise square root will be problematic.
Of course, we don't actually need a precise computation. We can start with an approximation, square it, and see if it is either four more or four less than 5n². And it's easy to see how to compute a good guess: it will very close to n×√5. By using a good precomputed approximation of √5, we can easily do this computation without the need for floating point, without division, and without a sqrt function. (If the approximation isn't accurate, we might need to adjust the result up or down, but that's easy to do using the identity (n+1)² = n²+2n+1; once we have n², we can compute (n+1)² with only addition.
We still need to solve the problem of precision, so we'll need some way of dealing with 66-bit integers. But we only need to implement addition and multiplication of positive integers, is considerably simpler than a full-fledged bignum package. Indeed, if we can prove that our square root estimation is close enough, we could safely do the verification modulo 2³¹.
So the analytic solution can be made to work, but before diving into it, we should ask whether it's the best solution. One very common caregory of suboptimal programming is clinging desperately to the first idea you come up with even when as its complications become increasingly evident. That will be one of the things the interviewer wants to know about you: how flexible are you when presented with new information or new requirements.
So what other ways are there to know if n is a Fibonacci number. One interesting fact is that if n is Fib(k), then k is the floor of logφ(k×√5 + 0.5). Since logφ is easily computed from log2, which in turn can be approximated by a simple bitwise operation, we could try finding an approximation of k and verifying it using the classic O(log k) recursion for computing Fib(k). None of the above involved numbers bigger than the capacity of a 32-bit signed type.
Even more simply, we could just run through the Fibonacci series in a loop, checking to see if we hit the target number. Only 47 loops are necessary. Alternatively, these 47 numbers could be precalculated and searched with binary search, using far less than the 1k bytes you are allowed.

It is unlikely an interviewer for a programming position would be testing for knowledge of a specific property of the Fibonacci sequence. Thus, unless they present the property to be tested, they are examining the candidate’s approaches to problems of this nature and their general knowledge of algorithms. Notably, the notion to iterate through a table of squares is a poor response on several fronts:
At a minimum, binary search should be the first thought for table look-up. Some calculated look-up approaches could also be proposed for discussion, such as using find-first-set-bit instruction to index into a table.
Hashing might be another idea worth considering, especially since an efficient customized hash might be constructed.
Once we have decided to use a table, it is likely a direct table of Fibonacci numbers would be more useful than a table of squares.

How does one divide a big integer by another big integer?

I've been researching this the last few days and I have been unable to come up with an answer. I have come up with one algorithm that works if the divisor is only one word. But, if the divisor is multiple words then I get some strange answers. I know this question has been asked a few times on here, but there has been no definitive answer except use the schoolbook method or go get a book on the subject. I have been able to get every function in my big integer library to work except division. It seems that some individuals think big integer division is a NP hard problem, and with the trouble that I'm having with it, I'm inclined to agree.
The data is stored in a structure that contains a pointer to an array of either uint16_t or uint32_t based on if the long long data type is supported or not. If long long is not supported, then uint16_t is used for the capture of any carry/overflow from multiplication and addition operations. The current functions that I have are addition, subtraction, multiply, 2's complement negation, comparison, and, or, xor, not, shift left, shift right, rotate left, rotate right, bit reversal (reflection), a few conversion routines, a random number fill routine, and some other utility routines. All these work correctly (I checked the results on a calculator) except division.
typedef struct bn_data_t bn_t;
struct bn_data_t
{
uint32 sz1; /* Bit Size */
uint32 sz8; /* Byte Size */
uint32 szw; /* Word Count */
bnint *dat; /* Data Array */
uint32 flags; /* Operational Flags */
};
This is related to another question that I asked about inline assembler as this is what it was for.
What I have found so far:
Algorithm for dividing very large numbers
What is the fastest algorithm for division of crazy large integers?
https://en.wikipedia.org/wiki/Division_algorithm
Newton-Raphson Division With Big Integers
And a bunch of academic papers on the subject.
What I have tried so far:
I have a basic routine working, but it divides a multi-word big integer number by a single word. I have tried to implement a Newton-Raphson algorithm, but that's not working as I have gotten some really strange results. I know about Newton's method from Calculus on which it is based, but this is integer math and not floating point. I understand the math behind the Goldschmidt division algorithm, but I am not clear on how to implement it with integer math. Part of the problem with some of these algorithms is that they call for a base 2 logarithm function. I know how to implement a logarithm function using floating point and a Taylor series, but not when using integer math.
I have tried looking at the GMP library, but the division algorithm is not very well documented and it kinda goes over my head. It seems that they are using different algorithms at different points which adds to the confusion.
For the academic papers, I mostly understand the math (I have cleared basic calculus math, multi-variable calculus, and ordinary differential equations), but once again, there is a disconnect between my mathematical knowledge and implementation using integer math. I have seen the grade school method being suggested which from what I can ascertain is something similar to a shift-subtract method, but I'm not too sure how to implement that one either. Any ideas? Code would be nice.
EDIT:
This is for my own personal learning experience. I want to learn how it is done.
EDIT: 4-JUN-2016
It has been awhile since I have worked on this as I had other irons in the fire and other projects to work on. Now that I have revisited this project, I have finally implemented big integer division using two different algorithms. The basic one is the shift-subtract method outlined here. The high speed algorithm which uses the CPU divide instruction is called only when the divisor is one word. Both algorithms have been confirmed to work properly as the results that they produce has been checked with an online big number calculator. So now, all basic math and logic functions have been implemented. Those functions include add, subtract, multiply, divide, divide with modulus, modulus, and, or, not, xor, negate, reverse (reflection), shift left, shift right, rotate left, and rotate right. I may add additional functions as their need comes up. Thank you to everyone who responded.

The schoolbook division (long-division) algorithm, commonly used for base-10 operands, can be used for arbitrarily large operands too. I will assume we are implementing the large numbers by array of digits in base B.
When we perform long-division manually for decimal operands, we usually depend on trial-and-error to find each quotient-digit d. But this trial-and-error can be replaced with an efficient method (due to D. A. Pope and M. L. Stein) when using long-division for large operands in base B.
To guess d, we can use the first digit (e) of the divisor and first two digits (yz) of the "current remainder" (resulting from a subtraction step of long-division). Say, d1 is the estimate for d obtained by dividing the number yz by e. It can be proved that, if the divisor has certain properties (which are always achievable, refer the link below), either d1 or d1-1 or d1-2 must be the required digit d. Each of these three candidates can be checked for the desired properties of d one by one.
Thus the finding of each quotient-digit becomes efficient, and for the rest part we can follow the iterative long-division process. Please refer the below article (written by me) for details about this algorithm and implementation in C:
https://mathsanew.com/articles/implementing_large_integers_division.pdf

Interview : Hash function: sine function

I was asked this interview question. I am not sure what the correct answer for it is (and the reasoning behind the answer):
Is sin(x) a good hash function?

If you mean sin(), it's not a good hashing function because:
it's quite predictable and for some x it's no better than just x itself. There should be no seemingly apparent relationship between the key and the hash of the key.
it does not produce an integer value. You cannot index/subscript arrays with floating-point indices and there must be some kind of array in the hash table.
floating-point is very implementation-specific and even if you make a hash function out of sin(), it may not work with a different compiler or on a different kind of CPU/computer.
sin() may be much slower than some simpler integer-arithmetic function.

Not really.
It's horribly slow.
You'll need to convert the result to some integer type anyway to avoid the insanity of floating-point equality comparisons. (Not actually the usual precision problems that are endemic to FP equality comparisons and which arise from calculating two things slightly different ways; I mean specifically the problems caused by things like the fact that 387-derived FPUs store extra bits of precision in their registers, so if a comparison is done between two freshly-calculated values in registers you could get a different answer than if exactly one of the operands was loaded into a register from memory.)
It's almost flat near the peaks and troughs, so the quantisation step (multiplying by some large number and rounding to an integer) will produce many hash values near the min and max, rather than an even distribution.

Based off of mathematical knowledge:
Sine(x) is periodic so it's going to reach the same number from different values of x, so Sine(x) would be awful as a hashing function because you will get multiple values hashing to the exact same point. There are **infinitely many values between 0 and pi for the return value, but then past that the values will repeat. So 0 & pi & 2*pi will all hash to the same point.
If you could make the increment small enough and have Sine(x) multiplied by say x^2 or something of that nature it'd be mediocre at best, but then again, if you were to do that why not just use x^2 anyway and toss out the periodic function all together.
**infinitely: a large enough number that I'm not willing to count.
NOTE: Sine(x) will have values that are small and could be affected by rounding error.
NOTE: Any value taken from a sine function should be multiplied by an integer and then either modded or the floor or ceiling taken so that the value can be used as an array offset, etc.

sin(x) is trigonometric function which repeats itself after every 360 degrees, so it's going to be a poor hash function as the hash will be repeated too often.
A simple refutation:
sin(0) == sin(360) == sin(720) == sin(..)
This is not a property of a goodhash function.
Even if you decide to use it, it's difficult to represent the value returned by sin.
Sin function:
sin x = x - x^3/3! + x^5/5! - ...
This can't accurately represented due to floating point precision issue, which means for a same value it may produce two different hashes!

Another point to note:
For sine(x) as hash function - Keys in a given close range will have hash values in close range too, it is not desirable. A good hash function evenly distributes hash values irrespective of the nature of the keys.

Hash values generally have to be integers to be useful. Since sin doesn't generate integers it wouldn't be appropriate.

Let's say we have a string s. It can be expressed as a number in hexadecimal and feeded to the function. If you added 2 pi it would cease to be a valid input, as it wouldn't be an integer anymore (only non-negative integers are accepted by the function). You have to find a string that gives a collision, not just multiply the hex expression of the string with 2 pi. And adding (concatenating?) 2 pi directly to the string wouldn't help finding a collision. There might be another way though but not that trivial.

I think sin(x) can make an excellent cryptographic hash function,
if used wisely. The input should be a natural number in radians
and never contain pi. We must use arbitrary-precision arithmetic.
For every natural number x (radians), sin(x)
is always a transcendental irrational number and there is no other
natural number with the same sine. But there's a catch: An attacker could gain
information about the input, by computing the arcsin of the hash.
In order to prevent this, we ignore the decimal part and some of the
first digits from the fractional part, keeping only the next n (say 100) digits,
making such an attack computationally infeasible.
It seems that a small change in the input gives a completely different result,
which is a desirable property.
The result of the function seems statistically random, again a good property.
I'm not sure how to prove that is is collision-resistant but i can't see why
it couldn't be. Also, i can't think of a way to find a specific input that results
in a specific hash. I'm not saying that we should blindly believe that it is
certainly a good crypt. hash function. I just think that it seems like a
good candidate to be one. We should give it a chance
and focus on proving that it is. And it might me a very good one.
To those that might say it is slow: Yes, it is. And that's good when hashing passwords.
Here i'm attaching some perl code for this idea. It runs on linux with bash and bc.
(bc is a command-line arbitrary-precision calculator, included in most distros)
I'll be checking this page for any answers, since this interests me a lot.
Don't be harsh though, i'm just a CS undergrad, willing to learn more.
use warnings;
use strict;
my $input='5AFF36B7';#Input for bc (as a hex number)
$input='1'.$input;#put '1' in front of input, so that 0x0 , 0x00 , 0x1 , 0x01 , etc ... ,
#all give different nonzero results
my $a=`bc -l -q <<< "scale=256;obase=16;ibase=16;s($input)"`;#call bc, keep result in $a
#keep only fractional part
$a=~tr/a-zA-Z0-9//cd;#Clean up string, keep only alphanumerics
my #m = $a =~ /./g;#Convert string to array of chars
#PRINT OUTPUT
#We ignore some digits, for security reasons:
#If we don't ignore any of the first digits, an attacker could gain information
#about the input by computing the inverse of sin (the arcsin of the hash)
#By ignoring enough of the first digits, it becomes computationally
#infeasible to compute arcsin
#Also, to avoid problems with roundoff error, we ignore some of the last digits
for (my $c=100;$c<200;$c++){
print $m[$c];
}

Efficiency of arcsin computation from sine lookup table

I have implemented a lookup table to compute sine/cosine values in my system. I now need inverse trigonometric functions (arcsin/arccos).
My application is running on an embedded device on which I can't add a second lookup table for arcsin as I am limited in program memory. So the solution I had in mind was to browse over the sine lookup table to retrieve the corresponding index.
I am wondering if this solution will be more efficient than using the standard implementation coming from the math standard library.
Has someone already experimented on this?
The current implementation of the LUT is an array of the sine values from 0 to PI/2. The value stored in the table are multiplied by 4096 to stay with integer values with enough precision for my application. The lookup table as a resolution of 1/4096 which give us an array of 6434 values.
Then I have two funcitons sine & cosine that takes an angle in radian multiplied by 4096 as argument. Those functions convert the given angle to the corresponding angle in the first quadrant and read the corresponding value in the table.
My application runs on dsPIC33F at 40 MIPS an I use the C30 compiling suite.

It's pretty hard to say anything with certainty since you have not told us about the hardware, the compiler or your code. However, a priori, I'd expect the standard library from your compiler to be more efficient than your code.

It is perhaps unfortunate that you have to use the C30 compiler which does not support C++, otherwise I'd point you to Optimizing Math-Intensive Applications with Fixed-Point Arithmetic and its associated library.
However the general principles of the CORDIC algorithm apply, and the memory footprint will be far smaller than your current implementation. The article explains the generation of arctan() and the arccos() and arcsin() can be calculated from that as described here.
Of course that suggests also that you will need square-root and division also. These may be expensive though PIC24/dsPIC have hardware integer division. The article on math acceleration deals with square-root also. It is likely that your look-up table approach will be faster for the direct look-up, but perhaps not for the reverse search, but the approaches explained in this article are more general and more precise (the library uses 64bit integers as 36.28 bit fixed point, you might get away with less precision and range in your application), and certainly faster than a standard library implementation using software-floating-point.

You can use a "halfway" approach, combining a coarse-grained lookup table to save memory, and a numeric approximation for the intermediate values (e.g. Maclaurin Series, which will be more accurate than linear interpolation.)
Some examples here.
This question also has some related links.

A binary search of 6434 will take ~12 lookups to find the value, followed by an interpolation if more accuracy is needed. Due to the nature if the sin curve, you will get much more accuracy at one end than the other. If you can spare the memory, making your own inverse table evenly spaced on the inputs is likely a better bet for speed and accuracy.
In terms of comparison to the built-in version, you'll have to test that. When you do, pay attention to how much the size of your image increases. The stdin implementations can be pretty hefty in some systems.

Why there is a discrepancy in the result?

If I apply Binet Formula and Recursive formula for finding the fibonaci series, there is a discrepancy in result. Why?
Basically I am a student and it's our assignment to implement the fibonacci series. So while making the experiment I came across this situation.
Thanks in advance

The Fibonacci number is generated using integer arithmetic. The Binet formula uses floating-point arithmetic. Floating-point calculations will always have these small inaccuracies because not every real number can be represented accurately.
Specifically, an 8-byte float in SQL Server only has a 15-digit mantissa. It cannot be any more precise than 15 decimal points. Not coincidentally, the errors you are seeing occur at the 15th digit. I would hazard a guess that numbers below 70 are accurate, because they are within the precision limits of a float.
In other words, this behaviour is by design. There is a limit to the precision you can achieve with floating-point math, and you've hit it. In order to go beyond that, you'd have to use an arbitrary-precision math library, and I'm not aware of any available within the SQL Server environment (although that doesn't necessarily mean they don't exist).
P.S. Recursion is a very inefficient method of generating a Fibonacci number, especially within a database. If this is more than an academic exercise then I would recommend switching to an iterative solution.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight