I have to implement an algorithm to evaluate a sum of poisson functions, each one with multiplying constants:
Where C(k) are positive constants<1, cut is a cutoff because in principle the sum should take infinite numbers of k, and lambda is a number that may vary in my case from 20 to 100. I've tried a straight forward implementation in my code:
#include<quadmath.h>
... //Definitions of lambda and c[k]...
long double sum=0;
for(int k=0;k<cutoff;++k)
{
sum=sum + c[k] powq(lambda,k)/tgamma(k+1)* (1.0/expq(lambda));
}
But I am not quite satisfied. I've searched on "Numerical recipes" for a good approach to evalutation of a poisson distribution, but I didn't find anything about that.
Are there better ways to do this?
Edit: to be clear, I'm looking for the most precise way to approximate the probaility of large events, given a poisson distribution, without computing awkward (lambda^k)/k! Factors!
Well, a simple improvement will be to calculate by hand, and cache lambda^k and (k+1)!, since their value in the previous iteration can be used to quickly calculate the respective ones in the current iteration, with an O(1) calculation.
Also, since 1.0/exp(lambda) is a constant, you should calculate it once in advance 1
#include<quadmath.h>
... //Definitions of lambda and c[k]...
const long double e_coeff = 1.0 / expq(lambda);
long double inv_k_factorial = 1.0l;
long double lambda_pow_k = 1.0l;
long double sum = 0.0l;
for(int k=0; k < cutoff; ++k)
{
lambda_pow_k *= lambda;
inv_k_factorial /= (k + 1);
sum += (c[k] * lambda_pow_k * inv_k_factorial);
}
sum *= e_coeff;
So now the three function calls and their respective overhead are completely gone from your loop.
Now, I've attempted to use the same data types as you did when writing your question. Since your comment indicates that lambda is greater than 1.0 (no relative error growth due to a quickly diminishing lambda_pow_k, I believe) Any significance lost here would depend on the limits of long double, which is either good or bad, depending on your concrete needs.
Compilers are clever nowadays. So it could be optimized like that any way, but I think it's best to leave less obvious optimizations to the optimizer. Your code shouldn't suffer in performance even when handed to a non-optimizing compiler.
Since the Poisson probabilities obey the simple recurrence
P(k,lam) = lam/k * P(k-1,lam)
one possibility would be to use something like Horner's rule for polynomials. That is:
Sum{k|C[k]*P(k,lam)} = exp(-lam)*(C[0]+(lam/1)*(C[1]+(lam/2)*(..))..)
or
P = c[cut]
for k=cut-1 .. 0
P = P*lam/(k+1) + C[k]
P *= exp(-lam)
Related
I have to calculate in c binomial coefficients of the expression
(x+y)**n, with n very large (order of 500-1000). The first algo to calculate binomial coefficients that came to my mind was multiplicative formula. So I coded it into my program as
long double binomial(int k, int m)
{
int i,j;
long double num=1, den=1;
j=m<(k-m)?m:(k-m);
for(i=1;i<=j;i++)
{
num*=(k+1-i);
den*=i;
}
return num/den;
}
This code is really fast on a single core thread, compared for example to recursive formula, although the latter one is less subject to rounding errors since involves only sums and not divisions.
So I wanted to test these algos for great values and tried to evaluate 500 choose 250 (order 10^160). I have found that the "relative error" is less than 10^(-19), so basically they are the same number, although they differ something like 10^141.
So I'm wondering: Is there a way to evaluate the order of the error of the calculation? And is there some fast way to calculate binomial coefficients which is more precise than the multiplicative formula? Since I don't know the precision of my algo I don't know where to truncate the stirling's series to get better results..
I've googled for some tables of binomial coefficients so I could copy from those, but the best one I've found stops at n=100...
If you're just computing individual binomial coefficients C(n,k) with n fairly large but no larger than about 1750, then your best bet with a decent C library is to use the tgammal standard library function:
tgammal(n+1) / (tgammal(n-k+1) * tgammal(k+1))
Tested with the Gnu implementation of libm, that consistently produced results within a few ULP of the precise value, and generally better than solutions based on multiplying and dividing.
If k is small (or large) enough that the binomial coefficient does not overflow 64 bits of precision, then you can get a precise result by alternately multiplying and dividing.
If n is so large that tgammal(n+1) exceeds the range of a long double (more than 1754) but not so large that the numerator overflows, then a multiplicative solution is the best you can get without a bignum library. However, you could also use
expl(lgammal(n+1) - lgammal(n-k+1) - lgammal(k+1))
which is less precise but easier to code. (Also, if the logarithm of the coefficient is useful to you, the above formula will work over quite a large range of n and k. Not having to use expl will improve the accuracy.)
If you need a range of binomial coefficients with the same value of n, then your best bet is iterative addition:
void binoms(unsigned n, long double* res) {
// res must have (n+3)/2 elements
res[0] = 1;
for (unsigned i = 2, half = 0; i <= n; ++i) {
res[half + 1] = res[half] * 2;
for (int k = half; k > 0; --k)
res[k] += res[k-1];
if (i % 2 == 0)
++half;
}
}
The above produces only the coefficients with k from 0 to n/2. It has a slightly larger round-off error than the multiplicative algorithm (at least when k is getting close to n/2), but it's a lot quicker if you need all the coefficients and it has a larger range of acceptable inputs.
To get exact integer results for small k and m, a better solution might be (a slight variation of your code) :
unsigned long binomial(int k, int m)
{
int i,j; unsigned long num=1;
j=m<(k-m)?m:(k-m);
for(i=1;i<=j;i++)
{
num*=(k+1-i);
num/=i;
}
return num;
}
Every time you get a combinatorial number after doing the division num/=i, so you won't get truncated. To get approximate results for bigger k and m, your solution might be good. But beware that long double multiplication is already much slower than the multiplication and division of integers (unsigned long or size_t). If you want to get bigger numbers exact, probably a big integer class must be coded or included from a library. You can also google if there's fast factorial algorithm for n! of extremely big integer n. That may help with combinatorics, too. Stirling's formula is a good approximation for ln(n!) when n is large. It all depends on how accurate you want to be.
If you really want to use the multiplicative formula, I would recommend an exception based approach.
Implement the formula with large integers (long long for example)
Attempt division operations as soon as possible (as suggested by Zhuoran)
Add code to check correctness of every division and multiplication
Resolve incorrect divisions or multiplications, e.g.
try the division in loop proposed by Zhuoran, but if it fails resort back to the initial algorithm (accumulating the product of divisor in den)
store the unresolved multiplier, divisors in additional long integers and try to resolve them in next iteration loops
If you really use large numbers then your result might not fit in long integer. then in that case you can switch to long double or use your personal LongInteger storage.
This is a skeleton code, to give you an idea:
long long binomial_l(int k, int m)
{
int i,j;
long long num=1, den=1;
j=m<(k-m)?m:(k-m);
for(i=1;i<=j;i++)
{
int multiplier=(k+1-i);
int divisor=i;
long long candidate_num=num*multiplier;
//check multiplication
if((candidate_num/multiplier)!=num)
{
//resolve exception...
}
else
{
num=candidate_num;
}
candidate_num=num/divisor;
//check division
if((candidate_num*divisor)==num)
{
num=candidate_num;
}
else
{
//resolve exception
den*=divisor;
//this multiplication should also be checked...
}
}
long long candidate_result= num/den;
if((candidate_result*den)==num)
{
return candidate_result;
}
// you should not get here if all exceptions are resolved
return 0;
}
This may not be what OP is looking for, but one can analytically approximate nCr for large n with binary entropy function. It is mentioned in
Page 10 of http://www.inference.phy.cam.ac.uk/mackay/itprnn/ps/5.16.pdf
https://math.stackexchange.com/questions/835017/using-binary-entropy-function-to-approximate-logn-choose-k
I have a task to make the following function as precise (the speed is not the aim) as possible. I have to use float and the method of middle rectangles. Could you suggest something? Actually, I think, it's all about minimization of float rounding errors. That's what I've done:
typedef float T;
T integrate(T left, T right, long N, T (*func)(T)) {
long i = 0;
T result = 0.0;
T interval = right - left;
for(i = 0; i < N; i++) {
result += func(left + interval * (i + 0.5) / N) * interval / N;
}
return result;
}
There are lots of ways you could avoid or compensate for floating-point rounding (MM's suggestion, using Kahan summation, etc...). However, there's no reason to do so, because the rounding errors are absolutely dwarfed by the error of the integration scheme; you won't get a more accurate integral, you'll get a more accurate approximation of the incorrect result computed by the midpoint rule. Any such effort is entirely wasted except in extremely specialized circumstances.
I have a task to make the following function as precise as possible
You say that you have to use float, so I assume the question isn't about rounding, but rather about computing the integral more accurately.
I also assume that simply increasing N is not an option.
Instead of using the mid-point rule, my suggestion is to consider using a higher-order quadrature rule (trapezoid, Simpson's etc).
Try this:
{
long i = 0;
T result = 0.0;
T interval = right - left;
for(i = 0; i < N; i++) {
result += func(left + interval * (i + 0.5) / N);
}
return result * interval / N;
}
If you want to compute an integral precisely, go read up on integration schemes. Some home-knit routine won't give any kind of precision
The book "Numerical recipes" (there are several versions, one for C) is highly regarded. Haven't looked at it personally.
For the classic interview question "How do you perform integer multiplication without the multiplication operator?", the easiest answer is, of course, the following linear-time algorithm in C:
int mult(int multiplicand, int multiplier)
{
for (int i = 1; i < multiplier; i++)
{
multiplicand += multiplicand;
}
return multiplicand;
}
Of course, there is a faster algorithm. If we take advantage of the property that bit shifting to the left is equivalent to multiplying by 2 to the power of the number of bits shifted, we can bit-shift up to the nearest power of 2, and use our previous algorithm to add up from there. So, our code would now look something like this:
#include <math.h>
int log2( double n )
{
return log(n) / log(2);
}
int mult(int multiplicand, int multiplier)
{
int nearest_power = 2 ^ (floor(log2(multiplier)));
multiplicand << nearest_power;
for (int i = nearest_power; i < multiplier; i++)
{
multiplicand += multiplicand;
}
return multiplicand;
}
I'm having trouble determining what the time complexity of this algorithm is. I don't believe that O(n - 2^(floor(log2(n)))) is the correct way to express this, although (I think?) it's technically correct. Can anyone provide some insight on this?
mulitplier - nearest_power can be as large as half of multiplier, and as it tends towards infinity the constant 0.5 there doesn't matter (not to mention we get rid of constants in Big O). The loop is therefore O(multiplier). I'm not sure about the bit-shifting.
Edit: I took more of a look around on the bit-shifting. As gbulmer says, it can be O(n), where n is the number of bits shifted. However, it can also be O(1) on certain architectures. See: Is bit shifting O(1) or O(n)?
However, it doesn't matter in this case! n > log2(n) for all valid n. So we have O(n) + O(multiplier) which is a subset of O(2*multiplier) due to the aforementioned relationship, and thus the whole algorithm is O(multiplier).
The point of finding the nearest power is so that your function runtime could get close to runtime O(1). This happens when 2^nearest_power is very close to the result of your addition.
Behind the scenes the whole "to the power of 2" is done with bit shifting.
So, to answer your question, the second version of your code is still worse case linear time: O(multiplier).
Your answer, O(n - 2^(floor(log2(n)))), is also not incorrect; it's just very precise and might be hard to do in your head quickly to find the bounds.
Edit
Let's look at the second posted algorithm, starting with:
int nearest_power = 2 ^ (floor(log2(multiplier)));
I believe calculating log2, is, rather pleasingly, O(log2(multiplier))
then nearest_power gets to the interval [multiplier/2 to multiplier], the magnitude of this is multiplier/2. This is the same as finding the highest set-bit for a positive number.
So the for loop is O(multiplier/2), the constant of 1/2 comes out, so it is O(n)
On average, it is half the interval away, which would be O(multiplier/4). But that is just the constant 1/4 * n, so it is still O(n), the constant is smaller but it is still O(n).
A faster algorithm.
Our intuitiion is we can multiply by an n digit number in n steps
In binary this is using 1-bit shift, 1-bit test and binary add to construct the whole answer. Each of those operations is O(1). This is long-multiplication, one digit at a time.
If we use O(1) operations for n, an x bit number, it is O(log2(n)) or O(x), where x is the number of bits in the number
This is an O(log2(n)) algorithm:
int mult(int multiplicand, int multiplier) {
int product = 0;
while (multiplier) {
if (multiplier & 1) product += multiplicand;
multiplicand <<= 1;
multiplier >>= 1;
}
return product;
}
It is essentially how we do long multiplication.
Of course, the wise thing to do is use the smaller number as the multiplier. (I'll leave that as an exercise for the reader :-)
This only works for positive values, but by testing and remembering the signs of the input, operating on positive values, and then adjusting the sign, it works for all numbers.
I'm trying to make a program to calculate the cos(x) function using taylor series so far I've got this:
int factorial(int a){
if(a < 0)
return 0;
else if(a==0 || a==1)
return 1;
else
return a*(factorial(a-1));
}
double Tserie(float angle, int repetitions){
double series = 0.0;
float i;
for(i = 0.0; i < repeticiones; i++){
series += (pow(-1, i) * pow(angle, 2*i))/factorial(2*i);
printf("%f\n", (pow(-1, i) * pow(angle, 2*i))/factorial(2*i));
}
return series;
}
For my example I'm using angle = 90, and repetitions = 20, to calculate cos(90) but it's useless I just keep getting values close to the infinite, any help will be greatly appreciated.
For one thing, the angle is in radians, so for a 90 degree angle, you'd need to pass M_PI/2.
Also, you should avoid recursive functions for something as trivial as factorials, it would take 1/4 the effort to write it iteratively and it would perform a lot better. You don't actually even need it, you can keep the factorial in a temporary variable and just multiply it by 2*i*(2*i-1) at each step. Keep in mind that you'll hit a representability/precision wall really quickly at this step.
You also don't need to actually call pow for -1 to the power of i, a simple i%2?1:-1 would suffice. This way it's faster and it won't lose precision as you increase i.
Oh and don't make i float, it's an integer, make it an integer. You're leaking precision a lot as it is, why make it worse..
And to top it all off, you're approximating cos around 0, but are calling it for pi/2. You'll get really high errors doing that.
The Taylor series is for the mathematical cosine function, whose arguments is in radians. So 90 probably doesn't mean what you thought it meant here.
Furthermore, the series requires more terms the longer the argument is from 0. Generally, the number of terms need to be comparable to the size of the argument before you even begin to see the successive terms becoming smaller, and many more than that in order to get convergence. 20 is pitifully few terms to use for x=90.
Another problem is then that you compute the factorial as an int. The factorial function grows very fast -- already for 13! an ordinary C int (on a 32-bit machine) will overflow, so your terms beyond the sixth will be completely wrong anyway.
In fact the factorials and the powers of 90 quickly become too large to be represented even as doubles. If you want any chance of seeing the series converge, you must not compute each term from scratch but derive it from the previous one using a formula such as
nextTerm = - prevTerm * x * x / (2*i-1) / (2*i);
For one of my course project I started implementing "Naive Bayesian classifier" in C. My project is to implement a document classifier application (especially Spam) using huge training data.
Now I have problem implementing the algorithm because of the limitations in the C's datatype.
( Algorithm I am using is given here, http://en.wikipedia.org/wiki/Bayesian_spam_filtering )
PROBLEM STATEMENT:
The algorithm involves taking each word in a document and calculating probability of it being spam word. If p1, p2 p3 .... pn are probabilities of word-1, 2, 3 ... n. The probability of doc being spam or not is calculated using
Here, probability value can be very easily around 0.01. So even if I use datatype "double" my calculation will go for a toss. To confirm this I wrote a sample code given below.
#define PROBABILITY_OF_UNLIKELY_SPAM_WORD (0.01)
#define PROBABILITY_OF_MOSTLY_SPAM_WORD (0.99)
int main()
{
int index;
long double numerator = 1.0;
long double denom1 = 1.0, denom2 = 1.0;
long double doc_spam_prob;
/* Simulating FEW unlikely spam words */
for(index = 0; index < 162; index++)
{
numerator = numerator*(long double)PROBABILITY_OF_UNLIKELY_SPAM_WORD;
denom2 = denom2*(long double)PROBABILITY_OF_UNLIKELY_SPAM_WORD;
denom1 = denom1*(long double)(1 - PROBABILITY_OF_UNLIKELY_SPAM_WORD);
}
/* Simulating lot of mostly definite spam words */
for (index = 0; index < 1000; index++)
{
numerator = numerator*(long double)PROBABILITY_OF_MOSTLY_SPAM_WORD;
denom2 = denom2*(long double)PROBABILITY_OF_MOSTLY_SPAM_WORD;
denom1 = denom1*(long double)(1- PROBABILITY_OF_MOSTLY_SPAM_WORD);
}
doc_spam_prob= (numerator/(denom1+denom2));
return 0;
}
I tried Float, double and even long double datatypes but still same problem.
Hence, say in a 100K words document I am analyzing, if just 162 words are having 1% spam probability and remaining 99838 are conspicuously spam words, then still my app will say it as Not Spam doc because of Precision error (as numerator easily goes to ZERO)!!!.
This is the first time I am hitting such issue. So how exactly should this problem be tackled?
This happens often in machine learning. AFAIK, there's nothing you can do about the loss in precision. So to bypass this, we use the log function and convert divisions and multiplications to subtractions and additions, resp.
SO I decided to do the math,
The original equation is:
I slightly modify it:
Taking logs on both sides:
Let,
Substituting,
Hence the alternate formula for computing the combined probability:
If you need me to expand on this, please leave a comment.
Here's a trick:
for the sake of readability, let S := p_1 * ... * p_n and H := (1-p_1) * ... * (1-p_n),
then we have:
p = S / (S + H)
p = 1 / ((S + H) / S)
p = 1 / (1 + H / S)
let`s expand again:
p = 1 / (1 + ((1-p_1) * ... * (1-p_n)) / (p_1 * ... * p_n))
p = 1 / (1 + (1-p_1)/p_1 * ... * (1-p_n)/p_n)
So basically, you will obtain a product of quite large numbers (between 0 and, for p_i = 0.01, 99). The idea is, not to multiply tons of small numbers with one another, to obtain, well, 0, but to make a quotient of two small numbers. For example, if n = 1000000 and p_i = 0.5 for all i, the above method will give you 0/(0+0) which is NaN, whereas the proposed method will give you 1/(1+1*...1), which is 0.5.
You can get even better results, when all p_i are sorted and you pair them up in opposed order (let's assume p_1 < ... < p_n), then the following formula will get even better precision:
p = 1 / (1 + (1-p_1)/p_n * ... * (1-p_n)/p_1)
that way you devide big numerators (small p_i) with big denominators (big p_(n+1-i)), and small numerators with small denominators.
edit: MSalter proposed a useful further optimization in his answer. Using it, the formula reads as follows:
p = 1 / (1 + (1-p_1)/p_n * (1-p_2)/p_(n-1) * ... * (1-p_(n-1))/p_2 * (1-p_n)/p_1)
Your problem is caused because you are collecting too many terms without regard for their size. One solution is to take logarithms. Another is to sort your individual terms. First, let's rewrite the equation as 1/p = 1 + ∏((1-p_i)/p_i). Now your problem is that some of the terms are small, while others are big. If you have too many small terms in a row, you'll underflow, and with too many big terms you'll overflow the intermediate result.
So, don't put too many of the same order in a row. Sort the terms (1-p_i)/p_i. As a result, the first will be the smallest term, the last the biggest. Now, if you'd multiply them straight away you would still have an underflow. But the order of calculation doesn't matter. Use two iterators into your temporary collection. One starts at the beginning (i.e. (1-p_0)/p_0), the other at the end (i.e (1-p_n)/p_n), and your intermediate result starts at 1.0. Now, when your intermediate result is >=1.0, you take a term from the front, and when your intemediate result is < 1.0 you take a result from the back.
The result is that as you take terms, the intermediate result will oscillate around 1.0. It will only go up or down as you run out of small or big terms. But that's OK. At that point, you've consumed the extremes on both ends, so it the intermediate result will slowly approach the final result.
There's of course a real possibility of overflow. If the input is completely unlikely to be spam (p=1E-1000) then 1/p will overflow, because ∏((1-p_i)/p_i) overflows. But since the terms are sorted, we know that the intermediate result will overflow only if ∏((1-p_i)/p_i) overflows. So, if the intermediate result overflows, there's no subsequent loss of precision.
Try computing the inverse 1/p. That gives you an equation of the form 1 + 1/(1-p1)*(1-p2)...
If you then count the occurrence of each probability--it looks like you have a small number of values that recur--you can use the pow() function--pow(1-p, occurences_of_p)*pow(1-q, occurrences_of_q)--and avoid individual roundoff with each multiplication.
You can use probability in percents or promiles:
doc_spam_prob= (numerator*100/(denom1+denom2));
or
doc_spam_prob= (numerator*1000/(denom1+denom2));
or use some other coefficient
I am not strong in math so I cannot comment on possible simplifications to the formula that might eliminate or reduce your problem. However, I am familiar with the precision limitations of long double types and am aware of several arbitrary and extended precision math libraries for C. Check out:
http://www.nongnu.org/hpalib/
and
http://www.tc.umn.edu/~ringx004/mapm-main.html