How do I use Metropolis Sampling in MATLAB to calculate an integral? - sampling

I am trying to write a matlab function to solve a test integral using the Metropolis Method. My function is listed below.
The integral is from 0 to infinity of x*e(-x^2), divided by the integral from 0 to infinity of e(-x^2)
This function converges to ~0.5 (notably, it does fluctuate about this answer a little) however analytically the solution is ~0.5642 or 1/sqrt(pi).
The code I use to run the function is also below.
What I have done wrong? How do I use metropolis to correct solve this test function?
% Metropolis Method for Integration
% Written by John Furness - Computational Physics, KTH
function [I S1 S2] = metropolis(f,a,b,n,sig)
% This function calculates an integral using Metropolis Mathod
% Only takes input as a function, f, on an interval between a and b,
% where n is the number of points.
%Defining burnin
%burnin = n/20;
burnin = 0;
% Finding maximum point
x = linspace(a,b,1000);
f1 = f(x);
max1 = max(f1);
%Setting Up x-vector and mu
x(1) = rand(1);
mu=0;
% Generating Random Poins for x with Gaussian distribution.
% Proposal Distribution will be the normal distribution
strg = 'exp(-1*((x-mu)/sig).^2)';
norm = inline(strg,'x','mu','sig');
for i = 2:n
% This loop generates a new state from the proposal distribution.
y = x(i-1) + sig*randn(1);
% generate a uniform for comparison
u = rand(1);
% Alpha is the acceptance probability
alpha = min([1, (f(y))/((f(x(i-1))))]);
if u <= alpha
x(i) = y;
else
x(i) = x(i-1);
end
end
%Discarding Burnin
%x(1:burnin) = [];
%I = ((inside)/length(x))*max1*(b-a);
I = (1/length(f(x)))*((sum(f(x))))/sum(norm(x,mu,sig));
%My investigation variables to see what's happening
%S1 = sum(f(x));
%S2 = sum(norm1(x,mu,sig));
S1 = min(x);
S2 = max(x);
end
Code used to run the above function:
% Code for Running Metropolis Method
% Written by John Furness - Computational Physics
% Clearing Workspace
clear all
close all
clc
% Equation 1
% Changing Parameters for Equation 1
a1 = 0;
b1 = 10;
n1 = 10000;
sig = 2;
N1 = #(x)(x.*exp(-x.^2));
D1 = #(x)(exp(-x.^2));
denom = metropolis(D1,a1,b1,n1,sig);
numer = metropolis(N1,a1,b1,n1,sig);
solI1 = numer/denom

Related

Erroneous result using inverse Vincenty's formula in C

I have written a C script to implement the inverse Vincenty's formula to calculate the distance between two sets of GPS coordinates based on the equations shown at https://en.wikipedia.org/wiki/Vincenty%27s_formulae
However, my results are different to the results given by this online calculator https://www.cqsrg.org/tools/GCDistance/ and Google maps. My results are consistently around 1.18 times the result of the online calculator.
My function is below, any tips on where I could be going wrong would be very much appreciated!
double get_distance(double lat1, double lon1, double lat2, double lon2)
{
double rad_eq = 6378137.0; //Radius at equator
double flattening = 1 / 298.257223563; //flattenig of earth
double rad_pol = (1 - flattening) * rad_eq; //Radius at poles
double U1,U2,L,lambda,old_lambda,sigma,sin_sig,cos_sig,alpha,cos2sigmam,A,B,C,u_sq,delta_s,dis;
//Convert to radians
lat1=M_PI*lat1/180.0;
lat2=M_PI*lat2/180.0;
lon1=M_PI*lon1/180.0;
lon2=M_PI*lon2/180.0;
//Calculate U1 and U2
U1=atan((1-flattening)*tan(lat1));
U2=atan((1-flattening)*tan(lat2));
L=lon2-lon1;
lambda=L;
double tolerance=pow(10.,-12.);//iteration tollerance should give 0.6mm
double diff=1.;
while (abs(diff)>tolerance)
{
sin_sig=sqrt(pow(cos(U2)*sin(lambda),2.)+pow(cos(U1)*sin(U2)-(sin(U1)*cos(U2)*cos(lambda)),2.));
cos_sig=sin(U1)*cos(U2)+cos(U1)*cos(U2)*cos(lambda);
sigma=atan(sin_sig/cos_sig);
alpha=asin((cos(U1)*cos(U2)*sin(lambda))/(sin_sig));
cos2sigmam=cos(sigma)-(2*sin(U1)*sin(U2))/((pow(cos(alpha),2.)));
C=(flattening/16)*pow(cos(alpha),2.)*(4+(flattening*(4-(3*pow(cos(alpha),2.)))));
old_lambda=lambda;
lambda=L+(1-C)*flattening*sin(alpha)*(sigma+C*sin_sig*(cos2sigmam+C*cos_sig*(-1+2*pow(cos2sigmam,2.))));
diff=abs(old_lambda-lambda);
}
u_sq=pow(cos(alpha),2.)*((pow(rad_eq,2.)-pow(rad_pol,2.))/(pow(rad_pol,2.)));
A=1+(u_sq/16384)*(4096+(u_sq*(-768+(u_sq*(320-(175*u_sq))))));
B=(u_sq/1024)*(256+(u_sq*(-128+(u_sq*(74-(47*u_sq))))));
delta_s=B*sin_sig*(cos2sigmam+(B/4)*(cos_sig*(-1+(2*pow(cos2sigmam,2.)))-(B/6)*cos2sigmam*(-3+(4*pow(sin_sig,2.)))*(-3+(4*pow(cos2sigmam,2.)))));
dis=rad_pol*A*(sigma-delta_s);
//Returns distance in metres
return dis;
}
This formula is not symmetric:
cos_sig = sin(U1)*cos(U2)
+ cos(U1)*cos(U2) * cos(lambda);
And turns out to be wrong, a sin is missing.
Another style of formatting (one including some whitespace) could also help.
Besides the fabs for abs and one sin for that cos I also changed the loop; there were two abs()-calls and diff had to be preset with the while-loop.
I inserted a printf to see how the value progresses.
Some parentheses can be left out. These formulas are really difficult to realize. Some more helper variables could be useful in this jungle of nested math operations.
do {
sin_sig = sqrt(pow( cos(U2) * sin(lambda), 2)
+ pow(cos(U1)*sin(U2)
- (sin(U1)*cos(U2) * cos(lambda))
, 2)
);
cos_sig = sin(U1) * sin(U2)
+ cos(U1) * cos(U2) * cos(lambda);
sigma = atan2(sin_sig, cos_sig);
alpha = asin(cos(U1) * cos(U2) * sin(lambda)
/ sin_sig
);
double cos2alpha = cos(alpha)*cos(alpha); // helper var.
cos2sigmam = cos(sigma) - 2*sin(U1)*sin(U2) / cos2alpha;
C = (flat/16) * cos2alpha * (4 + flat * (4 - 3*cos2alpha));
old_lambda = lambda;
lambda = L + (1-C) * flat * sin(alpha)
*(sigma + C*sin_sig
*(cos2sigmam + C*cos_sig
*(2 * pow(cos2sigmam, 2) - 1)
)
);
diff = fabs(old_lambda - lambda);
printf("%.12f\n", diff);
} while (diff > tolerance);
For 80,80, 0,0 the output is (in km):
0.000885870048
0.000000221352
0.000000000055
0.000000000000
9809.479224
which corresponds to the millimeter with WGS-84.

Number of recursive calls in gcd() function

Recently I have been given a gcd() function, written in C programming language which takes two arguments n and m and compute the GCD of these two numbers using recursion.I have been asked that "How many recursive calls are made by the function if n>=m?" Can any one provide the solution with explanation to my problem as I am unable to figure it out.
Here is the source code of the function :
int gcd(int n, int m)
{
if (n%m==0)
return m;
else
n=n%m;
return gcd(m, n);
}
Euclidean algorithm gives #steps =
T(a, b) = 1 + T(b, r0) = 2 + T(r0, r1) = … = N + T(rN - 2, rN - 1) = N + 1
where a and b are the inputs, and r_i the remainder. We used that T(x, 0) = 0
Running an example in paper would help you get a better grasp of the aforementioned equation:
gcd(1071, 462) is calculated from the equivalent gcd(462, 1071 mod 462) = gcd(462, 147). The latter GCD is calculated from the gcd(147, 462 mod 147) = gcd(147, 21), which in turn is calculated from the gcd(21, 147 mod 21) = gcd(21, 0) = 21
So a = 1071 and b = 462, and we have:
T(a, b) =
1 + T(b, a % b) = 1 + T(b, r_0) = (1)
2 + T(r_0, b % r_0) = 2 + T(r_0, r_1) =
3 + T(r_1, r_0 % r_1) = 3 + T(r_1, r_2) = (2)
3 + T(r_1, 0) =
3 + 0 =
3
which says that we needed to take 3 steps to compute gcd(1071, 462).
(1): notice that the 1 is the step already done before, i.e. T(a, b)
(2): r_2 is equal to 0 in this example
You could run a plethora of examples in paper, and see how this unfolds, and eventually you will be able to see the pattern, if you don't see it already.
Note: While #Ian'Abott's comments are also correct, I decided to present this approach, since it's more generic, and can be applied to any similar recursive method.

Cython: slow numpy arrays

I am trying to speed up my code using cython. After translating a code into cython from python I am seeing that I have not gained any speed up. I think the origin of the problem is the bad performance I am getting by using numpy arrays into cython.
I have came up with a very simple program to show this:
############### test.pyx #################
import numpy as np
cimport numpy as np
cimport cython
def func1(long N):
cdef double sum1,sum2,sum3
cdef long i
sum1 = 0.0
sum2 = 0.0
sum3 = 0.0
for i in range(N):
sum1 += i
sum2 += 2.0*i
sum3 += 3.0*i
return sum1,sum2,sum3
def func2(long N):
cdef np.ndarray[np.float64_t,ndim=1] sum_arr
cdef long i
sum_arr = np.zeros(3,dtype=np.float64)
for i in range(N):
sum_arr[0] += i
sum_arr[1] += 2.0*i
sum_arr[2] += 3.0*i
return sum_arr
def func3(long N):
cdef double sum_arr[3]
cdef long i
sum_arr[0] = 0.0
sum_arr[1] = 0.0
sum_arr[2] = 0.0
for i in range(N):
sum_arr[0] += i
sum_arr[1] += 2.0*i
sum_arr[2] += 3.0*i
return sum_arr
##########################################
################## test.py ###############
import time
import test as test
N = 1000000000
for i in xrange(10):
start = time.time()
sum1,sum2,sum3 = test.func1(N)
print 'Time taken = %.3f'%(time.time()-start)
print '\n'
for i in xrange(10):
start = time.time()
sum_arr = test.func2(N)
print 'Time taken = %.3f'%(time.time()-start)
print '\n'
for i in xrange(10):
start = time.time()
sum_arr = test.func3(N)
print 'Time taken = %.3f'%(time.time()-start)
############################################
And from python test.py I get:
Time taken = 1.445
Time taken = 1.433
Time taken = 1.434
Time taken = 1.428
Time taken = 1.449
Time taken = 1.425
Time taken = 1.421
Time taken = 1.451
Time taken = 1.483
Time taken = 1.418
Time taken = 2.623
Time taken = 2.603
Time taken = 2.977
Time taken = 3.237
Time taken = 2.748
Time taken = 2.798
Time taken = 2.811
Time taken = 2.783
Time taken = 2.585
Time taken = 2.595
Time taken = 1.503
Time taken = 1.529
Time taken = 1.509
Time taken = 1.543
Time taken = 1.427
Time taken = 1.425
Time taken = 1.423
Time taken = 1.415
Time taken = 1.414
Time taken = 1.418
My question is: why func2 is almost 2x slower than func1 and func3?
Is there a way to improve this?
Thanks!
######## UPDATE
My real problem is as follows. I am calling a function that accepts a 3D array (say P[i,j,k]). The function will loop through each element and compute several quantities: a quantity that depends on the value of the array in that position (say A=f(P[i,j,k])) and another quantities that only depend on the position of the array itself (B=g(i,j,k)). Schematically things will look like this:
for i in xrange(N):
corr1 = h(i,val)
for j in xrange(N):
corr2 = h(j,val)
for k in xrange(N):
corr3 = h(k,val)
A = f(P[i,j,k])
B = g(i,j,k)
Arr[B] += A*corr1*corr2*corr3
where val is a property of the 3D array represented by a number. This number can be different for different fields.
Since I have to do this operation over many 3D arrays, I though that it would be better if I create a new routine that accepts many different input 3D arrays, leaving the number of arrays unknown a-priori. The idea is that since B will be exactly the same over all arrays, I can avoid computing it for each array and only compute it once. The problem is that the corr1, corr2, corr3 above will become arrays:
If I have a number of 3D arrays equal to num_3D_arrays I am doing something as:
for i in xrange(N):
for p in xrange(num_3D_arrays):
corr1[p] = h(i,val[p])
for j in xrange(N):
for p in xrange(num_3D_arrays):
corr2[p] = h(j,val[p])
for k in xrange(N):
for p in xrange(num_3D_arrays):
corr3[p] = h(k,val[p])
B = g(i,j,k)
for p in xrange(num_3D_arrays):
A[p] = f(P[i,j,k])
Arr[p,B] += A[p]*corr1[p]*corr2[p]*corr3[p]
So the val that I am changing the variables corr1,corr2,corr3 and A from scalar to arrays is killing the performance that I would expect to avoid doing the big loop.
#
There are a couple things you can do to speed up array indexing in Cython:
Turn of bounds checking and wraparound.
Use typed memoryviews.
Declare the array is contiguous.
So for your function:
#cython.boundscheck(False)
#cython.wraparound(False)
def func2(long N):
cdef np.float64_t[::1] sum_arr
cdef long i
sum_arr = np.zeros(3,dtype=np.float64)
for i in range(N):
sum_arr[0] += i
sum_arr[1] += 2.0*i
sum_arr[2] += 3.0*i
return sum_arr
For the original code Cython produced the following C code for the line sum_arr[0] += i:
__pyx_t_12 = 0;
__pyx_t_6 = -1;
if (__pyx_t_12 < 0) {
__pyx_t_12 += __pyx_pybuffernd_sum_arr.diminfo[0].shape;
if (unlikely(__pyx_t_12 < 0)) __pyx_t_6 = 0;
} else if (unlikely(__pyx_t_12 >= __pyx_pybuffernd_sum_arr.diminfo[0].shape)) __pyx_t_6 = 0;
if (unlikely(__pyx_t_6 != -1)) {
__Pyx_RaiseBufferIndexError(__pyx_t_6);
{__pyx_filename = __pyx_f[0]; __pyx_lineno = 13; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
}
*__Pyx_BufPtrStrided1d(__pyx_t_5numpy_float64_t *, __pyx_pybuffernd_sum_arr.rcbuffer->pybuffer.buf, __pyx_t_12, __pyx_pybuffernd_sum_arr.diminfo[0].strides) += __pyx_v_i;
With the improvements above:
__pyx_t_8 = 0;
*((double *) ( /* dim=0 */ ((char *) (((double *) __pyx_v_sum_arr.data) + __pyx_t_8)) )) += __pyx_v_i;
why func2 is almost 2x slower than func1?
It's because indexing cause an indirection, so you double the number of elementary operations. Calculate the sum like in func1, then affect with
sum=array([sum1,sum2,sum3])
How to speed python code ?
Numpy is the first good idea , It raise nearly C speed with no effort.
Numba can fill the gap with no effort too, and is very simple.
Cython for critical cases.
Here some illustration of that:
# python way
def func1(N):
sum1 = 0.0
sum2 = 0.0
sum3 = 0.0
for i in range(N):
sum1 += i
sum2 += 2.0*i
sum3 += 3.0*i
return sum1,sum2,sum3
# numpy way
def func2(N):
aran=arange(float(N))
sum1=aran.sum()
sum2=(2.0*aran).sum()
sum3=(3.0*aran).sum()
return sum1,sum2,sum3
#numba way
import numba
func3 =numba.njit(func1)
"""
In [609]: %timeit func1(10**6)
1 loop, best of 3: 710 ms per loop
In [610]: %timeit func2(1e6)
100 loops, best of 3: 22.2 ms per loop
In [611]: %timeit func3(10e6)
100 loops, best of 3: 2.87 ms per loop
"""
Look at the html produced by cython -a ...pyx.
For func1, the sum1 += i line expands to :
+15: sum1 += i
__pyx_v_sum1 = (__pyx_v_sum1 + __pyx_v_i);
for func3, with a C array
+45: sum_arr[0] += i
__pyx_t_3 = 0;
(__pyx_v_sum_arr[__pyx_t_3]) = ((__pyx_v_sum_arr[__pyx_t_3]) + __pyx_v_i);
Slightly more complicated, but a straight forward c.
But for func2:
+29: sum_arr[0] += i
__pyx_t_12 = 0;
__pyx_t_6 = -1;
if (__pyx_t_12 < 0) {
__pyx_t_12 += __pyx_pybuffernd_sum_arr.diminfo[0].shape;
if (unlikely(__pyx_t_12 < 0)) __pyx_t_6 = 0;
} else if (unlikely(__pyx_t_12 >= __pyx_pybuffernd_sum_arr.diminfo[0].shape)) __pyx_t_6 = 0;
if (unlikely(__pyx_t_6 != -1)) {
__Pyx_RaiseBufferIndexError(__pyx_t_6);
__PYX_ERR(0, 29, __pyx_L1_error)
}
*__Pyx_BufPtrStrided1d(__pyx_t_5numpy_float64_t *, __pyx_pybuffernd_sum_arr.rcbuffer->pybuffer.buf, __pyx_t_12, __pyx_pybuffernd_sum_arr.diminfo[0].strides) += __pyx_v_i;
Much more complicated with references to numpy functions (e.g. Pyx_BUfPtrStrided1d). Even initializing the array is complicated:
+26: sum_arr = np.zeros(3,dtype=np.float64)
__pyx_t_1 = __Pyx_GetModuleGlobalName(__pyx_n_s_np); if (unlikely(!__pyx_t_1)) __PYX_ERR(0, 26, __pyx_L1_error)
__Pyx_GOTREF(__pyx_t_1);
....
I expect that moving the sum_arr creation to the calling Python, and passing it as an argument to func2 would save some time.
Have you read this guide for using memoryviews:
http://cython.readthedocs.io/en/latest/src/userguide/memoryviews.html
You'll get the best cython performance if you focus on writing the low level operations so they translate into simple c. In
for k in xrange(N):
corr3 = h(k,val)
A = f(P[i,j,k])
B = g(i,j,k)
Arr[B] += A*corr1*corr2*corr3
It's not the loops on i,j,k that will slow you down. It's evaluating h, f, and g each time, as well as the Arr[B] +=.... Those functions should be tightly coded cython, not general Python functions. Look at the compiled simplicity of the sum3d function in the memoryview guide.

Matlab: Help understanding sinusoidal curve fit

I have an unknown sine wave with some noise that I am trying to reconstruct. The ultimate goal is to come up with a C algorithm to find the amplitude, dc offset, phase, and frequency of a sine wave but I am prototyping in Matlab (Octave actually) first. The sine wave is of the form
y = a + b*sin(c + 2*pi*d*t)
a = dc offset
b = amplitude
c = phase shift (rad)
d = frequency
I have found this example and in the comments John D'Errico presents a method for using Least Squares to fit a sine wave to data. It is a neat little algorithm and works remarkably well but I am having difficulties understanding one aspect. The algorithm is as follows:
Algorithm
Suppose you have a sine wave of the form:
(1) y = a + b*sin(c+d*x)
Using the identity
(2) sin(u+v) = sin(u)*cos(v) + cos(u)*sin(v)
We can rewrite (1) as
(3) y = a + b*sin(c)*cos(d*x) + b*cos(c)*sin(d*x)
Since b*sin(c) and b*cos(c) are constants, these can be wrapped into constants b1 and b2.
(4) y = a + b1*cos(d*x) + b2*sin(d*x)
This is the equation that is used to fit the sine wave. A function is created to generate regression coefficients and a sum-of-squares residual error.
(5) cfun = #(d) [ones(size(x)), sin(d*x), cos(d*x)] \ y;
(6) sumerr2 = #(d) sum((y - [ones(size(x)), sin(d*x), cos(d*x)] * cfun(d)) .^ 2);
Next, sumerr2 is minimized for the frequency d using fminbnd with lower limit l1 and upper limit l2.
(7) dopt = fminbnd(sumerr2, l1, l2);
Now a, b, and c can be computed. The coefficients to compute a, b, and c are given from (4) at dopt
(8) abb = cfun(dopt);
The dc offset is simply the first value
(9) a = abb(1);
A trig identity is used to find b
(10) sin(u)^2 + cos(u)^2 = 1
(11) b = sqrt(b1^2 + b2^2)
(12) b = norm(abb([2 3]));
Finally the phase offset is found
(13) b1 = b*cos(c)
(14) c = acos(b1 / b);
(15) c = acos(abb(2) / b);
Question
What is going on in (5) and (6)? Can someone break down what is happening in pseudo-code or perhaps perform the same function in a more explicit way?
(5) cfun = #(d) [ones(size(x)), sin(d*x), cos(d*x)] \ y;
(6) sumerr2 = #(d) sum((y - [ones(size(x)), sin(d*x), cos(d*x)] * cfun(d)) .^ 2);
Also, given (4) shouldn't it be:
[ones(size(x)), cos(d*x), sin(d*x)]
Code
Here is the Matlab code in full. Blue line is the actual signal. Green line is the reconstructed signal.
close all
clear all
y = [111,140,172,207,243,283,319,350,383,414,443,463,483,497,505,508,503,495,479,463,439,412,381,347,311,275,241,206,168,136,108,83,63,54,45,43,41,45,51,63,87,109,137,168,204,239,279,317,348,382,412,439,463,479,496,505,508,505,495,483,463,441,414,383,350,314,278,245,209,175,140,140,110,85,63,51,45,41,41,44,49,63,82,105,135,166,200,236,277,313,345,379,409,438,463,479,495,503,508,503,498,485,467,444,415,383,351,318,281,247,211,174,141,111,87,67,52,45,42,41,45,50,62,79,104,131,163,199,233,273,310,345,377,407,435,460,479,494,503,508,505,499,486,467,445,419,387,355,319,284,249,215,177,143,113,87,67,55,46,43,41,44,48,63,79,102,127,159,191,232,271,307,343,373,404,437,457,478,492,503,508,505,499,488,470,447,420,391,360,323,287,254,215,182,147,116,92,70,55,46,43,42,43,49,60,76,99,127,159,191,227,268,303,339,371,401,431,456,476,492,502,507,507,500,488,471,447,424,392,361,326,287,287,255,220,185,149,119,92,72,55,47,42,41,43,47,57,76,95,124,156,189,223,258,302,337,367,399,428,456,476,492,502,508,508,501,489,471,451,425,396,364,328,294,259,223,188,151,119,95,72,57,46,43,44,43,47,57,73,95,124,153,187,222,255,297,335,366,398,426,451,471,494,502,507,508,502,489,474,453,428,398,367,332,296,262,227,191,154,124,95,75,60,47,43,41,41,46,55,72,94,119,150,183,215,255,295,331,361,396,424,447,471,489,500,508,508,502,492,475,454,430,401,369,335,299,265,228,191,157,126,99,76,59,49,44,41,41,46,55,72,92,118,147,179,215,252,291,328,360,392,422,447,471,488,499,507,508,503,493,477,456,431,403]';
fs = 100e3;
N = length(y);
t = (0:1/fs:N/fs-1/fs)';
cfun = #(d) [ones(size(t)), sin(2*pi*d*t), cos(2*pi*d*t)]\y;
sumerr2 = #(d) sum((y - [ones(size(t)), sin(2*pi*d*t), cos(2*pi*d*t)] * cfun(d)) .^ 2);
dopt = fminbnd(sumerr2, 2300, 2500);
abb = cfun(dopt);
a = abb(1);
b = norm(abb([2 3]));
c = acos(abb(2) / b);
d = dopt;
y_reconstructed = a + b*sin(2*pi*d*t - c);
figure(1)
hold on
title('Signal Reconstruction')
grid on
plot(t*1000, y, 'b')
plot(t*1000, y_reconstructed, 'g')
ylim = get(gca, 'ylim');
xlim = get(gca, 'xlim');
text(xlim(1), ylim(2) - 15, [num2str(b) ' cos(2\pi * ' num2str(d) 't - ' ...
num2str(c * 180/pi) ') + ' num2str(a)]);
hold off
(5) and (6) are defining anonymous functions that can be used within the optimisation code. cfun returns an array that is a function of t, y and the parameter d (that is the optimisation parameter that will be varied). Similarly, sumerr2 is another anonymous function, with the same arguments, this time returning a scalar. That scalar will be the error that is to be minimised by fminbnd.

Calculate (x exponent 0.19029) with low memory using lookup table?

I'm writing a C program for a PIC micro-controller which needs to do a very specific exponential function. I need to calculate the following:
A = k . (1 - (p/p0)^0.19029)
k and p0 are constant, so it's all pretty simple apart from finding x^0.19029
(p/p0) ratio would always be in the range 0-1.
It works well if I add in math.h and use the power function, except that uses up all of the available 16 kB of program memory. Talk about bloatware! (Rest of program without power function = ~20% flash memory usage; add math.h and power function, =100%).
I'd like the program to do some other things as well. I was wondering if I can write a special case implementation for x^0.19029, maybe involving iteration and some kind of lookup table.
My idea is to generate a look-up table for the function x^0.19029, with perhaps 10-100 values of x in the range 0-1. The code would find a close match, then (somehow) iteratively refine it by re-scaling the lookup table values. However, this is where I get lost because my tiny brain can't visualise the maths involved.
Could this approach work?
Alternatively, I've looked at using Exp(x) and Ln(x), which can be implemented with a Taylor expansion. b^x can the be found with:
b^x = (e^(ln b))^x = e^(x.ln(b))
(See: Wikipedia - Powers via Logarithms)
This looks a bit tricky and complicated to me, though. Am I likely to get the implementation smaller then the compiler's math library, and can I simplify it for my special case (i.e. base = 0-1, exponent always 0.19029)?
Note that RAM usage is OK at the moment, but I've run low on Flash (used for code storage). Speed is not critical. Somebody has already suggested that I use a bigger micro with more flash memory, but that sounds like profligate wastefulness!
[EDIT] I was being lazy when I said "(p/p0) ratio would always be in the range 0-1". Actually it will never reach 0, and I did some calculations last night and decided that in fact a range of 0.3 - 1 would be quite adequate! This mean that some of the simpler solutions below should be suitable. Also, the "k" in the above is 44330, and I'd like the error in the final result to be less than 0.1. I guess that means an error in the (p/p0)^0.19029 needs to be less than 1/443300 or 2.256e-6
Use splines. The relevant part of the function is shown in the figure below. It varies approximately like the 5th root, so the problematic zone is close to p / p0 = 0. There is mathematical theory how to optimally place the knots of splines to minimize the error (see Carl de Boor: A Practical Guide to Splines). Usually one constructs the spline in B form ahead of time (using toolboxes such as Matlab's spline toolbox - also written by C. de Boor), then converts to Piecewise Polynomial representation for fast evaluation.
In C. de Boor, PGS, the function g(x) = sqrt(x + 1) is actually taken as an example (Chapter 12, Example II). This is exactly what you need here. The book comes back to this case a few times, since it is admittedly a hard problem for any interpolation scheme due to the infinite derivatives at x = -1. All software from PGS is available for free as PPPACK in netlib, and most of it is also part of SLATEC (also from netlib).
Edit (Removed)
(Multiplying by x once does not significantly help, since it only regularizes the first derivative, while all other derivatives at x = 0 are still infinite.)
Edit 2
My feeling is that optimally constructed splines (following de Boor) will be best (and fastest) for relatively low accuracy requirements. If the accuracy requirements are high (say 1e-8), one may be forced to get back to the algorithms that mathematicians have been researching for centuries. At this point, it may be best to simply download the sources of glibc and copy (provided GPL is acceptable) whatever is in
glibc-2.19/sysdeps/ieee754/dbl-64/e_pow.c
Since we don't have to include the whole math.h, there shouldn't be a problem with memory, but we will only marginally profit from having a fixed exponent.
Edit 3
Here is an adapted version of e_pow.c from netlib, as found by #Joni. This seems to be the grandfather of glibc's more modern implementation mentioned above. The old version has two advantages: (1) It is public domain, and (2) it uses a limited number of constants, which is beneficial if memory is a tight resource (glibc's version defines over 10000 lines of constants!). The following is completely standalone code, which calculates x^0.19029 for 0 <= x <= 1 to double precision (I tested it against Python's power function and found that at most 2 bits differed):
#define __LITTLE_ENDIAN
#ifdef __LITTLE_ENDIAN
#define __HI(x) *(1+(int*)&x)
#define __LO(x) *(int*)&x
#else
#define __HI(x) *(int*)&x
#define __LO(x) *(1+(int*)&x)
#endif
static const double
bp[] = {1.0, 1.5,},
dp_h[] = { 0.0, 5.84962487220764160156e-01,}, /* 0x3FE2B803, 0x40000000 */
dp_l[] = { 0.0, 1.35003920212974897128e-08,}, /* 0x3E4CFDEB, 0x43CFD006 */
zero = 0.0,
one = 1.0,
two = 2.0,
two53 = 9007199254740992.0, /* 0x43400000, 0x00000000 */
/* poly coefs for (3/2)*(log(x)-2s-2/3*s**3 */
L1 = 5.99999999999994648725e-01, /* 0x3FE33333, 0x33333303 */
L2 = 4.28571428578550184252e-01, /* 0x3FDB6DB6, 0xDB6FABFF */
L3 = 3.33333329818377432918e-01, /* 0x3FD55555, 0x518F264D */
L4 = 2.72728123808534006489e-01, /* 0x3FD17460, 0xA91D4101 */
L5 = 2.30660745775561754067e-01, /* 0x3FCD864A, 0x93C9DB65 */
L6 = 2.06975017800338417784e-01, /* 0x3FCA7E28, 0x4A454EEF */
P1 = 1.66666666666666019037e-01, /* 0x3FC55555, 0x5555553E */
P2 = -2.77777777770155933842e-03, /* 0xBF66C16C, 0x16BEBD93 */
P3 = 6.61375632143793436117e-05, /* 0x3F11566A, 0xAF25DE2C */
P4 = -1.65339022054652515390e-06, /* 0xBEBBBD41, 0xC5D26BF1 */
P5 = 4.13813679705723846039e-08, /* 0x3E663769, 0x72BEA4D0 */
lg2 = 6.93147180559945286227e-01, /* 0x3FE62E42, 0xFEFA39EF */
lg2_h = 6.93147182464599609375e-01, /* 0x3FE62E43, 0x00000000 */
lg2_l = -1.90465429995776804525e-09, /* 0xBE205C61, 0x0CA86C39 */
ovt = 8.0085662595372944372e-0017, /* -(1024-log2(ovfl+.5ulp)) */
cp = 9.61796693925975554329e-01, /* 0x3FEEC709, 0xDC3A03FD =2/(3ln2) */
cp_h = 9.61796700954437255859e-01, /* 0x3FEEC709, 0xE0000000 =(float)cp */
cp_l = -7.02846165095275826516e-09, /* 0xBE3E2FE0, 0x145B01F5 =tail of cp_h*/
ivln2 = 1.44269504088896338700e+00, /* 0x3FF71547, 0x652B82FE =1/ln2 */
ivln2_h = 1.44269502162933349609e+00, /* 0x3FF71547, 0x60000000 =24b 1/ln2*/
ivln2_l = 1.92596299112661746887e-08; /* 0x3E54AE0B, 0xF85DDF44 =1/ln2 tail*/
double pow0p19029(double x)
{
double y = 0.19029e+00;
double z,ax,z_h,z_l,p_h,p_l;
double y1,t1,t2,r,s,t,u,v,w;
int i,j,k,n;
int hx,hy,ix,iy;
unsigned lx,ly;
hx = __HI(x); lx = __LO(x);
hy = __HI(y); ly = __LO(y);
ix = hx&0x7fffffff; iy = hy&0x7fffffff;
ax = x;
/* special value of x */
if(lx==0) {
if(ix==0x7ff00000||ix==0||ix==0x3ff00000){
z = ax; /*x is +-0,+-inf,+-1*/
return z;
}
}
s = one; /* s (sign of result -ve**odd) = -1 else = 1 */
double ss,s2,s_h,s_l,t_h,t_l;
n = ((ix)>>20)-0x3ff;
j = ix&0x000fffff;
/* determine interval */
ix = j|0x3ff00000; /* normalize ix */
if(j<=0x3988E) k=0; /* |x|<sqrt(3/2) */
else if(j<0xBB67A) k=1; /* |x|<sqrt(3) */
else {k=0;n+=1;ix -= 0x00100000;}
__HI(ax) = ix;
/* compute ss = s_h+s_l = (x-1)/(x+1) or (x-1.5)/(x+1.5) */
u = ax-bp[k]; /* bp[0]=1.0, bp[1]=1.5 */
v = one/(ax+bp[k]);
ss = u*v;
s_h = ss;
__LO(s_h) = 0;
/* t_h=ax+bp[k] High */
t_h = zero;
__HI(t_h)=((ix>>1)|0x20000000)+0x00080000+(k<<18);
t_l = ax - (t_h-bp[k]);
s_l = v*((u-s_h*t_h)-s_h*t_l);
/* compute log(ax) */
s2 = ss*ss;
r = s2*s2*(L1+s2*(L2+s2*(L3+s2*(L4+s2*(L5+s2*L6)))));
r += s_l*(s_h+ss);
s2 = s_h*s_h;
t_h = 3.0+s2+r;
__LO(t_h) = 0;
t_l = r-((t_h-3.0)-s2);
/* u+v = ss*(1+...) */
u = s_h*t_h;
v = s_l*t_h+t_l*ss;
/* 2/(3log2)*(ss+...) */
p_h = u+v;
__LO(p_h) = 0;
p_l = v-(p_h-u);
z_h = cp_h*p_h; /* cp_h+cp_l = 2/(3*log2) */
z_l = cp_l*p_h+p_l*cp+dp_l[k];
/* log2(ax) = (ss+..)*2/(3*log2) = n + dp_h + z_h + z_l */
t = (double)n;
t1 = (((z_h+z_l)+dp_h[k])+t);
__LO(t1) = 0;
t2 = z_l-(((t1-t)-dp_h[k])-z_h);
/* split up y into y1+y2 and compute (y1+y2)*(t1+t2) */
y1 = y;
__LO(y1) = 0;
p_l = (y-y1)*t1+y*t2;
p_h = y1*t1;
z = p_l+p_h;
j = __HI(z);
i = __LO(z);
/*
* compute 2**(p_h+p_l)
*/
i = j&0x7fffffff;
k = (i>>20)-0x3ff;
n = 0;
if(i>0x3fe00000) { /* if |z| > 0.5, set n = [z+0.5] */
n = j+(0x00100000>>(k+1));
k = ((n&0x7fffffff)>>20)-0x3ff; /* new k for n */
t = zero;
__HI(t) = (n&~(0x000fffff>>k));
n = ((n&0x000fffff)|0x00100000)>>(20-k);
if(j<0) n = -n;
p_h -= t;
}
t = p_l+p_h;
__LO(t) = 0;
u = t*lg2_h;
v = (p_l-(t-p_h))*lg2+t*lg2_l;
z = u+v;
w = v-(z-u);
t = z*z;
t1 = z - t*(P1+t*(P2+t*(P3+t*(P4+t*P5))));
r = (z*t1)/(t1-two)-(w+z*w);
z = one-(r-z);
__HI(z) += (n<<20);
return s*z;
}
Clearly, 50+ years of research have gone into this, so it's probably very hard to do any better. (One has to appreciate that there are 0 loops, only 2 divisions, and only 6 if statements in the whole algorithm!) The reason for this is, again, the behavior at x = 0, where all derivatives diverge, which makes it extremely hard to keep the error under control: I once had a spline representation with 18 knots that was good up to x = 1e-4, with absolute and relative errors < 5e-4 everywhere, but going to x = 1e-5 ruined everything again.
So, unless the requirement to go arbitrarily close to zero is relaxed, I recommend using the adapted version of e_pow.c given above.
Edit 4
Now that we know that the domain 0.3 <= x <= 1 is sufficient, and that we have very low accuracy requirements, Edit 3 is clearly overkill. As #MvG has demonstrated, the function is so well behaved that a polynomial of degree 7 is sufficient to satisfy the accuracy requirements, which can be considered a single spline segment. #MvG's solution minimizes the integral error, which already looks very good.
The question arises as to how much better we can still do? It would be interesting to find the polynomial of a given degree that minimizes the maximum error in the interval of interest. The answer is the minimax
polynomial, which can be found using Remez' algorithm, which is implemented in the Boost library. I like #MvG's idea to clamp the value at x = 1 to 1, which I will do as well. Here is minimax.cpp:
#include <ostream>
#define TARG_PREC 64
#define WORK_PREC (TARG_PREC*2)
#include <boost/multiprecision/cpp_dec_float.hpp>
typedef boost::multiprecision::number<boost::multiprecision::cpp_dec_float<WORK_PREC> > dtype;
using boost::math::pow;
#include <boost/math/tools/remez.hpp>
boost::shared_ptr<boost::math::tools::remez_minimax<dtype> > p_remez;
dtype f(const dtype& x) {
static const dtype one(1), y(0.19029);
return one - pow(one - x, y);
}
void out(const char *descr, const dtype& x, const char *sep="") {
std::cout << descr << boost::math::tools::real_cast<double>(x) << sep << std::endl;
}
int main() {
dtype a(0), b(0.7); // range to optimise over
bool rel_error(false), pin(true);
int orderN(7), orderD(0), skew(0), brake(50);
int prec = 2 + (TARG_PREC * 3010LL)/10000;
std::cout << std::scientific << std::setprecision(prec);
p_remez.reset(new boost::math::tools::remez_minimax<dtype>(
&f, orderN, orderD, a, b, pin, rel_error, skew, WORK_PREC));
out("Max error in interpolated form: ", p_remez->max_error());
p_remez->set_brake(brake);
unsigned i, count(50);
for (i = 0; i < count; ++i) {
std::cout << "Stepping..." << std::endl;
dtype r = p_remez->iterate();
out("Maximum Deviation Found: ", p_remez->max_error());
out("Expected Error Term: ", p_remez->error_term());
out("Maximum Relative Change in Control Points: ", r);
}
boost::math::tools::polynomial<dtype> n = p_remez->numerator();
for(i = n.size(); i--; ) {
out("", n[i], ",");
}
}
Since all parts of boost that we use are header-only, simply build with:
c++ -O3 -I<path/to/boost/headers> minimax.cpp -o minimax
We finally get the coefficients, which are after multiplication by 44330:
24538.3409, -42811.1497, 34300.7501, -11284.1276, 4564.5847, 3186.7541, 8442.5236, 0.
The following error plot demonstrates that this is really the best possible degree-7 polynomial approximation, since all extrema are of equal magnitude (0.06659):
Should the requirements ever change (while still keeping well away from 0!), the C++ program above can be simply adapted to spit out the new optimal polynomial approximation.
Instead of a lookup table, I'd use a polynomial approximation:
1 - x0.19029 ≈ - 1073365.91783x15 + 8354695.40833x14 - 29422576.6529x13 + 61993794.537x12 - 87079891.4988x11 + 86005723.842x10 - 61389954.7459x9 + 32053170.1149x8 - 12253383.4372x7 + 3399819.97536x6 - 672003.142815x5 + 91817.6782072x4 - 8299.75873768x3 + 469.530204564x2 - 16.6572179869x + 0.722044145701
Or in code:
double f(double x) {
double fx;
fx = - 1073365.91783;
fx = fx*x + 8354695.40833;
fx = fx*x - 29422576.6529;
fx = fx*x + 61993794.537;
fx = fx*x - 87079891.4988;
fx = fx*x + 86005723.842;
fx = fx*x - 61389954.7459;
fx = fx*x + 32053170.1149;
fx = fx*x - 12253383.4372;
fx = fx*x + 3399819.97536;
fx = fx*x - 672003.142815;
fx = fx*x + 91817.6782072;
fx = fx*x - 8299.75873768;
fx = fx*x + 469.530204564;
fx = fx*x - 16.6572179869;
fx = fx*x + 0.722044145701;
return fx;
}
I computed this in sage using the least squares approach:
f(x) = 1-x^(19029/100000) # your function
d = 16 # number of terms, i.e. degree + 1
A = matrix(d, d, lambda r, c: integrate(x^r*x^c, (x, 0, 1)))
b = vector([integrate(x^r*f(x), (x, 0, 1)) for r in range(d)])
A.solve_right(b).change_ring(RDF)
Here is a plot of the error this will entail:
Blue is the error from my 16 term polynomial, while red is the error you'd get from piecewise linear interpolation with 16 equidistant values. As you can see, both errors are quite small for most parts of the range, but will become really huge close to x=0. I actually clipped the plot there. If you can somehow narrow the range of possible values, you could use that as the domain for the integration, and obtain an even better fit for the relevant range. At the cost of worse fit outside, of course. You could also increase the number of terms to obtain a closer fit, although that might also lead to higher oscillations.
I guess you can also combine this approach with the one Stefan posted: use his to split the domain into several parts, then use mine to find a close low degree polynomial for each part.
Update
Since you updated the specification of your question, with regard to both the domain and the error, here is a minimal solution to fit those requirements:
44330(1 - x0.19029) ≈ + 23024.9160933(1-x)7 - 39408.6473636(1-x)6 + 31379.9086193(1-x)5 - 10098.7031260(1-x)4 + 4339.44098317(1-x)3 + 3202.85705860(1-x)2 + 8442.42528906(1-x)
double f(double x) {
double fx, x1 = 1. - x;
fx = + 23024.9160933;
fx = fx*x1 - 39408.6473636;
fx = fx*x1 + 31379.9086193;
fx = fx*x1 - 10098.7031260;
fx = fx*x1 + 4339.44098317;
fx = fx*x1 + 3202.85705860;
fx = fx*x1 + 8442.42528906;
fx = fx*x1;
return fx;
}
I integrated x from 0.293 to 1 or equivalently 1 - x from 0 to 0.707 to keep the worst oscillations outside the relevant domain. I also omitted the constant term, to ensure an exact result at x=1. The maximal error for the range [0.3, 1] now occurs at x=0.3260 and amounts to 0.0972 < 0.1. Here is an error plot, which of course has bigger absolute errors than the one above due to the scale factor k=44330 which has been included here.
I can also state that the first three derivatives of the function will have constant sign over the range in question, so the function is monotonic, convex, and in general pretty well-behaved.
Not meant to answer the question, but it illustrates the Road Not To Go, and thus may be helpful:
This quick-and-dirty C code calculates pow(i, 0.19029) for 0.000 to 1.000 in steps of 0.01. The first half displays the error, in percents, when stored as 1/65536ths (as that theoretically provides slightly over 4 decimals of precision). The second half shows both interpolated and calculated values in steps of 0.001, and the difference between these two.
It kind of looks okay if you read from the bottom up, all 100s and 99.99s there, but about the first 20 values from 0.001 to 0.020 are worthless.
#include <stdio.h>
#include <math.h>
float powers[102];
int main (void)
{
int i, as_int;
double as_real, low, high, delta, approx, calcd, diff;
printf ("calculating and storing:\n");
for (i=0; i<=101; i++)
{
as_real = pow(i/100.0, 0.19029);
as_int = (int)round(65536*as_real);
powers[i] = as_real;
diff = 100*as_real/(as_int/65536.0);
printf ("%.5f %.5f %.5f ~ %.3f\n", i/100.0, as_real, as_int/65536.0, diff);
}
printf ("\n");
printf ("-- interpolating in 1/10ths:\n");
for (i=0; i<1000; i++)
{
as_real = i/1000.0;
low = powers[i/10];
high = powers[1+i/10];
delta = (high-low)/10.0;
approx = low + (i%10)*delta;
calcd = pow(as_real, 0.19029);
diff = 100.0*approx/calcd;
printf ("%.5f ~ %.5f = %.5f +/- %.5f%%\n", as_real, approx, calcd, diff);
}
return 0;
}
You can find a complete, correct standalone implementation of pow in fdlibm. It's about 200 lines of code, about half of which deal with special cases. If you remove the code that deals with special cases you're not interested in I doubt you'll have problems including it in your program.
LutzL's answer is a really good one: Calculate your power as (x^1.52232)^(1/8), computing the inner power by spline interpolation or another method. The eighth root deals with the pathological non-differentiable behavior near zero. I took the liberty of mocking up an implementation this way. The below, however, only does a linear interpolation to do x^1.52232, and you'd need to get the full coefficients using your favorite numerical mathematics tools. You'll adding scarcely 40 lines of code to get your needed power, plus however many knots you choose to use for your spline, as dicated by your required accuracy.
Don't be scared by the #include <math.h>; it's just for benchmarking the code.
#include <stdio.h>
#include <math.h>
double my_sqrt(double x) {
/* Newton's method for a square root. */
int i = 0;
double res = 1.0;
if (x > 0) {
for (i = 0; i < 10; i++) {
res = 0.5 * (res + x / res);
}
} else {
res = 0.0;
}
return res;
}
double my_152232(double x) {
/* Cubic spline interpolation for x ** 1.52232. */
int i = 0;
double res = 0.0;
/* coefs[i] will give the cubic polynomial coefficients between x =
i and x = i+1. Out of laziness, the below numbers give only a
linear interpolation. You'll need to do some work and research
to get the spline coefficients. */
double coefs[3][4] = {{0.0, 1.0, 0.0, 0.0},
{-0.872526, 1.872526, 0.0, 0.0},
{-2.032706, 2.452616, 0.0, 0.0}};
if ((x >= 0) && (x < 3.0)) {
i = (int) x;
/* Horner's method cubic. */
res = (((coefs[i][3] * x + coefs[i][2]) * x) + coefs[i][1] * x)
+ coefs[i][0];
} else if (x >= 3.0) {
/* Scaled x ** 1.5 once you go off the spline. */
res = 1.024824 * my_sqrt(x * x * x);
}
return res;
}
double my_019029(double x) {
return my_sqrt(my_sqrt(my_sqrt(my_152232(x))));
}
int main() {
int i;
double x = 0.0;
for (i = 0; i < 1000; i++) {
x = 1e-2 * i;
printf("%f %f %f \n", x, my_019029(x), pow(x, 0.19029));
}
return 0;
}
EDIT: If you're just interested in a small region like [0,1], even simpler is to peel off one sqrt(x) and compute x^1.02232, which is quite well behaved, using a Taylor series:
double my_152232(double x) {
double part_050000 = my_sqrt(x);
double part_102232 = 1.02232 * x + 0.0114091 * x * x - 3.718147e-3 * x * x * x;
return part_102232 * part_050000;
}
This gets you within 1% of the exact power for approximately [0.1,6], though getting the singularity exactly right is always a challenge. Even so, this three-term Taylor series gets you within 2.3% for x = 0.001.

Resources