I am trying to implement the Haversine Formula in a little GPS program I'm writing. The distance calculations appear to be spot-on. However, I believe the bearing is being computed in radians, and I don't know how to properly convert the result to compass directions (0 for North, 90 for East, etc).
Any help would be greatly appreciated, all this talk of cosigns and arctangents is giving me a major headache! I'm a coder, not a mathematician!
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
void FindDistance (double latHome, double lonHome, double latDest, double lonDest)
{
double pi=3.141592653589793;
int R=6371; //Radius of the Earth in kilometers
//Keep the parameters passed to the function immutable
double latHomeTmp=(pi/180)*(latHome);
double latDestTmp=(pi/180)*(latDest);
double differenceLon= (pi/180)*(lonDest- lonHome);
double differenceLat=(pi/180)*(latDest- latHome);
double a= sin (differenceLat/2.)*sin (differenceLat/2.)+cos (latHomeTmp)*cos (latDestTmp)*sin (differenceLon/2.)*sin (differenceLon/2.);
double c=2*atan2 (sqrt (a), sqrt (1-a));
double Distance=R*c;
printf ("Distance is %f\n", Distance);
double RadBearing=atan2 (sin (differenceLon)*cos (latDestTmp), cos (latHomeTmp)*sin (latDestTmp)-sin (latHomeTmp)*cos (latDestTmp)*cos (differenceLon));
double DegBearing=RadBearing*57.2958;
if (DegBearing<0) DegBearing=360+DegBearing;
printf ("Bearing is %f\n", DegBearing);
} //Function FindDistance
int main (void) {
puts ("LA to NY");
FindDistance (34.052235, -118.243683, 40.748817, -73.985428);
puts ("NY to LA");
FindDistance (40.748817, -73.985428, 34.052235, -118.243683);
} //Function main
gcc -o gps -lm gps.c
It returns a bearing of 65 from LA to NY, and a bearing of 273 from NY to LA.
If we add the bearings together, we get 338 which can't be right - shouldn't it equal 360?
Or am I completely out to lunch?
Anyway, as you can see, I always compute both distance and bearing at the same time. If you could also suggest a way to clean up the code so it doesn't perform unnecessary calculations, that would be so very outstanding! I'm running this on a small microprocessor where I like to make every cycle count!
Not a problem.
Consider 2 locations at the same latitude, yet differ in longitude. One is not due east (90) of the other when going in the shorting great circle route. Nor is the first due (west 270). The bearings are not necessarily complementary.
FindDistance(34.0, -118.0, 34.0, -73.0);
FindDistance(34.0, -73.0, 34.0, -118.0);
Distance is 4113.598081
Bearing is 76.958824
Distance is 4113.598081
Bearing is 283.041176
#user3386109 adds more good information.
Per the site suggest by #M Oehm your code is about correct.
Per OP request, some mods that may have slight speed improvement.
void FindDistance(double latHome, double lonHome, double latDest,
double lonDest) {
// A few extra digits sometimes is worth it - rare is adding more digits a problem
// double pi=3.141592653589793;
static const double pi_d180 = 3.1415926535897932384626433832795 / 180;
static const double d180_pi = 180 / 3.1415926535897932384626433832795;
// int R=6371; //Radius of the Earth in kilometers
// (Compiler may do this all ready)
static const double R = 6371.0; // better to make FP to avoid the need to convert
//Keep the parameters passed to the function immutable
double latHomeTmp = pi_d180 * (latHome);
double latDestTmp = pi_d180 * (latDest);
double differenceLon = pi_d180 * (lonDest - lonHome);
double differenceLat = pi_d180 * (latDest - latHome);
double a = sin(differenceLat / 2.) * sin(differenceLat / 2.)
+ cos(latHomeTmp) * cos(latDestTmp) * sin(differenceLon / 2.)
* sin(differenceLon / 2.);
double c = 2 * atan2(sqrt(a), sqrt(1 - a));
double Distance = R * c;
printf("Distance is %f\n", Distance);
double RadBearing = atan2(sin(differenceLon) * cos(latDestTmp),
cos(latHomeTmp) * sin(latDestTmp)
- sin(latHomeTmp) * cos(latDestTmp) * cos(differenceLon));
// double DegBearing = RadBearing * 57.2958;
double DegBearing = RadBearing * d180_pi;
// Why is this even needed?
if (DegBearing < 0) DegBearing = 360 + DegBearing;
printf("Bearing is %f\n", DegBearing);
} //Function FindDistance
Related
I have written a C script to implement the inverse Vincenty's formula to calculate the distance between two sets of GPS coordinates based on the equations shown at https://en.wikipedia.org/wiki/Vincenty%27s_formulae
However, my results are different to the results given by this online calculator https://www.cqsrg.org/tools/GCDistance/ and Google maps. My results are consistently around 1.18 times the result of the online calculator.
My function is below, any tips on where I could be going wrong would be very much appreciated!
double get_distance(double lat1, double lon1, double lat2, double lon2)
{
double rad_eq = 6378137.0; //Radius at equator
double flattening = 1 / 298.257223563; //flattenig of earth
double rad_pol = (1 - flattening) * rad_eq; //Radius at poles
double U1,U2,L,lambda,old_lambda,sigma,sin_sig,cos_sig,alpha,cos2sigmam,A,B,C,u_sq,delta_s,dis;
//Convert to radians
lat1=M_PI*lat1/180.0;
lat2=M_PI*lat2/180.0;
lon1=M_PI*lon1/180.0;
lon2=M_PI*lon2/180.0;
//Calculate U1 and U2
U1=atan((1-flattening)*tan(lat1));
U2=atan((1-flattening)*tan(lat2));
L=lon2-lon1;
lambda=L;
double tolerance=pow(10.,-12.);//iteration tollerance should give 0.6mm
double diff=1.;
while (abs(diff)>tolerance)
{
sin_sig=sqrt(pow(cos(U2)*sin(lambda),2.)+pow(cos(U1)*sin(U2)-(sin(U1)*cos(U2)*cos(lambda)),2.));
cos_sig=sin(U1)*cos(U2)+cos(U1)*cos(U2)*cos(lambda);
sigma=atan(sin_sig/cos_sig);
alpha=asin((cos(U1)*cos(U2)*sin(lambda))/(sin_sig));
cos2sigmam=cos(sigma)-(2*sin(U1)*sin(U2))/((pow(cos(alpha),2.)));
C=(flattening/16)*pow(cos(alpha),2.)*(4+(flattening*(4-(3*pow(cos(alpha),2.)))));
old_lambda=lambda;
lambda=L+(1-C)*flattening*sin(alpha)*(sigma+C*sin_sig*(cos2sigmam+C*cos_sig*(-1+2*pow(cos2sigmam,2.))));
diff=abs(old_lambda-lambda);
}
u_sq=pow(cos(alpha),2.)*((pow(rad_eq,2.)-pow(rad_pol,2.))/(pow(rad_pol,2.)));
A=1+(u_sq/16384)*(4096+(u_sq*(-768+(u_sq*(320-(175*u_sq))))));
B=(u_sq/1024)*(256+(u_sq*(-128+(u_sq*(74-(47*u_sq))))));
delta_s=B*sin_sig*(cos2sigmam+(B/4)*(cos_sig*(-1+(2*pow(cos2sigmam,2.)))-(B/6)*cos2sigmam*(-3+(4*pow(sin_sig,2.)))*(-3+(4*pow(cos2sigmam,2.)))));
dis=rad_pol*A*(sigma-delta_s);
//Returns distance in metres
return dis;
}
This formula is not symmetric:
cos_sig = sin(U1)*cos(U2)
+ cos(U1)*cos(U2) * cos(lambda);
And turns out to be wrong, a sin is missing.
Another style of formatting (one including some whitespace) could also help.
Besides the fabs for abs and one sin for that cos I also changed the loop; there were two abs()-calls and diff had to be preset with the while-loop.
I inserted a printf to see how the value progresses.
Some parentheses can be left out. These formulas are really difficult to realize. Some more helper variables could be useful in this jungle of nested math operations.
do {
sin_sig = sqrt(pow( cos(U2) * sin(lambda), 2)
+ pow(cos(U1)*sin(U2)
- (sin(U1)*cos(U2) * cos(lambda))
, 2)
);
cos_sig = sin(U1) * sin(U2)
+ cos(U1) * cos(U2) * cos(lambda);
sigma = atan2(sin_sig, cos_sig);
alpha = asin(cos(U1) * cos(U2) * sin(lambda)
/ sin_sig
);
double cos2alpha = cos(alpha)*cos(alpha); // helper var.
cos2sigmam = cos(sigma) - 2*sin(U1)*sin(U2) / cos2alpha;
C = (flat/16) * cos2alpha * (4 + flat * (4 - 3*cos2alpha));
old_lambda = lambda;
lambda = L + (1-C) * flat * sin(alpha)
*(sigma + C*sin_sig
*(cos2sigmam + C*cos_sig
*(2 * pow(cos2sigmam, 2) - 1)
)
);
diff = fabs(old_lambda - lambda);
printf("%.12f\n", diff);
} while (diff > tolerance);
For 80,80, 0,0 the output is (in km):
0.000885870048
0.000000221352
0.000000000055
0.000000000000
9809.479224
which corresponds to the millimeter with WGS-84.
I started to learn to code last year.Not very good though.
I tried to plot the trajectory of cannonball shot from the ground using c language. (I didn't code this alone anyway).
(# sorry I have to delete the original code for a while. If you would like to see my original code, please let me know. but I'm sure it'd be totally enough if you just see the selected answer below. )
and if I excute it, the result is like this.
the speed is?
6
the angle is
32
the spring is
0.4
X=0.000000, Y=0.000000
X=0.000000, Y=-0.049000
X=0.000000, Y=-0.049000
X=0.000000, Y=-0.049000
X=0.000000, Y=-0.049000
X=0.000000, Y=-0.049000
X=0.000000, Y=-0.049000
X=0.000000, Y=-0.049000
X=0.000000, Y=-0.049000
X=0.000000, Y=-0.049000
I tried to figure out what I did wrong, but I have no idea at all.
Please help me!
The problem with this approach is that the Y-coordinate can become negative. And once it does, if the initial speed is not high enough, the ball can become stuck underneath the surface, as demonstrated by #RetiredNinja's results (the coordinate becomes stuck at -0.049 at some point which is not the correct behavior).
Erroneous results (not to scale) from original code, with modified parameters:
The ball seems to "tunnel" through the ground (some values dip below zero).
How to resolve this? We need to resolve the collision properly, making sure that the ball bounces instead of crossing the surface boundary. To do so, let's examine the behavior of the ball during a timestep in which a bounce occurs.
A collision occurs when both the vertical velocity and vertical position are negative. To find the collision time and velocity, use the equations of motion:
Once we have the collision velocity, we can simply update the new vertical speed to -spring * vc (spring should have a better name, e.g. coef_rest); an additional advantage is that we no longer need calls to pow.
This may occur several times during a timestep, so we need to perform this in a loop. One more thing to watch out for is that as the vertical velocity decays, the bounces become infinitely more frequent - so we need a "cut-off" velocity at which to stop the ball from bouncing.
Code:
#include <stdio.h>
#include <math.h>
#ifndef M_PI
#define M_PI 3.141592654
#endif
int main()
{
const double g = 9.81; // gravity
const double dt = 0.025; // time step
const double maxtime = 5.0; // max time
const double spring = 0.95; // coefficient of restitution
const double cutoff = 1e-4; // cut-off velocity
double speed = 6;
double angle = 32;
angle = angle * M_PI / 180.0;
double init_vx = speed * cos(angle);
double init_vy = speed * sin(angle);
int springnumber = 0;
printf("0.0,0.0\n");
for (double ts = 0.0, vs = init_vy, time = dt; time <= maxtime;)
{
// positions *after* this time step
double px = time * init_vx;
double elapse = time - ts;
double py = 0.0, vy = 0.0;
if (vs >= cutoff)
{
py = (vs - 0.5 * g * elapse) * elapse;
vy = vs - g * elapse;
}
// check for bounce
if (vy < 0.0 && py < 0.0)
{
// collision time
double tc = 2.0 * vs / g;
// update speed after bounce and time of collision
springnumber++;
vs *= spring;
ts += tc;
continue;
}
// print
printf("%f,%f\n", px, py);
// timestep
time += dt;
}
return 0;
}
Test results, with the same parameters as before:
The ball no longer "tunnels", which is the correct behavior.
I'm getting the following error message after trying to do the a numerical integration on a infinte interval [0,inf) using GSL in C.
gsl: qags.c:553: ERROR: bad integrand behavior found in the integration interval
Default GSL error handler invoked.
Command terminated by signal 6
Here is the function I'm integrating
$
double dI2dmu(double x, void * parametros){
double *p,Ep,mu,M,T;
p=(double *) parametros;
M=p[0];
T=p[1];
mu=p[2];
Ep=sqrt(x*x+M*M);
double fplus= -((exp((Ep - mu)/T)/(pow(1 + exp((Ep - mu)/T),2)*T) - exp((Ep + \
mu)/T)/(pow(1 + exp((Ep + mu)/T),2)*T))*pow(x,2))/(2.*Ep*pow(PI,2));
return fplus;
}
And the code for the integration procedure
params[0]=0.007683; //M
params[1]=0.284000;// T
params[2]=0.1; //mu
gsl_function dI2mu_u;
dI2mu_u.function = &dI2dmu;
dI2mu_u.params = ¶ms;
gsl_integration_qagiu (&dI2mu_u, 0, 0, 1e-7, 100000,
w, &resultTest2, &error1Test2);
The fucntion has the following aspect:
Which, to my eyes, has a very well behavior. So, instead of performing an infinite integration, I perform the integration up to an upper limit that I consider rezonable, like in:
gsl_function G;
G.function = &dI2dmu;
G.params = ¶ms;
gsl_integration_qags (&G, 0, 1e2*A, 0, 1e-7, 100000,
w, &result1, &error1);
Getting a result that agrees with the result of Mathematica for infinite integration
result definite up to 10*A = 0.005065263943958745
result up to infinity = nan
Mathematica result up to infinity = 0.005065260000000000
But the GSL infinite integral keps being "nan". Any ideas? I thanks in advance for the help.
As #yonatan zuleta ochoa points out correctly, the problem is in exp(t)/pow(exp(t)+1,2). exp(t) can overflow an ieee754 DBL_MAX for values of t as low as nextafter(log(DBL_MAX), INFINITY), which is ~7.09783e2.
When exp(t) == INFINITY,
exp(t)/pow(exp(t)+1,2) == ∞/pow(∞+1,2) == ∞/∞ == NAN
Yonatan's proposed solution is to use logarithms, which can be done as follows:
exp(t)/pow(exp(t)+1,2) == exp(log(exp(t)) - log(pow(exp(t)+1,2)))
== exp(t - 2*log(exp(t)+1))
== exp(t - 2*log1p(exp(t))) //<math.h> function avoiding loss of precision for log(exp(t)+1)) if exp(t) << 1.0
This is an entirely reasonable approach, avoiding NAN up to very high values of t. However, in your code, t == (Ep ± mu)/T can be INFINITY if abs(T) < 1.0 for values of x close to DBL_MAX, even if x is not infinity. In this case, the subtraction t - 2*log1p(exp(t)) turns into ∞ - ∞, which is NAN again.
A different approach is to replace exp(x)/pow(exp(x)+1,2) with 1.0/(pow(exp(x)+1,2)*pow(exp(x), -1)) by dividing both denominator and numerator by exp(x) (which is not zero for any finite x). This simplifies to 1.0/(exp(x)+exp(-x)+2.0).
Here is an implementation of the function avoiding NAN for values of x up to and including DBL_MAX:
static double auxfun4(double a, double b, double c, double d)
{
return 1.0/(a*b+2.0+c*d);
}
double dI2dmu(double x, void * parametros)
{
double *p = (double *) parametros;
double invT = 1.0/p[1];
double Ep = hypot(x, p[0]);
double muexp = exp(p[2]*invT);
double Epexp = exp(Ep*invT);
double muinv = 1.0/muexp;
double Epinv = 1.0/Epexp;
double subterm = auxfun4(Epexp, muinv, Epinv, muexp);
subterm -= auxfun4(Epexp, muexp, Epinv, muinv);
double fminus = subterm*(x/Ep)*invT*(0.5/(M_PI*M_PI))*x;;
return -fminus;
}
This implementation also uses hypot(x,M), rather than sqrt(x*x, M*M), and avoids calculating x*x by rearranging the order of multiplications/divisions to group x/Ep together. Since hypot(x,M) will be abs(x) for abs(x) >> abs(M), the term x/Ep approaches 1.0 for large x.
I think the problem here is that unlike Mathematica, C does not use arbitrary precision in computing. Then, at some point when Exp [Ep] is calculated numerical computation overflows.
Now, GSL uses the transformation x = (1-t)/t, to map onto interval (0,1].
So, for t<<0 is posible to get nan results since the behavior of your function tends to indeterminations (0/0 or inf/inf,etc) for extreme values.
Maybe if you write out the terms
Exp[ ( Ep(x) - \Mu)/T ] / { 1 + Exp[( Ep(x) - \Mu )/T] }^2
using A/B = Exp[ Ln A - Ln B], you could get a better numerical behavior.
I will try if and I have nice results, then I'll tell you.
The solution
As I said before, you must take care the problems arising with indeterminate forms. So, lets write out the problematic terms using the logarithmic version:
double dIdmu(double x, void * parametros){
double *p,Ep,mu,M,T;
p=(double *) parametros;
M=p[0];
T=p[1];
mu=p[2];
Ep=sqrt(x*x+M*M);
double fplus= - ( exp( (Ep - mu)/T -2.0*log(1.0 + exp((Ep - mu)/T) ) ) - exp( (Ep + mu)/T -2.0*log(1.0 + exp((Ep + mu)/T) ) ) ) * pow(x,2) / (2.* T * Ep*pow(M_PI,2));
return fplus;
}
and with this main function
int main()
{
double params[3];
double resultTest2, error1Test2;
gsl_integration_workspace * w
= gsl_integration_workspace_alloc (10000);
params[0]=0.007683; //M
params[1]=0.284000;// T
params[2]=0.1; //mu
gsl_function dI2mu_u;
dI2mu_u.function = &dIdmu;
dI2mu_u.params = ¶ms;
gsl_integration_qagiu (&dI2mu_u, 0.0, 1e-7, 1e-7, 10000, w, &resultTest2, &error1Test2);
printf("%e\n", resultTest2);
gsl_integration_workspace_free ( w);
return 0;
}
you get the answer:
-5.065288e-03.
I am curious... This is how I define the function in Mathematica
So comparing the answers:
GSL -5.065288e-03
Mathematica -0.005065287633739702
I have recently implemented a library of CORDIC functions to reduce the required computational power (my project is based on a PowerPC and is extremely strict in its execution time specifications). The language is ANSI-C.
The other functions (sin/cos/atan) work within accuracy limits both in 32 and in 64 bit implementations.
Unfortunately, the asin() function fails systematically for certain inputs.
For testing purposes I have implemented an .h file to be used in a simulink S-Function. (This is only for my convenience, you can compile the following as a standalone .exe with minimal changes)
Note: I have forced 32 iterations because I am working in 32 bit precision and the maximum possible accuracy is required.
Cordic.h:
#include <stdio.h>
#include <stdlib.h>
#define FLOAT32 float
#define INT32 signed long int
#define BIT_XOR ^
#define CORDIC_1K_32 0x26DD3B6A
#define MUL_32 1073741824.0F /*needed to scale float -> int*/
#define INV_MUL_32 9.313225746E-10F /*needed to scale int -> float*/
INT32 CORDIC_CTAB_32 [] = {0x3243f6a8, 0x1dac6705, 0x0fadbafc, 0x07f56ea6, 0x03feab76, 0x01ffd55b, 0x00fffaaa, 0x007fff55,
0x003fffea, 0x001ffffd, 0x000fffff, 0x0007ffff, 0x0003ffff, 0x0001ffff, 0x0000ffff, 0x00007fff,
0x00003fff, 0x00001fff, 0x00000fff, 0x000007ff, 0x000003ff, 0x000001ff, 0x000000ff, 0x0000007f,
0x0000003f, 0x0000001f, 0x0000000f, 0x00000008, 0x00000004, 0x00000002, 0x00000001, 0x00000000};
/* CORDIC Arcsine Core: vectoring mode */
INT32 CORDIC_asin(INT32 arc_in)
{
INT32 k;
INT32 d;
INT32 tx;
INT32 ty;
INT32 x;
INT32 y;
INT32 z;
x=CORDIC_1K_32;
y=0;
z=0;
for (k=0; k<32; ++k)
{
d = (arc_in - y)>>(31);
tx = x - (((y>>k) BIT_XOR d) - d);
ty = y + (((x>>k) BIT_XOR d) - d);
z += ((CORDIC_CTAB_32[k] BIT_XOR d) - d);
x = tx;
y = ty;
}
return z;
}
/* Wrapper function for scaling in-out of cordic core*/
FLOAT32 asin_wrap(FLOAT32 arc)
{
return ((FLOAT32)(CORDIC_asin((INT32)(arc*MUL_32))*INV_MUL_32));
}
This can be called in a manner similar to:
#include "Cordic.h"
#include "math.h"
void main()
{
y1 = asin_wrap(value_32); /*my implementation*/
y2 = asinf(value_32); /*standard math.h for comparison*/
}
The results are as shown:
Top left shows the [-1;1] input over 2000 steps (0.001 increments), bottom left the output of my function, bottom right the standard output and top right the difference of the two outputs.
It is immediate to see that the error is not within 32 bit accuracy.
I have analysed the steps performed (and the intermediate results) by my code and it seems to me that at a certain point the value of y is "close enough" to the initial value of arc_in and what could be related to a bit-shift causes the solution to diverge.
My questions:
I am at a loss, is this error inherent in the CORDIC implementation or have I made a mistake in the implementation? I was expecting the decrease of accuracy near the extremes, but those spikes in the middle are quite unexpected. (the most notable ones are just beyond +/- 0.6, but even removed these there are more at smaller values, albeit not as pronounced)
If it is something part of the CORDIC implementation, are there known workarounds?
EDIT:
Since some comment mention it, yes, I tested the definition of INT32, even writing
#define INT32 int32_T
does not change the results by the slightest amount.
The computation time on the target hardware has been measured by hundreds of repetitions of block of 10.000 iterations of the function with random input in the validity range. The observed mean results (for one call of the function) are as follows:
math.h asinf() 100.00 microseconds
CORDIC asin() 5.15 microseconds
(apparently the previous test had been faulty, a new cross-test has obtained no better than an average of 100 microseconds across the validity range)
I apparently found a better implementation. It can be downloaded in matlab version here and in C here. I will analyse more its inner workings and report later.
To review a few things mentioned in the comments:
The given code outputs values identical to another CORDIC implementation. This includes the stated inaccuracies.
The largest error is as you approach arcsin(1).
The second largest error is that the values of arcsin(0.60726) to arcsin(0.68514) all return 0.754805.
There are some vague references to inaccuracies in the CORDIC method for some functions including arcsin. The given solution is to perform "double-iterations" although I have been unable to get this to work (all values give a large amount of error).
The alternate CORDIC implemention has a comment /* |a| < 0.98 */ in the arcsin() implementation which would seem to reinforce that there is known inaccuracies close to 1.
As a rough comparison of a few different methods consider the following results (all tests performed on a desktop, Windows7 computer using MSVC++ 2010, benchmarks timed using 10M iterations over the arcsin() range 0-1):
Question CORDIC Code: 1050 ms, 0.008 avg error, 0.173 max error
Alternate CORDIC Code (ref): 2600 ms, 0.008 avg error, 0.173 max error
atan() CORDIC Code: 2900 ms, 0.21 avg error, 0.28 max error
CORDIC Using Double-Iterations: 4700 ms, 0.26 avg error, 0.917 max error (???)
Math Built-in asin(): 200 ms, 0 avg error, 0 max error
Rational Approximation (ref): 250 ms, 0.21 avg error, 0.26 max error
Linear Table Lookup (see below) 100 ms, 0.000001 avg error, 0.00003 max error
Taylor Series (7th power, ref): 300 ms, 0.01 avg error, 0.16 max error
These results are on a desktop so how relevant they would be for an embedded system is a good question. If in doubt, profiling/benchmarking on the relevant system would be advised. Most solutions tested don't have very good accuracy over the range (0-1) and all but one are actually slower than the built-in asin() function.
The linear table lookup code is posted below and is my usual method for any expensive mathematical function when speed is desired over accuracy. It simply uses a 1024 element table with linear interpolation. It seems to be both the fastest and most accurate of all methods tested, although the built-in asin() is not much slower really (test it!). It can easily be adjusted for more or less accuracy by changing the size of the table.
// Please test this code before using in anything important!
const size_t ASIN_TABLE_SIZE = 1024;
double asin_table[ASIN_TABLE_SIZE];
int init_asin_table (void)
{
for (size_t i = 0; i < ASIN_TABLE_SIZE; ++i)
{
float f = (float) i / ASIN_TABLE_SIZE;
asin_table[i] = asin(f);
}
return 0;
}
double asin_table (double a)
{
static int s_Init = init_asin_table(); // Call automatically the first time or call it manually
double sign = 1.0;
if (a < 0)
{
a = -a;
sign = -1.0;
}
if (a > 1) return 0;
double fi = a * ASIN_TABLE_SIZE;
double decimal = fi - (int)fi;
size_t i = fi;
if (i >= ASIN_TABLE_SIZE-1) return Sign * 3.14159265359/2;
return Sign * ((1.0 - decimal)*asin_table[i] + decimal*asin_table[i+1]);
}
The "single rotate" arcsine goes badly wrong when the argument is just greater than the initial value of 'x', where that is the magical scaling factor -- 1/An ~= 0.607252935 ~= 0x26DD3B6A.
This is because, for all arguments > 0, the first step always has y = 0 < arg, so d = +1, which sets y = 1/An, and leaves x = 1/An. Looking at the second step:
if arg <= 1/An, then d = -1, and the steps which follow converge to a good answer
if arg > 1/An, then d = +1, and this step moves further away from the right answer, and for a range of values a little bigger than 1/An, the subsequent steps all have d = -1, but are unable to correct the result :-(
I found:
arg = 0.607 (ie 0x26D91687), relative error 7.139E-09 -- OK
arg = 0.608 (ie 0x26E978D5), relative error 1.550E-01 -- APALLING !!
arg = 0.685 (ie 0x2BD70A3D), relative error 2.667E-04 -- BAD !!
arg = 0.686 (ie 0x2BE76C8B), relative error 1.232E-09 -- OK, again
The descriptions of the method warn about abs(arg) >= 0.98 (or so), and I found that somewhere after 0.986 the process fails to converge and the relative error jumps to ~5E-02 and hits 1E-01 (!!) at arg=1 :-(
As you did, I also found that for 0.303 < arg < 0.313 the relative error jumps to ~3E-02, and reduces slowly until things return to normal. (In this case step 2 overshoots so far that the remaining steps cannot correct it.)
So... the single rotate CORDIC for arcsine looks rubbish to me :-(
Added later... when I looked even closer at the single rotate CORDIC, I found many more small regions where the relative error is BAD...
...so I would not touch this as a method at all... it's not just rubbish, it's useless.
BTW: I thoroughly recommend "Software Manual for the Elementary Functions", William Cody and William Waite, Prentice-Hall, 1980. The methods for calculating the functions are not so interesting any more (but there is a thorough, practical discussion of the relevant range-reductions required). However, for each function they give a good test procedure.
The additional source I linked at the end of the question apparently contains the solution.
The proposed code can be reduced to the following:
#define M_PI_2_32 1.57079632F
#define SQRT2_2 7.071067811865476e-001F /* sin(45°) = cos(45°) = sqrt(2)/2 */
FLOAT32 angles[] = {
7.8539816339744830962E-01F, 4.6364760900080611621E-01F, 2.4497866312686415417E-01F, 1.2435499454676143503E-01F,
6.2418809995957348474E-02F, 3.1239833430268276254E-02F, 1.5623728620476830803E-02F, 7.8123410601011112965E-03F,
3.9062301319669718276E-03F, 1.9531225164788186851E-03F, 9.7656218955931943040E-04F, 4.8828121119489827547E-04F,
2.4414062014936176402E-04F, 1.2207031189367020424E-04F, 6.1035156174208775022E-05F, 3.0517578115526096862E-05F,
1.5258789061315762107E-05F, 7.6293945311019702634E-06F, 3.8146972656064962829E-06F, 1.9073486328101870354E-06F,
9.5367431640596087942E-07F, 4.7683715820308885993E-07F, 2.3841857910155798249E-07F, 1.1920928955078068531E-07F,
5.9604644775390554414E-08F, 2.9802322387695303677E-08F, 1.4901161193847655147E-08F, 7.4505805969238279871E-09F,
3.7252902984619140453E-09F, 1.8626451492309570291E-09F, 9.3132257461547851536E-10F, 4.6566128730773925778E-10F};
FLOAT32 arcsin_cordic(FLOAT32 t)
{
INT32 i;
INT32 j;
INT32 flip;
FLOAT32 poweroftwo;
FLOAT32 sigma;
FLOAT32 sign_or;
FLOAT32 theta;
FLOAT32 x1;
FLOAT32 x2;
FLOAT32 y1;
FLOAT32 y2;
flip = 0;
theta = 0.0F;
x1 = 1.0F;
y1 = 0.0F;
poweroftwo = 1.0F;
/* If the angle is small, use the small angle approximation */
if ((t >= -0.002F) && (t <= 0.002F))
{
return t;
}
if (t >= 0.0F)
{
sign_or = 1.0F;
}
else
{
sign_or = -1.0F;
}
/* The inv_sqrt() is the famous Fast Inverse Square Root from the Quake 3 engine
here used with 3 (!!) Newton iterations */
if ((t >= SQRT2_2) || (t <= -SQRT2_2))
{
t = 1.0F/inv_sqrt(1-t*t);
flip = 1;
}
if (t>=0.0F)
{
sign_or = 1.0F;
}
else
{
sign_or = -1.0F;
}
for ( j = 0; j < 32; j++ )
{
if (y1 > t)
{
sigma = -1.0F;
}
else
{
sigma = 1.0F;
}
/* Here a double iteration is done */
x2 = x1 - (sigma * poweroftwo * y1);
y2 = (sigma * poweroftwo * x1) + y1;
x1 = x2 - (sigma * poweroftwo * y2);
y1 = (sigma * poweroftwo * x2) + y2;
theta += 2.0F * sigma * angles[j];
t *= (1.0F + poweroftwo * poweroftwo);
poweroftwo *= 0.5F;
}
/* Remove bias */
theta -= sign_or*4.85E-8F;
if (flip)
{
theta = sign_or*(M_PI_2_32-theta);
}
return theta;
}
The following is to be noted:
It is a "Double-Iteration" CORDIC implementation.
The angles table thus differs in construction from the old table.
And the computation is done in floating point notation, this will cause a major increase in computation time on the target hardware.
A small bias is present in the output, removed via the theta -= sign_or*4.85E-8F; passage.
The following picture shows the absolute (left) and relative errors (right) of the old implementation (top) vs the implementation contained in this answer (bottom).
The relative error is obtained only by dividing the CORDIC output with the output of the built-in math.h implementation. It is plotted around 1 and not 0 for this reason.
The peak relative error (when not dividing by zero) is 1.0728836e-006.
The average relative error is 2.0253509e-007 (almost in accordance to 32 bit accuracy).
For convergence of iterative process it is necessary that any "wrong" i-th
iteration could be "corrected" in the subsequent (i+1)-th, (i+2)-th, (i+3)-th,
etc. etc. iterations. Or, in other words, at least a half of the "wrong"
i-th iteration could be corrected in the next (i+1)-th iteration.
For atan(1/2^i) this condition is satisfied, i.e.:
atan(1/2^(i+1)) > 1/2*atan(1/2^i)
Read more at
http://cordic-bibliography.blogspot.com/p/double-iterations-in-cordic.html
and:
http://baykov.de/CORDIC1972.htm
(note I'm the author of those pages)
I have implemented the newton raphson algorithm for finding roots in C. I want to print out the most accurate approximation of the root as possible without going into nan land. My strategy for this is while (!(isnan(x0)) { dostuff(); } But this continues to print out the result multiple times. Ideally I would like to setup a range so that the difference between each computed x intercept approximation would stop when the previous - current is less than some range .000001 in my case. I have a possible implementation below. When I input 2.999 It takes only one step, but when I input 3.0 it takes 20 steps, this seems incorrect to me.
(When I input 3.0)
λ newton_raphson 3
2.500000
2.250000
2.125000
2.062500
2.031250
2.015625
2.007812
2.003906
2.001953
2.000977
2.000488
2.000244
2.000122
2.000061
2.000031
2.000015
2.000008
2.000004
2.000002
2.000001
Took 20 operation(s) to approximate a proper root of 2.000002
within a range of 0.000001
(When I input 2.999)
λ newton_raphson 2.999
Took 1 operation(s) to approximate a proper root of 2.000000
within a range of 0.000001
My code:
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#define RANGE 0.000001
double absolute(double number)
{
if (number < 0) return -number;
else return number;
}
double newton_raphson(double (*func)(double), double (*derivative)(double), double x0){
int count;
double temp;
count = 0;
while (!isnan(x0)) {
temp = x0;
x0 = (x0 - (func(x0)/derivative(x0)));
if (!isnan(x0))
printf("%f\n", x0);
count++;
if (absolute(temp - x0) < RANGE && count > 1)
break;
}
printf("Took %d operation(s) to approximate a proper root of %6f\nwithin a range of 0.000001\n", count, temp);
return x0;
}
/* (x-2)^2 */
double func(double x){ return pow(x-2.0, 2.0); }
/* 2x-4 */
double derivative(double x){ return 2.0*x - 4.0; }
int main(int argc, char ** argv)
{
double x0 = atof(argv[1]);
double (*funcPtr)(double) = &func; /* this is a user defined function */
double (*derivativePtr)(double) = &derivative; /* this is the derivative of that function */
double result = newton_raphson(funcPtr, derivativePtr, x0);
return 0;
}
You call trunc(x0) which turns 2.999 into 2.0. Naturally, when you start at the right answer, no iteration is needed! In other words, although you intended to use 2.999 as your starting value, you actually used 2.0.
Simply remove the call to trunc().
Worth pointing out: taking 20 steps to converge is also anomalous; because you are converging to a multiple root, the convergence is only linear instead of the typical quadratic convergence that Newton-Raphson gives in the general case. You can see this in the fact that your error is halved with each iteration (with the usual quadratic convergence, you would get twice as many correct digits on each iteration, and converge much, much faster).