I am writing a raycaster, and I am trying to speed it up by making lookup tables for my most commonly called trig functions, namely sin, cos, and tan. This first snippet is my table lookup code. In order to avoid making a lookup table for each, I am just making one sin table, and defining cos(x) as sin(half_pi - x) and tan(x) as sin(x) / cos(x).
#include <math.h>
#include <time.h>
#include <stdio.h>
#include <stdlib.h>
const float two_pi = M_PI * 2, half_pi = M_PI / 2;
typedef struct {
int fn_type, num_vals;
double* vals, step;
} TrigTable;
static TrigTable sin_table;
TrigTable init_trig_table(const int fn_type, const int num_vals) {
double (*trig_fn) (double), period;
switch (fn_type) {
case 0: trig_fn = sin, period = two_pi; break;
case 1: trig_fn = cos, period = two_pi; break;
case 2: trig_fn = tan, period = M_PI; break;
}
TrigTable table = {fn_type, num_vals,
calloc(num_vals, sizeof(double)), period / num_vals};
for (double x = 0; x < period; x += table.step)
table.vals[(int) round(x / table.step)] = trig_fn(x);
return table;
}
double _lookup(const TrigTable table, const double x) {
return table.vals[(int) round(x / table.step)];
}
double lookup_sin(double x) {
const double orig_x = x;
if (x < 0) x = -x;
if (x > two_pi) x = fmod(x, two_pi);
const double result = _lookup(sin_table, x);
return orig_x < 0 ? -result : result;
}
double lookup_cos(double x) {
return lookup_sin(half_pi - x);
}
double lookup_tan(double x) {
return lookup_sin(x) / lookup_cos(x);
}
Here is how I went about benchmarking my code: my function for the current time in milliseconds is from here. The problem arises here: when timing my lookup_sin vs math.h's sin, my variant takes around three times longer: Table time vs default: 328 ms, 108 ms.
Here is the timing for cos:
Table time vs default: 332 ms, 109 ms
Here is the timing for tan:
Table time vs default: 715 ms, 153 ms
What makes my code so much slower? I would think that precomputing sin values would greatly accelerate my code. Perhaps it's the fmod in the lookup_sin function? Please provide whatever insight that you have. I am compiling with clang with no optimizations enabled, so that the calls to each trig function are not removed (I am ignoring the return value).
const int64_t millis() {
struct timespec now;
timespec_get(&now, TIME_UTC);
return ((int64_t) now.tv_sec) * 1000 + ((int64_t) now.tv_nsec) / 1000000;
}
const int64_t benchmark(double (*trig_fn) (double)) {
const int64_t before = millis();
for (double i = 0; i < 10000; i += 0.001)
trig_fn(i);
return millis() - before;
}
int main() {
sin_table = init_trig_table(0, 15000);
const int64_t table_time = benchmark(lookup_sin), default_time = benchmark(sin);
printf("Table time vs default: %lld ms, %lld ms\n", table_time, default_time);
free(sin_table.vals);
}
Reduce the floating point math.
OP's code is doing excessive FP math in what should be a scale and lookup.
Scale the radians per a pre-computed factor into an index.
The number of entries in the lookup table should be an unsigned power-of-2 so the mod is a simple &.
At first, let us simplify and have [0 ... 2*pi) map to indexes [0 ... number_of_entries) to demo the idea.
double lookup_sin_alt(double x) {
long scaled_x = lround(x * scale_factor); // This should be the _only_ line of FP code
// All following code is integer code.
scaled_x += number_of_entries/4 ; // If we are doing cosine
unsigned index = scaled_x & (number_of_entries - 1); // This & replaces fmod
double result = table.vals[index];
return result;
}
Later we can use a quarter size table [0 ... pi/2] and steer selection/reconstruction with integer operations.
Given OP's low precision requirements, consider using float instead of double throughout including float functions like lroundf().
Related
The teacher asks to remove the pi subtraction cycle in the main function. I don’t know how to write the program so that the correct results will come out for any values.
#include <stdio.h>
#include <math.h>
double sinus(double x);
int main(void) {
double a, x;
scanf("%le", & x);
a = x;
while (fabs(x) > 2 * (M_PI)) {
x = fabs(x) - 2 * (M_PI);
}
if (a > 0)
a = sinus(x);
else a = (-1) * sinus(x);
printf("%le", (double) a);
return 0;
}
double sinus(double x) {
double sum = 0, h, eps = 1.e-16;
int i = 2;
h = x;
do {
sum += h;
h *= -((x * x) / (i * (i + 1)));
i += 2;
}
while (fabs(h) > eps);
return sum;
return 0;
}
#include <stdio.h>
#include <math.h>
double sinus(double x);
int main(void)
{
double a,x;
scanf("%le",&x);
a=x;
x=fmod(fabs(x),2*(M_PI));
if(a>0)
a=sinus(x);
else a=(-1)*sinus(x);
printf("%le",(double)a);
return 0;}
double sinus(double x)
{
double sum=0, h, eps=1.e-16; int i=2;
h=x;
do{
sum+=h;
h*=-((x*x)/(i*(i+1)));
i+=2;}
while( fabs(h)>eps );
return sum;
return 0;
}
… how to write the program so that the correct results will come out for any values.
OP's loop is slow with large x and an infinfite loop with very large x:
while (fabs(x) > 2 * (M_PI)) {
x = fabs(x) - 2 * (M_PI);
}
A simple, though not high quality solution, is to use fmod() in the function itself. #Damien:
#ifndef M_PI
#define M_PI 3.1415926535897932384626433832795
#endif
double sinus(double x) {
x = fmod(x, 2*M_PI); // Reduce to [-2*M_PI ... 2*M_PI]
...
Although function fmod() is not expected to inject any error, the problem is that M_PI (a rational number) is an approximation of π, (an irrational number). Using that value approximation injects error especially x near multiplies of π. This is likely OK for modest quality code.
Good range reduction is a problem as challenging as the trigonometric functions themselves.
See K.C. Ng's "ARGUMENT REDUCTION FOR HUGE ARGUMENTS: Good to the Last Bit" .
OP's sinus() should use additional range reduction and trigonometric properties to get x in range [-M_PI/4 ... M_PI/4] (example) before attempting the power series solution. Otherwise, convergence is slow and errors accumulate.
I am trying to slowly decelerate based on a percentage.
Basically: if percentage is 0 the speed should be speed_max, if the percentage hits 85 the speed should be speed_min, continuing with speed_min until the percentage hits 100%. At percentages between 0% and 85%, the speed should be calculated with the percentage.
I started writing the code already, though I am not sure how to continue:
// Target
int degrees = 90;
// Making sure we're at 0
resetGyro(0);
int speed_max = 450;
int speed_min = 150;
float currentDeg = 0;
float percentage = 0;
while(percentage < 100)
{
//??
getGyroDeg(¤tDeg);
percentage = (degrees/100)*currentDeg;
}
killMotors(1);
Someone in the comments asked why I am doing this.
Unfortunately, I am working with very limited hardware and a pretty bad gyroscope, all while trying to guarantee +- 1 degree precision.
To do this, I am starting at speed_max, slowly decreasing to speed_min (this is to have better control over the motors) when nearing the target value (90).
Why does it stop decelerating at 85%? This is to really be precise and hit the target value successfully.
Assuming speed is linearly calculated based on percentages from 0 to 85 (and stays at speed_min with percentage is gt 85), then this is your formula for calculating speed:
if (percentage >= 85)
{
speed = speed_min;
}
else
{
speed = speed_max - (((speed_max - speed_min)*percentage)/85);
}
Linear interpolation is fairly straight forward.
At percentage 0, the speed should be speed_max.
At percentage 85, the speed should be speed_min.
At percentage values greater than 85, the speed should still be speed_min.
Between 0 and 85, the speed should be linearly interpolated between speed_max and speed_min, so percentage is a 'amount of drop from maximum speed'.
Assuming percentage is of type float:
float speed_from_percentage(float percent)
{
if (percent <= 0.0)
return speed_max;
if (percent >= 85.0)
return speed_min;
return speed_min + (speed_max - speed_min) * (85.0 - percentage) / 85.0;
}
You can also replace the final return with the equivalent:
return speed_max - (speed_max - speed_min) * percentage / 85.0;
If you're truly pedantic, all the constants should be suffixed with F to indicate float and hence use float arithmetic instead of double arithmetic. And hence you should probably also use float for speed_min and speed_max. If everything is meant to be integer arithmetic, you can change float to int and drop the .0 from the expressions.
Assuming getGyroDeg is input from the controller, what you are describing is a proportional control. A constant response curve, ie, 0 to 85 has an output of 450 to 150, and 150 after that, is an ad-hoc approach, based on experience. However, a properly initialised PID controller generally attains a faster time to set-point and greater stability.
#include <stdio.h>
#include <time.h>
#include <assert.h>
#include <stdlib.h>
static float sim_current = 0.0f;
static float sim_dt = 0.01f;
static float sim_speed = 0.0f /* 150.0f */;
static void getGyroDeg(float *const current) {
assert(current);
sim_current += sim_speed * sim_dt;
/* Simulate measurement error. */
*current = sim_current + 3.0 * ((2.0 * rand() / RAND_MAX) - 1.0);
}
static void setGyroSpeed(const float speed) {
assert(speed >= /*150.0f*/-450.0f && speed <= 450.0f);
sim_speed = speed;
}
int main(void) {
/* https://en.wikipedia.org/wiki/PID_controller
u(t) = K_p e(t) + K_i \int_0^t e(\theta)d\theta + K_d de(t)/dt */
const float setpoint = 90.0f;
const float max = 450.0f;
const float min = -450.0f/* 150.0f */;
/* Random value; actually get this number. */
const float dt = 1.0f;
/* Tune these. */
const float kp = 30.0f, ki = 4.0f, kd = 2.0f;
float current, last = 0.0f, integral = 0.0f;
float t = 0.0f;
float e, p, i, d, pid;
size_t count;
for(count = 0; count < 40; count++) {
getGyroDeg(¤t);
e = setpoint - current;
p = kp * e;
i = ki * integral * dt;
d = kd * (e - last) / dt;
last = e;
pid = p + i + d;
if(pid > max) {
pid = max;
} else if(pid < min) {
pid = min;
} else {
integral += e;
}
setGyroSpeed(pid);
printf("%f\t%f\t%f\n", t, sim_current, pid);
t += dt;
}
return EXIT_SUCCESS;
}
Here, instead of the speed linearly decreasing, it calculates the speed in a control loop. However, if the minimum is 150, then it's not going to achieve greater stability; if you go over 90, then you have no way of getting back.
If the controls are [-450, 450], it goes through zero and it is much nicer; I think this might be what you are looking for. It actively corrects for errors.
The following full code could compare speed of fast inverse square root with 1/sqrt(). According to this sentence in wikipedia, (i.e. The algorithm was approximately four times faster than computing the square root with another method and calculating the reciprocal via floating point division.)
But here is why I am here: it is slower than 1/sqrt(). something wrong in my code? please.
#include <stdio.h>
#include <time.h>
#include <math.h>
float FastInvSqrt (float number);
int
main ()
{
float x = 1.0e+100;
int N = 100000000;
int i = 0;
clock_t start2 = clock ();
do
{
float z = 1.0 / sqrt (x);
i++;
}
while (i < N);
clock_t end2 = clock ();
double time2 = (end2 - start2) / (double) CLOCKS_PER_SEC;
printf ("1/sqrt() spends %13f sec.\n\n", time2);
i = 0;
clock_t start1 = clock ();
do
{
float y = FastInvSqrt (x);
i++;
}
while (i < N);
clock_t end1 = clock ();
double time1 = (end1 - start1) / (double) CLOCKS_PER_SEC;
printf ("FastInvSqrt() spends %f sec.\n\n", time1);
printf ("fast inverse square root is faster %f times than 1/sqrt().\n", time2/time1);
return 0;
}
float
FastInvSqrt (float x)
{
float xhalf = 0.5F * x;
int i = *(int *) &x; // store floating-point bits in integer
i = 0x5f3759df - (i >> 1); // initial guess for Newton's method
x = *(float *) &i; // convert new bits into float
x = x * (1.5 - xhalf * x * x); // One round of Newton's method
//x = x * (1.5 - xhalf * x * x); // One round of Newton's method
//x = x * (1.5 - xhalf * x * x); // One round of Newton's method
//x = x * (1.5 - xhalf * x * x); // One round of Newton's method
return x;
}
The result is as follows:
1/sqrt() spends 0.850000 sec.
FastInvSqrt() spends 0.960000 sec.
fast inverse square root is faster 0.885417 times than 1/sqrt().
A function that reduces the domain in which it computes with precision will have less computational complexity (meaning that it can be computed faster). This can be thought of as optimizing the computation of a function's shape for a subset of its definition, or like search algorithms which each are best for a particular kind of input (No Free Lunch theorem).
As such, using this function for inputs outside the interval [0, 1] (which I suppose it was optimized / designed for) means using it in the subset of inputs where its complexity is worse (higher) than other possibly specialized variants of functions that compute square roots.
The sqrt() function you are using from the library was itself (likely) also optimized, as it has pre-computed values in a sort of LUT (which act as initial guesses for further approximations); using such a more "general function" (meaning that it covers more of the domain and tries to efficientize it by precomputation, for example; or eliminating redundant computation, but that is limited; or maximizing data reuse at run-time) has its complexity limitations, because the more choices between which precomputation to use for an interval, the more decision overhead there is; so knowing at compile-time that all your inputs to sqrt are in the interval [0, 1] would help reduce the run-time decision overhead, as you would know ahead of time which specialized approximation function to use (or you could generate specialized functions for each interval of interest, at compile-time -> see meta-programming for this).
I correct my code as follows:
1. compute random number, instead of a fixed number.
2. count time consumption inside while loop and sum of it.
#include <stdio.h>
#include <time.h>
#include <math.h>
#include <stdlib.h>
float FastInvSqrt (float number);
int
main ()
{
float x=0;
time_t t;
srand((unsigned) time(&t));
int N = 1000000;
int i = 0;
double sum_time2=0.0;
do
{
x=(float)(rand() % 10000)*0.22158;
clock_t start2 = clock ();
float z = 1.0 / sqrt (x);
clock_t end2 = clock ();
sum_time2=sum_time2+(end2-start2);
i++;
}
while (i < N);
printf ("1/sqrt() spends %13f sec.\n\n", sum_time2/(double)CLOCKS_PER_SEC);
double sum_time1=0.0;
i = 0;
do
{
x=(float)(rand() % 10000)*0.22158;
clock_t start1 = clock ();
float y = FastInvSqrt (x);
clock_t end1 = clock ();
sum_time1=sum_time1+(end1-start1);
i++;
}
while (i < N);
printf ("FastInvSqrt() spends %f sec.\n\n", sum_time1/(double)CLOCKS_PER_SEC);
printf ("fast inverse square root is faster %f times than 1/sqrt().\n", sum_time2/sum_time1);
return 0;
}
float
FastInvSqrt (float x)
{
float xhalf = 0.5F * x;
int i = *(int *) &x; // store floating-point bits in integer
i = 0x5f3759df - (i >> 1); // initial guess for Newton's method
x = *(float *) &i; // convert new bits into float
x = x * (1.5 - xhalf * x * x); // One round of Newton's method
//x = x * (1.5 - xhalf * x * x); // One round of Newton's method
//x = x * (1.5 - xhalf * x * x); // One round of Newton's method
//x = x * (1.5 - xhalf * x * x); // One round of Newton's method
return x;
}
but fast inverse square root still slower that 1/sqrt().
1/sqrt() spends 0.530000 sec.
FastInvSqrt() spends 0.540000 sec.
fast inverse square root is faster 0.981481 times than 1/sqrt().
I have a problem that, after much head scratching, I think is to do with very small numbers in a long-double.
I am trying to implement Planck's law equation to generate a normalised blackbody curve at 1nm intervals between a given wavelength range and for a given temperature. Ultimately this will be a function accepting inputs, for now it is main() with the variables fixed and outputting by printf().
I see examples in matlab and python, and they are implementing the same equation as me in a similar loop with no trouble at all.
This is the equation:
My code generates an incorrect blackbody curve:
I have tested key parts of the code independently. After trying to test the equation by breaking it into blocks in excel I noticed that it does result in very small numbers and I wonder if my implementation of large numbers could be causing the issue? Does anyone have any insight into using C to implement equations? This a new area to me and I have found the maths much harder to implement and debug than normal code.
#include <stdio.h>
#include <math.h>
#include <stdlib.h>
//global variables
const double H = 6.626070040e-34; //Planck's constant (Joule-seconds)
const double C = 299800000; //Speed of light in vacume (meters per second)
const double K = 1.3806488e-23; //Boltzmann's constant (Joules per Kelvin)
const double nm_to_m = 1e-6; //conversion between nm and m
const int interval = 1; //wavelength interval to caculate at (nm)
//typedef structure to hold results
typedef struct {
int *wavelength;
long double *radiance;
long double *normalised;
} results;
int main() {
int min = 100 , max = 3000; //wavelength bounds to caculate between, later to be swaped to function inputs
double temprature = 200; //temprature in kelvin, later to be swaped to function input
double new_valu, old_valu = 0;
static results SPD_data, *SPD; //setup a static results structure and a pointer to point to it
SPD = &SPD_data;
SPD->wavelength = malloc(sizeof(int) * (max - min)); //allocate memory based on wavelength bounds
SPD->radiance = malloc(sizeof(long double) * (max - min));
SPD->normalised = malloc(sizeof(long double) * (max - min));
for (int i = 0; i <= (max - min); i++) {
//Fill wavelength vector
SPD->wavelength[i] = min + (interval * i);
//Computes radiance for every wavelength of blackbody of given temprature
SPD->radiance[i] = ((2 * H * pow(C, 2)) / (pow((SPD->wavelength[i] / nm_to_m), 5))) * (1 / (exp((H * C) / ((SPD->wavelength[i] / nm_to_m) * K * temprature))-1));
//Copy SPD->radiance to SPD->normalised
SPD->normalised[i] = SPD->radiance[i];
//Find largest value
if (i <= 0) {
old_valu = SPD->normalised[0];
} else if (i > 0){
new_valu = SPD->normalised[i];
if (new_valu > old_valu) {
old_valu = new_valu;
}
}
}
//for debug perposes
printf("wavelength(nm) radiance(Watts per steradian per meter squared) normalised radiance\n");
for (int i = 0; i <= (max - min); i++) {
//Normalise SPD
SPD->normalised[i] = SPD->normalised[i] / old_valu;
//for debug perposes
printf("%d %Le %Lf\n", SPD->wavelength[i], SPD->radiance[i], SPD->normalised[i]);
}
return 0; //later to be swaped to 'return SPD';
}
/*********************UPDATE Friday 24th Mar 2017 23:42*************************/
Thank you for the suggestions so far, lots of useful pointers especially understanding the way numbers are stored in C (IEEE 754) but I don't think that is the issue here as it only applies to significant digits. I implemented most of the suggestions but still no progress on the problem. I suspect Alexander in the comments is probably right, changing the units and order of operations is likely what I need to do to make the equation work like the matlab or python examples, but my knowledge of maths is not good enough to do this. I broke the equation down into chunks to take a closer look at what it was doing.
//global variables
const double H = 6.6260700e-34; //Planck's constant (Joule-seconds) 6.626070040e-34
const double C = 299792458; //Speed of light in vacume (meters per second)
const double K = 1.3806488e-23; //Boltzmann's constant (Joules per Kelvin) 1.3806488e-23
const double nm_to_m = 1e-9; //conversion between nm and m
const int interval = 1; //wavelength interval to caculate at (nm)
const int min = 100, max = 3000; //max and min wavelengths to caculate between (nm)
const double temprature = 200; //temprature (K)
//typedef structure to hold results
typedef struct {
int *wavelength;
long double *radiance;
long double *normalised;
} results;
//main program
int main()
{
//setup a static results structure and a pointer to point to it
static results SPD_data, *SPD;
SPD = &SPD_data;
//allocate memory based on wavelength bounds
SPD->wavelength = malloc(sizeof(int) * (max - min));
SPD->radiance = malloc(sizeof(long double) * (max - min));
SPD->normalised = malloc(sizeof(long double) * (max - min));
//break equasion into visible parts for debuging
long double aa, bb, cc, dd, ee, ff, gg, hh, ii, jj, kk, ll, mm, nn, oo;
for (int i = 0; i < (max - min); i++) {
//Computes radiance at every wavelength interval for blackbody of given temprature
SPD->wavelength[i] = min + (interval * i);
aa = 2 * H;
bb = pow(C, 2);
cc = aa * bb;
dd = pow((SPD->wavelength[i] / nm_to_m), 5);
ee = cc / dd;
ff = 1;
gg = H * C;
hh = SPD->wavelength[i] / nm_to_m;
ii = K * temprature;
jj = hh * ii;
kk = gg / jj;
ll = exp(kk);
mm = ll - 1;
nn = ff / mm;
oo = ee * nn;
SPD->radiance[i] = oo;
}
//for debug perposes
printf("wavelength(nm) | radiance(Watts per steradian per meter squared)\n");
for (int i = 0; i < (max - min); i++) {
printf("%d %Le\n", SPD->wavelength[i], SPD->radiance[i]);
}
return 0;
}
Equation variable values during runtime in xcode:
I notice a couple of things that are wrong and/or suspicious about the current state of your program:
You have defined nm_to_m as 10-9,, yet you divide by it. If your wavelength is measured in nanometers, you should multiply it by 10-9 to get it in meters. To wit, if hh is supposed to be your wavelength in meters, it is on the order of several light-hours.
The same is obviously true for dd as well.
mm, being the exponential expression minus 1, is zero, which gives you infinity in the results deriving from it. This is apparently because you don't have enough digits in a double to represent the significant part of the exponential. Instead of using exp(...) - 1 here, try using the expm1() function instead, which implements a well-defined algorithm for calculating exponentials minus 1 without cancellation errors.
Since interval is 1, it doesn't currently matter, but you can probably see that your results wouldn't match the meaning of the code if you set interval to something else.
Unless you plan to change something about this in the future, there shouldn't be a need for this program to "save" the values of all calculations. You could just print them out as you run them.
On the other hand, you don't seem to be in any danger of underflow or overflow. The largest and smallest numbers you use don't seem to be a far way from 10±60, which is well within what ordinary doubles can deal with, let alone long doubles. The being said, it might not hurt to use more normalized units, but at the magnitudes you currently display, I wouldn't worry about it.
Thanks for all the pointers in the comments. For anyone else running into a similar problem with implementing equations in C, I had a few silly errors in the code:
writing a 6 not a 9
dividing when I should be multiplying
an off by one error with the size of my array vs the iterations of for() loop
200 when I meant 2000 in the temperature variable
As a result of the last one particularly I was not getting the results I expected (my wavelength range was not right for plotting the temperature I was calculating) and this was leading me to the assumption that something was wrong in the implementation of the equation, specifically I was thinking about big/small numbers in C because I did not understand them. This was not the case.
In summary, I should have made sure I knew exactly what my equation should be outputting for given test conditions before implementing it in code. I will work on getting more comfortable with maths, particularly algebra and dimensional analysis.
Below is the working code, implemented as a function, feel free to use it for anything but obviously no warranty of any kind etc.
blackbody.c
//
// Computes radiance for every wavelength of blackbody of given temprature
//
// INPUTS: int min wavelength to begin calculation from (nm), int max wavelength to end calculation at (nm), int temperature (kelvin)
// OUTPUTS: pointer to structure containing:
// - spectral radiance (Watts per steradian per meter squared per wavelength at 1nm intervals)
// - normalised radiance
//
//include & define
#include "blackbody.h"
//global variables
const double H = 6.626070040e-34; //Planck's constant (Joule-seconds) 6.626070040e-34
const double C = 299792458; //Speed of light in vacuum (meters per second)
const double K = 1.3806488e-23; //Boltzmann's constant (Joules per Kelvin) 1.3806488e-23
const double nm_to_m = 1e-9; //conversion between nm and m
const int interval = 1; //wavelength interval to calculate at (nm), to change this line 45 also need to be changed
bbresults* blackbody(int min, int max, double temperature) {
double new_valu, old_valu = 0; //variables for normalising result
bbresults *SPD;
SPD = malloc(sizeof(bbresults));
//allocate memory based on wavelength bounds
SPD->wavelength = malloc(sizeof(int) * (max - min));
SPD->radiance = malloc(sizeof(long double) * (max - min));
SPD->normalised = malloc(sizeof(long double) * (max - min));
for (int i = 0; i < (max - min); i++) {
//Computes radiance for every wavelength of blackbody of given temperature
SPD->wavelength[i] = min + (interval * i);
SPD->radiance[i] = ((2 * H * pow(C, 2)) / (pow((SPD->wavelength[i] * nm_to_m), 5))) * (1 / (expm1((H * C) / ((SPD->wavelength[i] * nm_to_m) * K * temperature))));
//Copy SPD->radiance to SPD->normalised
SPD->normalised[i] = SPD->radiance[i];
//Find largest value
if (i <= 0) {
old_valu = SPD->normalised[0];
} else if (i > 0){
new_valu = SPD->normalised[i];
if (new_valu > old_valu) {
old_valu = new_valu;
}
}
}
for (int i = 0; i < (max - min); i++) {
//Normalise SPD
SPD->normalised[i] = SPD->normalised[i] / old_valu;
}
return SPD;
}
blackbody.h
#ifndef blackbody_h
#define blackbody_h
#include <stdio.h>
#include <math.h>
#include <stdlib.h>
//typedef structure to hold results
typedef struct {
int *wavelength;
long double *radiance;
long double *normalised;
} bbresults;
//function declarations
bbresults* blackbody(int, int, double);
#endif /* blackbody_h */
main.c
#include <stdio.h>
#include "blackbody.h"
int main() {
bbresults *TEST;
int min = 100, max = 3000, temp = 5000;
TEST = blackbody(min, max, temp);
printf("wavelength | normalised radiance | radiance |\n");
printf(" (nm) | - | (W per meter squr per steradian) |\n");
for (int i = 0; i < (max - min); i++) {
printf("%4d %Lf %Le\n", TEST->wavelength[i], TEST->normalised[i], TEST->radiance[i]);
}
free(TEST);
free(TEST->wavelength);
free(TEST->radiance);
free(TEST->normalised);
return 0;
}
Plot of output:
I need to write my own asin() function without math.h library with the use of Taylor series. It works fine for numbers between <-0.98;0.98> but when I am close to limits it stops with 1604 iterations and therefore is inaccurate.
I don't know how to make it more accurete. Any suggestions are very appreciated!
The code is following:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define EPS 0.000000000001
double my_arcsin(double x)
{
long double a, an, b, bn;
a = an = 1.0;
b = bn = 2.0;
long double n = 3.0;
double xn;
double xs = x;
double xp = x;
int iterace = 0;
xn = xs + (a/b) * (my_pow(xp,n) / n);
while (my_abs(xn - xs) >= EPS)
{
n += 2.0;
an += 2.0;
bn += 2.0;
a = a * an;
b = b * bn;
xs = xn;
xn = xs + (a/b) * (my_pow(xp,n) / n);
iterace++;
}
//printf("%d\n", iterace);
return xn;
}
int main(int argc, char* argv[])
{
double x = 0.0;
if (argc > 2)
x = strtod(argv[2], NULL);
if (strcmp(argv[1], "--asin") == 0)
{
if (x < -1 || x > 1)
printf("nan\n");
else
{
printf("%.10e\n", my_arcsin(x));
//printf("%.10e\n", asin(x));
}
return 0;
}
}
And also a short list of my values and expected ones:
My values Expected values my_asin(x)
5.2359877560e-01 5.2359877560e-01 0.5
1.5567132089e+00 1.5707963268e+00 1 //problem
1.4292568534e+00 1.4292568535e+00 0.99 //problem
1.1197695150e+00 1.1197695150e+00 0.9
1.2532358975e+00 1.2532358975e+00 0.95
Even though the convergence radius of the series expansion you are using is 1, therefore the series will eventually converge for -1 < x < 1, convergence is indeed painfully slow close to the limits of this interval. The solution is to somehow avoid these parts of the interval.
I suggest that you
use your original algorithm for |x| <= 1/sqrt(2),
use the identity arcsin(x) = pi/2 - arcsin(sqrt(1-x^2)) for 1/sqrt(2) < x <= 1.0,
use the identity arcsin(x) = -pi/2 + arcsin(sqrt(1-x^2)) for -1.0 <= x < -1/sqrt(2).
This way you can transform your input x into [-1/sqrt(2),1/sqrt(2)], where convergence is relatively fast.
PLEASE NOTICE: In this case I strongly recommend #Bence's method, since you can't expect a slowly convergent method with low data accuracy to obtain arbitrary precision.
However I'm willing to show you how to improve the result using your current algorithm.
The main problem is that a and b grows too fast and soon become inf (after merely about 150 iterations). Another similar problem is my_pow(xp,n) grows fast when n grows, however this doesn't matter much in this very case since we could assume the input data goes inside the range of [-1, 1].
So I've just changed the method you deal with a/b by introducing ab_ratio, see my edited code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define EPS 0.000000000001
#include <math.h>
#define my_pow powl
#define my_abs fabsl
double my_arcsin(double x)
{
#if 0
long double a, an, b, bn;
a = an = 1.0;
b = bn = 2.0;
#endif
unsigned long _n = 0;
long double ab_ratio = 0.5;
long double n = 3.0;
long double xn;
long double xs = x;
long double xp = x;
int iterace = 0;
xn = xs + ab_ratio * (my_pow(xp,n) / n);
long double step = EPS;
#if 0
while (my_abs(step) >= EPS)
#else
while (1) /* manually stop it */
#endif
{
n += 2.0;
#if 0
an += 2.0;
bn += 2.0;
a = a * an;
b = b * bn;
#endif
_n += 1;
ab_ratio *= (1.0 + 2.0 * _n) / (2.0 + 2.0 * _n);
xs = xn;
step = ab_ratio * (my_pow(xp,n) / n);
xn = xs + step;
iterace++;
if (_n % 10000000 == 0)
printf("%lu %.10g %g %g %g %g\n", _n, (double)xn, (double)ab_ratio, (double)step, (double)xn, (double)my_pow(xp, n));
}
//printf("%d\n", iterace);
return xn;
}
int main(int argc, char* argv[])
{
double x = 0.0;
if (argc > 2)
x = strtod(argv[2], NULL);
if (strcmp(argv[1], "--asin") == 0)
{
if (x < -1 || x > 1)
printf("nan\n");
else
{
printf("%.10e\n", my_arcsin(x));
//printf("%.10e\n", asin(x));
}
return 0;
}
}
For 0.99 (and even 0.9999999) it soon gives correct results with more than 10 significant digits. However it gets slow when getting near to 1.
Actually the process has been running for nearly 12 minutes on my laptop calculating --asin 1, and the current result is 1.570786871 after 3560000000 iterations.
UPDATED: It's been 1h51min now and the result 1.570792915 and iteration count is 27340000000.