I'm trying to evaluate a^n, where a and n are rational numbers.
I don't want to use any predefined functions like sqrt() or pow()
So I'm trying to use Newton's Method to get an approximate solution using this approach:
3^0.2 = 3^(1/5) , so if x = 3^0.2, x^5 = 3.
Probably the best way to solve that (without a calculator but still
using the basic arithmetic operations) is to use "Newton's method".
Newton's method for solving the equation f(x)= 0 is to set up a
sequence of numbers xn defined by taking x0 as some initial "guess"
and then xn+1= xn- f(xn/f '(xn) where f '(x) is the derivative of f.
Posted on physicsforums
The problem with that method is that if I want to compute 5.2^0.33333, I'll need to find the roots for this equation x^10000 - 5.2^33333 = 0. I end up with huge numbers, and get inf and nan errors most of the time.
Can someone give me advice on how to solve this problem? Or, can someone provide another algorithm to compute a^n?
It seems your task is to calculate
⎛ xN ⎞(aN / aD)
⎜⎼⎼⎼⎼⎟ where xN,xD,aN,aD ∈ ℤ, xD,aD ≠ 0
⎝ xD ⎠
using only multiplications, divisions, additions, and subtractions, with Newton's method as the suggested method to implement.
The equation we're trying to solve (for y) is
(aN / aD)
y = (xN / xD) where y ∈ ℝ
Newton's method finds a root of a function. If we want to use it to solve the above, we substract the right side from the left side, to get a function whose zero gives us the y we want:
(aN/aD)
f(y) = y - (xN/xD) = 0
Not much help. I guess this is as far as you got? The point here is to not form that function just yet, because we don't have a way to calculate a rational power of a rational number!
First, let's decide that aD and xD are both positive. We can do that simply by negating both aN and aD if aD was negative (so sign of aN/aD does not change), and negating both xN and xD if xD was negative. Remember, by definition neither xD or aD is zero. Then, we can simply raise both sides to the aD'th power:
aD aN aN aN
y = (xN / xD) = xN / xD
We can even eliminate the division by multiplying both sides by the last term:
aD aN aN
y × xD = xN
Now, this looks quite promising! The function we get from this is
aD aN aN
f(y) = y xD - xN
Newton's method also requires the derivative, which is obviously
f(y) aD aN
⎼⎼⎼⎼ = df(y) = y xD y / aD
dy
Newton's method itself relies on iterating
f(y)
y = y - ⎼⎼⎼⎼⎼⎼
i+1 i df(y)
If you work out the math, you'll find that the iteration is just
aD
y[i] y[i] xN
y[i+1] = y[i] - ⎼⎼⎼⎼ + ⎼⎼⎼⎼⎼⎼⎼⎼⎼⎼⎼⎼⎼⎼
aD aD aN
aD y[i] xD
You don't need to keep all the y values in memory; it is enough to remember the last one, and stop iterating when their difference is small enough.
You do still have exponentiation above, but now they are integer exponentiation only, i.e.
aD
xN = xN × xN × .. × xN
╰───────┬───────╯
aD times
which you can do very simply, for example just by multiplying the argument by itself the desired number of times, e.g. in C,
double ipow(const double base, const int exponent)
{
double result = 1.0;
int i;
for (i = 0; i < exponent; i++)
result *= base;
return result;
}
There are more efficient methods to do integer exponentiation, but the above function should be perfectly acceptable for this.
The final problem is to pick the initial y so that you get convergence. You cannot use 0, because (a power of) y is used as a denominator in the division; you'd get division by zero error. Personally, I'd check whether the result ought to be positive or negative, and smaller than or greater than one in magnitude; two rules overall to pick a safe initial y.
Questions?
You can use the generalized binomial theorem. Substitute y=1 and x=a-1. You would want to truncate the infinite series after enough terms, based on the desired accuracy. To be able to link number of terms to accuracy, you would need to ensure that the x^r terms are decreasing in absolute value. So, depending on the value of a and n, you should apply the formula to compute one of a^n and a^(-n) and use that to get your desired result.
A solution for raising an integer number to a power is:
int poweri (int x, unsigned int y)
{
int temp;
if (y == 0)
return 1;
temp = poweri (x, y / 2);
if ((y % 2) == 0)
return temp * temp;
else
return x * temp * temp;
}
However, the square root doesn't provide as clean of a closed solution. There is a good bit of background to be found at wikipedia-square root and at Wolfram Mathworks Square Root Algorithms Both provide several methods that will meet your needs, you just have to choose the one that fits your purpose.
With slight modification, this routine from wikipedia (modified to return the square root and refine accuracy) returns a surprisingly accurate square root. Yes, there will be howls about the use of a union, and it is only valid where integer and float storage are equivalent, but if you are hacking your own square root, this is relatively efficient:
float sqrt_f (float x)
{
float xhalf = 0.5f*x;
union
{
float x;
int i;
} u;
u.x = x;
u.i = 0x5f3759df - (u.i >> 1);
/* The next line can be repeated any number of times to increase accuracy */
// u.x = u.x * (1.5f - xhalf * u.x * u.x);
int i = 10;
while (i--)
u.x *= 1.5f - xhalf * u.x * u.x;
return 1.0f / u.x;
}
Related
I am new to C, and my task is to create a function
f(x) = sqrt[(x^2)+1]-1
that can handle very large numbers and very small numbers. I am submitting my script on an online interface that checks my answers.
For very large numbers I simplify the expression to:
f(x) = x-1
By just using the highest power. This was the correct answer.
The same logic does not work for smaller numbers. For small numbers (on the order of 1e-7), they are very quickly truncated to zero, even before they are squared. I suspect that this has to do with floating point precision in C. In my textbook, it says that the float type has smallest possible value of 1.17549e-38, with 6 digit precision. So although 1e-7 is much larger than 1.17e-38, it has a higher precision, and is therefore rounded to zero. This is my guess, correct me if I'm wrong.
As a solution, I am thinking that I should convert x to a long double when x < 1e-6. However when I do this, I still get the same error. Any ideas? Let me know if I can clarify. Code below:
#include <math.h>
#include <stdio.h>
double feval(double x) {
/* Insert your code here */
if (x > 1e299)
{;
return x-1;
}
if (x < 1e-6)
{
long double g;
g = x;
printf("x = %Lf\n", g);
long double a;
a = pow(x,2);
printf("x squared = %Lf\n", a);
return sqrt(g*g+1.)- 1.;
}
else
{
printf("x = %f\n", x);
printf("Used third \n");
return sqrt(pow(x,2)+1.)-1;
}
}
int main(void)
{
double x;
printf("Input: ");
scanf("%lf", &x);
double b;
b = feval(x);
printf("%f\n", b);
return 0;
}
For small inputs, you're getting truncation error when you do 1+x^2. If x=1e-7f, x*x will happily fit into a 32 bit floating point number (with a little bit of error due to the fact that 1e-7 does not have an exact floating point representation, but x*x will be so much smaller than 1 that floating point precision will not be sufficient to represent 1+x*x.
It would be more appropriate to do a Taylor expansion of sqrt(1+x^2), which to lowest order would be
sqrt(1+x^2) = 1 + 0.5*x^2 + O(x^4)
Then, you could write your result as
sqrt(1+x^2)-1 = 0.5*x^2 + O(x^4),
avoiding the scenario where you add a very small number to 1.
As a side note, you should not use pow for integer powers. For x^2, you should just do x*x. Arbitrary integer powers are a little trickier to do efficiently; the GNU scientific library for example has a function for efficiently computing arbitrary integer powers.
There are two issues here when implementing this in the naive way: Overflow or underflow in intermediate computation when computing x * x, and substractive cancellation during final subtraction of 1. The second issue is an accuracy issue.
ISO C has a standard math function hypot (x, y) that performs the computation sqrt (x * x + y * y) accurately while avoiding underflow and overflow in intermediate computation. A common approach to fix issues with subtractive cancellation is to transform the computation algebraically such that it is transformed into multiplications and / or divisions.
Combining these two fixes leads to the following implementation for float argument. It has an error of less than 3 ulps across all possible inputs according to my testing.
/* Compute sqrt(x*x+1)-1 accurately and without spurious overflow or underflow */
float func (float x)
{
return (x / (1.0f + hypotf (x, 1.0f))) * x;
}
A trick that is often useful in these cases is based on the identity
(a+1)*(a-1) = a*a-1
In this case
sqrt(x*x+1)-1 = (sqrt(x*x+1)-1)*(sqrt(x*x+1)+1)
/(sqrt(x*x+1)+1)
= (x*x+1-1) / (sqrt(x*x+1)+1)
= x*x/(sqrt(x*x+1)+1)
The last formula can be used as an implementation. For vwry small x sqrt(x*x+1)+1 will be close to 2 (for small enough x it will be 2) but we don;t loose precision in evaluating it.
The problem isn't with running into the minimum value, but with the precision.
As you said yourself, float on your machine has about 7 digits of precision. So let's take x = 1e-7, so that x^2 = 1e-14. That's still well within the range of float, no problems there. But now add 1. The exact answer would be 1.00000000000001. But if we only have 7 digits of precision, this gets rounded to 1.0000000, i.e. exactly 1. So you end up computing sqrt(1.0)-1 which is exactly 0.
One approach would be to use the linear approximation of sqrt around x=1 that sqrt(x) ~ 1+0.5*(x-1). That would lead to the approximation f(x) ~ 0.5*x^2.
I have a bit of code that finds a point on a unit sphere. Recall, for a unit sphere:
1 = sqrt( x^2 + y^2 + z^2 )
The algorithm picks two random points (the x and y coordinates) between zero and one. Provided their magnitude is less than one we have room to define a third coordinate by solving the above equation for z.
void pointOnSphere(double *point){
double x, y;
do {
x = 2*randf() - 1;
y = 2*randf() - 1;
} while (x*x + y*y > 1);
double mag = sqrt(fabs(1 - x*x - y*y));
point[0] = 2*(x*mag);
point[1] = 2*(y*mag);
point[2] = 1 - 2*(mag*mag);
}
Technically, I inherited this code. The previous owner compiled using -Ofast which "Disregards strict standards compliance". TL;DR it means your code doesn't need to follow strict IEEE standards. So when I tried to compile without optimization I ran into an error.
undefined reference to `sqrt'
What are IEEE standards? Well, because computers can't store floating point numbers to infinite precision, rounding errors pop up during certain calculations if you're not careful.
After some googling I ran into this question which got me on the right track about using proper IEEE stuff. I even read this article about floating point numbers (which I recommend). Unfortunately it didn't answer my questions.
I'd like to use sqrt() in my function as opposed to something like Newton Iteration. I understand the issue in my algorithm probably comes from the fact I could potentially (even though not really) pass a negative number to the sqrt() function. I'm just not quite sure how to remedy the issue. Thanks for all the help!
Oh, and if it's relevant I'm using a Mersenne Twister number generator.
Just to clarify, I am linking libm with -lm! I have also confirmed it is pointing to the correct library.
As for the undefined reference to sqrt you need to link with libm, usually with -lm or similar option.
Also note that
Provided their magnitude is less than one we have room to define a third coordinate by solving the above equation for z.
is wrong. The x and y must satisfy x * x + y * y <= 1 in order for there to be a solution for z.
I'd use spherical coordinates
theta = randf()*M_PI;
phi = randf()*2*M_PI;
r = 1.0;
x = r*sin(theta)*cos(phi);
y = r*sin(theta)*sin(phi);
z = r*cos(theta);
To insure the points meet a condition, test for the condition itself as part of the while loop, rather than a derivation of the condition.
// functions like `sqrt(), hypot()` benefit with declaration before use
// and without it may generate "undefined reference to `sqrt'"
// Some functions like `sqrt()` are understood and optimized out by a smart compiler.
// Still, best to always declare them.
#include <math.h>
void pointOnSphere(double *point){
double x, y, z;
do {
x = 2*randf() - 1;
y = 2*randf() - 1;
double zz = 1.0 - hypot(x,y);
if (zz < 0.) continue; // On rare negative values due to imprecision
z = sqrt(zz);
if (rand()%2) z = -z; // Flip z half the time
} while (x*x + y*y + z*z > 1); // Must meet this condition
point[0] = x;
point[1] = y;
point[2] = z;
}
I'm working via a basic 'Programming in C' book.
I have written the following code based off of it in order to calculate the square root of a number:
#include <stdio.h>
float absoluteValue (float x)
{
if(x < 0)
x = -x;
return (x);
}
float squareRoot (float x, float epsilon)
{
float guess = 1.0;
while(absoluteValue(guess * guess - x) >= epsilon)
{
guess = (x/guess + guess) / 2.0;
}
return guess;
}
int main (void)
{
printf("SquareRoot(2.0) = %f\n", squareRoot(2.0, .00001));
printf("SquareRoot(144.0) = %f\n", squareRoot(144.0, .00001));
printf("SquareRoot(17.5) = %f\n", squareRoot(17.5, .00001));
return 0;
}
An exercise in the book has said that the current criteria used for termination of the loop in squareRoot() is not suitable for use when computing the square root of a very large or a very small number.
Instead of comparing the difference between the value of x and the value of guess^2, the program should compare the ratio of the two values to 1. The closer this ratio gets to 1, the more accurate the approximation of the square root.
If the ratio is just guess^2/x, shouldn't my code inside of the while loop:
guess = (x/guess + guess) / 2.0;
be replaced by:
guess = ((guess * guess) / x ) / 1 ; ?
This compiles but nothing is printed out into the terminal. Surely I'm doing exactly what the exercise is asking?
To calculate the ratio just do (guess * guess / x) that could be either higher or lower than 1 depending on your implementation. Similarly, your margin of error (in percent) would be absoluteValue((guess * guess / x) - 1) * 100
All they want you to check is how close the square root is. By squaring the number you get and dividing it by the number you took the square root of you are just checking how close you were to the original number.
Example:
sqrt(4) = 2
2 * 2 / 4 = 1 (this is exact so we get 1 (2 * 2 = 4 = 4))
margin of error = (1 - 1) * 100 = 0% margin of error
Another example:
sqrt(4) = 1.999 (lets just say you got this)
1.999 * 1.999 = 3.996
3.996/4 = .999 (so we are close but not exact)
To check margin of error:
.999 - 1 = -.001
absoluteValue(-.001) = .001
.001 * 100 = .1% margin of error
How about applying a little algebra? Your current criterion is:
|guess2 - x| >= epsilon
You are elsewhere assuming that guess is nonzero, so it is algebraically safe to convert that to
|1 - x / guess2| >= epsilon / guess2
epsilon is just a parameter governing how close the match needs to be, and the above reformulation shows that it must be expressed in terms of the floating-point spacing near guess2 to yield equivalent precision for all evaluations. But of course that's not possible because epsilon is a constant. This is, in fact, exactly why the original criterion gets less effective as x diverges from 1.
Let us instead write the alternative expression
|1 - x / guess2| >= delta
Here, delta expresses the desired precision in terms of the spacing of floating point values in the vicinity of 1, which is related to a fixed quantity sometimes called the "machine epsilon". You can directly select the required precision via your choice of delta, and you will get the same precision for all x, provided that no arithmetic operations overflow.
Now just convert that back into code.
Suggest a different point of view.
As this method guess_next = (x/guess + guess) / 2.0;, once the initial approximation is in the neighborhood, the number of bits of accuracy doubles. Example log2(FLT_EPSILON) is about -23, so 6 iterations are needed. (Think 23, 12, 6, 3, 2, 1)
The trouble with using guess * guess is that it may vanish, become 0.0 or infinity for a non-zero x.
To form a quality initial guess:
assert(x > 0.0f);
int expo;
float signif = frexpf(x, &expo);
float guess = ldexpf(signif, expo/2);
Now iterate N times (e.g. 6), (N based on FLT_EPSILON, FLT_DECIMAL_DIG or FLT_DIG.)
for (i=0; i<N; i++) {
guess = (x/guess + guess) / 2.0f;
}
The cost of perhaps an extra iteration is saved by avoiding an expensive termination condition calculation.
If code wants to compare a/b nearest to 1.0f
Simply use some epsilon factor like 1 or 2.
float a = guess;
float b = x/guess;
assert(b);
float q = a/b;
#define FACTOR (1.0f /* some value 1.0f to maybe 2,3 or 4 */)
if (q >= 1.0f - FLT_EPSILON*N && q <= 1.0f + FLT_EPSILON*N) {
close_enough();
}
First lesson in numerical analysis: for floating point numbers x+y has the potential for large relative errors, especially when the sum is near zero, but x*y has very limited relative errors.
When implementing "Carmack's Inverse Square Root" algorithm I noticed that the results seem biased. The following code seems to give better results:
float InvSqrtF(float x)
{
// Initial approximation by Greg Walsh.
int i = * ( int* ) &x;
i = 0x5f3759df - ( i >> 1 );
float y = * ( float * ) &i;
// Two iterations of Newton-Raphson's method to refine the initial estimate.
x *= 0.5f;
float f = 1.5F;
y = y * ( f - ( x * y * y ) );
y = y * ( f - ( x * y * y ) );
* ( int * )(&y) += 0x13; // More magic.
return y;
}
The key difference is in the penultimate "more magic" line. Since the initial results were too low by a fairly constant factor, this adds 19 * 2^(exponent(y)-bias) to the result with just a single instruction. It seems to give me about 3 extra bits, but am I overlooking something?
Newton's method produces a bias. The function whose zero is to be found,
f(y) = x - 1/y²
is concave, so - unless you start with an y ≥ √(3/x) - the Newton method only produces approximations ≤ 1/√x (and strictly smaller, unless you start with the exact result) with exact arithmetic.
Floating point arithmetic occasionally produces too large approximations, but typically not in the first two iterations (since the initial guess usually isn't close enough).
So yes, there is a bias, and adding a small quantity generally improves the result. But not always. In the region around 1.25 or 0.85 for example, the results without the adjustment are better than with. In other regions, the adjustment yields one bit of additional precision, in yet others more.
In any case, the magic constant to add should be adjusted to the region from which x is most often taken for the best results.
As this method is an approximation, the result will be overestimated some times and underestimated some others. You can find on McEniry's paper some nice figures about how this error is distributed for different configurations, and the math behind them.
So, unless you have solid proofs that in your domain of application the result is clearly biased, I would prefer tuning the magic constant as suggested in Lomont's document :-)
I'm looking for implementation of log() and exp() functions provided in C library <math.h>. I'm working with 8 bit microcontrollers (OKI 411 and 431). I need to calculate Mean Kinetic Temperature. The requirement is that we should be able to calculate MKT as fast as possible and with as little code memory as possible. The compiler comes with log() and exp() functions in <math.h>. But calling either function and linking with the library causes the code size to increase by 5 Kilobytes, which will not fit in one of the micro we work with (OKI 411), because our code already consumed ~12K of available ~15K code memory.
The implementation I'm looking for should not use any other C library functions (like pow(), sqrt() etc). This is because all library functions are packed in one library and even if one function is called, the linker will bring whole 5K library to code memory.
EDIT
The algorithm should be correct up to 3 decimal places.
Using Taylor series is not the simplest neither the fastest way of doing this. Most professional implementations are using approximating polynomials. I'll show you how to generate one in Maple (it is a computer algebra program), using the Remez algorithm.
For 3 digits of accuracy execute the following commands in Maple:
with(numapprox):
Digits := 8
minimax(ln(x), x = 1 .. 2, 4, 1, 'maxerror')
maxerror
Its response is the following polynomial:
-1.7417939 + (2.8212026 + (-1.4699568 + (0.44717955 - 0.056570851 * x) * x) * x) * x
With the maximal error of: 0.000061011436
We generated a polynomial which approximates the ln(x), but only inside the [1..2] interval. Increasing the interval is not wise, because that would increase the maximal error even more. Instead of that, do the following decomposition:
So first find the highest power of 2, which is still smaller than the number (See: What is the fastest/most efficient way to find the highest set bit (msb) in an integer in C?). That number is actually the base-2 logarithm. Divide with that value, then the result gets into the 1..2 interval. At the end we will have to add n*ln(2) to get the final result.
An example implementation for numbers >= 1:
float ln(float y) {
int log2;
float divisor, x, result;
log2 = msb((int)y); // See: https://stackoverflow.com/a/4970859/6630230
divisor = (float)(1 << log2);
x = y / divisor; // normalized value between [1.0, 2.0]
result = -1.7417939 + (2.8212026 + (-1.4699568 + (0.44717955 - 0.056570851 * x) * x) * x) * x;
result += ((float)log2) * 0.69314718; // ln(2) = 0.69314718
return result;
}
Although if you plan to use it only in the [1.0, 2.0] interval, then the function is like:
float ln(float x) {
return -1.7417939 + (2.8212026 + (-1.4699568 + (0.44717955 - 0.056570851 * x) * x) * x) * x;
}
The Taylor series for e^x converges extremely quickly, and you can tune your implementation to the precision that you need. (http://en.wikipedia.org/wiki/Taylor_series)
The Taylor series for log is not as nice...
If you don't need floating-point math for anything else, you may compute an approximate fractional base-2 log pretty easily. Start by shifting your value left until it's 32768 or higher and store the number of times you did that in count. Then, repeat some number of times (depending upon your desired scale factor):
n = (mult(n,n) + 32768u) >> 16; // If a function is available for 16x16->32 multiply
count<<=1;
if (n < 32768) n*=2; else count+=1;
If the above loop is repeated 8 times, then the log base 2 of the number will be count/256. If ten times, count/1024. If eleven, count/2048. Effectively, this function works by computing the integer power-of-two logarithm of n**(2^reps), but with intermediate values scaled to avoid overflow.
Would basic table with interpolation between values approach work? If ranges of values are limited (which is likely for your case - I doubt temperature readings have huge range) and high precisions is not required it may work. Should be easy to test on normal machine.
Here is one of many topics on table representation of functions: Calculating vs. lookup tables for sine value performance?
Necromancing.
I had to implement logarithms on rational numbers.
This is how I did it:
Occording to Wikipedia, there is the Halley-Newton approximation method
which can be used for very-high precision.
Using Newton's method, the iteration simplifies to (implementation), which has cubic convergence to ln(x), which is way better than what the Taylor-Series offers.
// Using Newton's method, the iteration simplifies to (implementation)
// which has cubic convergence to ln(x).
public static double ln(double x, double epsilon)
{
double yn = x - 1.0d; // using the first term of the taylor series as initial-value
double yn1 = yn;
do
{
yn = yn1;
yn1 = yn + 2 * (x - System.Math.Exp(yn)) / (x + System.Math.Exp(yn));
} while (System.Math.Abs(yn - yn1) > epsilon);
return yn1;
}
This is not C, but C#, but I'm sure anybody capable to program in C will be able to deduce the C-Code from that.
Furthermore, since
logn(x) = ln(x)/ln(n).
You have therefore just implemented logN as well.
public static double log(double x, double n, double epsilon)
{
return ln(x, epsilon) / ln(n, epsilon);
}
where epsilon (error) is the minimum precision.
Now as to speed, you're probably better of using the ln-cast-in-hardware, but as I said, I used this as a base to implement logarithms on a rational numbers class working with arbitrary precision.
Arbitrary precision might be more important than speed, under certain circumstances.
Then, use the logarithmic identities for rational numbers:
logB(x/y) = logB(x) - logB(y)
In addition to Crouching Kitten's answer which gave me inspiration, you can build a pseudo-recursive (at most 1 self-call) logarithm to avoid using polynomials. In pseudo code
ln(x) :=
If (x <= 0)
return NaN
Else if (!(1 <= x < 2))
return LN2 * b + ln(a)
Else
return taylor_expansion(x - 1)
This is pretty efficient and precise since on [1; 2) the taylor series converges A LOT faster, and we get such a number 1 <= a < 2 with the first call to ln if our input is positive but not in this range.
You can find 'b' as your unbiased exponent from the data held in the float x, and 'a' from the mantissa of the float x (a is exactly the same float as x, but now with exponent biased_0 rather than exponent biased_b). LN2 should be kept as a macro in hexadecimal floating point notation IMO. You can also use http://man7.org/linux/man-pages/man3/frexp.3.html for this.
Also, the trick
unsigned long tmp = *(ulong*)(&d);
for "memory-casting" double to unsigned long, rather than "value-casting", is very useful to know when dealing with floats memory-wise, as bitwise operators will cause warnings or errors depending on the compiler.
Possible computation of ln(x) and expo(x) in C without <math.h> :
static double expo(double n) {
int a = 0, b = n > 0;
double c = 1, d = 1, e = 1;
for (b || (n = -n); e + .00001 < (e += (d *= n) / (c *= ++a)););
// approximately 15 iterations
return b ? e : 1 / e;
}
static double native_log_computation(const double n) {
// Basic logarithm computation.
static const double euler = 2.7182818284590452354 ;
unsigned a = 0, d;
double b, c, e, f;
if (n > 0) {
for (c = n < 1 ? 1 / n : n; (c /= euler) > 1; ++a);
c = 1 / (c * euler - 1), c = c + c + 1, f = c * c, b = 0;
for (d = 1, c /= 2; e = b, b += 1 / (d * c), b - e/* > 0.0000001 */;)
d += 2, c *= f;
} else b = (n == 0) / 0.;
return n < 1 ? -(a + b) : a + b;
}
static inline double native_ln(const double n) {
// Returns the natural logarithm (base e) of N.
return native_log_computation(n) ;
}
static inline double native_log_base(const double n, const double base) {
// Returns the logarithm (base b) of N.
return native_log_computation(n) / native_log_computation(base) ;
}
Try it Online
Building off #Crouching Kitten's great natural log answer above, if you need it to be accurate for inputs <1 you can add a simple scaling factor. Below is an example in C++ that i've used in microcontrollers. It has a scaling factor of 256 and it's accurate to inputs down to 1/256 = ~0.04, and up to 2^32/256 = 16777215 (due to overflow of a uint32 variable).
It's interesting to note that even on an STMF103 Arm M3 with no FPU, the float implementation below is significantly faster (eg 3x or better) than the 16 bit fixed-point implementation in libfixmath (that being said, this float implementation still takes a few thousand cycles so it's still not ~fast~)
#include <float.h>
float TempSensor::Ln(float y)
{
// Algo from: https://stackoverflow.com/a/18454010
// Accurate between (1 / scaling factor) < y < (2^32 / scaling factor). Read comments below for more info on how to extend this range
float divisor, x, result;
const float LN_2 = 0.69314718; //pre calculated constant used in calculations
uint32_t log2 = 0;
//handle if input is less than zero
if (y <= 0)
{
return -FLT_MAX;
}
//scaling factor. The polynomial below is accurate when the input y>1, therefore using a scaling factor of 256 (aka 2^8) extends this to 1/256 or ~0.04. Given use of uint32_t, the input y must stay below 2^24 or 16777216 (aka 2^(32-8)), otherwise uint_y used below will overflow. Increasing the scaing factor will reduce the lower accuracy bound and also reduce the upper overflow bound. If you need the range to be wider, consider changing uint_y to a uint64_t
const uint32_t SCALING_FACTOR = 256;
const float LN_SCALING_FACTOR = 5.545177444; //this is the natural log of the scaling factor and needs to be precalculated
y = y * SCALING_FACTOR;
uint32_t uint_y = (uint32_t)y;
while (uint_y >>= 1) // Convert the number to an integer and then find the location of the MSB. This is the integer portion of Log2(y). See: https://stackoverflow.com/a/4970859/6630230
{
log2++;
}
divisor = (float)(1 << log2);
x = y / divisor; // FInd the remainder value between [1.0, 2.0] then calculate the natural log of this remainder using a polynomial approximation
result = -1.7417939 + (2.8212026 + (-1.4699568 + (0.44717955 - 0.056570851 * x) * x) * x) * x; //This polynomial approximates ln(x) between [1,2]
result = result + ((float)log2) * LN_2 - LN_SCALING_FACTOR; // Using the log product rule Log(A) + Log(B) = Log(AB) and the log base change rule log_x(A) = log_y(A)/Log_y(x), calculate all the components in base e and then sum them: = Ln(x_remainder) + (log_2(x_integer) * ln(2)) - ln(SCALING_FACTOR)
return result;
}