How will you implement pow(a,b) in C ? condition follows -- - c

without using multiplication or division operators.
You can use only add/substract operators.

A pointless problem, but solvable with the properties of logarithms:
pow(a,b) = exp( b * log(a) )
= exp( exp(log(b) + log(log(a)) )
Take care to insure that your exponential and logarithm functions are using the same base.
Yes, I know how to use a sliderule. Learning that trick will change your perspective of logarithms.

If they are integers, it's simple to turn pow (a, b) into b multiplications of a.
pow(a, b) = a * a * a * a ... ; // do this b times
And simple to turn a * a into additions
a * a = a + a + a + a + ... ; // do this a times
If you combine them, you can make pow.
First, make mult(int a, int b), then use it to make pow.

A recursive solution :
#include<stdio.h>
int multiplication(int a1, int b1)
{
if(b1)
return (a1 + multiplication(a1, b1-1));
else
return 0;
}
int pow(int a, int b)
{
if(b)
return multiplication(a, pow(a, b-1));
else
return 1;
}
int main()
{
printf("\n %d", pow(5, 4));
}

You've already gotten answers purely for FP and purely for integers. Here's one for a FP number raised to an integer power:
double power(double x, int y) {
double z = 1.0;
while (y > 0) {
while (!(y&1)) {
y >>= 2;
x *= x;
}
--y;
z = x * z;
}
return z;
}
At the moment this uses multiplication. You can implement multiplication using only bit shifts, a few bit comparisons, and addition. For integers it looks like this:
int mul(int x, int y) {
int result = 0;
while (y) {
if (y&1)
result += x;
x <<= 1;
y >>= 1;
}
return result;
}
Floating point is pretty much the same, except you have to normalize your results -- i.e., in essence, a floating point number is 1) a significand expressed as a (usually fairly large) integer, and 2) a scale factor. If you want to produce normal IEEE floating point numbers a few parts get a bit ugly though -- for example, the scale factor is stored as a "bias" number instead of any of the usual 1's complement, 2's complement, etc., so working with it is clumsy (basically, each operation you subtract off the bias, do the operation, check for overflow, and (assuming it hasn't overflowed) add the bias back on again).
Doing the job without any kind of logical tests sounds (to me) like it probably wasn't really intended. For quite a few computer architecture classes, it's interesting to reduce a problem to primitive operations you can express directly in hardware (e.g., bit shifts, bitwise-AND, -OR and -NOT, etc.) The implementation shown above fits that reasonably well (if you want to get technical, an adder takes a few gates, but VHDL, Verilog, etc., but it's included in things like VHDL and Verilog anyway).

Related

Underflow error in floating point arithmetic in C

I am new to C, and my task is to create a function
f(x) = sqrt[(x^2)+1]-1
that can handle very large numbers and very small numbers. I am submitting my script on an online interface that checks my answers.
For very large numbers I simplify the expression to:
f(x) = x-1
By just using the highest power. This was the correct answer.
The same logic does not work for smaller numbers. For small numbers (on the order of 1e-7), they are very quickly truncated to zero, even before they are squared. I suspect that this has to do with floating point precision in C. In my textbook, it says that the float type has smallest possible value of 1.17549e-38, with 6 digit precision. So although 1e-7 is much larger than 1.17e-38, it has a higher precision, and is therefore rounded to zero. This is my guess, correct me if I'm wrong.
As a solution, I am thinking that I should convert x to a long double when x < 1e-6. However when I do this, I still get the same error. Any ideas? Let me know if I can clarify. Code below:
#include <math.h>
#include <stdio.h>
double feval(double x) {
/* Insert your code here */
if (x > 1e299)
{;
return x-1;
}
if (x < 1e-6)
{
long double g;
g = x;
printf("x = %Lf\n", g);
long double a;
a = pow(x,2);
printf("x squared = %Lf\n", a);
return sqrt(g*g+1.)- 1.;
}
else
{
printf("x = %f\n", x);
printf("Used third \n");
return sqrt(pow(x,2)+1.)-1;
}
}
int main(void)
{
double x;
printf("Input: ");
scanf("%lf", &x);
double b;
b = feval(x);
printf("%f\n", b);
return 0;
}
For small inputs, you're getting truncation error when you do 1+x^2. If x=1e-7f, x*x will happily fit into a 32 bit floating point number (with a little bit of error due to the fact that 1e-7 does not have an exact floating point representation, but x*x will be so much smaller than 1 that floating point precision will not be sufficient to represent 1+x*x.
It would be more appropriate to do a Taylor expansion of sqrt(1+x^2), which to lowest order would be
sqrt(1+x^2) = 1 + 0.5*x^2 + O(x^4)
Then, you could write your result as
sqrt(1+x^2)-1 = 0.5*x^2 + O(x^4),
avoiding the scenario where you add a very small number to 1.
As a side note, you should not use pow for integer powers. For x^2, you should just do x*x. Arbitrary integer powers are a little trickier to do efficiently; the GNU scientific library for example has a function for efficiently computing arbitrary integer powers.
There are two issues here when implementing this in the naive way: Overflow or underflow in intermediate computation when computing x * x, and substractive cancellation during final subtraction of 1. The second issue is an accuracy issue.
ISO C has a standard math function hypot (x, y) that performs the computation sqrt (x * x + y * y) accurately while avoiding underflow and overflow in intermediate computation. A common approach to fix issues with subtractive cancellation is to transform the computation algebraically such that it is transformed into multiplications and / or divisions.
Combining these two fixes leads to the following implementation for float argument. It has an error of less than 3 ulps across all possible inputs according to my testing.
/* Compute sqrt(x*x+1)-1 accurately and without spurious overflow or underflow */
float func (float x)
{
return (x / (1.0f + hypotf (x, 1.0f))) * x;
}
A trick that is often useful in these cases is based on the identity
(a+1)*(a-1) = a*a-1
In this case
sqrt(x*x+1)-1 = (sqrt(x*x+1)-1)*(sqrt(x*x+1)+1)
/(sqrt(x*x+1)+1)
= (x*x+1-1) / (sqrt(x*x+1)+1)
= x*x/(sqrt(x*x+1)+1)
The last formula can be used as an implementation. For vwry small x sqrt(x*x+1)+1 will be close to 2 (for small enough x it will be 2) but we don;t loose precision in evaluating it.
The problem isn't with running into the minimum value, but with the precision.
As you said yourself, float on your machine has about 7 digits of precision. So let's take x = 1e-7, so that x^2 = 1e-14. That's still well within the range of float, no problems there. But now add 1. The exact answer would be 1.00000000000001. But if we only have 7 digits of precision, this gets rounded to 1.0000000, i.e. exactly 1. So you end up computing sqrt(1.0)-1 which is exactly 0.
One approach would be to use the linear approximation of sqrt around x=1 that sqrt(x) ~ 1+0.5*(x-1). That would lead to the approximation f(x) ~ 0.5*x^2.

How to compare two complex numbers?

In C, complex numbers are float or double and have same problem as canonical types:
#include <stdio.h>
#include <complex.h>
int main(void)
{
double complex a = 0 + I * 0;
double complex b = 1 + I * 1;
for (int i = 0; i < 10; i++) {
a += .1 + I * .1;
}
if (a == b) {
puts("Ok");
}
else {
printf("Fail: %f + i%f != %f + i%f\n", creal(a), cimag(a), creal(b), cimag(b));
}
return 0;
}
The result:
$ clang main.c
$ ./a.out
Fail: 1.000000 + i1.000000 != 1.000000 + i1.000000
I try this syntax:
a - b < DBL_EPSILON + I * DBL_EPSILON
But the compiler hate it:
main.c:24:15: error: invalid operands to binary expression ('_Complex double' and '_Complex double')
if (a - b < DBL_EPSILON + I * DBL_EPSILON) {
~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This last works fine but it’s a little fastidious:
fabs(creal(a) - creal(b)) < DBL_EPSILON && fabs(cimag(a) - cimag(b)) < DBL_EPSILON
Comparing 2 complex floating point numbers is much like comparing 2 real floating point numbers.
Comparing for exact equivalences often is insufficient as the numbers involved contain small computational errors.
So rather than if (a == b) code needs to be if (nearlyequal(a,b))
The usual approach is double diff = cabs(a - b) and then comparing diff to some small constant value like DBL_EPSILON.
This fails when a,b are large numbers as their difference may many orders of magnitude larger than DBL_EPSILON, even though a,b differ only by their least significant bit.
This fails for small numbers too as the difference between a,b may be relatively great, but many orders of magnitude smaller than DBL_EPSILON and so return true when the value are relatively quite different.
Complex numbers literally add another dimensional problem to the issue as the real and imaginary components themselves may be greatly different. Thus the best answer for nearlyequal(a,b) is highly dependent on the code's goals.
For simplicity, let us use the magnitude of the difference as compared to the average magnitude of a,b. A control constant ULP_N approximates the number of binary digits of least significance that a,b are allowed to differ.
#define ULP_N 4
bool nearlyequal(complex double a, complex double b) {
double diff = cabs(a - b);
double mag = (cabs(a) + cabs(b))/2;
return diff <= (mag * DBL_EPSILON * (1ull << ULP_N));
}
Instead of comparing the complex number components, you can compute the complex absolute value (also known as norm, modulus or magnitude) of their difference, which is the distance between the two on the complex plane:
if (cabs(a - b) < DBL_EPSILON) {
// complex numbers are close
}
Small complex numbers will appear to be close to zero even if there is no precision issue, a separate issue that is also present for real numbers.
Since complex numbers are represented as floating point numbers, you have to deal with their inherent imprecision. Floating point numbers are "close enough" if they're within the machine epsilon.
The usual way is to subtract them, take the absolute value, and see if it's close enough.
#include <complex.h>
#include <stdbool.h>
#include <float.h>
static inline bool ceq(double complex a, double complex b) {
return cabs(a-b) < DBL_EPSILON;
}

Approximation of arcsin in C

I've got a program that calculates the approximation of an arcsin value based on Taylor's series.
My friend and I have come up with an algorithm which has been able to return the almost "right" values, but I don't think we've done it very crisply. Take a look:
double my_asin(double x)
{
double a = 0;
int i = 0;
double sum = 0;
a = x;
for(i = 1; i < 23500; i++)
{
sum += a;
a = next(a, x, i);
}
}
double next(double a, double x, int i)
{
return a*((my_pow(2*i-1, 2)) / ((2*i)*(2*i+1)*my_pow(x, 2)));
}
I checked if my_pow works correctly so there's no need for me to post it here as well. Basically I want the loop to end once the difference between the current and next term is more or equal to my EPSILON (0.00001), which is the precision I'm using when calculating a square root.
This is how I would like it to work:
while(my_abs(prev_term - next_term) >= EPSILON)
But the function double next is dependent on i, so I guess I'd have to increment it in the while statement too. Any ideas how I should go about doing this?
Example output for -1:
$ -1.5675516116e+00
Instead of:
$ -1.5707963268e+00
Thanks so much guys.
Issues with your code and question include:
Your image file showing the Taylor series for arcsin has two errors: There is a minus sign on the x5 term instead of a plus sign, and the power of x is shown as xn but should be x2n+1.
The x factor in the terms of the Taylor series for arcsin increases by x2 in each term, but your formula a*((my_pow(2*i-1, 2)) / ((2*i)*(2*i+1)*my_pow(x, 2))) divides by x2 in each term. This does not matter for the particular value -1 you ask about, but it will produce wrong results for other values, except 1.
You ask how to end the loop once the difference in terms is “more or equal to” your epsilon, but, for most values of x, you actually want less than (or, conversely, you want to continue, not end, while the difference is greater than or equal to, as you show in code).
The Taylor series is a poor way to evaluate functions because its error increases as you get farther from the point around which the series is centered. Most math library implementations of functions like this use a minimax series or something related to it.
Evaluating the series from low-order terms to high-order terms causes you to add larger values first, then smaller values later. Due to the nature of floating-point arithmetic, this means that accuracy from the smaller terms is lost, because it is “pushed out” of the width of the floating-point format by the larger values. This effect will limit how accurate any result can be.
Finally, to get directly to your question, the way you have structured the code, you directly update a, so you never have both the previous term and the next term at the same time. Instead, create another double b so that you have an object b for a previous term and an object a for the current term, as shown below.
Example:
double a = x, b, sum = a;
int i = 0;
do
{
b = a;
a = next(a, x, ++i);
sum += a;
} while (abs(b-a) > threshold);
using Taylor series for arcsin is extremly imprecise as the stuff converge very badly and there will be relatively big differencies to the real stuff for finite number of therms. Also using pow with integer exponents is not very precise and efficient.
However using arctan for this is OK
arcsin(x) = arctan(x/sqrt(1-(x*x)));
as its Taylor series converges OK on the <0.0,0.8> range all the other parts of the range can be computed through it (using trigonometric identities). So here my C++ implementation (from my arithmetics template):
T atan (const T &x) // = atan(x)
{
bool _shift=false;
bool _invert=false;
bool _negative=false;
T z,dz,x1,x2,a,b; int i;
x1=x; if (x1<0.0) { _negative=true; x1=-x1; }
if (x1>1.0) { _invert=true; x1=1.0/x1; }
if (x1>0.7) { _shift=true; b=::sqrt(3.0)/3.0; x1=(x1-b)/(1.0+(x1*b)); }
x2=x1*x1;
for (z=x1,a=x1,b=1,i=1;i<1000;i++) // if x1>0.8 convergence is slow
{
a*=x2; b+=2; dz=a/b; z-=dz;
a*=x2; b+=2; dz=a/b; z+=dz;
if (::abs(dz)<zero) break;
}
if (_shift) z+=pi/6.0;
if (_invert) z=0.5*pi-z;
if (_negative) z=-z;
return z;
}
T asin (const T &x) // = asin(x)
{
if (x<=-1.0) return -0.5*pi;
if (x>=+1.0) return +0.5*pi;
return ::atan(x/::sqrt(1.0-(x*x)));
}
Where T is any floating point type (float,double,...). As you can see you need sqrt(x), pi=3.141592653589793238462643383279502884197169399375105, zero=1e-20 and +,-,*,/ operations implemented. The zero constant is the target precision.
So just replace T with float/double and ignore the :: ...
so I guess I'd have to increment it in the while statement too
Yes, this might be a way. And what stops you?
int i=0;
while(condition){
//do something
i++;
}
Another way would be using the for condition:
for(i = 1; i < 23500 && my_abs(prev_term - next_term) >= EPSILON; i++)
Your formula is wrong. Here is the correct formula: http://scipp.ucsc.edu/~haber/ph116A/taylor11.pdf.
P.S. also note that your formula and your series are not correspond to each other.
You can use while like this:
while( std::abs(sum_prev - sum) < 1e-15 )
{
sum_prev = sum;
sum += a;
a = next(a, x, i);
}

Performing arithmetic operations in binary using only bitwise operators [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How can I multiply and divide using only bit shifting and adding?
I have to write functions to perform binary subtraction, multiplication, and division without using any arithmetic operators except for loop control. I've only written code in Java before now, so I'm having a hard time wrapping my head around this.
Starting with subtraction, I need to write a function with prototype
int bsub(int x, int y)
I know I need to convert y to two's complement in order to make it negative and add it to x, but I only know how to do this by using one's complement ~ operator and adding 1, but I can't use the + operator.
The badd function was provided, and I will be able to implement it in bsub if I can figure out how to make y a negative number. The code for badd is shown below. Thanks in advance for any tips.
int badd(int x,int y){
int i;
char sum;
char car_in=0;
char car_out;
char a,b;
unsigned int mask=0x00000001;
int result=0;
for(i=0;i<32;i++){
a=(x&mask)!=0;
b=(y&mask)!=0;
car_out=car_in & (a|b) |a&b;
sum=a^b^car_in;
if(sum) {
result|=mask;
}
if(i!=31) {
car_in=car_out;
} else {
if(car_in!=car_out) {
printf("Overflow occurred\n");
}
}
mask<<=1;
}
return result;
}
Well, subtracting in bitwise operations without the + or - operators is slightly tricky, but can be done. You have the basic idea with the complement, but without using + it becomes slightly tricky.
You can do it by first setting up addition with bit-wise only, then using that, you can do subtraction. Which is used for the complement, So the code looks like this:
int badd(int n1, int n2){
int carry, sum;
carry = (n1 & n2) << 1; // Find bits that are used for carry
sum = n1 ^ n2; // Add each bit, discard carry.
if (sum & carry) // If bits match, add current sum and carry.
return badd(sum, carry);
else
return sum ^ carry; // Return the sum.
}
int bsub(int n1, int n2){
// Add two's complement and return.
return badd(n1, badd(~n2, 1));
}
And then if we use the above code in an example:
int main(){
printf("%d\n", bsub(53, 17));
return 0;
}
Which ends up returning 36. And that is how subtraction works with bitwise only operations.
Afterwards multiplication and division get more complicated, but can be done; for those two operations, use shifts along with addition and/or subtraction to get the job done. You may also want to read this question and this article on how to do it.
You have to implement the binary addition first:
Example with 4 bits:
a = 1101
b = 1011
mask will range from 0001 to 1000
for (i=0;i<4;i++) {
x = a & pow(2, i); //mask, you can shift left as well
y = b & pow(2, i);
z = x ^ y; //XOR to calculate addition
z = z ^ carry; //add previous carry
carry = x & y | x ^ carry | y ^ carry; //new carry
}
This is pseudocode. The mask allows for operating bit by bit from left to right. You'll have to store z conveniently into another variable.
Once you have the addition, you'll be able to implement subtraction by 1'complementing and adding 1.
Multiplication goes the same way, but slightly more difficult. Basically it's the same division method you learned at school, using masks to select bits conveniently and adding the intermediate results using the addition above.
Division is a bit more complicated, it would take some more time to explain but basically it's the same principle.

Efficient implementation of natural logarithm (ln) and exponentiation

I'm looking for implementation of log() and exp() functions provided in C library <math.h>. I'm working with 8 bit microcontrollers (OKI 411 and 431). I need to calculate Mean Kinetic Temperature. The requirement is that we should be able to calculate MKT as fast as possible and with as little code memory as possible. The compiler comes with log() and exp() functions in <math.h>. But calling either function and linking with the library causes the code size to increase by 5 Kilobytes, which will not fit in one of the micro we work with (OKI 411), because our code already consumed ~12K of available ~15K code memory.
The implementation I'm looking for should not use any other C library functions (like pow(), sqrt() etc). This is because all library functions are packed in one library and even if one function is called, the linker will bring whole 5K library to code memory.
EDIT
The algorithm should be correct up to 3 decimal places.
Using Taylor series is not the simplest neither the fastest way of doing this. Most professional implementations are using approximating polynomials. I'll show you how to generate one in Maple (it is a computer algebra program), using the Remez algorithm.
For 3 digits of accuracy execute the following commands in Maple:
with(numapprox):
Digits := 8
minimax(ln(x), x = 1 .. 2, 4, 1, 'maxerror')
maxerror
Its response is the following polynomial:
-1.7417939 + (2.8212026 + (-1.4699568 + (0.44717955 - 0.056570851 * x) * x) * x) * x
With the maximal error of: 0.000061011436
We generated a polynomial which approximates the ln(x), but only inside the [1..2] interval. Increasing the interval is not wise, because that would increase the maximal error even more. Instead of that, do the following decomposition:
So first find the highest power of 2, which is still smaller than the number (See: What is the fastest/most efficient way to find the highest set bit (msb) in an integer in C?). That number is actually the base-2 logarithm. Divide with that value, then the result gets into the 1..2 interval. At the end we will have to add n*ln(2) to get the final result.
An example implementation for numbers >= 1:
float ln(float y) {
int log2;
float divisor, x, result;
log2 = msb((int)y); // See: https://stackoverflow.com/a/4970859/6630230
divisor = (float)(1 << log2);
x = y / divisor; // normalized value between [1.0, 2.0]
result = -1.7417939 + (2.8212026 + (-1.4699568 + (0.44717955 - 0.056570851 * x) * x) * x) * x;
result += ((float)log2) * 0.69314718; // ln(2) = 0.69314718
return result;
}
Although if you plan to use it only in the [1.0, 2.0] interval, then the function is like:
float ln(float x) {
return -1.7417939 + (2.8212026 + (-1.4699568 + (0.44717955 - 0.056570851 * x) * x) * x) * x;
}
The Taylor series for e^x converges extremely quickly, and you can tune your implementation to the precision that you need. (http://en.wikipedia.org/wiki/Taylor_series)
The Taylor series for log is not as nice...
If you don't need floating-point math for anything else, you may compute an approximate fractional base-2 log pretty easily. Start by shifting your value left until it's 32768 or higher and store the number of times you did that in count. Then, repeat some number of times (depending upon your desired scale factor):
n = (mult(n,n) + 32768u) >> 16; // If a function is available for 16x16->32 multiply
count<<=1;
if (n < 32768) n*=2; else count+=1;
If the above loop is repeated 8 times, then the log base 2 of the number will be count/256. If ten times, count/1024. If eleven, count/2048. Effectively, this function works by computing the integer power-of-two logarithm of n**(2^reps), but with intermediate values scaled to avoid overflow.
Would basic table with interpolation between values approach work? If ranges of values are limited (which is likely for your case - I doubt temperature readings have huge range) and high precisions is not required it may work. Should be easy to test on normal machine.
Here is one of many topics on table representation of functions: Calculating vs. lookup tables for sine value performance?
Necromancing.
I had to implement logarithms on rational numbers.
This is how I did it:
Occording to Wikipedia, there is the Halley-Newton approximation method
which can be used for very-high precision.
Using Newton's method, the iteration simplifies to (implementation), which has cubic convergence to ln(x), which is way better than what the Taylor-Series offers.
// Using Newton's method, the iteration simplifies to (implementation)
// which has cubic convergence to ln(x).
public static double ln(double x, double epsilon)
{
double yn = x - 1.0d; // using the first term of the taylor series as initial-value
double yn1 = yn;
do
{
yn = yn1;
yn1 = yn + 2 * (x - System.Math.Exp(yn)) / (x + System.Math.Exp(yn));
} while (System.Math.Abs(yn - yn1) > epsilon);
return yn1;
}
This is not C, but C#, but I'm sure anybody capable to program in C will be able to deduce the C-Code from that.
Furthermore, since
logn(x) = ln(x)/ln(n).
You have therefore just implemented logN as well.
public static double log(double x, double n, double epsilon)
{
return ln(x, epsilon) / ln(n, epsilon);
}
where epsilon (error) is the minimum precision.
Now as to speed, you're probably better of using the ln-cast-in-hardware, but as I said, I used this as a base to implement logarithms on a rational numbers class working with arbitrary precision.
Arbitrary precision might be more important than speed, under certain circumstances.
Then, use the logarithmic identities for rational numbers:
logB(x/y) = logB(x) - logB(y)
In addition to Crouching Kitten's answer which gave me inspiration, you can build a pseudo-recursive (at most 1 self-call) logarithm to avoid using polynomials. In pseudo code
ln(x) :=
If (x <= 0)
return NaN
Else if (!(1 <= x < 2))
return LN2 * b + ln(a)
Else
return taylor_expansion(x - 1)
This is pretty efficient and precise since on [1; 2) the taylor series converges A LOT faster, and we get such a number 1 <= a < 2 with the first call to ln if our input is positive but not in this range.
You can find 'b' as your unbiased exponent from the data held in the float x, and 'a' from the mantissa of the float x (a is exactly the same float as x, but now with exponent biased_0 rather than exponent biased_b). LN2 should be kept as a macro in hexadecimal floating point notation IMO. You can also use http://man7.org/linux/man-pages/man3/frexp.3.html for this.
Also, the trick
unsigned long tmp = *(ulong*)(&d);
for "memory-casting" double to unsigned long, rather than "value-casting", is very useful to know when dealing with floats memory-wise, as bitwise operators will cause warnings or errors depending on the compiler.
Possible computation of ln(x) and expo(x) in C without <math.h> :
static double expo(double n) {
int a = 0, b = n > 0;
double c = 1, d = 1, e = 1;
for (b || (n = -n); e + .00001 < (e += (d *= n) / (c *= ++a)););
// approximately 15 iterations
return b ? e : 1 / e;
}
static double native_log_computation(const double n) {
// Basic logarithm computation.
static const double euler = 2.7182818284590452354 ;
unsigned a = 0, d;
double b, c, e, f;
if (n > 0) {
for (c = n < 1 ? 1 / n : n; (c /= euler) > 1; ++a);
c = 1 / (c * euler - 1), c = c + c + 1, f = c * c, b = 0;
for (d = 1, c /= 2; e = b, b += 1 / (d * c), b - e/* > 0.0000001 */;)
d += 2, c *= f;
} else b = (n == 0) / 0.;
return n < 1 ? -(a + b) : a + b;
}
static inline double native_ln(const double n) {
// Returns the natural logarithm (base e) of N.
return native_log_computation(n) ;
}
static inline double native_log_base(const double n, const double base) {
// Returns the logarithm (base b) of N.
return native_log_computation(n) / native_log_computation(base) ;
}
Try it Online
Building off #Crouching Kitten's great natural log answer above, if you need it to be accurate for inputs <1 you can add a simple scaling factor. Below is an example in C++ that i've used in microcontrollers. It has a scaling factor of 256 and it's accurate to inputs down to 1/256 = ~0.04, and up to 2^32/256 = 16777215 (due to overflow of a uint32 variable).
It's interesting to note that even on an STMF103 Arm M3 with no FPU, the float implementation below is significantly faster (eg 3x or better) than the 16 bit fixed-point implementation in libfixmath (that being said, this float implementation still takes a few thousand cycles so it's still not ~fast~)
#include <float.h>
float TempSensor::Ln(float y)
{
// Algo from: https://stackoverflow.com/a/18454010
// Accurate between (1 / scaling factor) < y < (2^32 / scaling factor). Read comments below for more info on how to extend this range
float divisor, x, result;
const float LN_2 = 0.69314718; //pre calculated constant used in calculations
uint32_t log2 = 0;
//handle if input is less than zero
if (y <= 0)
{
return -FLT_MAX;
}
//scaling factor. The polynomial below is accurate when the input y>1, therefore using a scaling factor of 256 (aka 2^8) extends this to 1/256 or ~0.04. Given use of uint32_t, the input y must stay below 2^24 or 16777216 (aka 2^(32-8)), otherwise uint_y used below will overflow. Increasing the scaing factor will reduce the lower accuracy bound and also reduce the upper overflow bound. If you need the range to be wider, consider changing uint_y to a uint64_t
const uint32_t SCALING_FACTOR = 256;
const float LN_SCALING_FACTOR = 5.545177444; //this is the natural log of the scaling factor and needs to be precalculated
y = y * SCALING_FACTOR;
uint32_t uint_y = (uint32_t)y;
while (uint_y >>= 1) // Convert the number to an integer and then find the location of the MSB. This is the integer portion of Log2(y). See: https://stackoverflow.com/a/4970859/6630230
{
log2++;
}
divisor = (float)(1 << log2);
x = y / divisor; // FInd the remainder value between [1.0, 2.0] then calculate the natural log of this remainder using a polynomial approximation
result = -1.7417939 + (2.8212026 + (-1.4699568 + (0.44717955 - 0.056570851 * x) * x) * x) * x; //This polynomial approximates ln(x) between [1,2]
result = result + ((float)log2) * LN_2 - LN_SCALING_FACTOR; // Using the log product rule Log(A) + Log(B) = Log(AB) and the log base change rule log_x(A) = log_y(A)/Log_y(x), calculate all the components in base e and then sum them: = Ln(x_remainder) + (log_2(x_integer) * ln(2)) - ln(SCALING_FACTOR)
return result;
}

Resources