How to compute the digits of an irrational number one by one?

How to compute the digits of an irrational number one by one? - c

I want to read digit by digit the decimals of the sqrt of 5 in C.
The square root of 5 is 2,23606797749979..., so this'd be the expected output:
2
3
6
0
6
7
9
7
7
...
I've found the following code:
#include<stdio.h>
void main()
{
int number;
float temp, sqrt;
printf("Provide the number: \n");
scanf("%d", &number);
// store the half of the given number e.g from 256 => 128
sqrt = number / 2;
temp = 0;
// Iterate until sqrt is different of temp, that is updated on the loop
while(sqrt != temp){
// initially 0, is updated with the initial value of 128
// (on second iteration = 65)
// and so on
temp = sqrt;
// Then, replace values (256 / 128 + 128 ) / 2 = 65
// (on second iteration 34.46923076923077)
// and so on
sqrt = ( number/temp + temp) / 2;
}
printf("The square root of '%d' is '%f'", number, sqrt);
}
But this approach stores the result in a float variable, and I don't want to depend on the limits of the float types, as I would like to extract like 10,000 digits, for instance. I also tried to use the native sqrt() function and casting it to string number using this method, but I faced the same issue.

What you've asked about is a very hard problem, and whether it's even possible to do "one by one" (i.e. without working space requirement that scales with how far out you want to go) depends on both the particular irrational number and the base you want it represented in. For example, in 1995 when a formula for pi was discovered that allows computing the nth binary digit in O(1) space, this was a really big deal. It was not something people expected to be possible.
If you're willing to accept O(n) space, then some cases like the one you mentioned are fairly easy. For example, if you have the first n digits of the square root of a number as a decimal string, you can simply try appending each digit 0 to 9, then squaring the string with long multiplication (same as you learned in grade school), and choosing the last one that doesn't overshoot. Of course this is very slow, but it's simple. The easy way to make it a lot faster (but still asymptotically just as bad) is using an arbitrary-precision math library in place of strings. Doing significantly better requires more advanced approaches and in general may not be possible.

As already noted, you need to change the algorithm into a digit-by-digit one (there are some examples in the Wikipedia page about the methods of computing of the square roots) and use an arbitrary precision arithmetic library to perform the calculations (for instance, GMP).
In the following snippet I implemented the before mentioned algorithm, using GMP (but not the square root function that the library provides). Instead of calculating one decimal digit at a time, this implementation uses a larger base, the greatest multiple of 10 that fits inside an unsigned long, so that it can produce 9 or 18 decimal digits at every iteration.
It also uses an adapted Newton method to find the actual "digit".
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <gmp.h>
unsigned long max_ul(unsigned long a, unsigned long b)
{
return a < b ? b : a;
}
int main(int argc, char *argv[])
{
// The GMP functions accept 'unsigned long int' values as parameters.
// The algorithm implemented here can work with bases other than 10,
// so that it can evaluate more than one decimal digit at a time.
const unsigned long base = sizeof(unsigned long) > 4
? 1000000000000000000
: 1000000000;
const unsigned long decimals_per_digit = sizeof(unsigned long) > 4 ? 18 : 9;
// Extract the number to be square rooted and the desired number of decimal
// digits from the command line arguments. Fallback to 0 in case of errors.
const unsigned long number = argc > 1 ? atoi(argv[1]) : 0;
const unsigned long n_digits = argc > 2 ? atoi(argv[2]) : 0;
// All the variables used by GMP need to be properly initialized before use.
// 'c' is basically the remainder, initially set to the original number
mpz_t c;
mpz_init_set_ui(c, number);
// At every iteration, the algorithm "move to the left" by two "digits"
// the reminder, so it multplies it by base^2.
mpz_t base_squared;
mpz_init_set_ui(base_squared, base);
mpz_mul(base_squared, base_squared, base_squared);
// 'p' stores the digits of the root found so far. The others are helper variables
mpz_t p;
mpz_init_set_ui(p, 0UL);
mpz_t y;
mpz_init(y);
mpz_t yy;
mpz_init(yy);
mpz_t dy;
mpz_init(dy);
mpz_t dx;
mpz_init(dx);
mpz_t pp;
mpz_init(pp);
// Timing, for testing porpuses
clock_t start = clock(), diff;
unsigned long x_max = number;
// Each "digit" correspond to some decimal digits
for (unsigned long i = 0,
last = (n_digits + decimals_per_digit) / decimals_per_digit + 1UL;
i < last; ++i)
{
// Find the greatest x such that: x * (2 * base * p + x) <= c
// where x is in [0, base), using a specialized Newton method
// pp = 2 * base * p
mpz_mul_ui(pp, p, 2UL * base);
unsigned long x = x_max;
for (;;)
{
// y = x * (pp + x)
mpz_add_ui(yy, pp, x);
mpz_mul_ui(y, yy, x);
// dy = y - c
mpz_sub(dy, y, c);
// If y <= c we have found the correct x
if ( mpz_sgn(dy) <= 0 )
break;
// Newton's step: dx = dy/y' where y' = 2 * x + pp
mpz_add_ui(yy, yy, x);
mpz_tdiv_q(dx, dy, yy);
// Update x even if dx == 0 (last iteration)
x -= max_ul(mpz_get_si(dx), 1);
}
x_max = base - 1;
// The actual format of the printed "digits" is up to you
if (i % 4 == 0)
{
if (i == 0)
printf("%lu.", x);
putchar('\n');
}
else
printf("%018lu", x);
// p = base * p + x
mpz_mul_ui(p, p, base);
mpz_add_ui(p, p, x);
// c = (c - y) * base^2
mpz_sub(c, c, y);
mpz_mul(c, c, base_squared);
}
diff = clock() - start;
long int msec = diff * 1000L / CLOCKS_PER_SEC;
printf("\n\nTime taken: %ld.%03ld s\n", msec / 1000, msec % 1000);
// Final cleanup
mpz_clear(c);
mpz_clear(base_squared);
mpz_clear(p);
mpz_clear(pp);
mpz_clear(dx);
mpz_clear(y);
mpz_clear(dy);
mpz_clear(yy);
}
You can see the outputted digits here.

Your title says:
How to compute the digits of an irrational number one by one?
Irrational numbers are not limited to most square roots. They also include numbers of the form log(x), exp(z), sin(y), etc. (transcendental numbers). However, there are some important factors that determine whether or how fast you can compute a given irrational number's digits one by one (that is, from left to right).
Not all irrational numbers are computable; that is, no one has found a way to approximate them to any desired length (whether by a closed form expression, a series, or otherwise).
There are many ways numbers can be expressed, such as by their binary or decimal expansions, as continued fractions, as series, etc. And there are different algorithms to compute a given number's digits depending on the representation.
Some formulas compute a given number's digits in a particular base (such as base 2), not in an arbitrary base.
For example, besides the first formula to extract the digits of π without computing the previous digits, there are other formulas of this type (known as BBP-type formulas) that extract the digits of certain irrational numbers. However, these formulas only work for a particular base, not all BBP-type formulas have a formal proof, and most importantly, not all irrational numbers have a BBP-type formula (essentially, only certain log and arctan constants do, not numbers of the form exp(x) or sqrt(x)).
On the other hand, if you can express an irrational number as a continued fraction (which all real numbers have), you can extract its digits from left to right, and in any base desired, using a specific algorithm. What is more, this algorithm works for any real number constant, including square roots, exponentials (e and exp(x)), logarithms, etc., as long as you know how to express it as a continued fraction. For an implementation see "Digits of pi and Python generators". See also Code to Generate e one Digit at a Time.

Related

How to calculate the log2 of integer in C as precisely as possible with bitwise operations

I need to calculate the entropy and due to the limitations of my system I need to use restricted C features (no loops, no floating point support) and I need as much precision as possible. From here I figure out how to estimate the floor log2 of an integer using bitwise operations. Nevertheless, I need to increase the precision of the results. Since no floating point operations are allowed, is there any way to calculate log2(x/y) with x < y so that the result would be something like log2(x/y)*10000, aiming at getting the precision I need through arithmetic integer?

You will base an algorithm on the formula
log2(x/y) = K*(-log(x/y));
where
K = -1.0/log(2.0); // you can precompute this constant before run-time
a = (y-x)/y;
-log(x/y) = a + a^2/2 + a^3/3 + a^4/4 + a^5/5 + ...
If you write the loop correctly—or, if you prefer, unroll the loop to code the same sequence of operations looplessly—then you can handle everything in integer operations:
(y^N*(1*2*3*4*5*...*N)) * (-log(x/y))
= y^(N-1)*(2*3*4*5*...*N)*(y-x) + y^(N-2)*(1*3*4*5*...*N)*(y-x)^2 + ...
Of course, ^, the power operator, binding tighter than *, is not a C operator, but you can implement that efficiently in the context of your (perhaps unrolled) loop as a running product.
The N is an integer large enough to afford desired precision but not so large that it overruns the number of bits you have available. If unsure, then try N = 6 for instance. Regarding K, you might object that that is a floating-point number, but this is not a problem for you because you are going to precompute K, storing it as a ratio of integers.
SAMPLE CODE
This is a toy code but it works for small values of x and y such as 5 and 7, thus sufficing to prove the concept. In the toy code, larger values can silently overflow the default 64-bit registers. More work would be needed to make the code robust.
#include <stddef.h>
#include <stdlib.h>
// Your program will not need the below headers, which are here
// included only for comparison and demonstration.
#include <math.h>
#include <stdio.h>
const size_t N = 6;
const long long Ky = 1 << 10; // denominator of K
// Your code should define a precomputed value for Kx here.
int main(const int argc, const char *const *const argv)
{
// Your program won't include the following library calls but this
// does not matter. You can instead precompute the value of Kx and
// hard-code its value above with Ky.
const long long Kx = lrintl((-1.0/log(2.0))*Ky); // numerator of K
printf("K == %lld/%lld\n", Kx, Ky);
if (argc != 3) exit(1);
// Read x and y from the command line.
const long long x0 = atoll(argv[1]);
const long long y = atoll(argv[2]);
printf("x/y == %lld/%lld\n", x0, y);
if (x0 <= 0 || y <= 0 || x0 > y) exit(1);
// If 2*x <= y, then, to improve accuracy, double x repeatedly
// until 2*x > y. Each doubling offsets the log2 by 1. The offset
// is to be recovered later.
long long x = x0;
int integral_part_of_log2 = 0;
while (1) {
const long long trial_x = x << 1;
if (trial_x > y) break;
x = trial_x;
--integral_part_of_log2;
}
printf("integral_part_of_log2 == %d\n", integral_part_of_log2);
// Calculate the denominator of -log(x/y).
long long yy = 1;
for (size_t j = N; j; --j) yy *= j*y;
// Calculate the numerator of -log(x/y).
long long xx = 0;
{
const long long y_minus_x = y - x;
for (size_t i = N; i; --i) {
long long term = 1;
size_t j = N;
for (; j > i; --j) {
term *= j*y;
}
term *= y_minus_x;
--j;
for (; j; --j) {
term *= j*y_minus_x;
}
xx += term;
}
}
// Convert log to log2.
xx *= Kx;
yy *= Ky;
// Restore the aforementioned offset.
for (; integral_part_of_log2; ++integral_part_of_log2) xx -= yy;
printf("log2(%lld/%lld) == %lld/%lld\n", x0, y, xx, yy);
printf("in floating point, this ratio of integers works out to %g\n",
(1.0*xx)/(1.0*yy));
printf("the CPU's floating-point unit computes the log2 to be %g\n",
log2((1.0*x0)/(1.0*y)));
return 0;
}
Running this on my machine with command-line arguments of 5 7, it outputs:
K == -1477/1024
x/y == 5/7
integral_part_of_log2 == 0
log2(5/7) == -42093223872/86740254720
in floating point, this ratio of integers works out to -0.485279
the CPU's floating-point unit computes the log2 to be -0.485427
Accuracy would be substantially improved by N = 12 and Ky = 1 << 20, but for that you need either thriftier code or more than 64 bits.
THRIFTIER CODE
Thriftier code, wanting more effort to write, might represent numerator and denominator in prime factors. For example, it might represent 500 as [2 0 3], meaning (22)(30)(53).
Yet further improvements might occur to your imagination.
AN ALTERNATE APPROACH
For an alternate approach, though it might not meet your requirements precisely as you have stated them, #phuclv has given the suggestion I would be inclined to follow if your program were mine: work the problem in reverse, guessing a value c/d for the logarithm and then computing 2^(c/d), presumably via a Newton-Raphson iteration. Personally, I like the Newton-Raphson approach better. See sect. 4.8 here (my original).
MATHEMATICAL BACKGROUND
Several sources including mine already linked explain the Taylor series underlying the first approach and the Newton-Raphson iteration of the second approach. The mathematics unfortunately is nontrivial, but there you have it. Good luck.

Upper bound for number of digits of big integer in different base

I want to create a big integer from string representation and to do that efficiently I need an upper bound on the number of digits in the target base to avoid reallocating memory.
Example:
A 640 bit number has 640 digits in base 2, but only ten digits in base 2^64, so I will have to allocate ten 64 bit integers to hold the result.
The function I am currently using is:
int get_num_digits_in_different_base(int n_digits, double src_base, double dst_base){
return ceil(n_digits*log(src_base)/log(dst_base));
}
Where src_base is in {2, ..., 10 + 26} and dst_base is in {2^8, 2^16, 2^32, 2^64}.
I am not sure if the result will always be correctly rounded though. log2 would be easier to reason about, but I read that older versions of Microsoft Visual C++ do not support that function. It could be emulated like log2(x) = log(x)/log(2) but now I am back where I started.
GMP probably implements a function to do base conversion, but I may not read the source or else I might get GPL cancer so I can not do that.

I imagine speed is of some concern, or else you could just try the floating point-based estimate and adjust if it turned out to be too small. In that case, one can sacrifice tightness of the estimate for speed.
In the following, let dst_base be 2^w, src_base be b, and n_digits be n.
Let k(b,w)=max {j | b^j < 2^w}. This represents the largest power of b that is guaranteed to fit within a w-wide binary (non-negative) integer. Because of the relatively small number of source and destination bases, these values can be precomputed and looked-up in a table, but mathematically k(b,w)=[w log 2/log b] (where [.] denotes the integer part.)
For a given n let m=ceil( n / k(b,w) ). Then the maximum number of dst_base digits required to hold a number less than b^n is:
ceil(log (b^n-1)/log (2^w)) ≤ ceil(log (b^n) / log (2^w) )
≤ ceil( m . log (b^k(b,w)) / log (2^w) ) ≤ m.
In short, if you precalculate the k(b,w) values, you can quickly get an upper bound (which is not tight!) by dividing n by k, rounding up.

I'm not sure about float point rounding in this case, but it is relatively easy to implement this using only integers, as log2 is a classic bit manipulation pattern and integer division can be easily rounded up. The following code is equivalent to yours, but using integers:
// Returns log2(x) rounded up using bit manipulation (not most efficient way)
unsigned int log2(unsigned int x)
{
unsigned int y = 0;
--x;
while (x) {
y++;
x >>= 1;
}
return y;
}
// Returns ceil(a/b) using integer division
unsigned int roundup(unsigned int a, unsigned int b)
{
return (a + b - 1) / b;
}
unsigned int get_num_digits_in_different_base(unsigned int n_digits, unsigned int src_base, unsigned int log2_dst_base)
{
return roundup(n_digits * log2(src_base), log2_dst_base);
}
Please, note that:
This function return different results compared to yours! However, in every case I looked, both were still correct (the smaller value was more accurate, but your requirement is just an upper bound).
The integer version I wrote receives log2_dst_base instead of dst_base to avoid overflow for 2^64.
log2 can be made more efficient using lookup tables.
I've used unsigned int instead of int.

How can you easily calculate the square root of an unsigned long long in C?

I was looking at another question (here) where someone was looking for a way to get the square root of a 64 bit integer in x86 assembly.
This turns out to be very simple. The solution is to convert to a floating point number, calculate the sqrt and then convert back.
I need to do something very similar in C however when I look into equivalents I'm getting a little stuck. I can only find a sqrt function which takes in doubles. Doubles do not have the precision to store large 64bit integers without introducing significant rounding error.
Is there a common math library that I can use which has a long double sqrt function?

There is no need for long double; the square root can be calculated with double (if it is IEEE-754 64-bit binary). The rounding error in converting a 64-bit integer to double is nearly irrelevant in this problem.
The rounding error is at most one part in 253. This causes an error in the square root of at most one part in 254. The sqrt itself has a rounding error of less than one part in 253, due to rounding the mathematical result to the double format. The sum of these errors is tiny; the largest possible square root of a 64-bit integer (rounded to 53 bits) is 232, so an error of three parts in 254 is less than .00000072.
For a uint64_t x, consider sqrt(x). We know this value is within .00000072 of the exact square root of x, but we do not know its direction. If we adjust it to sqrt(x) - 0x1p-20, then we know we have a value that is less than, but very close to, the square root of x.
Then this code calculates the square root of x, truncated to an integer, provided the operations conform to IEEE 754:
uint64_t y = sqrt(x) - 0x1p-20;
if (2*y < x - y*y)
++y;
(2*y < x - y*y is equivalent to (y+1)*(y+1) <= x except that it avoids wrapping the 64-bit integer if y+1 is 232.)

Function sqrtl(), taking a long double, is part of C99.
Note that your compilation platform does not have to implement long double as 80-bit extended-precision. It is only required to be as wide as double, and Visual Studio implements is as a plain double. GCC and Clang do compile long double to 80-bit extended-precision on Intel processors.

Yes, the standard library has sqrtl() (since C99).

If you only want to calculate sqrt for integers, using divide and conquer should find the result in max 32 iterations:
uint64_t mysqrt (uint64_t a)
{
uint64_t min=0;
//uint64_t max=1<<32;
uint64_t max=((uint64_t) 1) << 32; //chux' bugfix
while(1)
{
if (max <= 1 + min)
return min;
uint64_t sqt = min + (max - min)/2;
uint64_t sq = sqt*sqt;
if (sq == a)
return sqt;
if (sq > a)
max = sqt;
else
min = sqt;
}
Debugging is left as exercise for the reader.

Here we collect several observations in order to arrive to a solution:
In standard C >= 1999, it is garanted that non-netative integers have a representation in bits as one would expected for any base-2 number.
----> Hence, we can trust in bit manipulation of this type of numbers.
If x is a unsigned integer type, tnen x >> 1 == x / 2 and x << 1 == x * 2.
(!) But: It is very probable that bit operations shall be done faster than their arithmetical counterparts.
sqrt(x) is mathematically equivalent to exp(log(x)/2.0).
If we consider truncated logarithms and base-2 exponential for integers, we could obtain a fair estimate: IntExp2( IntLog2(x) / 2) "==" IntSqrtDn(x), where "=" is informal notation meaning almost equatl to (in the sense of a good approximation).
If we write IntExp2( IntLog2(x) / 2 + 1) "==" IntSqrtUp(x), we obtain an "above" approximation for the integer square root.
The approximations obtained in (4.) and (5.) are a little rough (they enclose the true value of sqrt(x) between two consecutive powers of 2), but they could be a very well starting point for any algorithm that searchs for the square roor of x.
The Newton algorithm for square root could be work well for integers, if we have a good first approximation to the real solution.
http://en.wikipedia.org/wiki/Integer_square_root
The final algorithm needs some mathematical comprobations to be plenty sure that always work properly, but I will not do it right now... I will show you the final program, instead:
#include <stdio.h> /* For printf()... */
#include <stdint.h> /* For uintmax_t... */
#include <math.h> /* For sqrt() .... */
int IntLog2(uintmax_t n) {
if (n == 0) return -1; /* Error */
int L;
for (L = 0; n >>= 1; L++)
;
return L; /* It takes < 64 steps for long long */
}
uintmax_t IntExp2(int n) {
if (n < 0)
return 0; /* Error */
uintmax_t E;
for (E = 1; n-- > 0; E <<= 1)
;
return E; /* It takes < 64 steps for long long */
}
uintmax_t IntSqrtDn(uintmax_t n) { return IntExp2(IntLog2(n) / 2); }
uintmax_t IntSqrtUp(uintmax_t n) { return IntExp2(IntLog2(n) / 2 + 1); }
int main(void) {
uintmax_t N = 947612934; /* Try here your number! */
uintmax_t sqrtn = IntSqrtDn(N), /* 1st approx. to sqrt(N) by below */
sqrtn0 = IntSqrtUp(N); /* 1st approx. to sqrt(N) by above */
/* The following means while( abs(sqrt-sqrt0) > 1) { stuff... } */
/* However, we take care of subtractions on unsigned arithmetic, just in case... */
while ( (sqrtn > sqrtn0 + 1) || (sqrtn0 > sqrtn+1) )
sqrtn0 = sqrtn, sqrtn = (sqrtn0 + N/sqrtn0) / 2; /* Newton iteration */
printf("N==%llu, sqrt(N)==%g, IntSqrtDn(N)==%llu, IntSqrtUp(N)==%llu, sqrtn==%llu, sqrtn*sqrtn==%llu\n\n",
N, sqrt(N), IntSqrtDn(N), IntSqrtUp(N), sqrtn, sqrtn*sqrtn);
return 0;
}
The last value stored in sqrtn is the integer square root of N.
The last line of the program just shows all the values, with comprobation purposes.
So, you can try different values of Nand see what happens.
If we add a counter inside the while-loop, we'll see that no more than a few iterations happen.
Remark: It is necessary to verify that the condition abs(sqrtn-sqrtn0)<=1 is always achieved when working in the integer-number setting. If not, we shall have to fix the algorithm.
Remark2: In the initialization sentences, observe that sqrtn0 == sqrtn * 2 == sqrtn << 1. This avoids us some calculations.

// sqrt_i64 returns the integer square root of v.
int64_t sqrt_i64(int64_t v) {
uint64_t q = 0, b = 1, r = v;
for( b <<= 62; b > 0 && b > r; b >>= 2);
while( b > 0 ) {
uint64_t t = q + b;
q >>= 1;
if( r >= t ) {
r -= t;
q += b;
}
b >>= 2;
}
return q;
}
The for loop may be optimized by using the clz machine code instruction.

Optimized way to handle extremely large number without using external library

Optimized way to handle the value of n^n (1 ≤ n ≤ 10^9)
I used long long int but it's not good enough as the value might be (1000^1000)
Searched and found the GMP library http://gmplib.org/ and BigInt class but don't wanna use them. I am looking for some numerical method to handle this.
I need to print the first and last k (1 ≤ k ≤ 9) digits of n^n
For the first k digits I am getting it like shown below (it's bit ugly way of doing it)
num = pow(n,n);
while(num){
arr[i++] = num%10;
num /= 10;
digit++;
}
while(digit > 0){
j=digit;
j--;
if(count<k){
printf("%lld",arr[j]);
count++;
}
digit--;
}
and for last k digits am using num % 10^k like below.
findk=pow(10,k);
lastDigits = num % findk;
enter code here
maximum value of k is 9. so i need only 18 digits at max.
I am think of getting those 18 digits without really solving the complete n^n expression.
Any idea/suggestion??

// note: Scope of use is limited.
#include <stdio.h>
long long powerMod(long long a, long long d, long long n){
// a ^ d mod n
long long result = 1;
while(d > 0){
if(d & 1)
result = result * a % n;
a = (a * a) % n;
d >>=1;
}
return result;
}
int main(void){
long long result = powerMod(999, 999, 1000000000);//999^999 mod 10^9
printf("%lld\n", result);//499998999
return 0;
}

Finding the Least Significant Digits (last k digits) are easy because of the property of modular arithmetic, which says: (n*n)%m == (n%m * n%m)%m, so the code shown by BLUEPIXY which followed exponentiation by squaring method will work well for finding k LSDs.
Now, Most Significant Digits (1st k digits) of N^N can be found in this way:
We know,
N^N = 10^(N log N)
So if you calculate N log (N) you will get a number of this format xxxx.yyyy, now we have to use this number as a power of 10, it is easily understandable that xxxx or integer part of the number will add xxxx zeros after 10, which is not important for you! That means, if you calculate 10^0.yyyy, you will get those significants digits you are looking for.
So the solution will be something like this:
double R = N * log10 (N);
R = R - (long long) R; //so taking only the fractional part
double V = pow(10, R);
int powerK = 1;
for (int i=0; i<k; i++) powerK *=10;
V *= powerK;
//Now Print the 1st K digits from V

Why don't you want to use bigint libraries?
bignum arithmetic is very hard to do right and efficiently. You could still get a PhD by working on that subject.
Fist, bigint arithmetic have non-trivial algorithmics
Then, bigint implementations usually need some machine instructions (like add with carry) which are not easily accessible in plain C.
For your specific problem (first and last few digits of NN) you'll better also reason on paper (using arithmetic theorems) to lower the complexity. I am not an expert, but I guess that still remains intractable, perhaps with a complexity worse than O(N)

Efficient implementation of natural logarithm (ln) and exponentiation

I'm looking for implementation of log() and exp() functions provided in C library <math.h>. I'm working with 8 bit microcontrollers (OKI 411 and 431). I need to calculate Mean Kinetic Temperature. The requirement is that we should be able to calculate MKT as fast as possible and with as little code memory as possible. The compiler comes with log() and exp() functions in <math.h>. But calling either function and linking with the library causes the code size to increase by 5 Kilobytes, which will not fit in one of the micro we work with (OKI 411), because our code already consumed ~12K of available ~15K code memory.
The implementation I'm looking for should not use any other C library functions (like pow(), sqrt() etc). This is because all library functions are packed in one library and even if one function is called, the linker will bring whole 5K library to code memory.
EDIT
The algorithm should be correct up to 3 decimal places.

Using Taylor series is not the simplest neither the fastest way of doing this. Most professional implementations are using approximating polynomials. I'll show you how to generate one in Maple (it is a computer algebra program), using the Remez algorithm.
For 3 digits of accuracy execute the following commands in Maple:
with(numapprox):
Digits := 8
minimax(ln(x), x = 1 .. 2, 4, 1, 'maxerror')
maxerror
Its response is the following polynomial:
-1.7417939 + (2.8212026 + (-1.4699568 + (0.44717955 - 0.056570851 * x) * x) * x) * x
With the maximal error of: 0.000061011436
We generated a polynomial which approximates the ln(x), but only inside the [1..2] interval. Increasing the interval is not wise, because that would increase the maximal error even more. Instead of that, do the following decomposition:
So first find the highest power of 2, which is still smaller than the number (See: What is the fastest/most efficient way to find the highest set bit (msb) in an integer in C?). That number is actually the base-2 logarithm. Divide with that value, then the result gets into the 1..2 interval. At the end we will have to add n*ln(2) to get the final result.
An example implementation for numbers >= 1:
float ln(float y) {
int log2;
float divisor, x, result;
log2 = msb((int)y); // See: https://stackoverflow.com/a/4970859/6630230
divisor = (float)(1 << log2);
x = y / divisor; // normalized value between [1.0, 2.0]
result = -1.7417939 + (2.8212026 + (-1.4699568 + (0.44717955 - 0.056570851 * x) * x) * x) * x;
result += ((float)log2) * 0.69314718; // ln(2) = 0.69314718
return result;
}
Although if you plan to use it only in the [1.0, 2.0] interval, then the function is like:
float ln(float x) {
return -1.7417939 + (2.8212026 + (-1.4699568 + (0.44717955 - 0.056570851 * x) * x) * x) * x;
}

The Taylor series for e^x converges extremely quickly, and you can tune your implementation to the precision that you need. (http://en.wikipedia.org/wiki/Taylor_series)
The Taylor series for log is not as nice...

If you don't need floating-point math for anything else, you may compute an approximate fractional base-2 log pretty easily. Start by shifting your value left until it's 32768 or higher and store the number of times you did that in count. Then, repeat some number of times (depending upon your desired scale factor):
n = (mult(n,n) + 32768u) >> 16; // If a function is available for 16x16->32 multiply
count<<=1;
if (n < 32768) n*=2; else count+=1;
If the above loop is repeated 8 times, then the log base 2 of the number will be count/256. If ten times, count/1024. If eleven, count/2048. Effectively, this function works by computing the integer power-of-two logarithm of n**(2^reps), but with intermediate values scaled to avoid overflow.

Would basic table with interpolation between values approach work? If ranges of values are limited (which is likely for your case - I doubt temperature readings have huge range) and high precisions is not required it may work. Should be easy to test on normal machine.
Here is one of many topics on table representation of functions: Calculating vs. lookup tables for sine value performance?

Necromancing.
I had to implement logarithms on rational numbers.
This is how I did it:
Occording to Wikipedia, there is the Halley-Newton approximation method
which can be used for very-high precision.
Using Newton's method, the iteration simplifies to (implementation), which has cubic convergence to ln(x), which is way better than what the Taylor-Series offers.
// Using Newton's method, the iteration simplifies to (implementation)
// which has cubic convergence to ln(x).
public static double ln(double x, double epsilon)
{
double yn = x - 1.0d; // using the first term of the taylor series as initial-value
double yn1 = yn;
do
{
yn = yn1;
yn1 = yn + 2 * (x - System.Math.Exp(yn)) / (x + System.Math.Exp(yn));
} while (System.Math.Abs(yn - yn1) > epsilon);
return yn1;
}
This is not C, but C#, but I'm sure anybody capable to program in C will be able to deduce the C-Code from that.
Furthermore, since
logn(x) = ln(x)/ln(n).
You have therefore just implemented logN as well.
public static double log(double x, double n, double epsilon)
{
return ln(x, epsilon) / ln(n, epsilon);
}
where epsilon (error) is the minimum precision.
Now as to speed, you're probably better of using the ln-cast-in-hardware, but as I said, I used this as a base to implement logarithms on a rational numbers class working with arbitrary precision.
Arbitrary precision might be more important than speed, under certain circumstances.
Then, use the logarithmic identities for rational numbers:
logB(x/y) = logB(x) - logB(y)

In addition to Crouching Kitten's answer which gave me inspiration, you can build a pseudo-recursive (at most 1 self-call) logarithm to avoid using polynomials. In pseudo code
ln(x) :=
If (x <= 0)
return NaN
Else if (!(1 <= x < 2))
return LN2 * b + ln(a)
Else
return taylor_expansion(x - 1)
This is pretty efficient and precise since on [1; 2) the taylor series converges A LOT faster, and we get such a number 1 <= a < 2 with the first call to ln if our input is positive but not in this range.
You can find 'b' as your unbiased exponent from the data held in the float x, and 'a' from the mantissa of the float x (a is exactly the same float as x, but now with exponent biased_0 rather than exponent biased_b). LN2 should be kept as a macro in hexadecimal floating point notation IMO. You can also use http://man7.org/linux/man-pages/man3/frexp.3.html for this.
Also, the trick
unsigned long tmp = *(ulong*)(&d);
for "memory-casting" double to unsigned long, rather than "value-casting", is very useful to know when dealing with floats memory-wise, as bitwise operators will cause warnings or errors depending on the compiler.

Possible computation of ln(x) and expo(x) in C without <math.h> :
static double expo(double n) {
int a = 0, b = n > 0;
double c = 1, d = 1, e = 1;
for (b || (n = -n); e + .00001 < (e += (d *= n) / (c *= ++a)););
// approximately 15 iterations
return b ? e : 1 / e;
}
static double native_log_computation(const double n) {
// Basic logarithm computation.
static const double euler = 2.7182818284590452354 ;
unsigned a = 0, d;
double b, c, e, f;
if (n > 0) {
for (c = n < 1 ? 1 / n : n; (c /= euler) > 1; ++a);
c = 1 / (c * euler - 1), c = c + c + 1, f = c * c, b = 0;
for (d = 1, c /= 2; e = b, b += 1 / (d * c), b - e/* > 0.0000001 */;)
d += 2, c *= f;
} else b = (n == 0) / 0.;
return n < 1 ? -(a + b) : a + b;
}
static inline double native_ln(const double n) {
// Returns the natural logarithm (base e) of N.
return native_log_computation(n) ;
}
static inline double native_log_base(const double n, const double base) {
// Returns the logarithm (base b) of N.
return native_log_computation(n) / native_log_computation(base) ;
}
Try it Online

Building off #Crouching Kitten's great natural log answer above, if you need it to be accurate for inputs <1 you can add a simple scaling factor. Below is an example in C++ that i've used in microcontrollers. It has a scaling factor of 256 and it's accurate to inputs down to 1/256 = ~0.04, and up to 2^32/256 = 16777215 (due to overflow of a uint32 variable).
It's interesting to note that even on an STMF103 Arm M3 with no FPU, the float implementation below is significantly faster (eg 3x or better) than the 16 bit fixed-point implementation in libfixmath (that being said, this float implementation still takes a few thousand cycles so it's still not ~fast~)
#include <float.h>
float TempSensor::Ln(float y)
{
// Algo from: https://stackoverflow.com/a/18454010
// Accurate between (1 / scaling factor) < y < (2^32 / scaling factor). Read comments below for more info on how to extend this range
float divisor, x, result;
const float LN_2 = 0.69314718; //pre calculated constant used in calculations
uint32_t log2 = 0;
//handle if input is less than zero
if (y <= 0)
{
return -FLT_MAX;
}
//scaling factor. The polynomial below is accurate when the input y>1, therefore using a scaling factor of 256 (aka 2^8) extends this to 1/256 or ~0.04. Given use of uint32_t, the input y must stay below 2^24 or 16777216 (aka 2^(32-8)), otherwise uint_y used below will overflow. Increasing the scaing factor will reduce the lower accuracy bound and also reduce the upper overflow bound. If you need the range to be wider, consider changing uint_y to a uint64_t
const uint32_t SCALING_FACTOR = 256;
const float LN_SCALING_FACTOR = 5.545177444; //this is the natural log of the scaling factor and needs to be precalculated
y = y * SCALING_FACTOR;
uint32_t uint_y = (uint32_t)y;
while (uint_y >>= 1) // Convert the number to an integer and then find the location of the MSB. This is the integer portion of Log2(y). See: https://stackoverflow.com/a/4970859/6630230
{
log2++;
}
divisor = (float)(1 << log2);
x = y / divisor; // FInd the remainder value between [1.0, 2.0] then calculate the natural log of this remainder using a polynomial approximation
result = -1.7417939 + (2.8212026 + (-1.4699568 + (0.44717955 - 0.056570851 * x) * x) * x) * x; //This polynomial approximates ln(x) between [1,2]
result = result + ((float)log2) * LN_2 - LN_SCALING_FACTOR; // Using the log product rule Log(A) + Log(B) = Log(AB) and the log base change rule log_x(A) = log_y(A)/Log_y(x), calculate all the components in base e and then sum them: = Ln(x_remainder) + (log_2(x_integer) * ln(2)) - ln(SCALING_FACTOR)
return result;
}

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight