Related
I am trying to write a program that outputs the number of the digits in the decimal portion of a given number (0.128).
I made the following program:
#include <stdio.h>
#include <math.h>
int main(){
float result = 0;
int count = 0;
int exp = 0;
for(exp = 0; int(1+result) % 10 != 0; exp++)
{
result = 0.128 * pow(10, exp);
count++;
}
printf("%d \n", count);
printf("%f \n", result);
return 0;
}
What I had in mind was that exp keeps being incremented until int(1+result) % 10 outputs 0. So for example when result = 0.128 * pow(10,4) = 1280, result mod 10 (int(1+result) % 10) will output 0 and the loop will stop.
I know that on a bigger scale this method is still inefficient since if result was a given input like 1.1208 the program would basically stop at one digit short of the desired value; however, I am trying to first find out the reason why I'm facing the current issue.
My Issue: The loop won't just stop at 1280; it keeps looping until its value reaches 128000000.000000.
Here is the output when I run the program:
10
128000000.000000
Apologies if my description is vague, any given help is very much appreciated.
I am trying to write a program that outputs the number of the digits in the decimal portion of a given number (0.128).
This task is basically impossible, because on a conventional (binary) machine the goal is not meaningful.
If I write
float f = 0.128;
printf("%f\n", f);
I see
0.128000
and I might conclude that 0.128 has three digits. (Never mind about the three 0's.)
But if I then write
printf("%.15f\n", f);
I see
0.128000006079674
Wait a minute! What's going on? Now how many digits does it have?
It's customary to say that floating-point numbers are "not accurate" or that they suffer from "roundoff error". But in fact, floating-point numbers are, in their own way, perfectly accurate — it's just that they're accurate in base two, not the base 10 we're used to thinking about.
The surprising fact is that most decimal (base 10) fractions do not exist as finite binary fractions. This is similar to the way that the number 1/3 does not even exist as a finite decimal fraction. You can approximate 1/3 as 0.333 or 0.3333333333 or 0.33333333333333333333, but without an infinite number of 3's it's only an approximation. Similarly, you can approximate 1/10 in base 2 as 0b0.00011 or 0b0.000110011 or 0b0.000110011001100110011001100110011, but without an infinite number of 0011's it, too, is only an approximation. (That last rendition, with 33 bits past the binary point, works out to about 0.0999999999767.)
And it's the same with most decimal fractions you can think of, including 0.128. So when I wrote
float f = 0.128;
what I actually got in f was the binary number 0b0.00100000110001001001101111, which in decimal is exactly 0.12800000607967376708984375.
Once a number has been stored as a float (or a double, for that matter) it is what it is: there is no way to rediscover that it was initially initialized from a "nice, round" decimal fraction like 0.128. And if you try to "count the number of decimal digits", and if your code does a really precise job, you're liable to get an answer of 26 (that is, corresponding to the digits "12800000607967376708984375"), not 3.
P.S. If you were working with computer hardware that implemented decimal floating point, this problem's goal would be meaningful, possible, and tractable. And implementations of decimal floating point do exist. But the ordinary float and double values any of is likely to use on any of today's common, mass-market computers are invariably going to be binary (specifically, conforming to IEEE-754).
P.P.S. Above I wrote, "what I actually got in f was the binary number 0b0.00100000110001001001101111". And if you count the number of significant bits there — 100000110001001001101111 — you get 24, which is no coincidence at all. You can read at single precision floating-point format that the significand portion of a float has 24 bits (with 23 explicitly stored), and here, you're seeing that in action.
float vs. code
A binary float cannot encode 0.128 exactly as it is not a dyadic rational.
Instead, it takes on a nearby value: 0.12800000607967376708984375. 26 digits.
Rounding errors
OP's approach incurs rounding errors in result = 0.128 * pow(10, exp);.
Extended math needed
The goal is difficult. Example: FLT_TRUE_MIN takes about 149 digits.
We could use double or long double to get us somewhat there.
Simply multiply the fraction by 10.0 in each step.
d *= 10.0; still incurs rounding errors, but less so than OP's approach.
#include <stdio.h>
#include <math.h> int main(){
int count = 0;
float f = 0.128f;
double d = f - trunc(f);
printf("%.30f\n", d);
while (d) {
d *= 10.0;
double ipart = trunc(d);
printf("%.0f", ipart);
d -= ipart;
count++;
}
printf("\n");
printf("%d \n", count);
return 0;
}
Output
0.128000006079673767089843750000
12800000607967376708984375
26
Usefulness
Typically, past FLT_DECMAL_DIG (9) or so significant decimal places, OP’s goal is usually not that useful.
As others have said, the number of decimal digits is meaningless when using binary floating-point.
But you also have a flawed termination condition. The loop test is (int)(1+result) % 10 != 0 meaning that it will stop whenever we reach an integer whose last digit is 9.
That means that 0.9, 0.99 and 0.9999 all give a result of 2.
We also lose precision by truncating the double value we start with by storing into a float.
The most useful thing we could do is terminate when the remaining fractional part is less than the precision of the type used.
Suggested working code:
#include <math.h>
#include <float.h>
#include <stdio.h>
int main(void)
{
double val = 0.128;
double prec = DBL_EPSILON;
double result;
int count = 0;
while (fabs(modf(val, &result)) > prec) {
++count;
val *= 10;
prec *= 10;
}
printf("%d digit(s): %0*.0f\n", count, count, result);
}
Results:
3 digit(s): 128
Suppose I have a floating-point value of type float or double (i.e. 32 or 64 bits on typical machines). I want to print this value as text (e.g. to the standard output stream), and then later, in some other process, scan it back in - with fscanf() if I'm using C, or perhaps with istream::operator>>() if I'm using C++. But - I need the scanned float to end up being exactly, identical to the original value (up to equivalent representations of the same value). Also, the printed value should be easily readable - to a human - as floating-point, i.e. I don't want to print 0x42355316 and reinterpret that as a 32-bit float.
How should I do this? I'm assuming the standard library of (C and C++) won't be sufficient, but perhaps I'm wrong. I suppose that a sufficient number of decimal digits might be able to guarantee an error that's underneath the precision threshold - but that's not the same as guaranteeing the rounding/truncation will happen just the way I want it.
Notes:
The scanning does not having to be perfectly accurate w.r.t. the value it scans, only the original value.
If it makes it easier, you may assume the value is a number and is not infinity.
denormal support is desired but not required; still if we get a denormal, failure should be conspicuous.
First, you should use the %a format with fprintf and fscanf. This is what it was designed for, and the C standard requires it to work (reproduce the original number) if the implementation uses binary floating-point.
Failing that, you should print a float with at least FLT_DECIMAL_DIG significant digits and a double with at least DBL_DECIMAL_DIG significant digits. Those constants are defined in <float.h> and are defined:
… number of decimal digits, n, such that any floating-point number with p radix b digits can be rounded to a floating-point number with n decimal digits and back again without change to the value,… [b is the base used for the floating-point format, defined in FLT_RADIX, and p is the number of base-b digits in the format.]
For example:
printf("%.*g\n", FLT_DECIMAL_DIG, 1.f/3);
or:
#define QuoteHelper(x) #x
#define Quote(x) QuoteHelper(x)
…
printf("%." Quote(FLT_DECIMAL_DIG) "g\n", 1.f/3);
In C++, these constants are defined in <limits> as std::numeric_limits<Type>::max_digits10, where Type is float or double or another floating-point type.
Note that the C standard only recommends that such a round-trip through a decimal numeral work; it does not require it. For example, C 2018 5.2.4.2.2 15 says, under the heading “Recommended practice”:
Conversion from (at least) double to decimal with DECIMAL_DIG digits and back should be the identity function. [DECIMAL_DIG is the equivalent of FLT_DECIMAL_DIG or DBL_DECIMAL_DIG for the widest floating-point format supported in the implementation.]
In contrast, if you use %a, and FLT_RADIX is a power of two (meaning the implementation uses a floating-point base that is two, 16, or another power of two), then C standard requires that the result of scanning the numeral produced with %a equals the original number.
I need the scanned float to end up being exactly, identical to the original value.
As already pointed out in the other answers, that can be achieved with the %a format specifier.
Also, the printed value should be easily readable - to a human - as floating-point, i.e. I don't want to print 0x42355316 and reinterpret that as a 32-bit float.
That's more tricky and subjective. The first part of the string that %a produces is in fact a fraction composed by hexadecimal digits, so that an output like 0x1.4p+3 may take some time to be parsed as 10 by a human reader.
An option could be to print all the decimal digits needed to represent the floating-point value, but there may be a lot of them. Consider, for example the value 0.1, its closest representation as a 64-bit float may be
0x1.999999999999ap-4 == 0.1000000000000000055511151231257827021181583404541015625
While printf("%.*lf\n", DBL_DECIMAL_DIG, 01); (see e.g. Eric's answer) would print
0.10000000000000001 // If DBL_DECIMAL_DIG == 17
My proposal is somewhere in the middle. Similarly to what %a does, we can exactly represent any floating-point value with radix 2 as a fraction multiplied by 2 raised to some integer power. We can transform that fraction into a whole number (increasing the exponent accordingly) and print it as a decimal value.
0x1.999999999999ap-4 --> 1.999999999999a16 * 2-4 --> 1999999999999a16 * 2-56
--> 720575940379279410 * 2-56
That whole number has a limited number of digits (it's < 253), but the result it's still an exact representation of the original double value.
The following snippet is a proof of concept, without any check for corner cases. The format specifier %a separates the mantissa and the exponent with a p character (as in "... multiplied by two raised to the Power of..."), I'll use a q instead, for no particular reason other than using a different symbol.
The value of the mantissa will also be reduced (and the exponent raised accordingly), removing all the trailing zero-bits. The idea beeing that 5q+1 (parsed as 510 * 21) should be more "easily" identified as 10, rather than 2814749767106560q-48.
#include <math.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void to_my_format(double x, char *str)
{
int exponent;
double mantissa = frexp(x, &exponent);
long long m = 0;
if ( mantissa ) {
exponent -= 52;
m = (long long)scalbn(mantissa, 52);
// A reduced mantissa should be more readable
while (m && m % 2 == 0) {
++exponent;
m /= 2;
}
}
sprintf(str, "%lldq%+d", m, exponent);
// ^
// Here 'q' is used to separate the mantissa from the exponent
}
double from_my_format(char const *str)
{
char *end;
long long mantissa = strtoll(str, &end, 10);
long exponent = strtol(str + (end - str + 1), &end, 10);
return scalbn(mantissa, exponent);
}
int main(void)
{
double tests[] = { 1, 0.5, 2, 10, -256, acos(-1), 1000000, 0.1, 0.125 };
size_t n = (sizeof tests) / (sizeof *tests);
char num[32];
for ( size_t i = 0; i < n; ++i ) {
to_my_format(tests[i], num);
double x = from_my_format(num);
printf("%22s%22a ", num, tests[i]);
if ( tests[i] != x )
printf(" *** %22a *** Round-trip failed\n", x);
else
printf("%58.55g\n", x);
}
return 0;
}
Testable here.
Generally, the improvement in readability is admitedly little to none, surely a matter of opinion.
You can use the %a format specifier to print the value as hexadecimal floating point. Note that this is not the same as reinterpreting the float as an integer and printing the integer value.
For example:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
float x;
scanf("%f", &x);
printf("x=%.7f\n", x);
char str[20];
sprintf(str, "%a", x);
printf("str=%s\n", str);
float y;
sscanf(str, "%f", &y);
printf("y=%.7f\n", y);
printf("x==y: %d\n", (x == y));
return 0;
}
With an input of 4, this outputs:
x=4.0000000
str=0x1p+2
y=4.0000000
x==y: 1
With an input of 3.3, this outputs:
x=3.3000000
str=0x1.a66666p+1
y=3.3000000
x==y: 1
As you can see from the output, the %a format specifier prints in exponential format with the significand in hex and the exponent in decimal. This format can then be converted directly back to the exact same value as demonstrated by the equality check.
I've made a program in C that takes two inputs, x and n, and raises x to the power of n. 10^10 doesn't work, what happened?
#include <cs50.h>
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
float isEven(int n)
{
return n % 2 == 0;
}
float isOdd(int n)
{
return !isEven(n);
}
float power(int x, int n)
{
// base case
if (n == 0)
{
return 1;
}
// recursive case: n is negative
else if (n < 0)
{
return (1 / power(x, -n));
}
// recursive case: n is odd
else if (isOdd(n))
{
return x * power(x, n-1);
}
// recursive case: n is positive and even
else if (isEven(n))
{
int y = power(x, n/2);
return y * y;
}
return true;
}
int displayPower(int x, int n)
{
printf("%d to the %d is %f", x, n, power(x, n));
return true;
}
int main(void)
{
int x = 0;
printf("What will be the base number?");
scanf("%d", &x);
int n = 0;
printf("What will be the exponent?");
scanf("%d", &n);
displayPower(x, n);
}
For example, here is a pair of inputs that works:
./exponentRecursion
What will be the base number?10
What will be the exponent?9
10 to the 9 is 1000000000.000000
But this is what I get for 10^10:
./exponentRecursion
What will be the base number?10
What will be the exponent?10
10 to the 10 is 1410065408.000000
Why does this write such a weird number?
BTW, 10^11 returns 14100654080.000000, exactly ten times the above.
Perhaps it may be that there is some "Limit" to the data type that I am using? I am not sure.
Your variable x is an int type. The most common internal representation of that is 32 bits. That a signed binary number, so only 31 bits are available for representing a magnitude, with the usual maximum positive int value being 2^31 - 1 = 2,147,483,647. Anything larger that that will overflow, giving a smaller magnitude and possibly a negative sign.
For a greater range, you can change the type of x to long long (usually 64 bits--about 18 digits) or double (usually 64 bits, with 51 bits of precision for about 15 digits).
(Warning: Many implementations use the same representation for int and long, so using long might not be an improvement.)
A float only has enough precision for about 7 decimal digits. Any number with more digits than that will only be an approximations.
If you switch to double you'll get about 16 digits of precision.
When you start handling large numbers with the basic data types in C, you can run into trouble.
Integral types have a limited range of values (such as 4x109 for a 32-bit unsigned integer). Floating point type haver a much larger range (though not infinite) but limited precision. For example, IEEE754 double precision can give you about 16 decimal digits of precision in the range +/-10308
To recover both of these aspects, you'll need to use a bignum library of some sort, such as MPIR.
If you are mixing different data types in a C program, there are several implicit casts done by the compiler. As there are strong rules how the compiler works one can exactly figure out, what happens to your program and why.
As I do not know all of this casting rules, I did the following: Estimating the maximum of precision needed for the biggest result. Then casting explicit every variable and funktion in the process to this precision, even if it is not necessary. Normally this will work like a workarount.
What's going on here:
#include <stdio.h>
#include <math.h>
int main(void) {
printf("17^12 = %lf\n", pow(17, 12));
printf("17^13 = %lf\n", pow(17, 13));
printf("17^14 = %lf\n", pow(17, 14));
}
I get this output:
17^12 = 582622237229761.000000
17^13 = 9904578032905936.000000
17^14 = 168377826559400928.000000
13 and 14 do not match with wolfram alpa cf:
12: 582622237229761.000000
582622237229761
13: 9904578032905936.000000
9904578032905937
14: 168377826559400928.000000
168377826559400929
Moreover, it's not wrong by some strange fraction - it's wrong by exactly one!
If this is down to me reaching the limits of what pow() can do for me, is there an alternative that can calculate this? I need a function that can calculate x^y, where x^y is always less than ULLONG_MAX.
pow works with double numbers. These represent numbers of the form s * 2^e where s is a 53 bit integer. Therefore double can store all integers below 2^53, but only some integers above 2^53. In particular, it can only represent even numbers > 2^53, since for e > 0 the value is always a multiple of 2.
17^13 needs 54 bits to represent exactly, so e is set to 1 and hence the calculated value becomes even number. The correct value is odd, so it's not surprising it's off by one. Likewise, 17^14 takes 58 bits to represent. That it too is off by one is a lucky coincidence (as long as you don't apply too much number theory), it just happens to be one off from a multiple of 32, which is the granularity at which double numbers of that magnitude are rounded.
For exact integer exponentiation, you should use integers all the way. Write your own double-free exponentiation routine. Use exponentiation by squaring if y can be large, but I assume it's always less than 64, making this issue moot.
The numbers you get are too big to be represented with a double accurately. A double-precision floating-point number has essentially 53 significant binary digits and can represent all integers up to 2^53 or 9,007,199,254,740,992.
For higher numbers, the last digits get truncated and the result of your calculation is rounded to the next number that can be represented as a double. For 17^13, which is only slightly above the limit, this is the closest even number. For numbers greater than 2^54 this is the closest number that is divisible by four, and so on.
If your input arguments are non-negative integers, then you can implement your own pow.
Recursively:
unsigned long long pow(unsigned long long x,unsigned int y)
{
if (y == 0)
return 1;
if (y == 1)
return x;
return pow(x,y/2)*pow(x,y-y/2);
}
Iteratively:
unsigned long long pow(unsigned long long x,unsigned int y)
{
unsigned long long res = 1;
while (y--)
res *= x;
return res;
}
Efficiently:
unsigned long long pow(unsigned long long x,unsigned int y)
{
unsigned long long res = 1;
while (y > 0)
{
if (y & 1)
res *= x;
y >>= 1;
x *= x;
}
return res;
}
A small addition to other good answers: under x86 architecture there is usually available x87 80-bit extended format, which is supported by most C compilers via the long double type. This format allows to operate with integer numbers up to 2^64 without gaps.
There is analogue of pow() in <math.h> which is intended for operating with long double numbers - powl(). It should also be noticed that the format specifier for the long double values is other than for double ones - %Lf. So the correct program using the long double type looks like this:
#include <stdio.h>
#include <math.h>
int main(void) {
printf("17^12 = %Lf\n", powl(17, 12));
printf("17^13 = %Lf\n", powl(17, 13));
printf("17^14 = %Lf\n", powl(17, 14));
}
As Stephen Canon noted in comments there is no guarantee that this program should give exact result.
How do I convert 30.8365146 into two integers, 30 and 8365146, for example, in Arduino or C?
This problem faces me when I try to send GPS data via XBee series 1 which don't allow to transmit fraction numbers, so I decided to split the data into two parts. How can I do this?
I have tried something like this:
double num=30.233;
int a,b;
a = floor(num);
b = (num-a) * pow(10,3);
The output is 30 and 232! The output is not 30 and 233. Why and how can I fix it?
double value = 30.8365146;
int left_part, right_part;
char buffer[50];
sprintf(buffer, "%lf", value);
sscanf(buffer, "%d.%d", &left_part, &right_part);
and you will get left/right parts separately stored in integers.
P.S. the other solution is to just multiply your number by some power of 10 and send as an integer.
You can output the integer to a char array using sprintf, then replace the '.' with a space and read back two integers using sscanf.
I did it for float, using double as temporary:
int fract(float raw) {
static int digits = std::numeric_limits<double>::digits10 - std::numeric_limits<float>::digits10 - 1;
float intpart;
double fract = static_cast<double>(modf(raw, &intpart));
fract = fract*pow(10, digits - 1);
return floor(fract);
}
I imagine that you could use quadruple-precision floating-point format to achieve the same for double: libquadmath.
The 30 can just be extracted by rounding down (floor(x) in math.h).
The numbers behind the decimal point are a bit more tricky, since the number is most likely stored as a binary number internally, this might not translate nicely into the number you're looking for, especially if floating point-math is involved. You're best bet would probably be to convert the number to a string, and then extract the data from that string.
As in the comments, you need to keep track of the decimal places. You can't do a direct conversion to integer. A bit of code that would do something like this:
#include <stdio.h>
#include <math.h>
#define PLACES 3
void extract(double x)
{
char buf[PLACES+10];
int a, b;
sprintf(buf, "%.*f", PLACES, x);
sscanf(buf, "%d.%d", &a, &b);
int n = (int) pow(10, PLACES);
printf("Number : %.*f\n", PLACES, x);
printf(" Integer : %d\n", a);
printf(" Fractional part: %d over %d\n", b, n);
}
int main()
{
extract(1.1128);
extract(20.0);
extract(300.000512);
}
Produces:
Number : 1.113
Integer : 1
Fractional part: 113 over 1000
Number : 20.000
Integer : 20
Fractional part: 0 over 1000
Number : 300.001
Integer : 300
Fractional part: 1 over 1000
What about using floor() to get the integer value and
num % 1 (modulo arithmetic) to get the decimal component?
Then you could multiply the decimal component by a multiple of 10 and round.
This would also give you control over how many decimal places you send, if that is limited in your comm. standard.
Would that work?
#include <math.h>
integer_part = floor(num);
decimal_part = fmod(num,1)*10^whatever;