Display decimal number - c

somebody know how to represent the digits to the left of the decimal point?
I want to display the number 5 digits left to the point and 4 digits right to it/
for the exercise 12345/100 i want to get 00123.4500

printf("%010.4f", (double)12345/100);
man 3 printf says:
The overall syntax of a conversion specification is:
%[$][flags][width][.precision][length modifier]conversion
.4 means the floating point precision to print 4 decimals.
0 is a flag that means to pad with 0s.
10 is the width. If at the right of the decimal there are 4, and at the left there are 5, the total is 10 (with the dot).

While the answer of alinsoar is correct and the most common case, I would like to mention another possibility which sometimes is useful: fixed-point representation.
An uint32_t integer holds 9 decimal digits. We may choose to imagine a decimal point before the 4th digit from the right. The example would then look like this:
#include <stdint.h>
#include <stdio.h>
#include <inttypes.h>
int main(void)
{
uint32_t a = 123450000; // 12345.0000
uint32_t b = a / 100; // 123.4500
uint32_t i = b / 10000; // integer part
uint32_t f = b % 10000; // fractional part
printf("%05" PRIu32 ".%04" PRIu32, i, f); // 5 integer digits, 4 fractional digits
}
Using fixed-point one has to pay special attention to the value range and e.g. multiplication needs special handling, but there are cases where it is preferable over the floating-point representation.

Related

Determining the number of decimal digits in a floating number

I am trying to write a program that outputs the number of the digits in the decimal portion of a given number (0.128).
I made the following program:
#include <stdio.h>
#include <math.h>
int main(){
float result = 0;
int count = 0;
int exp = 0;
for(exp = 0; int(1+result) % 10 != 0; exp++)
{
result = 0.128 * pow(10, exp);
count++;
}
printf("%d \n", count);
printf("%f \n", result);
return 0;
}
What I had in mind was that exp keeps being incremented until int(1+result) % 10 outputs 0. So for example when result = 0.128 * pow(10,4) = 1280, result mod 10 (int(1+result) % 10) will output 0 and the loop will stop.
I know that on a bigger scale this method is still inefficient since if result was a given input like 1.1208 the program would basically stop at one digit short of the desired value; however, I am trying to first find out the reason why I'm facing the current issue.
My Issue: The loop won't just stop at 1280; it keeps looping until its value reaches 128000000.000000.
Here is the output when I run the program:
10
128000000.000000
Apologies if my description is vague, any given help is very much appreciated.
I am trying to write a program that outputs the number of the digits in the decimal portion of a given number (0.128).
This task is basically impossible, because on a conventional (binary) machine the goal is not meaningful.
If I write
float f = 0.128;
printf("%f\n", f);
I see
0.128000
and I might conclude that 0.128 has three digits. (Never mind about the three 0's.)
But if I then write
printf("%.15f\n", f);
I see
0.128000006079674
Wait a minute! What's going on? Now how many digits does it have?
It's customary to say that floating-point numbers are "not accurate" or that they suffer from "roundoff error". But in fact, floating-point numbers are, in their own way, perfectly accurate — it's just that they're accurate in base two, not the base 10 we're used to thinking about.
The surprising fact is that most decimal (base 10) fractions do not exist as finite binary fractions. This is similar to the way that the number 1/3 does not even exist as a finite decimal fraction. You can approximate 1/3 as 0.333 or 0.3333333333 or 0.33333333333333333333, but without an infinite number of 3's it's only an approximation. Similarly, you can approximate 1/10 in base 2 as 0b0.00011 or 0b0.000110011 or 0b0.000110011001100110011001100110011, but without an infinite number of 0011's it, too, is only an approximation. (That last rendition, with 33 bits past the binary point, works out to about 0.0999999999767.)
And it's the same with most decimal fractions you can think of, including 0.128. So when I wrote
float f = 0.128;
what I actually got in f was the binary number 0b0.00100000110001001001101111, which in decimal is exactly 0.12800000607967376708984375.
Once a number has been stored as a float (or a double, for that matter) it is what it is: there is no way to rediscover that it was initially initialized from a "nice, round" decimal fraction like 0.128. And if you try to "count the number of decimal digits", and if your code does a really precise job, you're liable to get an answer of 26 (that is, corresponding to the digits "12800000607967376708984375"), not 3.
P.S. If you were working with computer hardware that implemented decimal floating point, this problem's goal would be meaningful, possible, and tractable. And implementations of decimal floating point do exist. But the ordinary float and double values any of is likely to use on any of today's common, mass-market computers are invariably going to be binary (specifically, conforming to IEEE-754).
P.P.S. Above I wrote, "what I actually got in f was the binary number 0b0.00100000110001001001101111". And if you count the number of significant bits there — 100000110001001001101111 — you get 24, which is no coincidence at all. You can read at single precision floating-point format that the significand portion of a float has 24 bits (with 23 explicitly stored), and here, you're seeing that in action.
float vs. code
A binary float cannot encode 0.128 exactly as it is not a dyadic rational.
Instead, it takes on a nearby value: 0.12800000607967376708984375. 26 digits.
Rounding errors
OP's approach incurs rounding errors in result = 0.128 * pow(10, exp);.
Extended math needed
The goal is difficult. Example: FLT_TRUE_MIN takes about 149 digits.
We could use double or long double to get us somewhat there.
Simply multiply the fraction by 10.0 in each step.
d *= 10.0; still incurs rounding errors, but less so than OP's approach.
#include <stdio.h>
#include <math.h> int main(){
int count = 0;
float f = 0.128f;
double d = f - trunc(f);
printf("%.30f\n", d);
while (d) {
d *= 10.0;
double ipart = trunc(d);
printf("%.0f", ipart);
d -= ipart;
count++;
}
printf("\n");
printf("%d \n", count);
return 0;
}
Output
0.128000006079673767089843750000
12800000607967376708984375
26
Usefulness
Typically, past FLT_DECMAL_DIG (9) or so significant decimal places, OP’s goal is usually not that useful.
As others have said, the number of decimal digits is meaningless when using binary floating-point.
But you also have a flawed termination condition. The loop test is (int)(1+result) % 10 != 0 meaning that it will stop whenever we reach an integer whose last digit is 9.
That means that 0.9, 0.99 and 0.9999 all give a result of 2.
We also lose precision by truncating the double value we start with by storing into a float.
The most useful thing we could do is terminate when the remaining fractional part is less than the precision of the type used.
Suggested working code:
#include <math.h>
#include <float.h>
#include <stdio.h>
int main(void)
{
double val = 0.128;
double prec = DBL_EPSILON;
double result;
int count = 0;
while (fabs(modf(val, &result)) > prec) {
++count;
val *= 10;
prec *= 10;
}
printf("%d digit(s): %0*.0f\n", count, count, result);
}
Results:
3 digit(s): 128

How do I print a floating-point value for later scanning with perfect accuracy?

Suppose I have a floating-point value of type float or double (i.e. 32 or 64 bits on typical machines). I want to print this value as text (e.g. to the standard output stream), and then later, in some other process, scan it back in - with fscanf() if I'm using C, or perhaps with istream::operator>>() if I'm using C++. But - I need the scanned float to end up being exactly, identical to the original value (up to equivalent representations of the same value). Also, the printed value should be easily readable - to a human - as floating-point, i.e. I don't want to print 0x42355316 and reinterpret that as a 32-bit float.
How should I do this? I'm assuming the standard library of (C and C++) won't be sufficient, but perhaps I'm wrong. I suppose that a sufficient number of decimal digits might be able to guarantee an error that's underneath the precision threshold - but that's not the same as guaranteeing the rounding/truncation will happen just the way I want it.
Notes:
The scanning does not having to be perfectly accurate w.r.t. the value it scans, only the original value.
If it makes it easier, you may assume the value is a number and is not infinity.
denormal support is desired but not required; still if we get a denormal, failure should be conspicuous.
First, you should use the %a format with fprintf and fscanf. This is what it was designed for, and the C standard requires it to work (reproduce the original number) if the implementation uses binary floating-point.
Failing that, you should print a float with at least FLT_DECIMAL_DIG significant digits and a double with at least DBL_DECIMAL_DIG significant digits. Those constants are defined in <float.h> and are defined:
… number of decimal digits, n, such that any floating-point number with p radix b digits can be rounded to a floating-point number with n decimal digits and back again without change to the value,… [b is the base used for the floating-point format, defined in FLT_RADIX, and p is the number of base-b digits in the format.]
For example:
printf("%.*g\n", FLT_DECIMAL_DIG, 1.f/3);
or:
#define QuoteHelper(x) #x
#define Quote(x) QuoteHelper(x)
…
printf("%." Quote(FLT_DECIMAL_DIG) "g\n", 1.f/3);
In C++, these constants are defined in <limits> as std::numeric_limits<Type>::max_digits10, where Type is float or double or another floating-point type.
Note that the C standard only recommends that such a round-trip through a decimal numeral work; it does not require it. For example, C 2018 5.2.4.2.2 15 says, under the heading “Recommended practice”:
Conversion from (at least) double to decimal with DECIMAL_DIG digits and back should be the identity function. [DECIMAL_DIG is the equivalent of FLT_DECIMAL_DIG or DBL_DECIMAL_DIG for the widest floating-point format supported in the implementation.]
In contrast, if you use %a, and FLT_RADIX is a power of two (meaning the implementation uses a floating-point base that is two, 16, or another power of two), then C standard requires that the result of scanning the numeral produced with %a equals the original number.
I need the scanned float to end up being exactly, identical to the original value.
As already pointed out in the other answers, that can be achieved with the %a format specifier.
Also, the printed value should be easily readable - to a human - as floating-point, i.e. I don't want to print 0x42355316 and reinterpret that as a 32-bit float.
That's more tricky and subjective. The first part of the string that %a produces is in fact a fraction composed by hexadecimal digits, so that an output like 0x1.4p+3 may take some time to be parsed as 10 by a human reader.
An option could be to print all the decimal digits needed to represent the floating-point value, but there may be a lot of them. Consider, for example the value 0.1, its closest representation as a 64-bit float may be
0x1.999999999999ap-4 == 0.1000000000000000055511151231257827021181583404541015625
While printf("%.*lf\n", DBL_DECIMAL_DIG, 01); (see e.g. Eric's answer) would print
0.10000000000000001 // If DBL_DECIMAL_DIG == 17
My proposal is somewhere in the middle. Similarly to what %a does, we can exactly represent any floating-point value with radix 2 as a fraction multiplied by 2 raised to some integer power. We can transform that fraction into a whole number (increasing the exponent accordingly) and print it as a decimal value.
0x1.999999999999ap-4 --> 1.999999999999a16 * 2-4 --> 1999999999999a16 * 2-56
--> 720575940379279410 * 2-56
That whole number has a limited number of digits (it's < 253), but the result it's still an exact representation of the original double value.
The following snippet is a proof of concept, without any check for corner cases. The format specifier %a separates the mantissa and the exponent with a p character (as in "... multiplied by two raised to the Power of..."), I'll use a q instead, for no particular reason other than using a different symbol.
The value of the mantissa will also be reduced (and the exponent raised accordingly), removing all the trailing zero-bits. The idea beeing that 5q+1 (parsed as 510 * 21) should be more "easily" identified as 10, rather than 2814749767106560q-48.
#include <math.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void to_my_format(double x, char *str)
{
int exponent;
double mantissa = frexp(x, &exponent);
long long m = 0;
if ( mantissa ) {
exponent -= 52;
m = (long long)scalbn(mantissa, 52);
// A reduced mantissa should be more readable
while (m && m % 2 == 0) {
++exponent;
m /= 2;
}
}
sprintf(str, "%lldq%+d", m, exponent);
// ^
// Here 'q' is used to separate the mantissa from the exponent
}
double from_my_format(char const *str)
{
char *end;
long long mantissa = strtoll(str, &end, 10);
long exponent = strtol(str + (end - str + 1), &end, 10);
return scalbn(mantissa, exponent);
}
int main(void)
{
double tests[] = { 1, 0.5, 2, 10, -256, acos(-1), 1000000, 0.1, 0.125 };
size_t n = (sizeof tests) / (sizeof *tests);
char num[32];
for ( size_t i = 0; i < n; ++i ) {
to_my_format(tests[i], num);
double x = from_my_format(num);
printf("%22s%22a ", num, tests[i]);
if ( tests[i] != x )
printf(" *** %22a *** Round-trip failed\n", x);
else
printf("%58.55g\n", x);
}
return 0;
}
Testable here.
Generally, the improvement in readability is admitedly little to none, surely a matter of opinion.
You can use the %a format specifier to print the value as hexadecimal floating point. Note that this is not the same as reinterpreting the float as an integer and printing the integer value.
For example:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
float x;
scanf("%f", &x);
printf("x=%.7f\n", x);
char str[20];
sprintf(str, "%a", x);
printf("str=%s\n", str);
float y;
sscanf(str, "%f", &y);
printf("y=%.7f\n", y);
printf("x==y: %d\n", (x == y));
return 0;
}
With an input of 4, this outputs:
x=4.0000000
str=0x1p+2
y=4.0000000
x==y: 1
With an input of 3.3, this outputs:
x=3.3000000
str=0x1.a66666p+1
y=3.3000000
x==y: 1
As you can see from the output, the %a format specifier prints in exponential format with the significand in hex and the exponent in decimal. This format can then be converted directly back to the exact same value as demonstrated by the equality check.

double representing values upto 16 significant digits

I was working on reading values in variables from a byte positioned file and I had to represent some value read into decimal with 6 digits representing fractional part, total no. of digits 20.
So the value could be for example be 99999999999999999999 (20 9s) and it is to be used as float considering the last six 9s representing fractional part.
Now when I was tring to do it with the method employed:
#include<stdio.h>
#include<stdlib.h>
int main(void)
{
char c[]="999999999.999999"; //9 nines to the left of decimal and 6 nines to the right
double d=0;
sscanf(c,"%lf",&d);
printf("%f\n",d);
return 0;
}
OUTPUT:999999999.999999 same as input
Now I increased the number of 9s to the left of decimal by 1 (making 10 nines to the left of decimal)
the output became 9999999999.999998.
On further increase of one more 9 to the left of decimal the outcome became rounded off to 100000000000.000000
For my usage it is possible that values with 14 digits to the left of decimal and 6 to the right of it can come in the variable - I want it to be converted precisely just like the input itself without and truncation or rounding off. Also I read somewhere that double can be used to represent a value with up to 16 significant digits but here when I used only 9999999999.999999 (10 nines to the left and 6 to the right) it produced outcome as 9999999999.999998 which contradicts this *represent a value with up to 16 significant digits` statement.
What should be done in this case?
The nature of floating point numbers is they are inaccurate. The bigger the number, the more inaccurate.
What should be done in this case?
You could try a long double, but even that's not guaranteed to be precise enough.
long double z = 99999999999999.999999L;
printf("%Lf\n", z); // 100000000000000.000000
You could store it as an integer and remember it's actually 1,000,000 times smaller. Unfortunately, 20 digits is a bit too large for even an unsigned 64 bit integer. It's 3 bits too large.
#include <inttypes.h>
int main() {
uint64_t x = 99999999999999999999ULL;
}
test.c:4:18: error: integer literal is too large to be represented in any integer type
uint64_t x = 99999999999999999999ULL;
You could make a struct that stores the pieces separately.
#include <inttypes.h>
#include <stdbool.h>
typedef struct {
bool positive;
uint64_t integer;
uint64_t decimal;
} bignum;
int main() {
bignum num = {
.positive = true,
.integer = 99999999999999,
.decimal = 999999
};
printf("%s" "%"PRIu64 "." "%"PRIu64 "\n",
num.positive ? "" : "-", num.integer, num.decimal
);
}
At which point you're building your own arbitrary-precision arithmetic library. Instead, use an existing one such as GMP. This can store numbers of any size. The trade off is speed and you have to use special types and functions.
#include <gmp.h>
int main() {
mpf_t y;
mpf_init_set_str(y, "99999999999999.999999", 10);
gmp_printf("%Ff\n", y);
}

Why does my computer arbitrarily change the decimal digits at the end of a float number when that number is the actual parameter of a function?

I was trying to satisfy this question: Write a function print_dig_float(float f) which prints the value of each digit of a floating point number f. For example, if f is 2345.1234 the print_dig_float(f) will print integer values of digits 2, 3, 4, 5, 1, 2, 3, and 4 in succession.
What I did is: given a number with some decimals, I try to move the digits to the left (Ex: 3.45 -> 345) by multiplying it with 10. After that, I store each digit in an array by taking the remainder and put it in an element. Then, I print them out.
So my program looks like this:
#include <stdio.h>
void print_dig_float(float f);
int main(int argc, char const *argv[]) {
print_dig_float(23432.214122);
return 0;
}
void print_dig_float(float f) {
printf("%f\n", f);
int i = 0, arr[50], conv;
//move digits to the left
do {
f = f * 10;
conv = f;
printf("%i\n", conv);
} while (conv % 10 != 0);
conv = f / 10;
//store digits in an array
while (conv > 1) {
arr[i] = conv % 10;
conv = conv / 10;
i++;
}
for (int j = i - 1; j >= 0; j--) {
printf("%i ", arr[j]);
}
printf("\n");
}
When I tested it with the number: 23432.214122, this is what I get (according to Linux terminal):
23432.214844
234322
2343221
23432216
234322160
2 3 4 3 2 2 1 6
The problem is that, as you can see above, the computer arbitrarily changes the decimal digits at the end of the number even before I do anything with it. I don't know if this is my fault or the computer's fault for this problem.
Per C 2018 5.2.4.2.2, a floating-point number is represented with a sign, a fixed base to some exponent, and a numeral formed of digits in that base. Most commonly, two is used as the base.
When the base is two, 23432.214122 cannot be represented in floating-point, because every representable number is necessarily some integer multiple of a power of the base (possible a negative power). 23432.214122 is not a multiple of ½, ¼, ⅛, 1/24, 1/25, or any other power of two.
When 23432.214122 is used in source code, it is converted to a value that is representable. In good C implementations, the nearest representable value is used, but the C standard permits either the representable value that is the nearest larger or nearest smaller value to be used. Other than this, the digits that appear are not arbitrary; they are a consequence of the mathematics.
When IEEE-754 binary32 is used for float, the representable value nearest to 23432.214122 is exactly 23432.21484375.
Because, when a C implementation uses base two for floating-point numbers, floating-point numbers have binary digits and do not have decimal digits. It is generally not meaningful to attempt to extract decimal digits from a thing that does not have decimal digits. It is possible to determine the decimal digits that were in an original numeral up to some limit affected by the floating-point format. However, “23432.214122” has too many digits to do this with a 32-bit floating-point type. With a 64-bit type, as is commonly used for double, you could recover the original digits providing you knew how many decimal digits there were to start with. It is not generally possible to recover the original numeral without that information—as you have seen the trailing digits will be different, and there is no indication in the floating-point number itself of where the differences start.

Why float is more precise than it ought to be?

#include <stdio.h>
#include <float.h>
int main(int argc, char** argv)
{
long double pival = 3.14159265358979323846264338327950288419716939937510582097494459230781640628620899L;
float pival_float = pival;
printf("%1.80f\n", pival_float);
return 0;
}
The output I got on gcc is :
3.14159274101257324218750000000000000000000000000000000000000000000000000000000000
The float uses 23 bits mantisa. So the maximum fraction that can be represented is 2^23 = 8388608 = 7 decimal digits of precision.
But the above output shows 23 decimal digits of precision (3.14159274101257324218750). I expected it print 3.1415927000000000000....)
What did I miss to understand ?
You only got 7 digits of precision. Pi is
3.1415926535897932384626433832795028841971693993751058209...
But the output you got from printing your float approximation to Pi was
3.14159274101257324218750000...
As you can see the values diverge starting from the 7th digit after the decimal point.
If you ask printf() for 80 digits after the decimal place, it will print out that many digits of the decimal representation of the binary value stored in the float, even if that many digits is far more than the precision allowed by the float representation.
A binary floating-point value can't represent 3.1415927 exactly (since that's not an exact binary fraction). The nearest value that it can represent is 3.1415927410125732421875, so that's the actual value of your pival_float. When you print pival_float with eighty digits, you see its exact value, plus a bunch of zeroes for good measure.
The closest float value to pi has binary encoding...
0 10000000 10010010000111111011011
...in which I've inserted spaces between the sign, exponent and mantissa. The exponent is biased, so the bits above encode a multiplier of 2^1 == 2, and the mantissa encodes a fraction above 1, with the first bit being worth a half, and each bit thereafter being worth half as much as the bit before.
Therefore, the mantissa bits above are worth:
1 x 0.5
0 x 0.25
0 x 0.125
1 x 0.0625
0 x 0.03125
0 x 0.015625
1 x 0.0078125
0 x 0.00390625
0 x 0.001953125
0 x 0.0009765625
0 x 0.00048828125
1 x 0.000244140625
1 x 0.0001220703125
1 x 0.00006103515625
1 x 0.000030517578125
1 x 0.0000152587890625
1 x 0.00000762939453125
0 x 0.000003814697265625
1 x 0.0000019073486328125
1 x 0.00000095367431640625
0 x 0.000000476837158203125
1 x 0.0000002384185791015625
1 x 0.00000011920928955078125
So, the least significant bit after multiplying by the exponent-encoded value "2" is worth...
0.000 000 238 418 579 101 562 5
I added spaces to make it easier to count that the last non-0 digit is in the 22nd decimal place.
The value the question says printf() displayed appears below alongside the contribution of the least significant bit in the mantissa:
3.14159274101257324218750000000000000000000000000000000000000000000000000000000000
0.0000002384185791015625
Clearly the least significant digits line up properly. If you added up all the mantissa contributions above, added the implicit 1, then multiplied by 2, you'd get the exact value printf displayed. That explains how the float value is precisely (in the mathematical sense of zero randomness) the value shown by printf, but the comparison below against pi shows only the first 6 decimal places are accurate given the particular value we want it to store.
3.14159274101257324218750000000000000000000000000000000000000000000000000000000000
3.14159265358979323846264338327950288419716939937510582097494459230781640628620899
^
In computing, it's common to refer to the precision of floating point types when we're actually interested in the accuracy we can rely on. I suppose you could argue that while taken in isolation the precision of floats and doubles is infinite, the rounding necessary when using them to approximate numbers that they can't encode perfectly is for most practical purposes random, and in that sense they offer finite significant digits of precision at encoding such numbers.
So, printf isn't wrong to display so many digits; some application might be using a float to encode that exact number (almost certainly because the nature of the app's calculations involve sums of 1/2^n values), but that'd be the exception rather than the rule.
Carrying on from Tony's answer, one way to prove this limitation on decimal precision to yourself in a practical way is simply to declare pi to as many decimals points as you like while assigning the value to a float. Then look at how it is stored in memory.
What you find, is no matter how many decimal points you give it, the 32-bit value in memory will always be the equivalent of the unsigned value 1078530011 or 01000000010010010000111111011011 in binary. That is due, as others explained, to the IEEE-754 Single Precision Floating Point Format Below is a simple bit of code that will allow you to prove to yourself that this limitation means pi, as a float, is limited to six decimal precision:
#include <stdio.h>
#include <stdlib.h>
#if defined (__LP64__) || defined (_LP64)
# define BUILD_64 1
#endif
#ifdef BUILD_64
# define BITS_PER_LONG 64
#else
# define BITS_PER_LONG 32
#endif
char *binpad (unsigned long n, size_t sz);
int main (void) {
float fPi = 3.1415926535897932384626433;
printf ("\n fPi : %f, in memory : %s unsigned : %u\n\n",
fPi, binpad (*(unsigned*)&fPi, 32), *(unsigned*)&fPi);
return 0;
}
char *binpad (unsigned long n, size_t sz)
{
static char s[BITS_PER_LONG + 1] = {0};
char *p = s + BITS_PER_LONG;
register size_t i;
for (i = 0; i < sz; i++)
*(--p) = (n>>i & 1) ? '1' : '0';
return p;
}
Output
$ ./bin/ieee754_pi
fPi : 3.141593, in memory : 01000000010010010000111111011011 unsigned : 1078530011

Resources