int main()
{
float x=3.4e2;
printf("%f",x);
return 0;
}
Output:
340.000000 // It's ok.
But if write x=3.1234e2 the output is 312.339996 and if x=3.12345678e2 the output is 312.345673.
Why are the outputs like these? I think if I write x=3.1234e2 the output should be 312.340000, but the actual output is 312.339996 using GCC compiler.
Not all fractional numbers have an exact binary equivalent so it is rounded to the nearest value.
Simplified example,
if you have 3 bits for the fraction, you can have:
0
0.125
0.25
0.375
...
0.5 has an exact representation, but 0.1 will be shown as 0.125.
Of course the real differences are much smaller.
Floating-point numbers are normally represented as binary fractions times a power of two, for efficiency. This is about as accurate as base-10 representation, except that there are decimal fractions that cannot be exactly represented as binary fractions. They are, instead, represented as approximations.
Moreover, a float is normally 32 bits long, which means that it doesn't have all that many significant digits. You can see in your examples that they're accurate to about 8 significant digits.
You are, however, printing the numbers to slightly beyond their significance, and therefore you're seeing the difference. Look at your printf format string documentation to see how to print fewer digits.
You may need to represent decimal numbers exactly; this often happens in financial applications. In that case, you need to use a special library to represent numbers, or simply calculate everything as integers (such as representing amounts as cents rather than as dollars and fractions of a dollar).
The standard reference is What Every Computer Scientist Should Know About Floating-Point Arithmetic, but it looks like that would be very advanced for you. Alternatively, you could Google floating-point formats (particularly IEEE standard formats) or look them up on Wikipedia, if you wanted the details.
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
Adding the ability to print a decimal fixed-point number as hexadecimal in my general purpose library and realized i wasn't %100 sure how i should represent the fraction part of the number. A quick google search suggests i should:
Multiply by 16
Convert the integer part to hex and add it to the buffer
Get rid of the integer part
Repeat
As suggested here https://bytes.com/topic/c/answers/219928-how-convert-float-hex
This method is for floating-point (ieee 754 binary formats) and it works for that just fine. However, i tried to adopt this to my decimal fixed-point (scaled by 8) format, and after testing this approach on paper i noticed for some fractions (i.e. .7), this causes a repeating pattern of .B3333... and so on.
To me this looks very undesirable. I also wonder if this would case a loss in precision if i was to try to read this from a string into my fixed-point format.
Is there any reason why someone wouldn't print the fraction part like any other 2s complement hexadecimal number? i.e where 17535.564453 is printed as 447F.89CE5
While this is targeted at decimal fixed-point, I'm looking for a solution that can also be used by other real number formats such as ieee 754 binary.
Perhaps theres another alternative to these 2 methods. Any ideas?
Although the question asks about fixed-point, the C standard has some useful information in its information for the %a format for floating-point. C 2018 7.21.6.1 8 says:
… if the [user-requested] precision is missing and FLT_RADIX is not a power of 2, then the precision is sufficient to distinguish285) values of type double, except that trailing zeros may be omitted;…
Footnote 285 says:
The precision p is sufficient to distinguish values of the source type if 16p−1 > bn where b is FLT_RADIX and n is the number of base-b digits in the significand of the source type…
To see this intuitively, visualize the decimal fixed-point numbers on the real number line from 0 to 1. For each such number x, visualize a segment starting halfway toward the previous fixed-point number and ending halfway toward the next fixed-point number. All the points in that segment are closer to x than they are to the previous or next numbers, except for the endpoints. Now, consider where all the single-hexadecimal-digit numbers j/16 are. They lie in some of those segments. But, if there are 100 segments (from two-digit decimal numbers), most of the segments do not contain one of those single-hexadecimal-digit numbers. If you increase the number of hexadecimal digits, p, until 16p−1 > bn, then the spacing between the hexadecimal numbers is less than the width of the segments, and every segment contains a hexadecimal number.
This shows that using p hexadecimal digits is sufficient to distinguish numbers made with bn decimal digits. (This is sufficient, but it may be one more than necessary.) This means all the information needed to recover the original decimal number is present, and avoiding any loss of accuracy in recovering the original decimal number is a matter of programming the conversion from hexadecimal to decimal correctly.
Printing the fraction “like any other hexadecimal number” is inadequate if leading zeroes are not accounted for. The decimal numbers “3.7” and “3.007” are different, so the fraction part cannot be formatted merely as “7”. If a convention is adopted to convert the decimal part **including trailing zeros* to hexadecimal, then this could work. For example, if the decimal fixed-point number has four decimal digits after the decimal point, then treating the fraction parts of 3.7 and 3.007 as 7000 and 0070 and converting those to hexadecimal will preserve the required information. When converting back, one would convert the hexadecimal to decimal, format it in four digits, and insert it into the decimal fixed-point number. This could be a suitable solution where speed is desired, but it will not be a good representation for human use.
Of course, if one merely wishes to preserve the information in the number so that it can be transmitted or stored and later recovered, one might as well simply transmit the bits representing the number with whatever conversion is easiest to compute, such as formatting all the raw bits as hexadecimal.
I am working on software that, among other things, converts measured numbers between text and internal (double) representation. A necessary part of the process is to produce text representations with the correct decimal precision based on the statistical uncertainty of the measurement. The needed precision varies with the number, and the least-significant digit in it can be anywhere, including left of the (decimal) units place.
Correct rounding is essential for this process, where "correct" means according to the floating-point rounding mode in effect at the time, or at least in a well-defined rounding mode. As such, I need to be careful about (read: avoid) performing intermediate arithmetic on the numbers being handled, because rounding can be sensitive even to the least-significant bit in the internal representation of a number.
I think I can do almost all the needed formatting reasonably well with the printf family of functions if I first compute the number of significant digits in the required representation:
sprintf(buffer, "%.*e", num_sig_figs - 1, number);
There is one class of corner cases that has so far defeated me, however: the one where the most significant (decimal) digit in the measured number is one place right of the least significant digit of the desired-precision representation. In that case, rounding should yield the least (and only) significant digit in the desired result as either 0 or 1, but I haven't been able to devise a way to perform the rounding in a portable(*) way without risk of changing the result. This is similar to what the MPFR function mpfr_prec_round() could do, except that it works in binary precision, whereas I need to use decimal precision.
For example, in the default rounding mode (round-to-nearest with ties rounded to even):
0.5 expressed to unit (10^0) precision should be "0" or "0e+00"
654 expressed to thousands (10^3) precision should be "1e+03"
0.03125 expressed to tenths (10^-1) precision should be "0" or "0e-01" or even "0e+00"
(*) "Portable" here means that the code accurately expresses the computation in standard, portable C99 (or better, C90). It is understood that the actual result may depend on machine details, and it should depend (and be consistent with) the floating-point rounding mode in effect.
What options do I have?
One simple (albeit fairly inefficient) approach that will always work is to print the full exact decimal value as a string, then do your rounding in decimal manually. This can be achieved by something like
snprintf(buf, sizeof buf, "%.*f", DBL_MANT_DIG-DBL_MIN_EXP, x);
I hope I got that precision right. The idea is that each additional mantissa bit, and each additional negative power of two, takes up one extra decimal place.
You avoid the issue of double rounding by the fact that the decimal value obtained is exact.
Note that double rounding only matters in the default rounding mode (nearest). In other modes, double rounding obtains the same result that would be obtained by a single rounding step, so you can take lots of shortcuts if you like.
There are probably better solutions which I'll post later if I think of them. Note that the above solution will only work on high-quality implementations where the printf family of functions is capable of printing exact decimals. It will fail horribly, for example, on MSVCRT and other low-quality implementations, even some conforming ones.
This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Dealing with accuracy problems in floating-point numbers
I was quite surprised why I tried to multiply a float in C (with GCC 3.2) and that it did not do as I expected.. As a sample:
int main() {
float nb = 3.11f;
nb *= 10;
printf("%f\n", nb);
}
Displays: 31.099998
I am curious regarding the way floats are implemented and why it produces this unexpected behavior?
First off, you can multiply floats. The problem you have is not the multiplication itself, but the original number you've used. Multiplication can lose some precision, but here the original number you've multiplied started with lost precision.
This is actually an expected behavior. floats are implemented using binary representation which means they can't accurately represent decimal values.
See MSDN for more information.
You can also see in the description of float that it has 6-7 significant digits accuracy. In your example if you round 31.099998 to 7 significant digits you will get 31.1 so it still works as expected here.
double type would of course be more accurate, but still has rounding error due to it's binary representation while the number you wrote is decimal.
If you want complete accuracy for decimal numbers, you should use a decimal type. This type exists in languages like C#. http://msdn.microsoft.com/en-us/library/system.decimal.aspx
You can also use rational numbers representation. Using two integers which will give you complete accuracy as long as you can represent the number as a division of two integers.
This is working as expected. Computers have finite precision, because they're trying to compute floating point values from integers. This leads to floating point inaccuracies.
The Floating point wikipedia page goes into far more detail on the representation and resulting accuracy problems than I could here :)
Interesting real-world side-note: this is partly why a lot of money calculations are done using integers (cents) - don't let the computer lose money with lack of precision! I want my $0.00001!
The number 3.11 cannot be represented in binary. The closest you can get with 24 significant bits is 11.0001110000101000111101, which works out to 3.1099998950958251953125 in decimal.
If your number 3.11 is supposed to represent a monetary amount, then you need to use a decimal representation.
In the Python communities we often see people surprised at this, so there are well-tested-and-debugged FAQs and tutorial sections on the issue (of course they're phrased in terms of Python, not C, but since Python delegates float arithmetic to the underlying C and hardware anyway, all the descriptions of float's mechanics still apply).
It's not the multiplication's fault, of course -- remove the statement where you multiply nb and you'll see similar issues anyway.
From Wikipedia article:
The fact that floating-point numbers
cannot precisely represent all real
numbers, and that floating-point
operations cannot precisely represent
true arithmetic operations, leads to
many surprising situations. This is
related to the finite precision with
which computers generally represent
numbers.
Floating points are not precise because they use base 2 (because it's binary: either 0 or 1) instead of base 10. And base 2 converting to base 10, as many have stated before, will cause rounding precision issues.
Why is this C program giving the "wrong" output?
#include<stdio.h>
void main()
{
float f = 12345.054321;
printf("%f", f);
getch();
}
Output:
12345.054688
But the output should be, 12345.054321.
I am using VC++ in VS2008.
It's giving the "wrong" answer simply because not all real values are representable by floats (or doubles, for that matter). What you'll get is an approximation based on the underlying encoding.
In order to represent every real value, even between 1.0x10-100 and 1.1x10-100 (a truly minuscule range), you still require an infinite number of bits.
Single-precision IEEE754 values have only 32 bits available (some of which are tasked to other things such as exponent and NaN/Inf representations) and cannot therefore give you infinite precision. They actually have 23 bits available giving precision of about 224 (there's an extra implicit bit) or just over 7 decimal digits (log10(224) is roughly 7.2).
I enclose the word "wrong" in quotes because it's not actually wrong. What's wrong is your understanding about how computers represent numbers (don't be offended though, you're not alone in this misapprehension).
Head on over to http://www.h-schmidt.net/FloatApplet/IEEE754.html and type your number into the "Decimal representation" box to see this in action.
If you want a more accurate number, use doubles instead of floats - these have double the number of bits available for representing values (assuming your C implementation is using IEEE754 single and double precision data types for float and double respectively).
If you want arbitrary precision, you'll need to use a "bignum" library like GMP although that's somewhat slower than native types so make sure you understand the trade-offs.
The decimal number 12345.054321 cannot be represented accurately as a float on your platform. The result that you are seeing is a decimal approximation to the closest number that can be represented as a float.
floats are about convenience and speed, and use a binary representation - if you care about precision use a decimal type.
To understand the problem, read What Every Computer Scientist Should Know About Floating-Point Arithmetic:
http://docs.sun.com/source/806-3568/ncg_goldberg.html
For a solution, see the Decimal Arithmetic FAQ:
http://speleotrove.com/decimal/decifaq.html
It's all to do with precision. Your number cannot be stored accurately in a float.
Single-precision floating point values can only represent about eight to nine significant (decimal) digits. Beyond that point, you're seeing quantization error.
Why when I save a value of say 40.54 in SQL Server to a column of type Real does it return to me a value that is more like 40.53999878999 instead of 40.54? I've seen this a few times but have never figured out quite why it happens. Has anyone else experienced this issue and if so causes it?
Have a look at What Every Computer Scientist Should Know About Floating Point Arithmetic.
Floating point numbers in computers don't represent decimal fractions exactly. Instead, they represent binary fractions. Most fractional numbers don't have an exact representation as a binary fraction, so there is some rounding going on. When such a rounded binary fraction is translated back to a decimal fraction, you get the effect you describe.
For storing money values, SQL databases normally provide a DECIMAL type that stores exact decimal digits. This format is slightly less efficient for computers to deal with, but it is quite useful when you want to avoid decimal rounding errors.
Floating point numbers use binary fractions, and they don't correspond exactly to decimal fractions.
For money, it's better to either store number of cents as integer, or use a decimal number type. For example, Decimal(8,2) stores 8 digits including 2 decimals (xxxxxx.xx), i.e. to cent precision.
In a nutshell, it's for pretty much the same reason that one-third cannot be exactly expressed in decimal. Have a look at David Goldberg's classic paper "What Every Computer Scientist Should Know About Floating-Point Arithmetic" for details.
To add a clarification, a floating point numbers stored in a computer behaves as described by other posts here, because as described, it is stored in binary format.
This means that unless it's value (both the mantissa and exponent components of the value) are powers of two, and cannot be represented exactly.
Some systems, on the other hand store fractional numbers in decimal (SQL Server Decimal, and Numeric data types, and Oracle Number datatype for example,) and then their internal representation is, therefore, exact for any number that is a power of 10. But then numbers that are not powers of 10 cannot be represented exactly.