This question already has answers here:
strange output in comparison of float with float literal
(8 answers)
Closed 5 years ago.
I have come across two programs in C, both comparing floating point number but with different outputs.
1)
#include<stdio.h>
int main()
{
float x = 0.1;
if (x == 0.1)
printf("IF");
else if (x == 0.1f)
printf("ELSE IF");
else
printf("ELSE");
}
Output : ELSE IF
2)
int main()
{
float x = 0.5;
if (x == 0.5)
printf("IF");
else if (x == 0.5f)
printf("ELSE IF");
else
printf("ELSE");
}
Output : IF
why 0.5 is not promoted to double whereas 0.1 is?
Since double is wider than float, x == 0.1 is interpreted as (double) x == 0.1.
This works for 0.5 because 0.5 is exactly representable in binary, so (double) 0.5f produces precisely 0.5. On the other hand, 0.1 has an infinite-digit representation in binary, and 0.1f and 0.1 end up being rounded to numbers that differ in how many initial digits of the sequence they hold.
In an analogy with decimal numbers, you can think of the above situation as trying to write down the fraction 1/3 by rounding it to a fixed number of decimal digits. Using a 5-significant-digit representation, you get 0.33333; choosing a 10-digit one results in 0.3333333333. Now, "casting" the five-digit number to ten digits results in 0.3333300000, which is a different number than 0.3333333333. In the same analogy, 0.5 in is like 1/10 in decimal, which would be represented as 0.10000 and 0.1000000000 respectively, so one could convert it to the other representation and back without changing its meaning.
If the contents of x is a marker value set from code, then simply compare it to 0.1f instead of to 0.1. If it is the result of a calculation, see Paul's answer for the correct way to compare floating-point quantities.
The proper way of comparing one floating point number with another is by using a precission value, for example
#define EPS 0.00001
#define ABS(a) ((a)<0?-(a):(a))
if (ABS(a-b)<EPS)
...
This is derived from:
if (a == b) // "equal" of fp numbers is not well defined
if (a-b == 0) // so comparing with zero is also ill defined
if (ABS(a-b) < EPS) // and so we compare with this small precission
Related
I have problem with floating point rounding. I want to calculate floating point numbers and round them to (given) N decimals. In this example I want to round to 1 decimal places.
Calculation 37.1-28.75 will result into floating point 8.349998 (instead of 8.35), which will result printf rounding to 8.3 instead of 8.4 for 1 decimal places.
The actual result in math is 37.10-28.75=8.35000000, but due to floating point imprecision it is converted into 8.349998, which is then converted into 8.3 instead of 8.4 when using 1 decimal place rounding.
Minimum reproducible example:
float a = 37.10;
float b = 28.75;
//a-b = 8.35 = 8.4
printf("%.1f\n", a - b); //outputs 8.3 instead of 8.4
Is it valid to add following to the result:
float result = a - b;
if (result > 0.0f)
{
result += powf(10, -nr_of_decimals - 1) / 2;
}
else
{
result -= powf(10, -nr_of_decimals - 1) / 2;
}
EDIT: corrected that I want 1 decimal place rounded output, not 2 decimal places
EDIT2: negative results are needed as well (28.75-37.1 = -8.4)
On my system I do actually get 8.35. It's possible that you have to set the rounding direction to "nearest" first, try this (compile with e.g. gcc ... -lm):
#include <fenv.h>
#include <stdio.h>
int main()
{
float a = 37.10;
float b = 28.75;
float res = a - b;
fesetround(FE_TONEAREST);
printf("%.2f\n", res);
}
Binary floating point is, after all, binary, and if you do care about the correct decimal rounding this much, then your choices would be:
decimal floating point, or
fixed point.
I'd say the solution is to use fixed point, especially if you're on embedded, and forget about everything else.
With
int32_t a = 3710;
int32_t b = 2875;
the result of
a - b
will exactly be
835
every time; and then you just need to have a simple fixed point printing routine for the desired precision, and check the following digit after the last digit to see if it needs to be rounded up.
If you want to round to 2 decimals, you can add 0.005 to the result and then offset it with floorf:
float f = 37.10f - 28.75f;
float r = floorf((f + 0.005f) * 100.f) / 100.f;
printf("%f\n", r);
The output is 8.350000
Why are you using floats instead of doubles?
Regarding your question:
Is it valid to add following to the result:
float result = a - b;
if (result > 0.0f)
{
result += powf(10, -nr_of_decimals - 1) / 2;
}
else
{
result -= powf(10, -nr_of_decimals - 1) / 2;
}
It doesn't seem so, on my computer I get 8.350498 instead of 8.350000.
After your edit:
Calculation 37.1-28.75 will result into floating point 8.349998, which will result printf rounding to 8.3 instead of 8.4.
Then
float r = roundf((f + (f < 0.f ? -0.05f : +0.05f)) * 10.f) / 10.f;
is what you are looking for.
In C, 0.55 == 0.55f is false while 0.5 == 0.5f is true. Why is it different?
Comparing 0.55:
#include <stdio.h>
int main() {
if (0.55 == 0.55f)
printf("Hi");
else
printf("Hello");
}
Outputs Hello.
Comparing 0.5:
#include <stdio.h>
int main() {
if (0.5 == 0.5f)
printf("Hi");
else
printf("Hello");
}
Outputs Hi.
For both the code snippets, I expected Hello.
Why this difference?
0.5 is a dyadic rational and of an appropriate magnitude so 0.5 is exactly one-half either as a float or a double.
The same cannot be said for 0.55. A double will store that number with no less precision than a float, and most likely more.
In both cases, the float is implicitly converted to a double prior to ==, but by then any truncation has taken place.
You are comparing two different types of values which are double and float. Think about the limitations of size with inexact numbers.
Exact values (decimal)
A -> 1/2 with 5 decimals is 0.5000
B -> 1/2 with 10 decimals is 0.5000000000
A == B will always return true
Inexact values (decimal)
A -> 1/3 with 5 decimals is 0.33333
B -> 1/3 with 10 decimals is 0.3333333333
A == B -> will always return false because they aren't the same.
Similarly, 0.55 cannot be represented exactly in binary but 0.5 can be.
The binary representation of 0.55d -> 0.10001100110011001101...
So they will not be equal
The binary representation of 0.5d -> 0.1
So they will be equal
Hope It clears your doubt
This question already has answers here:
Comparing float and double
(3 answers)
Closed 7 years ago.
int main(void)
{
float me = 1.1;
double you = 1.1;
if ( me == you ) {
printf("I love U");
} else {
printf("I hate U");
}
}
This prints "I hate U". Why?
Floats use binary fraction. If you convert 1.1 to float, this will result in a binary representation.
Each bit right if the binary point halves the weight of the digit, as much as for decimal, it divides by ten. Bits left of the point double (times ten for decimal).
in decimal: ... 0*2 + 1*1 + 0*0.5 + 0*0.25 + 0*0.125 + 1*0.0625 + ...
binary: 0 1 . 0 0 0 1 ...
2's exp: 1 0 -1 -2 -3 -4
(exponent to the power of 2)
Problem is that 1.1 cannot be converted exactly to binary representation. For double, there are, however, more significant digits than for float.
If you compare the values, first, the float is converted to double. But as the computer does not know about the original decimal value, it simply fills the trailing digits of the new double with all 0, while the double value is more precise. So both do compare not equal.
This is a common pitfall when using floats. For this and other reasons (e.g. rounding errors), you should not use exact comparison for equal/unequal), but a ranged compare using the smallest value different from 0:
#include "float.h"
...
// check for "almost equal"
if ( fabs(fval - dval) <= FLT_EPSILON )
...
Note the usage of FLT_EPSILON, which is the aforementioned value for single precision float values. Also note the <=, not <, as the latter will actually require exact match).
If you compare two doubles, you might use DBL_EPSILON, but be careful with that.
Depending on intermediate calculations, the tolerance has to be increased (you cannot reduce it further than epsilon), as rounding errors, etc. will sum up. Floats in general are not forgiving with wrong assumptions about precision, conversion and rounding.
Edit:
As suggested by #chux, this might not work as expected for larger values, as you have to scale EPSILON according to the exponents. This conforms to what I stated: float comparision is not that simple as integer comparison. Think about before comparing.
In short, you should NOT use == to compare floating points.
for example
float i = 1.1; // or double
float j = 1.1; // or double
This argument
(i==j) == true // is not always valid
for a correct comparison you should use epsilon (very small number):
(abs(i-j)<epsilon)== true // this argument is valid
The question simplifies to why do me and you have different values?
Usually, C floating point is based on a binary representation. Many compilers & hardware follow IEEE 754 binary32 and binary64. Rare machines use a decimal, base-16 or other floating point representation.
OP's machine certainly does not represent 1.1 exactly as 1.1, but to the nearest representable floating point number.
Consider the below which prints out me and you to high precision. The previous representable floating point numbers are also shown. It is easy to see me != you.
#include <math.h>
#include <stdio.h>
int main(void) {
float me = 1.1;
double you = 1.1;
printf("%.50f\n", nextafterf(me,0)); // previous float value
printf("%.50f\n", me);
printf("%.50f\n", nextafter(you,0)); // previous double value
printf("%.50f\n", you);
1.09999990463256835937500000000000000000000000000000
1.10000002384185791015625000000000000000000000000000
1.09999999999999986677323704498121514916420000000000
1.10000000000000008881784197001252323389053300000000
But it is more complicated: C allows code to use higher precision for intermediate calculations depending on FLT_EVAL_METHOD. So on another machine, where FLT_EVAL_METHOD==1 (evaluate all FP to double), the compare test may pass.
Comparing for exact equality is rarely used in floating point code, aside from comparison to 0.0. More often code uses an ordered compare a < b. Comparing for approximate equality involves another parameter to control how near. #R.. has a good answer on that.
Because you are comparing two Floating point!
Floating point comparison is not exact because of Rounding Errors. Simple values like 1.1 or 9.0 cannot be precisely represented using binary floating point numbers, and the limited precision of floating point numbers means that slight changes in the order of operations can change the result. Different compilers and CPU architectures store temporary results at different precisions, so results will differ depending on the details of your environment. For example:
float a = 9.0 + 16.0
double b = 25.0
if(a == b) // can be false!
if(a >= b) // can also be false!
Even
if(abs(a-b) < 0.0001) // wrong - don't do this
This is a bad way to do it because a fixed epsilon (0.0001) is chosen because it “looks small”, could actually be way too large when the numbers being compared are very small as well.
I personally use the following method, may be this will help you:
#include <iostream> // std::cout
#include <cmath> // std::abs
#include <algorithm> // std::min
using namespace std;
#define MIN_NORMAL 1.17549435E-38f
#define MAX_VALUE 3.4028235E38f
bool nearlyEqual(float a, float b, float epsilon) {
float absA = std::abs(a);
float absB = std::abs(b);
float diff = std::abs(a - b);
if (a == b) {
return true;
} else if (a == 0 || b == 0 || diff < MIN_NORMAL) {
return diff < (epsilon * MIN_NORMAL);
} else {
return diff / std::min(absA + absB, MAX_VALUE) < epsilon;
}
}
This method passes tests for many important special cases, for different a, b and epsilon.
And don't forget to read What Every Computer Scientist Should Know About Floating-Point Arithmetic!
Typically, Rounding to 2 decimal places is very easy with
printf("%.2lf",<variable>);
However, the rounding system will usually rounds to the nearest even. For example,
2.554 -> 2.55
2.555 -> 2.56
2.565 -> 2.56
2.566 -> 2.57
And what I want to achieve is that
2.555 -> 2.56
2.565 -> 2.57
In fact, rounding half-up is doable in C, but for Integer only;
int a = (int)(b+0.5)
So, I'm asking for how to do the same thing as above with 2 decimal places on positive values instead of Integer to achieve what I said earlier for printing.
It is not clear whether you actually want to "round half-up", or rather "round half away from zero", which requires different treatment for negative values.
Single precision binary float is precise to at least 6 decimal places, and 20 for double, so nudging a FP value by DBL_EPSILON (defined in float.h) will cause a round-up to the next 100th by printf( "%.2lf", x ) for n.nn5 values. without affecting the displayed value for values not n.nn5
double x2 = x * (1 + DBL_EPSILON) ; // round half-away from zero
printf( "%.2lf", x2 ) ;
For different rounding behaviours:
double x2 = x * (1 - DBL_EPSILON) ; // round half-toward zero
double x2 = x + DBL_EPSILON ; // round half-up
double x2 = x - DBL_EPSILON ; // round half-down
Following is precise code to round a double to the nearest 0.01 double.
The code functions like x = round(100.0*x)/100.0; except it handles uses manipulations to insure scaling by 100.0 is done exactly without precision loss.
Likely this is more code than OP is interested, but it does work.
It works for the entire double range -DBL_MAX to DBL_MAX. (still should do more unit testing).
It depends on FLT_RADIX == 2, which is common.
#include <float.h>
#include <math.h>
void r100_best(const char *s) {
double x;
sscanf(s, "%lf", &x);
// Break x into whole number and fractional parts.
// Code only needs to round the fractional part.
// This preserves the entire `double` range.
double xi, xf;
xf = modf(x, &xi);
// Multiply the fractional part by N (256).
// Break into whole and fractional parts.
// This provides the needed extended precision.
// N should be >= 100 and a power of 2.
// The multiplication by a power of 2 will not introduce any rounding.
double xfi, xff;
xff = modf(xf * 256, &xfi);
// Multiply both parts by 100.
// *100 incurs 7 more bits of precision of which the preceding code
// insures the 8 LSbit of xfi, xff are zero.
int xfi100, xff100;
xfi100 = (int) (xfi * 100.0);
xff100 = (int) (xff * 100.0); // Cast here will truncate (towards 0)
// sum the 2 parts.
// sum is the exact truncate-toward-0 version of xf*256*100
int sum = xfi100 + xff100;
// add in half N
if (sum < 0)
sum -= 128;
else
sum += 128;
xf = sum / 256;
xf /= 100;
double y = xi + xf;
printf("%6s %25.22f ", "x", x);
printf("%6s %25.22f %.2f\n", "y", y, y);
}
int main(void) {
r100_best("1.105");
r100_best("1.115");
r100_best("1.125");
r100_best("1.135");
r100_best("1.145");
r100_best("1.155");
r100_best("1.165");
return 0;
}
[Edit] OP clarified that only the printed value needs rounding to 2 decimal places.
OP's observation that rounding of numbers "half-way" per a "round to even" or "round away from zero" is misleading. Of 100 "half-way" numbers like 0.005, 0.015, 0.025, ... 0.995, only 4 are typically exactly "half-way": 0.125, 0.375, 0.625, 0.875. This is because floating-point number format use base-2 and numbers like 2.565 cannot be exactly represented.
Instead, sample numbers like 2.565 have as the closest double value of 2.564999999999999947... assuming binary64. Rounding that number to nearest 0.01 should be 2.56 rather than 2.57 as desired by OP.
Thus only numbers ending with 0.125 and 0.625 area exactly half-way and round down rather than up as desired by OP. Suggest to accept that and use:
printf("%.2lf",variable); // This should be sufficient
To get close to OP's goal, numbers could be A) tested against ending with 0.125 or 0.625 or B) increased slightly. The smallest increase would be
#include <math.h>
printf("%.2f", nextafter(x, 2*x));
Another nudge method is found with #Clifford.
[Former answer that rounds a double to the nearest double multiple of 0.01]
Typical floating-point uses formats like binary64 which employs base-2. "Rounding to nearest mathmatical 0.01 and ties away from 0.0" is challenging.
As #Pascal Cuoq mentions, floating point numbers like 2.555 typically are only near 2.555 and have a more precise value like 2.555000000000000159872... which is not half way.
#BLUEPIXY solution below is best and practical.
x = round(100.0*x)/100.0;
"The round functions round their argument to the nearest integer value in floating-point
format, rounding halfway cases away from zero, regardless of the current rounding direction." C11dr §7.12.9.6.
The ((int)(100 * (x + 0.005)) / 100.0) approach has 2 problems: it may round in the wrong direction for negative numbers (OP did not specify) and integers typically have a much smaller range (INT_MIN to INT_MAX) that double.
There are still some cases when like when double x = atof("1.115"); which end up near 1.12 when it really should be 1.11 because 1.115, as a double is really closer to 1.11 and not "half-way".
string x rounded x
1.115 1.1149999999999999911182e+00 1.1200000000000001065814e+00
OP has not specified rounding of negative numbers, assuming y = -f(-x).
I have problem with precision of double format.
Sample example:
double K=0, L=0, M=0;
scanf("%lf %lf %lf", &K, &L, &M);
if((K+L) <= M) printf("Incorrect input");
else printf("Right, K=%f, L=%f, M=%f", K, L, M);
My test input:
K = 0.1, L = 0.2, M = 0.3 -> Condition but goes to 'else' statement.
How I can correct this difference? Is there any other method to summation?
In the world of Double Precision IEEE 754 binary floating-point format (the ones used on Intel and other processors) 0.1 + 0.2 == 0.30000000000000004 :-) And 0.30000000000000004 != 0.3 (and note that in the marvelous world of doubles, 0.1, 0.2 and 0.3 don't exist as "exact" quantities. There are some double numbers that are very near them, but if you printed them with full precision, they wouldn't be 0.1, 0.2 and 0.3)
To laugh a little, try this: http://pages.cs.wisc.edu/~rkennedy/exact-float
Insert a decimal number and look at the second and third row, it shows how the number is really represented in memory. It's for Delphi, but Double and Single are the same for Delphi and for probably all the C compilers for Intel processors (they are called double and float in C)
And if you want to try for yourself, look at this http://ideone.com/WEL7h
#include <stdio.h>
int main()
{
double d1 = (0.1 + 0.2);
double d2 = 0.3;
printf("%.20e\n%.20e", d1, d2);
return 0;
}
output:
3.00000000000000044409e-01
2.99999999999999988898e-01
(be aware that the output is compiler dependant. Depending on the options, 0.1 + 0.2 could be compiled and rounded to 0.3)
Unlike integer values floating point values are not stored exactly the way you assign values to them. Lets consider the following code:
int i = 1; // this is and always will be 1
float j = 0.03 // this gets stored at least on my machine as something like 0.029999999
Why is this so? Well how many floating point number exist in the interval between 0.1 and 0.2?
An infinite number! So there are values which will get stored as you intended but a hell of a lot of values which will be stored with a small error.
This is the reason why comparing floating point values for equality is not a good idea. Try something like this instead:
float a = 0.3f;
float b = 0.301f;
float threshold = 1e-6;
if( abs(a-b) < threshold )
return true;
else
return false;
There are infinitely many real numbers between any two distinct real numbers. If we were to be able to represent every one of those, we would need infinite memory. Since we only have finite memory, floating point numbers need to be stored with only finite precision. Up to that finite precision, it might be not be true that 0.1 + 0.2 <= 0.3.
Now, you really should go read what's at the other end of the excellent link provided by Paul R.