Similar codes output different results - c

The output of the following code is 0.0000000:
#include <stdio.h>
int main() {
float x;
x = (float)3.3 == 3.3;
printf("%f", x);
return 0;
}
Whereas this code outputs 1.000000:
int main() {
float x;
x = (float)3.5 == 3.5;
printf("%f", x);
return 0;
}
The only difference between the 2 codes is the value in the comparison, however, the results are not the same, why is this?

This line:
x=(float)3.3==3.3;
Compares (float)3.3 and 3.3 for equality and assigns the result to x. The left side of the comparison has type float because of the cast, while the right side has type double which is the default type for floating point constants.
The value 3.3 cannot be represented exactly in binary floating point, so the actual value stored is an approximation. This approximated value will be different for types float and double due to their differing precisions, so the equality will evaluated to false, i.e. 0. This is the value that gets assigned to x.
Regarding your comment on why x is 1 when the number you're checking is 3.5, that is because 3.5 can be represented exactly in binary, and both types have the precision to store that value, so they compare equal.

The assigned value 3.3 is a type of double but you're trying to compare a double with a float (by typecasting and due to this, precision losses).
The value of 3.3 as double is 3.299999999999999822 whereas the same value in float is measured 3.299999952F which are clearly unequal. Hence, the result will be true (i.e. 1.0000000) if you typecast the other 3.3 as float.
Rather than:
x = (float) 3.3 == 3.3; // float != double precision (precision loss)
If you do this:
x = (float) 3.3 == (float) 3.3; // converting both to make precision equal
Or,
x = (double) 3.3 == (double) 3.3; // converting . . . (same)
In other words, the comparison will be equal if you convert any one of the expression as same as the other one.
Also, notice that 3.5 is equal to 3.50000000... in both float and double, hence all the trailing zeroes are truncated from the assigned variable and hence you get 1.0000000. But this stuff is just a bit contrary with 3.3.

float has less precision than double, the value of 3.3 constant defaults to double and is 3.299999999999999822, the same constant when converted to float is 3.29999995231628418.
3.299999999999999822 == 3.29999995231628418 the result of this comparison is false i.e. 0.
Given the precedence rules, the expresssion amounts to x = ((float)3.5 == 3.5);, the comparison is evaluated first and the result is assigned to x.
When there is no cast both constants default to double so naturally the result of the comparison between them is true i.e. 1.
Regarding the comparison between 3.5 double and float being true, it has to do with the binary conversion, 3.3 is subjected to an aproximation given the fact that the mantissa conversion would go on indefinitely as can be seen in the link above, the exact value simply cannot be represented in double nor in float, whereas 3.5 is perfectly representable both in double and float alike.

To see loss of precision with float:
#include <stdio.h>
int main(){
float x = 3.3;
double d = 3.3;
printf("%12.9f %12.9lf\n",x, d);
Output:
3.299999952 3.300000000

3.3 in binary is
11.0100110011001100110011001100110011001100110011
When converted to float some bits of precision are truncated
11.0100110011001100110011001100110
When converted back to double, computer just uses 0's
11.0100110011001100110011001100110000000000000000
So
3.3 11.0100110011001100110011001100110011001100110011
(float)3.3 11.0100110011001100110011001100110
(double)((float)3.3) 11.0100110011001100110011001100110000000000000000
3.5 for comparison
3.5 11.1000000000000000000000000000000000000000000000
(float)3.5 11.1000000000000000000000000000000
(double)((float)3.5) 11.1000000000000000000000000000000000000000000000

Values like 3.3 cannot be represented exactly in a finite number of bits, just like the result of 1/3 cannot be represented exactly in a finite number of decimal digits, so you wind up storing an approximation of the value.
The float approximation is different from the double approximation, so the comparison (float) 3.3 == 3.3 fails.
By contrast, 3.5 can be represented exactly in both float and double types, so the comparison (float) 3.5 == 3.5 succeeds.
Just like with integer types, the significand of a floating-point type is a sum of powers of 2 - the value 3.5 is represented as 1.75 * 21, and the binary representation of the significand 1.75 is 1.112 - 1 * 20 + 1 * 2-1 + 1 * 2-2, or 1 + 0.5 + 0.25.

The issue is that you loose precision when converting the value 3.3, which is a literal of type double, to a float value with the cast (float)3.3. This loss of precision is irreversible, even if the comparison operator == will promote the left operand back to type double.
So the issue is that
(double)((float)3.3) == (double)3.3
will be false since the cast of 3.3 to float looses precision. For 3.5, in contrast, the result will be true, because 3.5 can be exactly represented as float in the same precision as double can.
Actually, the situation can be compared to casting a value from a higher rank to a lower and then back like in the following snippet:
unsigned int x = 257;
unsigned char y = x;
unsigned int x2 = y;
printf("%d\n", x==x2); // 0
x = 255;
y = x;
x2 = y;
printf("%d\n", x==x2); // 1

Related

why if(a==2.3) evaluates false when float a=2.3 [duplicate]

This question already has answers here:
Why are floating point numbers inaccurate?
(5 answers)
Closed 6 years ago.
#include<stdio.h>
void main()
{
float a = 2.3;
if(a == 2.3) {
pritnf("hello");
}
else {
printf("hi");
}
}
It prints "hi" in output, or we can say that if condition is getting false value.
#include<stdio.h>
void main()
{
float a = 2.5;
if(a == 2.5)
printf("Hello");
else
printf("Hi");
}
It prints hello.
The variable a is a float that holds some value close to the mathematical value 2.3.
The literal 2.3 is a double that also holds some value close to the mathematical value 2.3, but because double has greater precision than float, this may be a different value from the value of a. Both float and double can only represent a finite number of values, so there are necessarily mathematical real numbers that cannot be represented exactly by either of those two types.
In the comparison a == 2.3, the left operand is promoted from float to double. This promotion is exact and preserves the value (as all promotions do), but as discussed above, that value may be a different one from that of the 2.3 literal.
To make a comparison between floats, you can use an appropriate float literal:
assert(a == 2.3f);
// ^
2.3 with binary representation is 01000000000100110011001100110011...
so you are not able to set a float exactly to 2.3
with double precision you get something similar: 2.299999952316284
you converted a double to float when you wrote:
float a = 2.3;
the if checks if the float a is equal to double 2.299999952316284
you should write:
float a = 2.3f;
and you can check:
if (a == 2.3f) {
...
}
i would rather test with:
if (fabs(a - 2.3f) < 0.00001) {
...
}
the 2.5 represented with bits is: 01000000001000000000000000000000
EDIT: fabs is part of the <math.h> or <cmath>
Read this: article
Comparing floating point values is not as easy as it might seem, have a look at Most effective way for float and double comparison.
It all boils down to the fact, that floating point numbers are not exact (well
most are not). Usually you compare 2 floats by allowing a small error window (epsilon):
if( fabs(a - 2.3f) < epsion) { ... }
where epsilon is small enough for your calculation, but not too small (bigger than Machine epsilon).

behaviour of float in C

#include<stdio.h>
int main()
{
float f = 0.1;
double d = 0.1;
printf("%lu %lu %lu %lu\n", sizeof(f), sizeof(0.1f), sizeof(0.1), sizeof(d));
return 0;
}
Output
$ ./a.out
4 4 8 8
As per above code, we can see sizeof(0.1) and sizeof(0.1f) are not same.
sizeof(0.1) is 8 bytes, while sizeof(0.1f) is 4 bytes.
but while assigning the value to float variable f, it automatically truncates its size to 4 bytes.
While in below code, while comparing it with float x it is not truncating and 4 bytes of float are compared with 8 bytes of 0.1, value of float x matches with 0.1f as both are of 4 bytes.
#include<stdio.h>
int main()
{
float x = 0.1;
if (x == 0.1)
printf("IF");
else if (x == 0.1f)
printf("ELSE IF");
else
printf("ELSE");
}
Output
$ ./a.out
ELSE IF
why and how it is truncating while assigning and not while comparing?
A floating point literal without a suffix is of type double. Suffixing it with an f makes a literal of type float.
When assigning to a variable, the right operand to = is converted to the type of the left operand, thus you observe truncation.
When comparing, the operands to == are converted to the larger of the two operands, so x == 0.1 is like (double)x == 0.1, which is false since (double)(float)0.1 is not equal to 0.1 due to rounding issues. In x == 0.1f, both operands have type float, which results in equality on your machine.
Floating point math is tricky, read the standard for more details.
a floating point constant like 0.1 is a double unless specified as a float like 0.1f. The line
float f = 0.1;
means create a double with value 0.1 and cast it to float and lose precision in the process. The lines
float x = 0.1;
if (x == 0.1)
will cause x to be implicitly converted to double but it will have a slightly different value than for e.g. double x = 0.1;
0.1f (the "f" after the number) is for the computer as float , that how your compailer know that he need to store it as float and not as double.
so float 0.1 not equal to 0.1 , its equal to 0.1f
when you write 0.1 , it is considered by default as double. suffix f explicitly make it float.
In second question float are stored as ieee standard so it it's going in else if because equivalent conversion of 0.1f to double is not same.
https://en.wikipedia.org/wiki/Floating_point
0.1 is a double value whereas 0.1f is a float value.
The reason we can write float x=0.1 as well as double x=0.1 is due to implicit conversions .
But by using suffix f you make it a float type .
In this -
if(x == 0.1)
is flase because 0.1 is not exactly 0.1 at some places after decimal .There is also conversion in this to higher type i.e double.
Converting to float and then to double , there is loss of information as also double as higher precession than float so it differs .

float vs double comparison [duplicate]

This question already has answers here:
Comparing float and double
(3 answers)
Closed 7 years ago.
int main(void)
{
  float me = 1.1;  
double you = 1.1;   
if ( me == you ) {
printf("I love U");
} else {
printf("I hate U");
}
}
This prints "I hate U". Why?
Floats use binary fraction. If you convert 1.1 to float, this will result in a binary representation.
Each bit right if the binary point halves the weight of the digit, as much as for decimal, it divides by ten. Bits left of the point double (times ten for decimal).
in decimal: ... 0*2 + 1*1 + 0*0.5 + 0*0.25 + 0*0.125 + 1*0.0625 + ...
binary: 0 1 . 0 0 0 1 ...
2's exp: 1 0 -1 -2 -3 -4
(exponent to the power of 2)
Problem is that 1.1 cannot be converted exactly to binary representation. For double, there are, however, more significant digits than for float.
If you compare the values, first, the float is converted to double. But as the computer does not know about the original decimal value, it simply fills the trailing digits of the new double with all 0, while the double value is more precise. So both do compare not equal.
This is a common pitfall when using floats. For this and other reasons (e.g. rounding errors), you should not use exact comparison for equal/unequal), but a ranged compare using the smallest value different from 0:
#include "float.h"
...
// check for "almost equal"
if ( fabs(fval - dval) <= FLT_EPSILON )
...
Note the usage of FLT_EPSILON, which is the aforementioned value for single precision float values. Also note the <=, not <, as the latter will actually require exact match).
If you compare two doubles, you might use DBL_EPSILON, but be careful with that.
Depending on intermediate calculations, the tolerance has to be increased (you cannot reduce it further than epsilon), as rounding errors, etc. will sum up. Floats in general are not forgiving with wrong assumptions about precision, conversion and rounding.
Edit:
As suggested by #chux, this might not work as expected for larger values, as you have to scale EPSILON according to the exponents. This conforms to what I stated: float comparision is not that simple as integer comparison. Think about before comparing.
In short, you should NOT use == to compare floating points.
for example
float i = 1.1; // or double
float j = 1.1; // or double
This argument
(i==j) == true // is not always valid
for a correct comparison you should use epsilon (very small number):
(abs(i-j)<epsilon)== true // this argument is valid
The question simplifies to why do me and you have different values?
Usually, C floating point is based on a binary representation. Many compilers & hardware follow IEEE 754 binary32 and binary64. Rare machines use a decimal, base-16 or other floating point representation.
OP's machine certainly does not represent 1.1 exactly as 1.1, but to the nearest representable floating point number.
Consider the below which prints out me and you to high precision. The previous representable floating point numbers are also shown. It is easy to see me != you.
#include <math.h>
#include <stdio.h>
int main(void) {
float me = 1.1;
double you = 1.1;
printf("%.50f\n", nextafterf(me,0)); // previous float value
printf("%.50f\n", me);
printf("%.50f\n", nextafter(you,0)); // previous double value
printf("%.50f\n", you);
1.09999990463256835937500000000000000000000000000000
1.10000002384185791015625000000000000000000000000000
1.09999999999999986677323704498121514916420000000000
1.10000000000000008881784197001252323389053300000000
But it is more complicated: C allows code to use higher precision for intermediate calculations depending on FLT_EVAL_METHOD. So on another machine, where FLT_EVAL_METHOD==1 (evaluate all FP to double), the compare test may pass.
Comparing for exact equality is rarely used in floating point code, aside from comparison to 0.0. More often code uses an ordered compare a < b. Comparing for approximate equality involves another parameter to control how near. #R.. has a good answer on that.
Because you are comparing two Floating point!
Floating point comparison is not exact because of Rounding Errors. Simple values like 1.1 or 9.0 cannot be precisely represented using binary floating point numbers, and the limited precision of floating point numbers means that slight changes in the order of operations can change the result. Different compilers and CPU architectures store temporary results at different precisions, so results will differ depending on the details of your environment. For example:
float a = 9.0 + 16.0
double b = 25.0
if(a == b) // can be false!
if(a >= b) // can also be false!
Even
if(abs(a-b) < 0.0001) // wrong - don't do this
This is a bad way to do it because a fixed epsilon (0.0001) is chosen because it “looks small”, could actually be way too large when the numbers being compared are very small as well.
I personally use the following method, may be this will help you:
#include <iostream> // std::cout
#include <cmath> // std::abs
#include <algorithm> // std::min
using namespace std;
#define MIN_NORMAL 1.17549435E-38f
#define MAX_VALUE 3.4028235E38f
bool nearlyEqual(float a, float b, float epsilon) {
float absA = std::abs(a);
float absB = std::abs(b);
float diff = std::abs(a - b);
if (a == b) {
return true;
} else if (a == 0 || b == 0 || diff < MIN_NORMAL) {
return diff < (epsilon * MIN_NORMAL);
} else {
return diff / std::min(absA + absB, MAX_VALUE) < epsilon;
}
}
This method passes tests for many important special cases, for different a, b and epsilon.
And don't forget to read What Every Computer Scientist Should Know About Floating-Point Arithmetic!

Why storing a double expression in a variable before a cast to int can lead to different results than casting it directly?

I write this short program to test the conversion from double to int:
int main() {
int a;
int d;
double b = 0.41;
/* Cast from variable. */
double c = b * 100.0;
a = (int)(c);
/* Cast expression directly. */
d = (int)(b * 100.0);
printf("c = %f \n", c);
printf("a = %d \n", a);
printf("d = %d \n", d);
return 0;
}
Output:
c = 41.000000
a = 41
d = 40
Why do a and d have different values even though they are both the product of b and 100?
The C standard allows a C implementation to compute floating-point operations with more precision than the nominal type. For example, the Intel 80-bit floating-point format may be used when the type in the source code is double, for the IEEE-754 64-bit format. In this case, the behavior can be completely explained by assuming the C implementation uses long double (80 bit) whenever it can and converts to double when the C standard requires it.
I conjecture what happens in this case is:
In double b = 0.41;, 0.41 is converted to double and stored in b. The conversion results in a value slightly less than .41.
In double c = b * 100.0000;, b * 100.0000 is evaluated in long double. This produces a value slightly less than 41.
That expression is used to initialize c. The C standard requires that it be converted to double at this point. Because the value is so close to 41, the conversion produces exactly 41. So c is 41.
a = (int)(c); produces 41, as normal.
In d = (int)(b * 100.000);, we have the same multiplication as before. The value is the same as before, something slightly less than 41. However, this value is not assigned to or used to intialize a double, so no conversion to double occurs. Instead, it is converted to int. Since the value is slightly less than 41, the conversion produces 40.
The compiler can infer that c has to be initialized with 0.41 * 100.0 and does that better than the calculation of d.
The crux of the problem is that 0.41 is not exactly representable in IEEE 754 64-bit binary floating point. The actual value (with only enough precision to show the relevant part) is 0.409999999999999975575..., while 100 can be represented exactly. Multiplying these together should yield 40.9999999999999975575..., which is again not quite representable. In the likely case that the rounding mode is towards nearest, zero, or negative infinity, this should be rounded to 40.9999999999999964.... When cast to an int, this is rounded to 40.
The compiler is allowed to do calculations with higher precision, however, and in particular may replace the multiplication in the assignment of c with a direct store of the computed value.
Edit: I miscalculated the largest representable number less than 41, the correct value is approximately 40.99999999999999289.... As both Eric Postpischil and Daniel Fischer correctly point out, even the value calculated as a double should be rounded to 41 unless the rounding mode is towards zero or negative infinity. Do you know what the rounding mode is? It makes a difference, as this code sample shows:
#include <stdio.h>
#include <fenv.h>
#pragma STDC FENV_ACCESS ON
int main(void)
{
int roundMode = fegetround( );
volatile double d1;
volatile double d2;
volatile double result;
volatile int rounded;
fesetround(FE_TONEAREST);
d1 = 0.41;
d2 = 100;
result = d1 * d2;
rounded = result;
printf("nearest rounded=%i\n", rounded);
fesetround(FE_TOWARDZERO);
d1 = 0.41;
d2 = 100;
result = d1 * d2;
rounded = result;
printf("zero rounded=%i\n", rounded);
fesetround(roundMode);
return 0;
}
Output:
nearest rounded=41
zero rounded=40

Dividing integers in C rounds the value down / gives zero as a result

I'm trying to do some arithmetic on integers. The problem is when I'm trying to do division to get a double as a result, the result is always 0.00000000000000000000, even though this is obviously not true for something like ((7 * 207) / 6790). I have tried type-casting the formulas, but I still get the same result.
What am I doing wrong and how can I fix it?
int o12 = 7, o21 = 207, numTokens = 6790;
double e11 = ((o12 * o21) / numTokens);
printf(".%20lf", e11); // prints 0.00000000000000000000
Regardless of the actual values, the following holds:
int / int = int
The output will not be cast to a non-int type automatically.
So the output will be floored to an int when doing division.
What you want to do is force any of these to happen:
double / int = double
float / int = float
int / double = double
int / float = float
The above involves an automatic widening conversion - note that only one needs to be a floating point value.
You can do this by either:
Putting a (double) or (float) before one of your values to cast it to the corresponding type or
Changing one or more of the variables to double or float
Note that a cast like (double)(int / int) will not work, as this first does the integer division (which returns an int, and thus floors the value) and only then casts the result to double (this will be the same as simply trying to assign it to a double without any casting, as you've done).
It is certainly true for an expression such as ((7 * 207) / 6790) that the result is 0, or 0.0 if you think in double.
The expression only has integers, so it will be computed as an integer multiplication followed by an integer division.
You need to cast to a floating-point type to change that, e.g. ((7 * 207) / 6790.0).
Many poeple seem to expect the right-hand side of an assignment to be automatically "adjusted" by the type of the target variable: this is not how it works. The result is converted, but that doesn't affect any "inner" operations in the right-hand expression. In your code:
e11 = ((o12 * o21) / numTokens);
All of o12, o21 and numTokens are integer, so that expression is evaluated as integer, then converted to floating-point since e11 is double.
This like doing
const double a_quarter = 1 / 4;
this is just a simpler case of the same problem: the expression is evaluated first, then the result (the integer 0) is converted to double and stored. That's how the language works.
The fix is to cast:
e11 = ((o12 * o21) / (double) numTokens);
You must cast these numbers to double before division. When you perform division on int the result is also an integer rounded towards zero, e.g. 1 / 2 == 0, but 1.0 / 2.0 == 0.5.
If the operands are integer, C will perform integer arithmetic. That is, 1/4 == 0. However, if you force an operand to be double, then the arithmetic will take fractional parts into account. So:
int a = 1;
int b = 4;
double c = 1.0
double d = a/b; // d == 0.0
double e = c/b; // e == 0.25
double f = (double)a/b; // f == 0.25

Resources