This question already has answers here:
Is floating point math broken?
(31 answers)
Closed 2 years ago.
I have written this piece of code in my computer and the result is 7 instead of 8 (the correct result ... I think).
I don't know why... Can somebody help me?
#include <stdio.h>
int main() {
int num;
num = (68/10.0 - 68/10)*10;
printf("the result %d", num);
return 0;
}
double typically represents exactly about 264 different numbers. 68/10.0 is not one of them,
As a binary64, 68/10.0 is about
6.7999999999999998223643161..., the closest value to 6.8 that is a multiple of a dyadic rational. # AntonH
68/10 is an integer division with a quotient of 6.
(68/10.0 - 68/10)*10 is thus about 7.9999999999999982236431606...
Assigning that to an int is 7 not 8 as the fraction is discarded even though it is very close to 8.
When converting a floating point value consider round to the the closest, rather than truncating.
num = lround((68/10.0 - 68/10)*10);
Related
This question already has answers here:
Why are floating point numbers inaccurate?
(5 answers)
Closed 3 months ago.
#include <stdio.h>
typedef struct ElapsedTime_struct {
int decadesVal;
int yearsVal;
} ElapsedTime;
ElapsedTime ConvertToDecadesAndYears(int totalYears) {
ElapsedTime tempVal;
tempVal.decadesVal = totalYears/10;
tempVal.yearsVal = (((double) totalYears/10) - tempVal.decadesVal) * 10;
return tempVal;
}
int main(void) {
ElapsedTime elapsedYears;
int totalYears;
scanf("%d", &totalYears);
elapsedYears = ConvertToDecadesAndYears(totalYears);
printf("%d decades and %d years\n", elapsedYears.decadesVal, elapsedYears.yearsVal);
return 0;
}
my logic is that if you take number of total years( 26) and divide the integer value by 10, you will get the number of decades. int (26/10) = 2
my logic for the number of years is that if you take the double value of 26/10, you will get 2.6000
then subtracting 2.6 - 2 (number of decades), you will get the decimal value for the number of years(0.6)
i then multiplied by 10 to get a whole value (6) so together its 2 decades and 6 years.
however when i try running the code for some inputs (34 total years) i am getting 3 decades and 3 years for some reason whereas i should get 3 decades and 4 years.
i am confused as to why this is happening.
Floating point math is inexact. Values such as 2.6 and 3.3 cannot be exactly represented in binary floating point. You instead end up with a value that is either slightly larger or slightly smaller.
The latter case is what you're seeing. 3.4 is stored as roughly 3.39999999999999991. Subtracting 3 from that and multiplying by 10 gives you 3.9999999999999991, then when you convert that to an int the fractional part is truncated giving you 3.
That being said, you don't need floating point operations here. You can instead use the modulus operator % to get the remainder when dividing by 10.
tempVal.yearsVal = totalYears%10;
This question already has answers here:
Why are floating point numbers inaccurate?
(5 answers)
Is floating point math broken?
(31 answers)
Closed 4 months ago.
i want to extract the decimal part of a float variable by substracting the whole part, the thing is i get a wrong value
#include<stdio.h>
int main(){
int k ;
float a=12.36,i;
k = (int)a;
i = a - k ;
i*=10;
printf("%f",i);
return 0;
}
well, the output is 3.599997 not 3.6 , is there a way to solve this ?
edit : i know it's because of the binary conversion, i m asking if there is a solution to get the right result, not the cause of it. ty for the replies anw.
edit 2 : sadly it's not a matter of display, i want the stored value to be 3.6 ( in this case) because i need it in other calculations.
This question already has answers here:
Why does dividing two int not yield the right value when assigned to double?
(10 answers)
Closed 6 years ago.
I have an array of double:
double theoretical_distribution[] = {1/21, 2/21, 3/21, 4/21, 5/21, 6/21};
And I am trying to computer it's entropy as:
double entropy = 0;
for (int i = 0; i < sizeof(theoretical_distribution)/sizeof(*theoretical_distribution); i++) {
entropy -= (theoretical_distribution[i] * (log10(theoretical_distribution[i])/log10(arity)));
}
However I am getting NaN, I have checked the part
(theoretical_distribution[i] * (log10(theoretical_distribution[i])/log10(arity)))
And found it to return NaN itself, so I assume it's the culprit, however all it's supposed to be is a simple base conversion of the log? Am I missing some detail about the maths of it?
Why is it evaluating to NaN.
You are passing 0 to the log10 function.
This is because your array theoretical_distribution is being populated with constant values that result from integer computations, all of which have a denominator larger than the numerator.
You probably intended floating computations, so make at least one of the numerator or denominator a floating constant.
This question already has answers here:
Floating point inaccuracy examples
(7 answers)
C++ floating point precision [duplicate]
(5 answers)
Closed 8 years ago.
I found this code snippet on Page 174, A Book on C -Al Kelley, Ira Pohl .
int main()
{
int cnt=0; double sum=0.0,x;
for( x=0.0 ;x!= 9.9 ;x+=0.1)
{
sum=sum +x;
printf("cnt = %5d\n",cnt++);
}
return 0;
}
and it became a infinite loop as the book said it would. It didnt mention the precise reason except saying that it had to do with the accuracy of the machine.
I modified the code to check if
x=9.9
would ever become true, i.e. x was attaining 9.9 by adding the following lines
diff=x-9.9;
printf("cnt =10%d \a x =%10.10lf dif=%10.10lf \n",++cnt,x,diff);
and i got the following lines among the output
cnt =1098 x =9.7000000000 dif=-0.2000000000
cnt =1099 x =9.8000000000 dif=-0.1000000000
cnt =10100 x =9.9000000000 dif=-0.0000000000
cnt =10101 x =10.0000000000 dif=0.1000000000
cnt =10102 x =10.1000000000 dif=0.2000000000
if x is attaining the value 9.9 exactly , why is it still a infinite loop?
You are simply printing the number with too poor accuracy to notice that it isn't exact. Try something like this:
#include <stdio.h>
int main()
{
double d = 9.9;
if(d == 9.9)
{
printf("Equal!");
}
else
{
printf("Not equal! %.20f", d);
}
}
Output on my machine:
Not equal! 9.90000000000000035527
The book is likely trying to teach you to never use == or != operators to compare floating point variables. Also for the same reason, never use floats as loop iterators.
The problem is that most floating point implementation are based on IEEE 754. See http://en.wikipedia.org/wiki/IEEE_floating_point
The problem with this is, that numbers are builded with base 2 (binary formats).
The number 9.9 can never be build with base 2 excatly.
The "Numerical Computation Guide" by David Goldberg gves an exact statement about it:
Several different representations of real numbers have been proposed,
but by far the most widely used is the floating-point representation.
Floating-point representations have a base b (which is always assumed to
be even) and a precision p. If b = 10 and p = 3, then the number 0.1 is
represented as 1.00 × 10^-1. If b = 2 and p = 24, then the decimal
number 0.1 cannot be represented exactly, but is approximately
1.10011001100110011001101 × 2^-4.
You can safely assume two floating point numbers are never equal 'exactly' (unless one is a copy of the other).
Computer works on binary and floating point, in other words in base 2. Just like base 10, base 2 have numbers that it cannot build. For example, try to write the fraction 10/3 in base 10. You'll end up with infinite 3s. and in Binary, you cannot even write 0.1 (decimal) in binary, you'll also get a recurring pattern 0.0001100110011... (binary).
This video will do better to explain http://www.youtube.com/watch?v=PZRI1IfStY0
This question already has answers here:
Why adding these two double does not give correct answer? [duplicate]
(2 answers)
Closed 8 years ago.
I'm a bit of C newbie but this problem is really confusing me.
I have a variable double = 436553940.0000000000 (it was cast from an Int) and an other variable double 0.095832496.
My result should be 436553940.0958324*96*, however I get 436553940.0958324*67*.
Why does this happen and how can I prevent it from happening?
The number you expect is simply not representable by a double. The value you receive is instead a close approximation based on rounding results:
In [9]: 436553940.095832496
Out[9]: 436553940.09583247
In [18]: 436553940.095832496+2e-8
Out[18]: 436553940.09583247
In [19]: 436553940.095832496+3e-8
Out[19]: 436553940.0958325
In [20]: 436553940.095832496-2e-8
Out[20]: 436553940.09583247
In [21]: 436553940.095832496-3e-8
Out[21]: 436553940.0958324
You've just run out of significand bits.
Doubles are not able to represent every number. We can write some C++ code (that implements doubles in the same way) to show this.
#include <cstdio>
#include <cmath>
int main() {
double x = 436553940;
double y = 0.095832496;
double sum = x + y;
printf("prev: %50.50lf\n", std::nextafter(sum, 0));
printf("sum: %50.50lf\n", sum);
printf("next: %50.50lf\n", std::nextafter(sum, 500000000));
}
This code computes the sum of the two numbers you are talking about, and stores it as sum. We then compute the next representable double before that number, and after that number.
Here's the output:
[11:43am][wlynch#watermelon /tmp] ./foo
prev: 436553940.09583240747451782226562500000000000000000000000000
sum: 436553940.09583246707916259765625000000000000000000000000000
next: 436553940.09583252668380737304687500000000000000000000000000
So, we are not able to have the calculation equal 436553940.0958324_96_, because that number is not a valid double. So the IEEE-754 standard (and your compiler) defines some rules that tell us how the number should be rounded, to reach the nearest representable double.