C and casting - why does this program stop at 6? - c

I edited a C program for my assignment, previously there wasn't typecasting and the iteration stopped at i=1, now with the typecasting it stops at i=6.
Any ideas why? Thanks in advance!
int main(void)
{
int i = 0;
double d = 0.0;
while ( (i == (int) (d * 10)) && (i < 10) )
{
i = i + 1;
d = (double) (d + 0.1);
printf("%d %lf\n", i, d);
}
printf("%d %lf\n", i, d);
getch();
return 0;
}

Floating point arithmetic is inexact. The value 0.1 is not exactly representable in binary floating point. The recommended reading here is: What Every Computer Scientist Should Know About Floating-Point Arithmetic.
At some point in the program, d becomes slightly less than i/10 due to rounding error, and so your loop terminates.

In addition to the other answers, I'd like to answer the question why the loop terminates earlier with the condition i == (d * 10) than with i == (int) (d * 10).
In the first case, int value at the left side of == is promoted to double, so the inequality happens when the accumulated error in d*10 is either positive or negative (e.g. 0.999999 or 1.000001).
In the 2nd case, the right side is truncated to int, so the inequality happens only when the error is negative (e.g. 5.999999). Therefore, the 1st version would fail earlier.

As has been stated many times before, the reason this doesn't work is that binary floating point numbers cannot represent all decimal floating point binary numbers, it just isn't possible. To read more, check out this really great article:
What Every Programmer Should Know About Floating-Point Arithmetic
Now, on the more practical side of things, when using floating point and comparing it to another number, you should almost always round the value or use an epsilon value, like this:
if (ABS(doubleValue - intValue) < 0.00001) // 0.00001 is a margin-of-error for floating point arithmetic
// the two numbers are even (or close to it)

Related

Series: 1 + 1/3 + 1/5 +...upto N terms

I was recently asked this question in a programming test. I can't seem to understand why I am getting the answer '1'. I am a beginner in the C programming language.
Here is my code:
#include<stdio.h>
int main()
{
float c = 0;
int n, i = 1;
printf("Enter the number here: ");
n = getchar();
while (i <= 2*n - 1)
{
c = c + (1/i);
i = i + 2;
}
printf("%f", c);
}
I have already tried using a for loop, but the answer remains the same. Any help would be appreciated!
The problem in your code lies on this line:
c = c + (1/i);
Here, the operation performed inside the parentheses is integer division! So, when i has any value greater than 1, the result will be zero. This zero is then converted to a float value.
To force the compiler to use floating point division, use this:
c = c + (1.0/i);
I agree with Adrian's answer.
Another issue is because of the way floating point numbers are represented in a system when they are added in arbitrary order, precision can be lost.
To have maximum precision, floating point numbers should be added from smallest first to largest last.

pow() function giving wrong answer [duplicate]

This question already has an answer here:
pow() function in C problems [duplicate]
(1 answer)
Closed 3 years ago.
I'm trying to multiply 2, 3 digit numbers.
I used 2 for loops (nested) and multiplied each digit of num1 with num2,
and shifted each result to the appropriate place using pow().
So the problem is pow(10,3) is coming out to be 299 instead of 300.
I haven't tried much as but used printf to find what is actually happening in the runtime and this is what I have found.
the values of tempR after shift should be
5,40,300,100,800,6000,1500,12000,90000
but are coming as
5,40,299,100,799,6000,1500,12000,89999
int main(void)
{
int result; // final result
int tempR; // temporary for each iteration
char a[] = "345"; // number 1
char b[] = "321"; // number 2
for(int i = 2;i>= 0 ; i --)
{
for(int j = 2;j >= 0 ; j --)
{
int shift = abs(i-2 + j -2);
printf("%d\n",shift); //used to see the values of shift.
//and it is coming as expected
tempR = (int)(b[i] - '0') * (int)(a[j] - '0');
printf("%d \n",tempR); // value to tempR is perfect
tempR = tempR*pow(10,shift);
printf("%d \n",tempR); // here the problem starts
result += tempR;
}
}
printf("%d",result);
}
Although IEEE754 (ubiquitous on desktop systems) is required to return the best possible floating point value for certain operators such as addition, multiplication, division, and subtraction, and certain functions such as sqrt, this does not apply to pow.
pow(x, y) can and often is implemented as exp(y * ln (x)). Hopefully you can see that this can cause result to "go off" spectacularly when pow is used with seemingly trivial integral arguments and the result truncated to int.
There are C implementations out there that have more accurate implementations of pow than the one you have, particularly for integral arguments. If such accuracy is required, then you could move your toolset to such an implementation. Borrowing an implementation of pow from a respected mathematics library is also an option, else roll your own. Using round is also a technique, if a little kludgy if you get my meaning.
Never use float functions for the integer calculations. Your pow result almost never will be precise. In this case it is slightly below 300 and the cast to integer makes it 299.
The pow function operates on doubles. Doubles use finite precision. Conversion back to integer chops rather than rounding.
Finite precision is like representing 1/3 as 0.333333. If you do 9 * 1/3 and chop to an integer, you'll get 2 instead of 3 because 9 * 1/3 will give 2.999997 which chops to two.
This same kind of rounding and chopping is causing you to be off by one. You could also round by adding 0.5 before chopping to an integer, but I wouldn't suggest it.
Don't pass integers through doubles and back if you expect exact answers.
Others have mentioned that pow does not yield exact results, and if you convert the result to an integer there's a high risk of loss of precision. Especially since if you assign a float type to an integer type, the result get truncated rather than rounded. Read more here: Is floating math broken?
The most convenient solution is to write your own integer variant of pow. It can look like this:
int int_pow(int num, int e)
{
int ret = 1;
while(e-- > 0)
ret *= num;
return ret;
}
Note that it will not work if e is negative or if both num and e is 0. It also have no protection for overflow. It just shows the idea.
In your particular case, you could write a very specialized variant based on 10:
unsigned int pow10(unsigned int e)
{
unsigned int ret = 1;
while(e-- > 0)
ret *= 10;
return ret;
}

How to compare double variables in the if statement

As I am trying to compare these doubles, it won't seem to be working correctly
Here it goes: (This is exactly my problem)
#include <stdio.h>
#include <math.h>
int main () {
int i_wagen;
double dd[20];
dd[0]=0.;
dd[1]=0.;
double abstand= 15.;
double K_spiel=0.015;
double s_rel_0= K_spiel;
int i;
for(i=1; i<=9; i++)
{
i_wagen=2*(i-1)+2;
dd[i_wagen]=dd[i_wagen-1]-abstand;
i_wagen=2*(i-1)+3;
dd[i_wagen]=dd[i_wagen-1]-s_rel_0;
}
double s_rel=dd[3-1]-dd[3];
if((fabs(s_rel) - K_spiel) == 0.)
{
printf("yes\n");
}
return(0);
}
After executing the programm, it wont print the yes.
How to compare double variables in the if statement?
Take under account limited precision of the double representation of floating point numbers!
Your problem is simple and covered in Is floating point math broken?
Floating point operations are not precise. The representation of the given number may not be precise.
For 0.1 in the standard binary64 format, the representation can be written exactly as 0.1000000000000000055511151231257827021181583404541015625
Double precision (double) gives you only 52 bits of significant, 11 bits of exponent, and 1 sign bit. Floating point numbers in C use IEEE 754 encoding.
See the output of your program and the possible fix where you settle down for the variable being close to 0.0:
#include <stdio.h>
#include <math.h>
#define PRECISION 1e-6
int main (void) {
int i_wagen;
double dd[20];
dd[0]=0.;
dd[1]=0.;
double abstand= 15.;
double K_spiel=0.015;
double s_rel_0= K_spiel;
int i;
for(i=1; i<=9; i++)
{
i_wagen = 2*(i-1)+2;
dd[i_wagen] = dd[i_wagen-1]-abstand;
i_wagen = 2*(i-1)+3;
dd[i_wagen] = dd[i_wagen-1] - s_rel_0;
}
double s_rel = dd[3-1]-dd[3];
printf(" s_rel %.16f K_spiel %.16f diff %.16f \n" , s_rel, K_spiel, ((fabs(s_rel) - K_spiel)) );
if((fabs(s_rel) - K_spiel) == 0.0) // THIS WILL NOT WORK!
{
printf("yes\n");
}
// Settle down for being close enough to 0.0
if( fabs( (fabs(s_rel) - K_spiel)) < PRECISION)
{
printf("yes!!!\n");
}
return(0);
}
Output:
s_rel 0.0150000000000006 K_spiel 0.0150000000000000 diff 0.0000000000000006
yes!!!
You're comparing x to two different matrix entries: the first if compares x to coeff[0][0], the second to coeff[0][1]. So if x is greater than coeff[0][0] and less than or equal to coeff[0][1] the program will execture the final else branch. You probably want to compare x to the same matrix entry in both if statements. And in that case, the last else branch would be useless, since one of the three cases (less than, equal to or greater than) MUST be true.
First, dd[i_wagen-1] as used in the statement:
dd[i_wagen]=dd[i_wagen-1]-abstand;
is uninitialized. Code will run, but will have unpredictable results.
To initialize, you can use:
double dd[20]={0}; //sufficient
or possibly
double dd[20]={0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}; //explicit, but not necessary
Moving to your actual question, it all comes down to this statement:
if((fabs(s_rel) - K_spiel) == 0.)
You have initialized K_spiel to 0.015. And at this point in your execution flow s_rel appears to be close to 0.015. But it is actually closer to 0.0150000000000006. So the comparison fails.
One trick that is commonly used is to define an epsilon value, and use it to determine if the difference between two floating point values is small enough to satisfy your purpose:
From The Art of Computer Programming, the following snippet uses this approach, and will work for your very specific example: (caution: Read why this approach will not work for all floating point related comparisons.)
bool approximatelyEqual(float a, float b, float epsilon)
{
return fabs(a - b) <= ( (fabs(a) < fabs(b) ? fabs(b) : fabs(a)) * epsilon);
}
So replace the line:
if((fabs(s_rel) - K_spiel) == 0.)
with
if(approximatelyEqual(s_rel, K_spiel, 1e-8)

float vs double comparison [duplicate]

This question already has answers here:
Comparing float and double
(3 answers)
Closed 7 years ago.
int main(void)
{
  float me = 1.1;  
double you = 1.1;   
if ( me == you ) {
printf("I love U");
} else {
printf("I hate U");
}
}
This prints "I hate U". Why?
Floats use binary fraction. If you convert 1.1 to float, this will result in a binary representation.
Each bit right if the binary point halves the weight of the digit, as much as for decimal, it divides by ten. Bits left of the point double (times ten for decimal).
in decimal: ... 0*2 + 1*1 + 0*0.5 + 0*0.25 + 0*0.125 + 1*0.0625 + ...
binary: 0 1 . 0 0 0 1 ...
2's exp: 1 0 -1 -2 -3 -4
(exponent to the power of 2)
Problem is that 1.1 cannot be converted exactly to binary representation. For double, there are, however, more significant digits than for float.
If you compare the values, first, the float is converted to double. But as the computer does not know about the original decimal value, it simply fills the trailing digits of the new double with all 0, while the double value is more precise. So both do compare not equal.
This is a common pitfall when using floats. For this and other reasons (e.g. rounding errors), you should not use exact comparison for equal/unequal), but a ranged compare using the smallest value different from 0:
#include "float.h"
...
// check for "almost equal"
if ( fabs(fval - dval) <= FLT_EPSILON )
...
Note the usage of FLT_EPSILON, which is the aforementioned value for single precision float values. Also note the <=, not <, as the latter will actually require exact match).
If you compare two doubles, you might use DBL_EPSILON, but be careful with that.
Depending on intermediate calculations, the tolerance has to be increased (you cannot reduce it further than epsilon), as rounding errors, etc. will sum up. Floats in general are not forgiving with wrong assumptions about precision, conversion and rounding.
Edit:
As suggested by #chux, this might not work as expected for larger values, as you have to scale EPSILON according to the exponents. This conforms to what I stated: float comparision is not that simple as integer comparison. Think about before comparing.
In short, you should NOT use == to compare floating points.
for example
float i = 1.1; // or double
float j = 1.1; // or double
This argument
(i==j) == true // is not always valid
for a correct comparison you should use epsilon (very small number):
(abs(i-j)<epsilon)== true // this argument is valid
The question simplifies to why do me and you have different values?
Usually, C floating point is based on a binary representation. Many compilers & hardware follow IEEE 754 binary32 and binary64. Rare machines use a decimal, base-16 or other floating point representation.
OP's machine certainly does not represent 1.1 exactly as 1.1, but to the nearest representable floating point number.
Consider the below which prints out me and you to high precision. The previous representable floating point numbers are also shown. It is easy to see me != you.
#include <math.h>
#include <stdio.h>
int main(void) {
float me = 1.1;
double you = 1.1;
printf("%.50f\n", nextafterf(me,0)); // previous float value
printf("%.50f\n", me);
printf("%.50f\n", nextafter(you,0)); // previous double value
printf("%.50f\n", you);
1.09999990463256835937500000000000000000000000000000
1.10000002384185791015625000000000000000000000000000
1.09999999999999986677323704498121514916420000000000
1.10000000000000008881784197001252323389053300000000
But it is more complicated: C allows code to use higher precision for intermediate calculations depending on FLT_EVAL_METHOD. So on another machine, where FLT_EVAL_METHOD==1 (evaluate all FP to double), the compare test may pass.
Comparing for exact equality is rarely used in floating point code, aside from comparison to 0.0. More often code uses an ordered compare a < b. Comparing for approximate equality involves another parameter to control how near. #R.. has a good answer on that.
Because you are comparing two Floating point!
Floating point comparison is not exact because of Rounding Errors. Simple values like 1.1 or 9.0 cannot be precisely represented using binary floating point numbers, and the limited precision of floating point numbers means that slight changes in the order of operations can change the result. Different compilers and CPU architectures store temporary results at different precisions, so results will differ depending on the details of your environment. For example:
float a = 9.0 + 16.0
double b = 25.0
if(a == b) // can be false!
if(a >= b) // can also be false!
Even
if(abs(a-b) < 0.0001) // wrong - don't do this
This is a bad way to do it because a fixed epsilon (0.0001) is chosen because it “looks small”, could actually be way too large when the numbers being compared are very small as well.
I personally use the following method, may be this will help you:
#include <iostream> // std::cout
#include <cmath> // std::abs
#include <algorithm> // std::min
using namespace std;
#define MIN_NORMAL 1.17549435E-38f
#define MAX_VALUE 3.4028235E38f
bool nearlyEqual(float a, float b, float epsilon) {
float absA = std::abs(a);
float absB = std::abs(b);
float diff = std::abs(a - b);
if (a == b) {
return true;
} else if (a == 0 || b == 0 || diff < MIN_NORMAL) {
return diff < (epsilon * MIN_NORMAL);
} else {
return diff / std::min(absA + absB, MAX_VALUE) < epsilon;
}
}
This method passes tests for many important special cases, for different a, b and epsilon.
And don't forget to read What Every Computer Scientist Should Know About Floating-Point Arithmetic!

Strange output when using float instead of double

Strange output when I use float instead of double
#include <stdio.h>
void main()
{
double p,p1,cost,cost1=30;
for (p = 0.1; p < 10;p=p+0.1)
{
cost = 30-6*p+p*p;
if (cost<cost1)
{
cost1=cost;
p1=p;
}
else
{
break;
}
printf("%lf\t%lf\n",p,cost);
}
printf("%lf\t%lf\n",p1,cost1);
}
Gives output as expected at p = 3;
But when I use float the output is a little weird.
#include <stdio.h>
void main()
{
float p,p1,cost,cost1=40;
for (p = 0.1; p < 10;p=p+0.1)
{
cost = 30-6*p+p*p;
if (cost<cost1)
{
cost1=cost;
p1=p;
}
else
{
break;
}
printf("%f\t%f\n",p,cost);
}
printf("%f\t%f\n",p1,cost1);
}
Why is the increment of p in the second case going weird after 2.7?
This is happening because the float and double data types store numbers in base 2. Most base-10 numbers can’t be stored exactly. Rounding errors add up much more quickly when using floats. Outside of embedded applications with limited memory, it’s generally better, or at least easier, to use doubles for this reason.
To see this happening for double types, consider the output of this code:
#include <stdio.h>
int main(void)
{
double d = 0.0;
for (int i = 0; i < 100000000; i++)
d += 0.1;
printf("%f\n", d);
return 0;
}
On my computer, it outputs 9999999.981129. So after 100 million iterations, rounding error made a difference of 0.018871 in the result.
For more information about how floating-point data types work, read What Every Computer Scientist Should Know About Floating-Point Arithmetic. Or, as akira mentioned in a comment, see the Floating-Point Guide.
Your program can work fine with float. You don't need double to compute a table of 100 values to a few significant digits. You can use double, and if you do, it will have chances to work even if you use binary floating-point binary at cross-purposes. The IEEE 754 double-precision format used for double by most C compilers is so precise that it makes many misuses of floating-point unnoticeable (but not all of them).
Values that are simple in decimal may not be simple in binary
A consequence is that a value that is simple in decimal may not be represented exactly in binary.
This is the case for 0.1: it is not simple in binary, and it is not represented exactly as either double or float, but the double representation has more digits and as a result, is closer to the intended value 1/10.
Floating-point operations are not exact in general
Binary floating-point operations in a format such as float or double have to produce a result in the intended format. This leads to some digits having to be dropped from the result each time an operation is computed. When using binary floating-point in an advanced manner, the programmer sometimes knows that the result will have few enough digits for all the digits to be represented in the format (in other words, sometimes a floating-point operation can be exact and advanced programmers can predict and take advantage of conditions in which this happens). But here, you are adding 0.1, which is not simple and (in binary) uses all the available digits, so most of the times, this addition is not be exact.
How to print a small table of values using only float
In for (p = 0.1; p < 10;p=p+0.1), the value of p, being a float, will be rounded at each iteration. Each iteration will be computed from a previous iteration that was already rounded, so the rounding errors will accumulate and make the end result drift away from the intended, mathematical value.
Here is a list of improvements over what you wrote, in reverse order of exactness:
for (i = 1, p = 0.1f; i < 100; i++, p = i * 0.1f)
In the above version, 0.1f is not exactly 1/10, but the computation of p involves only one multiplication and one rounding, instead of up to 100. That version gives a more precise approximation of i/10.
for (i = 1, p = 0.1f; i < 100; i++, p = i * 0.1)
In the very slightly different version above, i is multiplied by the double value 0.1, which more closely approximates 1/10. The result is always the closest float to i/10, but this solution is cheating a bit, since it uses a double multiplication. I said a solution existed with only float!
for (i = 1, p = 0.1f; i < 100; i++, p = i / 10.0f)
In this last solution, p is computed as the division of i, represented exactly as a float because it is a small integer, by 10.0f, which is also exact for the same reason. The only computation approximation is that of a single operation, and the arguments are exactly what we wanted them to, so this is the best solution. It produces the closest float to i/10 for all values of i between 1 and 99.

Resources