I am trying to write a program in C to accomplish the following task.
Input: Three double-precision numbers, a, b, and c.
Output: All the numbers from b to a, that can be reached by decrements of c.
Here is a simple program (filename: range.c).
#include <stdlib.h>
#include <stdio.h>
int main()
{
double high, low, step, var;
printf("Enter the <lower limit> <upperlimit> <step>\n>>");
scanf("%lf %lf %lf", &low, &high, &step);
printf("Number in the requested range\n");
for (var = high; var >= low; var -= step)
printf("%g\n", var);
return 0;
}
However, the for loop behaves rather bizarrely for some inputs. For instance, the following.
10-236-49-81:stackoverflow pavithran$ ./range.o
Enter the <lower limit> <upperlimit> <step>
>>0.1 0.9 0.2
Number in the requested range
0.9
0.7
0.5
0.3
10-236-49-81:stackoverflow pavithran$
I cannot figure out why the loop quits at var = 0.1. While for another input, it behaves as expected.
10-236-49-81:stackoverflow pavithran$ ./range.o
Enter the <lower limit> <upperlimit> <step>
>>0.1 0.5 0.1
Number in the requested range
0.5
0.4
0.3
0.2
0.1
10-236-49-81:stackoverflow pavithran$
Had the weird behaviour in the first situation got something to do with numeric precision?
How can I ensure that the range will always contain floor((high - low)/step) + 1 numbers?
I have tried an alternate method of looping over floats, where I scale the loop variables to integers, and print the result of the loop variable divided by the scaling used. But there's perhaps a better way...
Using a double as a counter in a for loop requires very careful consideration. In many instances it's best avoided.
I'm sure you know that not all numbers that are exact in decimal are also exact in binary floating point. In fact, for IEEE754 floating point, only dyadic rationals are. So 0.5 is, but 0.4, 0.3, 0.2, and 0.1 are not.
The closest IEEE754 floating point double to 0.2 is actually the slightly larger 0.200000000000000011102230246251565404236316680908203125.
In your case a repeated subtraction of this from 0.9 eventually causes a number whose first significant figure is a to become a number whose first significant figure is a - 3: your bug then manifests itself.
The simple remedy is to work in integers, decement by 1 each time, and scale your output using step.
Related
I am trying to write a program that outputs the number of the digits in the decimal portion of a given number (0.128).
I made the following program:
#include <stdio.h>
#include <math.h>
int main(){
float result = 0;
int count = 0;
int exp = 0;
for(exp = 0; int(1+result) % 10 != 0; exp++)
{
result = 0.128 * pow(10, exp);
count++;
}
printf("%d \n", count);
printf("%f \n", result);
return 0;
}
What I had in mind was that exp keeps being incremented until int(1+result) % 10 outputs 0. So for example when result = 0.128 * pow(10,4) = 1280, result mod 10 (int(1+result) % 10) will output 0 and the loop will stop.
I know that on a bigger scale this method is still inefficient since if result was a given input like 1.1208 the program would basically stop at one digit short of the desired value; however, I am trying to first find out the reason why I'm facing the current issue.
My Issue: The loop won't just stop at 1280; it keeps looping until its value reaches 128000000.000000.
Here is the output when I run the program:
10
128000000.000000
Apologies if my description is vague, any given help is very much appreciated.
I am trying to write a program that outputs the number of the digits in the decimal portion of a given number (0.128).
This task is basically impossible, because on a conventional (binary) machine the goal is not meaningful.
If I write
float f = 0.128;
printf("%f\n", f);
I see
0.128000
and I might conclude that 0.128 has three digits. (Never mind about the three 0's.)
But if I then write
printf("%.15f\n", f);
I see
0.128000006079674
Wait a minute! What's going on? Now how many digits does it have?
It's customary to say that floating-point numbers are "not accurate" or that they suffer from "roundoff error". But in fact, floating-point numbers are, in their own way, perfectly accurate — it's just that they're accurate in base two, not the base 10 we're used to thinking about.
The surprising fact is that most decimal (base 10) fractions do not exist as finite binary fractions. This is similar to the way that the number 1/3 does not even exist as a finite decimal fraction. You can approximate 1/3 as 0.333 or 0.3333333333 or 0.33333333333333333333, but without an infinite number of 3's it's only an approximation. Similarly, you can approximate 1/10 in base 2 as 0b0.00011 or 0b0.000110011 or 0b0.000110011001100110011001100110011, but without an infinite number of 0011's it, too, is only an approximation. (That last rendition, with 33 bits past the binary point, works out to about 0.0999999999767.)
And it's the same with most decimal fractions you can think of, including 0.128. So when I wrote
float f = 0.128;
what I actually got in f was the binary number 0b0.00100000110001001001101111, which in decimal is exactly 0.12800000607967376708984375.
Once a number has been stored as a float (or a double, for that matter) it is what it is: there is no way to rediscover that it was initially initialized from a "nice, round" decimal fraction like 0.128. And if you try to "count the number of decimal digits", and if your code does a really precise job, you're liable to get an answer of 26 (that is, corresponding to the digits "12800000607967376708984375"), not 3.
P.S. If you were working with computer hardware that implemented decimal floating point, this problem's goal would be meaningful, possible, and tractable. And implementations of decimal floating point do exist. But the ordinary float and double values any of is likely to use on any of today's common, mass-market computers are invariably going to be binary (specifically, conforming to IEEE-754).
P.P.S. Above I wrote, "what I actually got in f was the binary number 0b0.00100000110001001001101111". And if you count the number of significant bits there — 100000110001001001101111 — you get 24, which is no coincidence at all. You can read at single precision floating-point format that the significand portion of a float has 24 bits (with 23 explicitly stored), and here, you're seeing that in action.
float vs. code
A binary float cannot encode 0.128 exactly as it is not a dyadic rational.
Instead, it takes on a nearby value: 0.12800000607967376708984375. 26 digits.
Rounding errors
OP's approach incurs rounding errors in result = 0.128 * pow(10, exp);.
Extended math needed
The goal is difficult. Example: FLT_TRUE_MIN takes about 149 digits.
We could use double or long double to get us somewhat there.
Simply multiply the fraction by 10.0 in each step.
d *= 10.0; still incurs rounding errors, but less so than OP's approach.
#include <stdio.h>
#include <math.h> int main(){
int count = 0;
float f = 0.128f;
double d = f - trunc(f);
printf("%.30f\n", d);
while (d) {
d *= 10.0;
double ipart = trunc(d);
printf("%.0f", ipart);
d -= ipart;
count++;
}
printf("\n");
printf("%d \n", count);
return 0;
}
Output
0.128000006079673767089843750000
12800000607967376708984375
26
Usefulness
Typically, past FLT_DECMAL_DIG (9) or so significant decimal places, OP’s goal is usually not that useful.
As others have said, the number of decimal digits is meaningless when using binary floating-point.
But you also have a flawed termination condition. The loop test is (int)(1+result) % 10 != 0 meaning that it will stop whenever we reach an integer whose last digit is 9.
That means that 0.9, 0.99 and 0.9999 all give a result of 2.
We also lose precision by truncating the double value we start with by storing into a float.
The most useful thing we could do is terminate when the remaining fractional part is less than the precision of the type used.
Suggested working code:
#include <math.h>
#include <float.h>
#include <stdio.h>
int main(void)
{
double val = 0.128;
double prec = DBL_EPSILON;
double result;
int count = 0;
while (fabs(modf(val, &result)) > prec) {
++count;
val *= 10;
prec *= 10;
}
printf("%d digit(s): %0*.0f\n", count, count, result);
}
Results:
3 digit(s): 128
i have a simple task that says 'Write the value of y with the following formula for the range between xmin and xmax with the difference of dx.
The only problem i have is that when using while with float, such as in code i am going to provide, i am getting one less output of y than i should have.
For the following code
#include <stdio.h>
int main() {
float x,xmin,xmax,dx,y;
printf("Input the values of xmin xmax i dx");
scanf("%f%f%f",&xmin,&xmax,&dx);
x=xmin;
while(x<=xmax) {
y=(x*x-2*x-2)/(x*x+1);
printf("%.3f %.3f\n",x,y);
x=x+dx;
}
}
for the input of (-2 2 0.2) i get output only up to 1.8 (that's 20 outputs) and not up to 2.
But when i use double instead of float everything works just fine (Has 21 outputs).
Is there something connected to the while condition that i am not aware of?
That makes sense. Float or double are an approximation rather an exact representation of Rational Numbers a/b:integers, b!=0. The closer you are to 1.000... the better the approximation but still an approximation.
A subset of rational numbers guaranteed to be exactly represented by floating point representation are rationals: 2^k, with k:integer [-126<= x <= 127 . Eg. const float dx = 0.25f; ~ 1/(2^2) would have worked fine.
0.2 is not represented as 0.2 rather as: 0.20000000298023223876953125
The next closest approximation to 0.2 is: 0.199999988079071044921875
https://www.h-schmidt.net/FloatConverter/IEEE754.html
An alternative way to loop floats might be:
#include <stdio.h>
int main() {
float x,xmin,xmax,dx,y;
printf("Input the values of xmin xmax i dx");
scanf("%f%f%f",&xmin,&xmax,&dx);
x=xmin;
//expected cummulative error
const float e = 0.7 * dx;
do
{
y=(x*x-2*x-2)/(x*x+1);
printf("%.3f %.3f\n",x,y);
x=x+dx;
}
while(!(x > (xmax + e)));
}
The solution above appears to be working as expected but it would only do so for small number of iterations.
#include <stdio.h>
#include <string.h>
#include <math.h>
#include <stdlib.h>
int main() {
/* Enter your code here. Read input from STDIN. Print output to STDOUT */
float sum=0.0;
for(int i=1;i<=1000000;i++)
sum+=(1.0)/i;
printf("Forward sum is %f ",sum);
sum=0.0;
for(int i=1000000;i>=1;i--)
sum+=(1.0)/i;
printf("Backward sum is %f ",sum);
return 0;
}
Output:-
Forward sum is :- 14.357358
Backward sum is :- 14.392652.
Why is there a difference in both sums ? I think that there is some precision error which is causing the difference in both the sums but I am not able to get a clear picture of why is this happening.
This is one of the surprising aspects of floating-point arithmetic: it actually matters what order you do things like addition in. (Formally, we say that floating-point addition is not commutative.)
It's pretty easy to see why this is the case, with a simpler, slightly artificial example. Let's say you have this addition problem:
1000000. + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1
But let's say that you're using a single-precision floating-point format that has only 7 digits of precision. So even though you might think that 1000000.0 + 0.1 would be 1000000.1, actually it would be rounded off to 1000000.. So 1000000.0 + 0.1 + 0.1 would also be 1000000., and adding in all 10 copies of 0.1 would still result in just 1000000., also.
But if instead you tried this:
0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 1000000.
Now, when you add 0.1 + 0.1, there's no problem with precision, so you get 0.2. So it you add 0.1 ten times, you get 1.0. So if you do the whole problem in that order, you'll get 1000001..
You can see this yourself. Try this program:
#include <stdio.h>
int main()
{
float f1 = 100000.0, f2 = 0.0;
int i;
for(i = 0; i < 10; i++) {
f1 += 0.1;
f2 += 0.1;
}
f2 += 100000.0;
printf("%.1f %.1f\n", f1, f2);
}
On my computer, this prints 100001.0 100001.0, as expected. But if I change the two big numbers to 10000000.0, then it prints 10000000.0 10000001.0. The two numbers are clearly unequal.
The first loop starts with adding relatively large parts to the sum, and decreasingly smaller parts when the sum gets larger. So while more bits are needed to represent the sum, less bits are available for the small parts.
In the second loop, small parts are added to the sum, and increasingly larger parts are added when the sum gets larger. So less bits are required to store the the newly added part relative to the current value of sum.
(Not a very scientific explanation, but i hope this verbal attempt makes the principle clear)
N.b.: it also means the second result is more accurate.
In an attempt to be more precise: in order to add two floating point numbers they need to be scaled to have the same number of bits for mantissa and exponent. When the sum gets larger, the item added to will be scaled so as not to loose significance of this sum. As a result, the least significant bits of the part to be added will be scaled out of the register before the addition. For example (hypothetical) adding 0.00000001 to 1,000,000,000 will result in adding zero to this large number.
This question already has answers here:
How dangerous is it to compare floating point values?
(12 answers)
Closed 9 years ago.
Look at the output of this link(scroll down to see the output) to find out what I'm trying to accomplish
The problem is with the for loop on line number 9-11
for(i=0; i<=0.9; i+=0.1){
printf("%6.1f ",i);
}
I expected this to print values from 0.0 until 0.9 but it stops after printing 0.8, any idea why ??
Using float here is source of problem. Instead, do it with an int:
int i;
for(i = 0; i <= 10; i++)
printf("%6.1f ", (float)(i / 10.0));
Output:
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Ideally floating point should not be used for iteration, but if you want to know why change your code and see how.
for(float i=0; i<=0.9f; ){
i+=0.1f;
System.out.println(i);
}
Here is the result.
0.1
0.2
0.3
0.4
0.5
0.6
0.70000005
0.8000001
0.9000001
your 9th value exceeds 0.9.
Floating point arithmetic is inexact in computing. This is because of the way that a computer represents floating point values. Here's an excerpt from an MSDN article on the subject:
Every decimal integer can be exactly represented by a binary integer; however, this is not >true for fractional numbers. In fact, every number that is irrational in base 10 will also be >irrational in any system with a base smaller than 10.
For binary, in particular, only fractional numbers that can be represented in the form p/q, >where q is an integer power of 2, can be expressed exactly, with a finite number of bits.
Even common decimal fractions, such as decimal 0.0001, cannot be represented exactly in >binary. (0.0001 is a repeating binary fraction with a period of 104 bits!)
Link to the full article: https://support.microsoft.com/kb/42980
Floating point number cannot precisely represent decimals, so rounding errors accumulate:
#include <iostream>
#include <iomanip>
using namespace std;
int main() {
float literal = 0.9;
float sum = 0;
for(int i = 0; i < 9; i++)
sum += 0.1;
cout << setprecision(10) << literal << ", " << sum << endl;
return 0;
}
Output:
0.8999999762, 0.9000000954
You loop is right, but the float comparison in loops is not safe.
The problem is that a binary floating point number cannot exactly represent 0.1
This would work.
for(i=0.0; i<=0.9001; i+=0.1){
printf("%6.1f ",i);
I have problem with precision of double format.
Sample example:
double K=0, L=0, M=0;
scanf("%lf %lf %lf", &K, &L, &M);
if((K+L) <= M) printf("Incorrect input");
else printf("Right, K=%f, L=%f, M=%f", K, L, M);
My test input:
K = 0.1, L = 0.2, M = 0.3 -> Condition but goes to 'else' statement.
How I can correct this difference? Is there any other method to summation?
In the world of Double Precision IEEE 754 binary floating-point format (the ones used on Intel and other processors) 0.1 + 0.2 == 0.30000000000000004 :-) And 0.30000000000000004 != 0.3 (and note that in the marvelous world of doubles, 0.1, 0.2 and 0.3 don't exist as "exact" quantities. There are some double numbers that are very near them, but if you printed them with full precision, they wouldn't be 0.1, 0.2 and 0.3)
To laugh a little, try this: http://pages.cs.wisc.edu/~rkennedy/exact-float
Insert a decimal number and look at the second and third row, it shows how the number is really represented in memory. It's for Delphi, but Double and Single are the same for Delphi and for probably all the C compilers for Intel processors (they are called double and float in C)
And if you want to try for yourself, look at this http://ideone.com/WEL7h
#include <stdio.h>
int main()
{
double d1 = (0.1 + 0.2);
double d2 = 0.3;
printf("%.20e\n%.20e", d1, d2);
return 0;
}
output:
3.00000000000000044409e-01
2.99999999999999988898e-01
(be aware that the output is compiler dependant. Depending on the options, 0.1 + 0.2 could be compiled and rounded to 0.3)
Unlike integer values floating point values are not stored exactly the way you assign values to them. Lets consider the following code:
int i = 1; // this is and always will be 1
float j = 0.03 // this gets stored at least on my machine as something like 0.029999999
Why is this so? Well how many floating point number exist in the interval between 0.1 and 0.2?
An infinite number! So there are values which will get stored as you intended but a hell of a lot of values which will be stored with a small error.
This is the reason why comparing floating point values for equality is not a good idea. Try something like this instead:
float a = 0.3f;
float b = 0.301f;
float threshold = 1e-6;
if( abs(a-b) < threshold )
return true;
else
return false;
There are infinitely many real numbers between any two distinct real numbers. If we were to be able to represent every one of those, we would need infinite memory. Since we only have finite memory, floating point numbers need to be stored with only finite precision. Up to that finite precision, it might be not be true that 0.1 + 0.2 <= 0.3.
Now, you really should go read what's at the other end of the excellent link provided by Paul R.