Why is not a==0 in the following code? - c

#include <stdio.h>
int main( )
{
float a=1.0;
long i;
for(i=0; i<100; i++)
{
a = a - 0.01;
}
printf("%e\n",a);
}
Result is: 6.59e-07

It's a binary floating point number, not a decimal one - therefore you need to expect rounding errors. See the Basic section in this article:
What Every Programmer Should Know About Floating-Point Arithmetic
For example, the value 0.01 does not have a precise represenation in binary floating point type. To get a "correct" result in your sample you would have to either round or use a a decimal floating point type (see Wikipedia):
Binary fixed-point types are most commonly used, because the rescaling operations can be implemented as fast bit shifts. Binary fixed-point numbers can represent fractional powers of two exactly, but, like binary floating-point numbers, cannot exactly represent fractional powers of ten. If exact fractional powers of ten are desired, then a decimal format should be used. For example, one-tenth (0.1) and one-hundredth (0.01) can be represented only approximately by binary fixed-point or binary floating-point representations, while they can be represented exactly in decimal fixed-point or decimal floating-point representations. These representations may be encoded in many ways, including BCD.

There are two questions here. If you're asking, why is my printf statement displaying the result as 6.59e-07 instead of 0.000000659, it's because you've used the format specifier for Scientific Notation: %e. You want %f for the floating point a.
printf("%f\n",a);
If you're asking why the result is not exactly zero rather than 0.000000659, the answer is (as others have pointed out) that with floating point arithmetic using binary numbers you need to expect rounding.

You have to specify %f for printing the float number then it will print 0 for variable a.

That's floating point numbers rounding errors on the scene. Each time you subtract a fraction you get approximately the result you'd normally expect from a number on paper and so the final result is very close to zero, but not necessarily precise zero.

The precision with floating numbers isn't accurate, that's why you find this result.
Cordially

Related

Matlab "single" precision vs C floating point?

My Matlab script reads a string value "0.001044397222448" from a file, and after parsing the file, this value printed in the console shows as double precision:
value_double =
0.001044397222448
After I convert this number to singe using value_float = single(value_double), the value shows as:
value_float =
0.0010444
What is the real value of this variable, that I later use in my Simulink simulation? Is it really truncated/rounded to 0.0010444?
My problem is that later on, after I compare this with analogous C code, I get differences. In the C code the value is read as float gf = 0.001044397222448f; and it prints out as 0.001044397242367267608642578125000. So the C code keeps good precision. But, does Matlab?
The number 0.001044397222448 (like the vast majority of decimal fractions) cannot be exactly represented in binary floating point.
As a single-precision float, it's most closely represented as (hex) 0x0.88e428 × 2-9, which in decimal is 0.001044397242367267608642578125.
In double precision, it's most closely represented as 0x0.88e427d4327300 × 2-9, which in decimal is 0.001044397222447999984407118745366460643708705902099609375.
Those are what the numbers are, internally, in both C and Matlab.
Everything else you see is an artifact of how the numbers are printed back out, possibly rounded and/or truncated.
When I said that the single-precision representation "in decimal is 0.001044397242367267608642578125", that's mildly misleading, because it makes it look like there are 28 or more digits' worth of precision. Most of those digits, however, are an artifact of the conversion from base 2 back to base 10. As other answers have noted, single-precision floating point actually gives you only about 7 decimal digits of precision, as you can see if you notice where the single- and double-precision equivalents start to diverge:
0.001044397242367267608642578125
0.001044397222447999984407118745366460643708705902099609375
^
difference
Similarly, double precision gives you roughly 16 decimal digits worth of precision, as you can see if you compare the results of converting a few previous and next mantissa values:
0x0.88e427d43272f8 0.00104439722244799976756668424826557384221814572811126708984375
0x0.88e427d4327300 0.001044397222447999984407118745366460643708705902099609375
0x0.88e427d4327308 0.00104439722244800020124755324246734744519926607608795166015625
0x0.88e427d4327310 0.0010443972224480004180879877395682342466898262500762939453125
^
changes
This also demonstrates why you can never exactly represent your original value 0.001044397222448 in binary. If you're using double, you can have 0.00104439722244799998, or you can have 0.0010443972224480002, but you can't have anything in between. (You'd get a little less close with float, and you could get considerably closer with long double, but you'll never get your exact value.)
In C, and whether you're using float or double, you can ask for as little or as much precision as you want when printing things with %f, and under a high-quality implementation you'll always get properly-rounded results. (Of course the results you get will always be the result of rounding the actual, internal value, not necessarily the decimal value you started with.) For example, if I run this code:
printf("%.5f\n", 0.001044397222448);
printf("%.10f\n", 0.001044397222448);
printf("%.15f\n", 0.001044397222448);
printf("%.20f\n", 0.001044397222448);
printf("%.30f\n", 0.001044397222448);
printf("%.40f\n", 0.001044397222448);
printf("%.50f\n", 0.001044397222448);
printf("%.60f\n", 0.001044397222448);
printf("%.70f\n", 0.001044397222448);
I see these results, which as you can see match the analysis above.
(Note that this particular example is using double, not float.)
0.00104
0.0010443972
0.001044397222448
0.00104439722244799998
0.001044397222447999984407118745
0.0010443972224479999844071187453664606437
0.00104439722244799998440711874536646064370870590210
0.001044397222447999984407118745366460643708705902099609375000
0.0010443972224479999844071187453664606437087059020996093750000000000000
I'm not sure how Matlab prints things.
In answer to your specific questions:
What is the real value of this variable, that I later use in my Simulink simulation? Is it really truncated/rounded to 0.0010444?
As a float, it is really "truncated" to a number which, converted back to decimal, is exactly 0.001044397242367267608642578125. But as we've seen, most of those digits are essentially meaningless, and the result can more properly thought of as being about 0.0010443972.
In the C code the value is read as float gf = 0.001044397222448f; and it prints out as 0.001044397242367267608642578125000
So C got the same answer I did -- but, again, most of those digits are not meaningful.
So the C code keeps good precision. But, does Matlab?
I'd be willing to bet that Matlab keeps the same internal precision for ordinary floats and doubles.
MATLAB uses IEEE-754 binary64 for its double-precision type and binary32 for single-precision. When 0.001044397222448 is rounded to the nearest value representable in binary64, the result is 4816432068447840•2−62 = 0.001044397222447999984407118745366460643708705902099609375.
When that is rounded to the nearest value representable in binary32, the result is 8971304•2−33 = 0.001044397242367267608642578125.
Various software (C, Matlab, others) displays floating-point numbers in diverse ways, with more or fewer digits. The above values are the exact numbers represented by the floating-point data, per the IEEE 754 specification, and they are the values the data has when used in arithmetic operations.
All single precisions should be the same
So here is the thing. According to documentation, both matlab and C comply with the IEEE 754 standard. Which means that there should not be any difference between what is actually stored in memory.
You could compute the binary representation by hand but according to this(thanks #Danijel) handy website, the representation of 0.001044397222448 should be 0x3a88e428.
The question is how precise is your representation? It is a bit tricky with floating point but the short answer is your number is accurate up to the 9th decimal and has decimal represented up to the 33rd decimal. If you want the long answer see the tow paragraphs at the end of this post.
A display issue
The fact that you are not seeing the same thing when you print does not mean that you don't have the same bits in memory (and you should have the exact same bytes in memory in C and MATLAB). The only reason you see a difference on your display is because the print functions truncate your number. If you print the 33 decimals in each language you should not have any difference.
To do so in matlab use: fprintf('%.33f', value_float);
To do so in c use printf('%.33f\n', gf);
About floating point precision
Now in more details, the question was: how precise is this representation? Well the tricky thing with floating point is that the precision of the representation depends on what number you are representing. The representation is over 32 bits and is divide with 1 bit for the sign, 8 for the exponent and 23 for the fraction.
The number can be computed as sign * 2^(exponent-127) * 1.fraction. This basically means that the maximal error/precision (depending on how you want to call it) is basically 2^(exponent-127-23), the 23 is here to represent the 23 bytes of the fraction. (There are a few edge cases, I won't elaborate on it). In our case the exponent is 117, which means your precision is 2^(117-127-23) = 1.16415321826934814453125e-10. That means that your single precision float should represent your number accurately up to the 9th decimal, after that it is up to luck.
Further details
I know this is a rather short explanation. For more details, this post explains the floating point imprecision more precisely and this website gives you some useful info and allows you to play visually with the representation.

What logic works behind the rounding off of floating point number in c language?

I have written a program in c language to roundoff floating point number but the output of the program does not follow any logic.
The code is-->
#include<stdio.h>
int main(){
float a=2.435000;
float b=2.535000;
float c=2.635000;
float d=2.735000;
float e=2.835000;
//Rounding off the floating point numbers to 2 decimal places
printf("%f %.2f\n",a,a);
printf("%f %.2f\n",b,b);
printf("%f %.2f\n",c,c);
printf("%f %.2f\n",d,d);
printf("%f %.2f\n",e,e);
return 0;
}
OUTPUT:
2.435000 2.43
2.535000 2.54
2.635000 2.63
2.735000 2.73
2.835000 2.84
All the floating numbers have same pattern of digits i.e. they are in the form of 2.x35000 where x is different in different numbers.
I am not getting why they are showing different behaviour while getting roundoff ,either they should give 2.x3 or 2.x4 but for different x it is giving different value.
What is the logic behind this?
It looks to you and me like 2.435 is a nice, round decimal fraction.
It looks to you and me like if we rounded it to two places, we'd get 2.44.
But most computers do not use decimal fractions internally, and the ones you and I use definitely do not. They use binary fractions, and binary fractions can be surprising.
Internally, it turns out that the number 2.435 cannot be represented exactly in binary. As a float, it's represented internally as a binary fraction that's equivalent to 2.434999942779541015625. That's pretty close to 2.435, but you can see that if you round it to two places, you get 2.43 instead.
The same sorts of argument explain your other results. 2.635 is really 2.6349999904632568359375, so it rounds to 2.63. But 2.535 is 2.5350000858306884765625, so it rounds to 2.54, as you expect.
Why can't we get closer to, say, 2.435? As I said, internally it's a binary fraction equivalent to 2.434999942779541015625. The actual IEEE-754 single-precision representation is 0x401bd70a, which works out to 0x0.9bd70a × 2², or 0.60874998569488525390625 × 22. But if we "add 1" to it, that is, if we make it the tiniest bit bigger that we can, the next float value is 0x401bd70b, which is 0x0.9bd70b or 0.608750045299530029296875 × 22, which is 2.4350001811981201171875, which ends up being a little bit farther away from 2.435. (It's actually over three times as far away. The first number is off by 0.000000057; the second one is off by 0.000000181.)
You can read more about the sometimes surprising properties of binary floating-point numbers at these other "canonical" SO questions:
Is floating point math broken?
Why are floating point numbers inaccurate?
Two other interesting questions that were asked about them -- just in the past couple of days -- were:
Why does the > operator not work correctly?
Explanation for floating point rounding effect

printf behaviour in C

I am trying to understand what is the difference between the following:
printf("%f",4.567f);
printf("%f",4.567);
How does using the f suffix change/influence the output?
How using the 'f' changes/influences the output?
The f at the end of a floating point constant determines the type and can affect the value.
4.567 is floating point constant of type and precision of double. A double can represent exactly typical about 264 different values. 4.567 is not one on them*1. The closest alternative typically is exactly
4.56700000000000017053025658242404460906982421875 // best
4.56699999999999928235183688229881227016448974609375 // next best double
4.567f is floating point constant of type and precision of float. A float can represent exactly typical about 232 different values. 4.567 is not one on them. The closest alternative typically is exactly
4.566999912261962890625 // best
4.56700038909912109375 // next best float
When passed to printf() as part of the ... augments, a float is converted to double with the same value.
So the question becomes what is the expected difference in printing?
printf("%f",4.56700000000000017053025658242404460906982421875);
printf("%f",4.566999912261962890625);
Since the default number of digits after the decimal point to print for "%f" is 6, the output for both rounds to:
4.567000
To see a difference, print with more precision or try 4.567e10, 4.567e10f.
45670000000.000000 // double
45669998592.000000 // float
Your output may slightly differ to to quality of implementation issues.
*1 C supports many floating point encodings. A common one is binary64. Thus typical floating-point values are encoded as an sign * binary fraction * 2exponent. Even simple decimal values like 0.1 can not be represented exactly as such.

No Output Coming In Simple C Program

I have been asked a very simple question in the book to write the output of the following program -
#include<stdio.h>
int main()
{
float i=1.1;
while(i==1.1)
{
printf("%f\n",i);
i=i-0.1;
}
return 0;
}
Now I already read that I can use floating point numbers as loop counters but are not advisable which I learned. Now when I run this program inside the gcc, I get no output even though the logic is completely correct and according to which the value of I should be printed once. I tried printing the value of i and it gave me a result of 1.100000 . So I do not understand why the value is not being printed?
In most C implementations, using IEEE-754 binary floating-point, what happens in your program is:
The source text 1.1 is converted to a double. Since binary floating-point does not represent this value exactly, the result is the nearest representable value, 1.100000000000000088817841970012523233890533447265625.
The definition float i=1.1; converts the value to float. Since float has less precision than double, the result is 1.10000002384185791015625.
In the comparison i==1.1, the float 1.10000002384185791015625 is converted to double (which does not change its value) and compared to 1.100000000000000088817841970012523233890533447265625. Since they are unequal, the result is false.
The quantity 11/10 cannot be represented exactly in binary floating-point, and it has different approximations as double and as float.
The constant 1.1 in the source code is the double approximation of 11/10. Since i is of type float, it ends up containing the float approximation of 1.1.
Write while (i==1.1f) or declare i as double and your program will work.
Comparing floating point numbers:1
Floating point math is not exact. Simple values like 0.2 cannot be precisely represented using binary floating point numbers, and the limited precision of floating point numbers means that slight changes in the order of operations can change the result. Different compilers and CPU architectures store temporary results at different precision, so results will differ depending on the details of your environment. If you do a calculation and then compare the results against some expected value it is highly unlikely that you will get exactly the result you intended.
In other words, if you do a calculation and then do this comparison:
if (result == expectedResult)
then it is unlikely that the comparison will be true. If the comparison is true then it is probably unstable – tiny changes in the input values, compiler, or CPU may change the result and make the comparison be false.
In short:
1.1 can't be represented exactly in binary floating pint number. This is like the decimal representation of 10/3 in decimal which is 3.333333333..........
I would suggest you to Read the article What Every Computer Scientist Should Know About Floating-Point Arithmetic.
1. For the experts who are encouraging beginner programmers to use == in floating point comparision
It is because i is not quite exactly 1.1.
If you are going to test a floating point, you should do something along the lines of while(i-1.1 < SOME_DELTA) where delta is the threshold where equality is good enough.
Read: https://softwareengineering.stackexchange.com/questions/101163/what-causes-floating-point-rounding-errors

My floating point number has extra digits when I print it

I define a floating point number as float transparency = 0.85f; And in the next line, I pass it to a function -- fcn_name(transparency) -- but it turns out that the variable transparency has value 0.850000002, and when I print it with the default setting, it is 0.850000002. For the value 0.65f, it is 0.649999998.
How can I avoid this issue? I know floating point is just an approximation, but if I define a float with just a few decimals, how can I make sure it is not changed?
Floating-point values represented in binary format do not have any specific decimal precision. Just because you read in some spec that the number can represent some fixed amount of decimal digits, it doesn't really mean much. It is just a rough conversion of the physical (and meaningful) binary precision to its much less meaningful decimal approximation.
One property of binary floating-point format is that it can only represent precisely (within the limits of its mantissa width) the numbers that can be expressed as finite sums of powers of 2 (including negative powers of 2). Numbers like 0.5, 0.25, 0.75 (decimal) will be represented precisely in binary floating-point format, since these numbers are either powers of 2 (2^-1, 2^-2) or sums thereof.
Meanwhile, such number as decimal 0.1 cannot be expressed by a finite sum of powers of 2. The representation of decimal 0.1 in floating-point binary has infinite length. This immediately means that 0.1 cannot be ever represented precisely in finite binary floating-point format. Note that 0.1 has only one decimal digit. However, this number is still not representable. This illustrates the fact that expressing floating-point precision in terms of decimal digits is not very useful.
Values like 0.85 and 0.65 from your example are also non-representable, which is why you see these values distorted after conversion to a finite binary floating-point format. Actually, you have to get used to the fact that most fractional decimal numbers you will encounter in everyday life will not be representable precisely in binary floating-point types, regardless of how large these floating-point types are.
The only way I can think of solving this problem is to pass characteristic and mantissa to the function separately and let IT work on setting the values appropriately.
Also if you want more precision,
http://www.drdobbs.com/cpp/fixed-point-arithmetic-types-for-c/184401992 is the article I know. Though this works for C++ only. (Searching for an equivalent C implementation).
I tried this on VS2010,
#include <stdio.h>
void printfloat(float f)
{
printf("%f",f);
}
int main(int argc, char *argv[])
{
float f = 0.24f;
printfloat(f);
return 0;
}
OUTPUT: 0.240000

Resources