C: convert double to float, preserving decimal point precision - c

i wanted to convert double to float in C, but wanted to preserve the decimal point exactly as possible without any changes...
for example, let's say i have
double d = 0.1108;
double dd = 639728.170000;
double ddd = 345.2345678
now correct me if i am wrong, i know that floating point precision is about 5 numbers after the dot. can i get those five numbers after the dot exactly as the double had it? so that above results as follows:
float f = x(d);
float ff = x(dd);
float fff = x(ddd);
printf("%f\n%f\n%f\n", f, ff, fff);
it should print
0.1108
639728.17000
345.23456
all digits after the precision limit (which i assume as 5) would be truncated.

float and double don't store decimal places. They store binary places: float is (assuming IEEE 754) 24 significant bits (7.22 decimal digits) and double is 53 significant bits (15.95 significant digits).
Converting from double to float will give you the closest possible float, so rounding won't help you. Goining the other way may give you "noise" digits in the decimal representation.
#include <stdio.h>
int main(void) {
double orig = 12345.67;
float f = (float) orig;
printf("%.17g\n", f); // prints 12345.669921875
return 0;
}
To get a double approximation to the nice decimal value you intended, you can write something like:
double round_to_decimal(float f) {
return round(f * pow(10, 7)) / pow(10, 7);
}

A float generally has about 7 digits of precision, regardless of the position of the decimal point. So if you want 5 digits of precision after the decimal, you'll need to limit the range of the numbers to less than somewhere around +/-100.

Floating point numbers are represented in scientific notation as a number of only seven significant digits multiplied by a larger number that represents the place of the decimal place.
More information about it on Wikipedia:
http://en.wikipedia.org/wiki/Floating_point

Related

Differences between using variable and using direct number in C language

I have such a program
int a = 100;
float b = 1.05;
printf("%f\n", a * b);
printf("%f\n", 100 * 1.05);
Its output looks like this:
104.999992
105.000000
I understand that `104.99992' is due to floating-point precision issues. But why does the use of direct numbers not result in a loss of accuracy?
a * b uses float arithmetic. 100 * 1.05 uses double arithmetic.
Your C implementation probably uses IEEE-754 binary64 for double and binary32 for float.
1.05 is a double constant. The decimal numeral 1.05 in source text is converted to the nearest representable double value, which is 4728779608739021•2−52 = 1.0500000000000000444089209850062616169452667236328125.
float b = 1.05; converts that value to float. The result of the conversion is the nearest representable float value, which is 4728779393990656•2−52 = 4404019•2−22 = 1.0499999523162841796875.
When b is multiplied by 100 in float arithmetic, the result is the representable float value that is nearest the real-arithmetic result. It is 13762559•2−17 = 104.99999237060546875.
When that is printed with six digits after the decimal point, the result is “104.999992”.
In 100 * 1.05, only double arithmetic is used; there are no float values. When the double 1.0500000000000000444089209850062616169452667236328125 is multiplied by 100, the result is the double nearest the real-arithmetic result. It is 105•20 = 105. When that is printed with six digits after the decimal point, the result is of course “105.000000”.
Summary: There are generally rounding errors in floating-point arithmetic. In float arithmetic, they are easily visible when more than a few digits are printed. In this example with double arithmetic, the rounding errors happened to cancel—the double nearest 1.05 was slightly above 1.05, so the rounding in the conversion rounded up. Then, when that was multiplied by 100, the double nearest the real-arithmetic result was slightly below it, so the rounding in the multiplication rounded down. These effectively canceled, producing 105.
If you execute printf("%f\n", 100*1.05f);, 1.05f will be converted directly to float instead of to a double, and the multiplication will be done with float arithmetic,1 and you will see the same result as for a * b.
Footnote
1 The C standard allows extra precision to be used, but generally you will see the same result here.

Comparing float and double in C

I wrote the following code to compare between a float variable and a double variable in C.
int main()
{
float f = 1.1;
double d = 1.1;
if(f==d)
printf("EQUAL");
if(f < d)
printf("LESS");
if(f > d)
printf("GREATER");
return 0;
}
I am using an online C compiler here to compile my code.
I know that EQUAL will never be printed for recurring decimals. However what I expect should be printed is LESS since double should have a higher precision and therefore should be closer to the actual value of 1.1 than float is. As far as I know, in C when you compare float and double, the mantissa of the float is zero-extended to double, and that zero-extended value should always be smaller.
Instead in all situations GREATER is being printed. Am I missing something here?
The exact value of the closest float to 1.1 is 1.10000002384185791015625. The binary equivalent is 1.00011001100110011001101
The exact value of the closest double to 1.1 is 1.100000000000000088817841970012523233890533447265625. The binary equivalent is 1.0001100110011001100110011001100110011001100110011010.
Lining up the two binary numbers next to each other:
1.00011001100110011001101
1.0001100110011001100110011001100110011001100110011010
The first few truncated bits for rounding to float are 11001100, which is greater than half, so the conversion to float rounded up, making its least significant bits 11001101. That rounding resulted in the most significant difference being a 1 in the float in a bit position that is 0 in the double. The float is greater than the double, regardless of values of bits of lower significance being zero in the float extended to double, but non-zero in the double.
If you'll add the following 2 lines after declaring the 2 variables:
printf("%.9g\n", f);
printf("%.17g\n", d);
you will get the following output:
1.10000002
1.1000000000000001
so it easy to see that due to precision the float is greater then the double thus the printing of GREATER is fine.

Multiplying two floats doesn't give exact result

I am trying to multiply two floats as follows:
float number1 = 321.12;
float number2 = 345.34;
float rexsult = number1 * number2;
The result I want to see is 110895.582, but when I run the code it just gives me 110896. Most of the time I'm having this issue. Any calculator gives me the exact result with all decimals. How can I achive that result?
edit : It's C code. I'm using XCode iOS simulator.
There's a lot of rounding going on.
float a = 321.12; // this number will be rounded
float b = 345.34; // this number will also be rounded
float r = a * b; // and this number will be rounded too
printf("%.15f\n", r);
I get 110895.578125000000000 after the three separate roundings.
If you want more than 6 decimal digits' worth of precision, you will have to use double and not float. (Note that I said "decimal digits' worth", because you don't get decimal digits, you get binary.) As it stands, 1/2 ULP of error (a worst-case bound for a perfectly rounded result) is about 0.004.
If you want exactly rounded decimal numbers, you will have to use a specialized decimal library for such a task. A double has more than enough precision for scientists, but if you work with money everything has to be 100% exact. No floating point numbers for money.
Unlike integers, floating point numbers take some real work before you can get accustomed to their pitfalls. See "What Every Computer Scientist Should Know About Floating-Point Arithmetic", which is the classic introduction to the topic.
Edit: Actually, I'm not sure that the code rounds three times. It might round five times, since the constants for a and b might be rounded first to double-precision and then to single-precision when they are stored. But I don't know the rules of this part of C very well.
You will never get the exact result that way.
First of all, number1 ≠ 321.12 because that value cannot be represented exactly in a base-2 system. You'll need an infinite number of bits for it.
The same holds for number2 ≠ 345.34.
So, you begin with inexact values to begin with.
Then the product will get rounded because multiplication gives you double the number of significant digits but the product has to be stored in float again if you multiply floats.
You probably want to use a 10-based system for your numbers. Or, in case your numbers only have 2 decimal digits of the fractional, you can use integers (32-bit integers are sufficient in this case, but you may end up needing 64-bit):
32112 * 34534 = 1108955808.
That represents 321.12 * 345.34 = 110895.5808.
Since you are using C you could easily set the precision by using "%.xf" where x is the wanted precision.
For example:
float n1 = 321.12;
float n2 = 345.34;
float result = n1 * n2;
printf("%.20f", result);
Output:
110895.57812500000000000000
However, note that float only gives six digits of precision. For better precision use double.
floating point variables are only approximate representation, not precise one. Not every number can "fit" into float variable. For example, there is no way to put 1/10 (0.1) into binary variable, just like it's not possible to put 1/3 into decimal one (you can only approximate it with endless 0.33333)
when outputting such variables, it's usual to apply many rounding options. Unless you set them all, you can never be sure which of them are applied. This is especially true for << operators, as the stream can be told how to round BEFORE <<.
Printf also does some rounding. Consider http://codepad.org/LLweoeHp:
float t = 0.1f;
printf("result: %f\n", t);
--
result: 0.100000
Well, it looks fine. Why? Because printf defaulted to some precision and rounded up the output. Let's dial in 50 places after decimal point: http://codepad.org/frUPOvcI
float t = 0.1f;
printf("result: %.50f\n", t);
--
result: 0.10000000149011611938476562500000000000000000000000
That's different, isn't it? After 625 the float ran out of capacity to hold more data, that's why we see zeroes.
A double can hold more digits, but 0.1 in binary is not finite. Double has to give up, eventually: http://codepad.org/RAd7Yu2r
double t = 0.1;
printf("result: %.70f\n", t);
--
result: 0.1000000000000000055511151231257827021181583404541015625000000000000000
In your example, 321.12 alone is enough to cause trouble: http://codepad.org/cgw3vUKn
float t = 321.12f;
printf("and the result is: %.50f\n", t);
result: 321.11999511718750000000000000000000000000000000000000
This is why one has to round up floating point values before presenting them to humans.
Calculator programs don't use floats or doubles at all. They implement decimal number format. eg:
struct decimal
{
int mantissa; //meaningfull digits
int exponent; //number of decimal zeroes
};
Ofc that requires reinventing all operations: addition, substraction, multiplication and division. Or just look for a decimal library.

How to set precision of a float

Can someone explain me how to choose the precision of a float with a C function?
Examples:
theFatFunction(0.666666666, 3) returns 0.667
theFatFunction(0.111111111, 3) returns 0.111
You can't do that, since precision is determined by the data type (i.e. float or double or long double). If you want to round it for printing purposes, you can use the proper format specifiers in printf(), i.e. printf("%0.3f\n", 0.666666666).
You can't. Precision depends entirely on the data type. You've got float and double and that's it.
Floats have a static, fixed precision. You can't change it. What you can sometimes do, is round the number.
See this page, and consider to scale yourself by powers of 10. Note that not all numbers are exactly representable as floats, either.
Most systems follow IEEE-754 floating point standard which defines several floating point types.
On these systems, usually float is the IEEE-754 binary32 single precision type: it has 24-bit of precision. double is the binary64 double precision type; it has 53-bit of precision. The precision in bit numbers is defined by the IEEE-754 standard and cannot be changed.
When you print values of floating point types using functions of the fprintf family (e.g., printf), the precision is defined as the maximum number of significant digits and is by default set to 6 digits. You can change the default precision with a . followed by a decimal number in the conversion specification. For example:
printf("%.10f\n", 4.0 * atan(1.0)); // prints 3.1415926536
whereas
printf("%f\n", 4.0 * atan(1.0)); // prints 3.141593
It might be roughly the following steps:
Add 0.666666666 with 0.0005 (we get 0.667166666)
Multiply by 1000 (we get 667.166666)
Shift the number to an int (we get 667)
Shift it back to float (we get 667.0)
Divide by 1000 (we get 0.667)
Thank you.
Precision is determined by the data type (i.e. float or double or long double).
If you want to round it for printing purposes, you can use the proper format specifiers in printf(), i.e.
printf("%0.3f\n", 0.666666666) //will print 0.667 in c
Now if you want to round it for calculating purposes you have to first multiply the float by 10^number of digits then typecast to int , do the calculation and then again typecast to float and divide by same power of 10
float f=0.66666;
f *= 1000; // 666.660
int i = (int)f; // 666
i = 2*i; // 1332
f = i; // 1332
f /= 1000; // 1.332
printf("%f",f); //1.332000

Comparison of float and double variables [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicates:
Difference between float and double
strange output in comparision of float with float literal
I am using visual C++ 6.0 and in a program I am comparing float and double variables
For example for this program
#include<stdio.h>
int main()
{
float a = 0.7f;
double b = 0.7;
printf("%d %d %d",a<b,a>b,a==b);
return 0;
}
I am getting 1 0 0 as output
and for
#include<stdio.h>
int main()
{
float a = 1.7f;
double b = 1.7;
printf("%d %d %d",a<b,a>b,a==b);
return 0;
}
I am getting 0 1 0 as output.
Please tell me why I am getting these weird output and is there any way to predict these outputs on the same processor. Also how comparison is done of two variables in C ?
It has to do with the way the internal representation of floats and doubles are in the computer. Computers store numbers in binary which is base 2. Base 10 numbers when stored in binary may have repeating digits and the "exact" value stored in the computer is not the same.
When you compare floats, it's common to use an epsilon to denote a small change in values. For example:
float epsilon = 0.000000001;
float a = 0.7;
double b = 0.7;
if (abs(a - b) < epsilon)
// they are close enough to be equal.
1.7d and 1.7f are very likely to be different values: one is the closest you can get to the absolute value 1.7 in a double representation, and one is the closest you can get to the absolute value 1.7 in a float representation.
To put it into simpler-to-understand terms, imagine that you had two types, shortDecimal and longDecimal. shortDecimal is a decimal value with 3 significant digits. longDecimal is a decimal value with 5 significant digits. Now imagine you had some way of representing pi in a program, and assigning the value to shortDecimal and longDecimal variables. The short value would be 3.14, and the long value would be 3.1416. The two values aren't the same, even though they're both the closest representable value to pi in their respective types.
1.7 is decimal. In binary, it has non-finite representation.
Therefore, 1.7 and 1.7f differ.
Heuristic proof: when you shift bits to the left (ie multiply by 2) it will in the end be an integer if ever the binary representation is “finite”.
But in decimal, multiply 1.7 by 2, and again: you will only obtain non-integers (decimal part will cycle between .4, .8, .6 and .2). Therefore 1.7 is not a sum of powers of 2.
You can't compare floating point variables for equality. The reason is that decimal fractions are represented as binary ones, that means loss of precision.

Resources