Why does casting to int not floor all doubles in C - c

Given the C code:
#include <math.h>
#include <stdio.h>
int main(){
int i;
double f=log(2.0)/log(pow(2.0,1.0/2.0));
printf("double=%f\n",f);
printf("int=%d\n",(int) f);
}
I get the output:
double=2.000000
int=1
f is apparently at least 2.0. Why is the cast value not 2?

Because the value of the double was less than 2
double=1.99999999999999978
int=1
Try it but this time add some precision
#include <math.h>
#include <stdio.h>
int main(){
int i;
double f=log(2.0)/log(pow(2.0,1.0/2.0));
printf("double=%0.17f\n",f);
printf("int=%d\n",(int) f);
}
Confusing as hell the first time you experience it. Now, if you try this
#include <math.h>
#include <stdio.h>
int main(){
int i;
double f=log(2.0)/log(pow(2.0,1.0/2.0));
printf("double=%0.17f\n",f);
printf("int=%d\n",(int) f);
printf("int=%d\n", (int)round(f));
}
It will correctly round the value. If you look in the man page (on a mac at least) you'll see the following comment...
round, lround, llround -- round to integral value, regardless of rounding direction
What do they mean by direction, it's all specified in IEEE 754. If you check the different ways to round to an integer... floor is mentioned as rounding towards -ve infinity which is in this case was towards 1 :)

Floating point types are unable to represent some numbers exactly. Even though mathematically, the calculation should be exactly 2, with IEEE754 floating points, the result turns out to be slightly less than 2. For more information see this question.
If you increase the precision of your output by specifying %.20f instead of just %f, you will see that the number is not exactly 2, and that when printing with less accuracy, the result is simply rounded.
When converting to an int however, the full accuracy of the floating point type is used, and since it comes to just under 2, the result when converting to an integer is 1.

Integer does not take a rounded up value but makes a truncation. So, if f is equal to 1.999999999999[..]9, the int value will be 1.
Also, as the double wants to be the more accurate possible, he will take a rounded up value considering the number of characters he can show, which is 2.000000.

Related

Using round() function in c

I'm a bit confused about the round() function in C.
First of all, man says:
SYNOPSIS
#include <math.h>
double round(double x);
RETURN VALUE
These functions return the rounded integer value.
If x is integral, +0, -0, NaN, or infinite, x itself is returned.
The return value is a double / float or an int?
In second place, I've created a function that first rounds, then casts to int. Latter on my code I use it as a mean to compare doubles
int tointn(double in,int n)
{
int i = 0;
i = (int)round(in*pow(10,n));
return i;
}
This function apparently isn't stable throughout my tests. Is there redundancy here? Well... I'm not looking only for an answer, but a better understanding on the subject.
The wording in the man-page is meant to be read literally, that is in its mathematical sense. The wording "x is integral" means that x is an element of Z, not that x has the data type int.
Casting a double to int can be dangerous because the maximum arbitrary integral value a double can hold is 2^52 (assuming an IEEE 754 conforming binary64 ), the maximum value an int can hold might be smaller (it is mostly 32 bit on 32-bit architectures and also 32-bit on some 64-bit architectures).
If you need only powers of ten you can test it with this little program yourself:
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
int main(){
int i;
for(i = 0;i < 26;i++){
printf("%d:\t%.2f\t%d\n",i, pow(10,i), (int)pow(10,i));
}
exit(EXIT_SUCCESS);
}
Instead of casting you should use the functions that return a proper integral data type like e.g.: lround(3).
here is an excerpt from the man page.
#include <math.h>
double round(double x);
float roundf(float x);
long double roundl(long double x);
notice: the returned value is NEVER a integer. However, the fractional part of the returned value is set to 0.
notice: depending on exactly which function is called will determine the type of the returned value.
Here is an excerpt from the man page about which way the rounding will be done:
These functions round x to the nearest integer, but round halfway cases
away from zero (regardless of the current rounding direction, see
fenv(3)), instead of to the nearest even integer like rint(3).
For example, round(0.5) is 1.0, and round(-0.5) is -1.0.
If you want a long integer to be returned then please use lround:
long int tolongint(double in)
{
return lround(in));
}
For details please see lround which is available as of the C++ 11 standard.

Power function returns 1 less result

Whenever I input a number in this program the program return a value which is 1 less than the actual result ... What is the problem here??
#include<stdio.h>
#include<math.h>
int main(void)
{
int a,b,c,n;
scanf("%d",&n);
c=pow((5),(n));
printf("%d",c);
}
pow() returns a double, the implicit conversion from double to int is "rounding towards zero".
So it depends on the behavior of the pow() function.
If it's perfect then no problem, the conversion is exact.
If not:
1) the result is slightly larger, then the conversion will round it down to the expected value.
2) if the result is slightly smaller, then the conversion will round down which is what you see.
solution:
Change the conversion to "round to nearest integer" by using rounding functions
c=lround(pow((5),(n)));
In this case, as long as pow() has an error of less than +-0.5 you will get the expected result.
pow() takes double arguments and returns a double.
If you store the return value into an int and print that, you may not get the desired result.
If you need accurate results for big numbers, you should use a arbitrary precision math library like GMP. It's easy:
#include <stdio.h>
#include <math.h>
#include <gmp.h>
int main(void) {
int n;
mpz_t c;
scanf("%d",&n);
mpz_ui_pow_ui(c, 5, n);
gmp_printf("%Zd\n", c);
return 0;
}
You can use the %f format specifier while commanding printf func.You should see your result accurately. In %d format specifier the value tends to the final answer eg. 5^2=24.9999999999 and hence the answer shown is 24.

declaring a variable with only 2 decimal points in C

I am trying to declare a variable eLon with only 1 decimal place so that if I have the code:
elon=359.8
printf("eLon = %f\n",eLon);
and the output will be
eLon = 359.8
However the output I get is:
eLon = 359.7999999.
How I know that I could modify the printf so that it would be:
printf(eLon is %0.1f\n",eLon);
to get the desired result. This is NOT what I want to do. I just want the variable itself to only have one decimal place so that it equals 359.8 not 359.7999999, since this is critical for any computations I make. Do you know how I should modify my code to get the desired result. I tried doing what was suggested in other inquiries but it did not work for example the code:
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
int main()
{
int endLonn;
float endLon,eLon;
endLon=359.8;
endLonn=359.8*10;
printf("%d is endLonn\n",endLonn);
eLon= endLonn / 10;
printf("elon is %f\n",eLon);
}
gives me the output:
elon is 359.00000000
Again this is also not what I am looking for. I want elon is 359.8. If you could help me tweak my code to get the desired result that would be great. Thank you for your time.
Instead of trying to cheat double or float and printf, you can use fixed point arithmetic. In the particular case of your example you will express variable in 0.1s of it's original value:
int in_10s;
in_10s=3598;
printf("My fixed point variable "
"represents %d.%01d value\n",
in_10s/10, abs(in_10s % 10));
In general case of n decimal places, the format to printf should be %d.%0nd, where n=log_10(scaling_factor^-1), and scaling factor is interpreted like here.
What you want is not a floating point number but a fixed point decimal, which is for example available in c# as decimal. c doesn't give you that. There are two "typical" approaches to roll your own:
Use a plain int and just have the decimal point at some position by convention. e.g. your value would be int elon=3598.
Use a struct with two ints, one for the whole number part and one for the amount of tenths (or hundredths, thousandths, ...)
In both cases, you will need to implement your own logic for output. The simple approach using a plain int at least lets you use basic arithmetics as usual.
You actually can't declare a variable to hold only one decimal figure. The only native types able to represent non-integer values are floating point types, on which any arithmetic operation which yield a value of the same nature. Floating point types cannot be limited to a fixed number of decimal figures.
In order to achieve what you are asking for, you have to either implement such behavior or use an existing implementation of what is called fixed-point airthmetic.
A pretty complete introduction can be found in Wikipedia's Fixed-Point arithmetic entry. At the end you will find a couple of libraries implementing it.
I just want the variable itself to only have one decimal place so that it equals 359.8 not 359.7999999,
You should use double:
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
int main(void){
int endLonn;
double endLon,eLon;
endLon=359.8;
endLonn=359.8*10;
printf("%d is endLonn\n",endLonn);
eLon = endLonn / 10;
printf("elon is %.1lf\n%.1lf",eLon,endLon);
return 0;
}
Output:
3598 is endLonn
elon is 359.0
359.8

Division of two floats giving incorrect answer

Attempting to divide two floats in C, using the code below:
#include <stdio.h>
#include <math.h>
int main(){
float fpfd = 122.88e6;
float flo = 10e10;
float int_part, frac_part;
int_part = (int)(flo/fpfd);
frac_part = (flo/fpfd) - int_part;
printf("\nInt_Part = %f\n", int_part);
printf("Frac_Part = %f\n", frac_part);
return(0);
}
To this code, I use the commands:
>> gcc test_prog.c -o test_prog -lm
>> ./test_prog
I then get this output:
Int_Part = 813.000000
Frac_Part = 0.802063
Now, this Frac_part it seems is incorrect. I have tried the same equation on a calculator first and then in Wolfram Alpha and they both give me:
Frac_Part = 0.802083
Notice the number at the fifth decimal place is different.
This may seem insignificant to most, but for the calculations I am doing it is of paramount importance.
Can anyone explain to me why the C code is making this error?
When you have inadequate precision from floating point operations, the first most natural step is to just use floating point types of higher precision, e.g. use double instead of float. (As pointed out immediately in the other answers.)
Second, examine the different floating point operations and consider their precisions. The one that stands out to me as being a source of error is the method above of separating a float into integer part and fractional part, by simply casting to int and subtracting. This is not ideal, because, when you subtract the integer part from the original value, you are doing arithmetic where the three numbers involved (two inputs and result) have very different scales, and this will likely lead to precision loss.
I would suggest to use the C <math.h> function modf instead to split floating point numbers into integer and fractional part. http://www.techonthenet.com/c_language/standard_library_functions/math_h/modf.php
(In greater detail: When you do an operation like f - (int)f, the floating point addition procedure is going to see that two numbers of some given precision X are being added, and it's going to naturally assume that the result will also have precision X. Then it will perform the actual computation under that assumption, and finally reevaluate the precision of the result at the end. Because the initial prediction turned out not to be ideal, some low order bits are going to get lost.)
Float are single precision for floating point, you should instead try to use double, the following code give me the right result:
#include <stdio.h>
#include <math.h>
int main(){
double fpfd = 122.88e6;
double flo = 10e10;
double int_part, frac_part;
int_part = (int)(flo/fpfd);
frac_part = (flo/fpfd) - int_part;
printf("\nInt_Part = %f\n", int_part);
printf("Frac_Part = %f\n", frac_part);
return(0);
}
Why ?
As I said, float are single precision floating point, they are smaller than double (in most architecture, sizeof(float) < sizeof(double)).
By using double instead of float you will have more bit to store the mantissa and the exponent part of the number (see wikipedia).
float has only 6~9 significant digits, it's not precise enough for most uses in practice. Changing all float variables to double (which provides 15~17 significant digits) gives output:
Int_Part = 813.000000
Frac_Part = 0.802083

sin, cos, tan and rounding error

I'm doing some trigonometry calculations in C/C++ and am running into problems with rounding errors. For example, on my Linux system:
#include <stdio.h>
#include <math.h>
int main(int argc, char *argv[]) {
printf("%e\n", sin(M_PI));
return 0;
}
This program gives the following output:
1.224647e-16
when the correct answer is of course 0.
How much rounding error can I expect when using trig functions? How can I best handle that error? I'm familiar with the Units in Last Place technique for comparing floating point numbers, from Bruce Dawson's Comparing Floating Point Numbers, but that doesn't seem to work here, since 0 and 1.22e-16 are quite a few ULPs apart.
The answer is only 0 for sin(pi) - did you include all the digits of Pi ?
-Has anyone else noticed a distinct lack of, irony/sense of humour around here?
An IEEE double stores 52 bits of mantissa, with the "implicit leading
one" forming a 53 bit number. An error in the bottom bit of a result
therefore makes up about 1/2^53 of the scale of the numbers. Your output is
of the same order as 1.0, so that comes out to just about exactly one
part in 10^16 (because 53*log(2)/log(10) == 15.9).
So yes. This is about the limit of the precision you can expect. I'm
not sure what the ULP technique you're using is, but I suspect you're
applying it wrong.
Sine of π is 0.0.
Sine of M_PI is about 1.224647e-16.
M_PI is not π.
program gives ... 1.224647e-16 when the correct answer is of course 0.
Code gave a correct answer to 7 significant places.
The following does not print the sine of π. It prints the sine of a number close to π. See below pic.
π // 3.141592653589793 2384626433832795...
printf("%.21\n", M_PI); // 3.141592653589793 115998
printf("%.21f\n", sin(M_PI));// 0.000000000000000 122465
Note: With the math function sine(x), the slope of the curve is -1.0 at x = π. The difference of π and M_PI is about the sin(M_PI) - as expected.
am running into problems with rounding errors
The rounding problem occurs when using M_PI to represent π. M_PI is the double constant closest to π, yet since π is irrational and all finite double are rational, they must differ - even by a small amount. So not a direct rounding issue with sin(), cos(), tan(). sin(M_PI) simple exposed the issue started with using M_PI - an inexact π.
This problem, with different non-zero results of sin(M_PI), occurs if code used a different FP type like float, long double or double with something other than 53 binary bits of precision. This is not a precision issue so much as a irrational/rational one.
#Josh Kelley - ok serious answer.
In general you should never compare the results of any operation involving floats or doubles with each other.
The only exceptions is assignment.
float a=10.0;
float b=10.0;
then a==b
Otherwise you always have to write some function like bool IsClose(float a,float b, float error) to allow you to check if two numbers are within 'error' of each other.
Remember to also check signs/use fabs - you could have -1.224647e-16
There are two sources of error. The sin() function and the approximated value of M_PI. Even if the sin() function were 'perfect', it would not return zero unless the value of M_PI were also perfect - which it is not.
I rather think that will be system-dependent. I don't think the Standard has anything to say on how accurate the transcendental functions will be. Unfortunately, I don't remember seeing any discussion of function precision, so you'll probably have to figure it out yourself.
Unless your program requires significant digits out to the 16th decimal place or more, you probably can do the rounding manually. From my experience programming games we always rounded our decimals to a tolerable significant digit. For example:
#include <math.h>
#include <stdio.h>
#include <stdlib.h>
#define HALF 0.5
#define GREATER_EQUAL_HALF(X) (X) >= HALF
double const M_PI = 2 * acos(0.0);
double round(double val, unsigned places = 1)
{
val = val * pow(10.0f, (float)places);
long longval = (long)val;
if ( GREATER_EQUAL_HALF(val - longval) ) {
return ceil(val) / pow(10.0f, (float)places);
} else {
return floor(val) / pow(10.0f, (float)places);
}
}
int main()
{
printf("\nValue %lf", round(sin(M_PI), 10));
return 0;
}
I get the exact same result on my system - I'd say it is close enough
I would solve the problem by changing the format string to "%f\n" :)
However, this gives you a "better" result, or at least on my system it does give -3.661369e-245
#include <stdio.h>
#include <math.h>
int main(int argc, char *argv[]) {
printf("%e\n", (long double)sin(M_PI));
return 0;
}
Maybe too low accuracy of implementation
M_PI = 3.14159265358979323846 (M_PI is not π)
http://fresh2refresh.com/c/c-function/c-math-h-library-functions/
It is an inaccuracy in implementation, see Stephen C. Steel's comment under Andy Ross` answer above and chux's answer.

Resources