#include <stdio.h>
void volume()
{
float pi=3.14,r,v;
printf("\nEnter the radius: ");
scanf("%f",&r);
v=(1/3)*pi*r*r*r;
printf("%f",v);
}
int main()
{
volume();
}
This is C program to find volume using functions.
Compiler shows no error or warnings but the output is always zero.
you have zero here due to 1/3 is integer division = 0
replace v=(1/3)*pi*r*r*r; with v=(1.f/3.f)*pi*r*r*r;
That's because anything multiplied by 0 is 0, and 1/3 is 0.
1/3 is 0 because both 1 and 3 are literals of type int, and dividing two integers results in an integer again.
Fix: 1.0/3.0 because 1.0 and 3.0 are floating point literals, so you get floating-point division.
It's due to the fact that 1 / 3 is done as integer division because both operands are type int(a). So the final result is zero.
You can get the intended effect simply by ensuring one of the operands is non-integer, such as with:
v = (1.0f / 3) * pi * r * r * r;
That will work, because it results in a float (inside the parentheses) of 0.333.... However, there's absolutely no real reason why you need to parrot the equation shown in text books, you can achieve the same result with the simpler:
v = pi * r * r * r / 3;
Because all those variable are floating point, the result of the final something / 3 is also done as floating point.
And, just some quick advice, you may want to consider using double types rather than float. The float type generally uses less space but, unless you have large arrays of them, it's not usually a problem. The double type gives a much greater range and precision.
In addition, 3.14 is not really that precise a value for pi. Most implementations will define an M_PI constant in math.h for you to use but it's not mandated by the standard. So, you can use something like this to get a more accurate value:
#include <stdio.h>
#include <math.h>
// Define if implementation doesn't provide.
#ifndef M_PI
#define M_PI 3.14159265358979323846
#endif
static void GetRadiusAndPrintVolume(void) {
printf("\nEnter the radius: ");
double radius;
scanf("%lf", &radius);
double volume = M_PI * radius * radius * radius / 3;
printf("Volume for radius %f is %f\n", radius, volume);
}
int main() {
GetRadiusAndPrintVolume();
}
And, finally, you may want to check that equation of yours. Though you don't say explicitly, it very much looks like it's supposed to be the volume of a sphere.
If that is the case, the formula should be 4/3 π r3 rather than 1/3. Hence the statement would be:
double volume = M_PI * radius * radius * radius * 4 / 3;
(a) If you're interested, this is all covered by the "Usual arithmetic conversions" section of the C standard (section 6.3.1.8 in C11):
Many operators that expect operands of arithmetic type cause conversions and yield result types in a similar way. The purpose is to determine a common real type for the operands and result.
It then goes on to list what happens for specific cases, which is generally that the "lesser" (in terms of its range and/or precision) operand is upgraded to the same type as the "greater" operand. That's the basic idea though, if you want the full picture, you should refer to the previously mentioned section.
In your specific case, since both operands of 1 / 3 are of type int, the calculation and the result is done as an int, meaning it truncates the fractional part (giving zero).
Because 1/3 is not 0.3333 in C, it is 0.
Solution:
#include <stdio.h>
void volume()
{
float pi=3.14,r,v;
printf("\nEnter the radius: ");
scanf("%f",&r);
v=(((float)1)/3)*pi*r*r*r;
printf("%f",v);
}
int main()
{
volume();
}
Related
I'm trying to understand something about sin and sinf from math.h.
I understand that their types differ: the former takes and returns doubles, and the latter takes and returns floats.
However, GCC still compiles my code if I call sin with float arguments:
#include <stdio.h>
#include <math.h>
#define PI 3.14159265
int main ()
{
float x, result;
x = 135 / 180 * PI;
result = sin (x);
printf ("The sin of (x=%f) is %f\n", x, result);
return 0;
}
By default, all compiles just fine (even with -Wall, -std=c99 and -Wpedantic; I need to work with C99). GCC won't complain about me passing floats to sin. If I enable -Wconversion then GCC tells me:
warning: conversion to ‘float’ from ‘double’ may alter its value [-Wfloat-conversion]
result = sin (x);
^~~
So my question is: is there a float input for which using sin, like above, and (implicitly) casting the result back to float, will result in a value that is different from that obtained using sinf?
This program finds three examples on my machine:
#include <math.h>
#include <stdio.h>
#include <stdlib.h>
int main()
{
int i;
float f, f1, f2;
for(i = 0; i < 10000; i++) {
f = (float)rand() / RAND_MAX;
float f1 = sinf(f);
float f2 = sin(f);
if(f1 != f2) printf("jackpot: %.8f %.8f %.8f\n", f, f1, f2);
}
}
I got:
jackpot: 0.98704159 0.83439910 0.83439904
jackpot: 0.78605396 0.70757037 0.70757031
jackpot: 0.78636044 0.70778692 0.70778686
This will find all the float input values in the range 0.0 to 2 * M_PI where (float)sin(input) != sinf(input):
#include <stdio.h>
#include <math.h>
#include <float.h>
#ifndef M_PI
#define M_PI 3.14159265358979323846
#endif
int main(void)
{
for (float in = 0.0; in < 2 * M_PI; in = nextafterf(in, FLT_MAX)) {
float sin_result = (float)sin(in);
float sinf_result = sinf(in);
if (sin_result != sinf_result) {
printf("sin(%.*g) = %.*g, sinf(%.*g) = %.*g\n",
FLT_DECIMAL_DIG, in, FLT_DECIMAL_DIG, sin_result,
FLT_DECIMAL_DIG, in, FLT_DECIMAL_DIG, sinf_result);
}
}
return 0;
}
There are 1020963 such inputs on my amd64 Linux system with glibc 2.32.
float precision is approximately 6 significant figures decimal, while double is good for about 15. (It is approximate because they are binary floating point values not decimal floating point).
As such for example: a double value 1.23456789 will become 1.23456xxx as a float where xxx are unlikely to be 789 in this case.
Clearly not all (in fact very few) double values are exactly representable by float, so will change value when down-converted.
So for:
double a = 1.23456789 ;
float b = a ;
printf( "double: %.10f\n", a ) ;
printf( "float: %.10f\n", b ) ;
The result in my test was:
double: 1.2345678900
float: 1.2345678806
As you can see the float in fact retained 9 significant figures in this case, but it is by no means guaranteed for all possible values.
In your test you have limited the number of instances of mismatch because of the limited and finite range of rand() and also because f itself is float. Consider:
int main()
{
unsigned mismatch_count = 0 ;
unsigned iterations = 0 ;
for( double f = 0; f < 6.28318530718; f += 0.000001)
{
float f1 = sinf(f);
float f2 = sin(f);
iterations++ ;
if(f1 != f2)
{
mismatch_count++ ;
}
}
printf("%f%%\n", (double)mismatch_count/iterations* 100.0);}
In my test about 55% of comparisons mismatched. Changing f to float, the mismatches reduced to 1.3%.
So in your test, you see few mismatches because of the constraints of your method of generating f and its type. In the general case the issue is much more obvious.
In some cases you might see no mismatches - an implementation may simply implement sinf() using sin() with explicit casts. The compiler warning is for the general case of implicitly casting a double to a float without reference to any operations performed prior to the conversion.
However, GCC still compiles my code if I call sin with float arguments:
Yes, this is because they are implicitly converted to double (because sin() requires a float), and back to float (because sin() returns a double) on entering and exiting from the sinf() function. See below why it is better to use sinf() in this case, instead of having only one function.
You have included math.h which has prototypes for both function calls:
double sin(double);
float sinf(float);
And so, the compiler knows that to use sin() it is necessary a conversion from float to double so it compiles a conversion before calling, and also compiles a conversion from double to float in the result from sin().
In case you have not #include <math.h> and you ignored the compiler warning telling you are calling a function sin() with no prototype, the compiler should have also converted first the float to double (because on nonspecified argument types this is how it mus proceed) and pass the double data to the function (which is assumed to return an int in this case, that will provoke a serious Undefined Behaviour)
In case you have used the sinf() function (with the proper prototype), and passed a float, then no conversion should be compiled, the float is passed as such with no type conversion, and the returned value is assigned to a float variable, also with no conversion. So everything goes fine with no conversion, this makes the fastest code.
In case you have used the sinf() function (with no prototype), and passed a float, this float would be converted to a double and passed as such to sinf(), resulting in undefined behaviour. In case somehow sinf() returned properly, an int result (that could have something to do with the calculation or not, as per UB) would be converted into float type (should this be possible) and assigned to the result value.
In the case mentioned above, in case you are operating on floats, it is better to use sinf() as it takes less to execute (it has less iterations to do, as less precision is required in them) and the two conversions (from float to double and back from double to float) have not to be compiled in, in the binary code output by the compiler.
There are some systems where computations on float are an order of magnitude faster than computations on double. The primary purpose of sinf is to allow trigonometric calculations to be performed efficiently on such systems in cases where the lower precision of float would be adequate to satisfy application needs. Converting a value to float, calling sin, and converting the result to float would always yield a value that either matched that of sinf or was more accurate(*), and on some implementations that would in fact be the most efficient way of implementing sinf. On some other systems, however, such an approach would be more than an order of magnitude slower than using a purpose-designed function to evaluate the sine of a float.
(*) Note that for arguments outside the range +/- π/2, the most mathematically accurate way of computing sin(x) for an exact specified value of x might not be the most accurate way of computing what the calling code wants to know. If an application computes sinf(angle * (2.0f * 3.14159265f)), when angle is 0.5, having the function (double)3.1415926535897932385-(float)3.14159265f may be more "mathematically accurate" than having it return sin(angle-(2.0f*3.14159265f)), but the latter would more accurately represent the sine of the angle the code was actually interested in.
I am new to C, and my task is to create a function
f(x) = sqrt[(x^2)+1]-1
that can handle very large numbers and very small numbers. I am submitting my script on an online interface that checks my answers.
For very large numbers I simplify the expression to:
f(x) = x-1
By just using the highest power. This was the correct answer.
The same logic does not work for smaller numbers. For small numbers (on the order of 1e-7), they are very quickly truncated to zero, even before they are squared. I suspect that this has to do with floating point precision in C. In my textbook, it says that the float type has smallest possible value of 1.17549e-38, with 6 digit precision. So although 1e-7 is much larger than 1.17e-38, it has a higher precision, and is therefore rounded to zero. This is my guess, correct me if I'm wrong.
As a solution, I am thinking that I should convert x to a long double when x < 1e-6. However when I do this, I still get the same error. Any ideas? Let me know if I can clarify. Code below:
#include <math.h>
#include <stdio.h>
double feval(double x) {
/* Insert your code here */
if (x > 1e299)
{;
return x-1;
}
if (x < 1e-6)
{
long double g;
g = x;
printf("x = %Lf\n", g);
long double a;
a = pow(x,2);
printf("x squared = %Lf\n", a);
return sqrt(g*g+1.)- 1.;
}
else
{
printf("x = %f\n", x);
printf("Used third \n");
return sqrt(pow(x,2)+1.)-1;
}
}
int main(void)
{
double x;
printf("Input: ");
scanf("%lf", &x);
double b;
b = feval(x);
printf("%f\n", b);
return 0;
}
For small inputs, you're getting truncation error when you do 1+x^2. If x=1e-7f, x*x will happily fit into a 32 bit floating point number (with a little bit of error due to the fact that 1e-7 does not have an exact floating point representation, but x*x will be so much smaller than 1 that floating point precision will not be sufficient to represent 1+x*x.
It would be more appropriate to do a Taylor expansion of sqrt(1+x^2), which to lowest order would be
sqrt(1+x^2) = 1 + 0.5*x^2 + O(x^4)
Then, you could write your result as
sqrt(1+x^2)-1 = 0.5*x^2 + O(x^4),
avoiding the scenario where you add a very small number to 1.
As a side note, you should not use pow for integer powers. For x^2, you should just do x*x. Arbitrary integer powers are a little trickier to do efficiently; the GNU scientific library for example has a function for efficiently computing arbitrary integer powers.
There are two issues here when implementing this in the naive way: Overflow or underflow in intermediate computation when computing x * x, and substractive cancellation during final subtraction of 1. The second issue is an accuracy issue.
ISO C has a standard math function hypot (x, y) that performs the computation sqrt (x * x + y * y) accurately while avoiding underflow and overflow in intermediate computation. A common approach to fix issues with subtractive cancellation is to transform the computation algebraically such that it is transformed into multiplications and / or divisions.
Combining these two fixes leads to the following implementation for float argument. It has an error of less than 3 ulps across all possible inputs according to my testing.
/* Compute sqrt(x*x+1)-1 accurately and without spurious overflow or underflow */
float func (float x)
{
return (x / (1.0f + hypotf (x, 1.0f))) * x;
}
A trick that is often useful in these cases is based on the identity
(a+1)*(a-1) = a*a-1
In this case
sqrt(x*x+1)-1 = (sqrt(x*x+1)-1)*(sqrt(x*x+1)+1)
/(sqrt(x*x+1)+1)
= (x*x+1-1) / (sqrt(x*x+1)+1)
= x*x/(sqrt(x*x+1)+1)
The last formula can be used as an implementation. For vwry small x sqrt(x*x+1)+1 will be close to 2 (for small enough x it will be 2) but we don;t loose precision in evaluating it.
The problem isn't with running into the minimum value, but with the precision.
As you said yourself, float on your machine has about 7 digits of precision. So let's take x = 1e-7, so that x^2 = 1e-14. That's still well within the range of float, no problems there. But now add 1. The exact answer would be 1.00000000000001. But if we only have 7 digits of precision, this gets rounded to 1.0000000, i.e. exactly 1. So you end up computing sqrt(1.0)-1 which is exactly 0.
One approach would be to use the linear approximation of sqrt around x=1 that sqrt(x) ~ 1+0.5*(x-1). That would lead to the approximation f(x) ~ 0.5*x^2.
This question already has answers here:
Comparing float and double
(3 answers)
Closed 7 years ago.
int main(void)
{
float me = 1.1;
double you = 1.1;
if ( me == you ) {
printf("I love U");
} else {
printf("I hate U");
}
}
This prints "I hate U". Why?
Floats use binary fraction. If you convert 1.1 to float, this will result in a binary representation.
Each bit right if the binary point halves the weight of the digit, as much as for decimal, it divides by ten. Bits left of the point double (times ten for decimal).
in decimal: ... 0*2 + 1*1 + 0*0.5 + 0*0.25 + 0*0.125 + 1*0.0625 + ...
binary: 0 1 . 0 0 0 1 ...
2's exp: 1 0 -1 -2 -3 -4
(exponent to the power of 2)
Problem is that 1.1 cannot be converted exactly to binary representation. For double, there are, however, more significant digits than for float.
If you compare the values, first, the float is converted to double. But as the computer does not know about the original decimal value, it simply fills the trailing digits of the new double with all 0, while the double value is more precise. So both do compare not equal.
This is a common pitfall when using floats. For this and other reasons (e.g. rounding errors), you should not use exact comparison for equal/unequal), but a ranged compare using the smallest value different from 0:
#include "float.h"
...
// check for "almost equal"
if ( fabs(fval - dval) <= FLT_EPSILON )
...
Note the usage of FLT_EPSILON, which is the aforementioned value for single precision float values. Also note the <=, not <, as the latter will actually require exact match).
If you compare two doubles, you might use DBL_EPSILON, but be careful with that.
Depending on intermediate calculations, the tolerance has to be increased (you cannot reduce it further than epsilon), as rounding errors, etc. will sum up. Floats in general are not forgiving with wrong assumptions about precision, conversion and rounding.
Edit:
As suggested by #chux, this might not work as expected for larger values, as you have to scale EPSILON according to the exponents. This conforms to what I stated: float comparision is not that simple as integer comparison. Think about before comparing.
In short, you should NOT use == to compare floating points.
for example
float i = 1.1; // or double
float j = 1.1; // or double
This argument
(i==j) == true // is not always valid
for a correct comparison you should use epsilon (very small number):
(abs(i-j)<epsilon)== true // this argument is valid
The question simplifies to why do me and you have different values?
Usually, C floating point is based on a binary representation. Many compilers & hardware follow IEEE 754 binary32 and binary64. Rare machines use a decimal, base-16 or other floating point representation.
OP's machine certainly does not represent 1.1 exactly as 1.1, but to the nearest representable floating point number.
Consider the below which prints out me and you to high precision. The previous representable floating point numbers are also shown. It is easy to see me != you.
#include <math.h>
#include <stdio.h>
int main(void) {
float me = 1.1;
double you = 1.1;
printf("%.50f\n", nextafterf(me,0)); // previous float value
printf("%.50f\n", me);
printf("%.50f\n", nextafter(you,0)); // previous double value
printf("%.50f\n", you);
1.09999990463256835937500000000000000000000000000000
1.10000002384185791015625000000000000000000000000000
1.09999999999999986677323704498121514916420000000000
1.10000000000000008881784197001252323389053300000000
But it is more complicated: C allows code to use higher precision for intermediate calculations depending on FLT_EVAL_METHOD. So on another machine, where FLT_EVAL_METHOD==1 (evaluate all FP to double), the compare test may pass.
Comparing for exact equality is rarely used in floating point code, aside from comparison to 0.0. More often code uses an ordered compare a < b. Comparing for approximate equality involves another parameter to control how near. #R.. has a good answer on that.
Because you are comparing two Floating point!
Floating point comparison is not exact because of Rounding Errors. Simple values like 1.1 or 9.0 cannot be precisely represented using binary floating point numbers, and the limited precision of floating point numbers means that slight changes in the order of operations can change the result. Different compilers and CPU architectures store temporary results at different precisions, so results will differ depending on the details of your environment. For example:
float a = 9.0 + 16.0
double b = 25.0
if(a == b) // can be false!
if(a >= b) // can also be false!
Even
if(abs(a-b) < 0.0001) // wrong - don't do this
This is a bad way to do it because a fixed epsilon (0.0001) is chosen because it “looks small”, could actually be way too large when the numbers being compared are very small as well.
I personally use the following method, may be this will help you:
#include <iostream> // std::cout
#include <cmath> // std::abs
#include <algorithm> // std::min
using namespace std;
#define MIN_NORMAL 1.17549435E-38f
#define MAX_VALUE 3.4028235E38f
bool nearlyEqual(float a, float b, float epsilon) {
float absA = std::abs(a);
float absB = std::abs(b);
float diff = std::abs(a - b);
if (a == b) {
return true;
} else if (a == 0 || b == 0 || diff < MIN_NORMAL) {
return diff < (epsilon * MIN_NORMAL);
} else {
return diff / std::min(absA + absB, MAX_VALUE) < epsilon;
}
}
This method passes tests for many important special cases, for different a, b and epsilon.
And don't forget to read What Every Computer Scientist Should Know About Floating-Point Arithmetic!
I'm new to C and when I run the code below, the value that is put out is 12098 instead of 12099.
I'm aware that working with decimals always involves a degree of inaccuracy, but is there a way to accurately move the decimal point to the right two places every time?
#include <stdio.h>
int main(void)
{
int i;
float f = 120.99;
i = f * 100;
printf("%d", i);
}
Use the round function
float f = 120.99;
int i = round( f * 100.0 );
Be aware however, that a float typically only has 6 or 7 digits of precision, so there's a maximum value where this will work. The smallest float value that won't convert properly is the number 131072.01. If you multiply by 100 and round, the result will be 13107202.
You can extend the range of your numbers by using double values, but even a double has limited range. (A double has 16 or 17 digits of precision.) For example, the following code will print 10000000000000098
double d = 100000000000000.99;
uint64_t j = round( d * 100.0 );
printf( "%llu\n", j );
That's just an example, finding the smallest number is that exceeds the precision of a double is left as an exercise for the reader.
Use fixed-point arithmetic on integers:
#include <stdio.h>
#define abs(x) ((x)<0 ? -(x) : (x))
int main(void)
{
int d = 12099;
int i = d * 100;
printf("%d.%02d\n", d/100, abs(d)%100);
printf("%d.%02d\n", i/100, abs(i)%100);
}
Your problem is that float are represented internaly using IEEE-754. That is in base 2 and not in base 10. 0.25 will have an exact representation, but 0.1 has not, nor has 120.99.
What really happens is that due to floating point inacuracy, the ieee-754 float closest to the decimal value 120.99 multiplied by 100 is slightly below 12099, so it is truncated to 12098. You compiler should have warned you that you had a truncation from float to in (mine did).
The only foolproof way to get what you expect is to add 0.5 to the float before the truncation to int :
i = (f * 100) + 0.5
But beware floating point are inherently inaccurate when processing decimal values.
Edit :
Of course for negative numbers, it should be i = (f * 100) - 0.5 ...
If you'd like to continue operating on the number as a floating point number, then the answer is more or less no. There's various things you can do for small numbers, but as your numbers get larger, you'll have issues.
If you'd like to only print the number, then my recommendation would be to convert the number to a string, and then move the decimal point there. This can be slightly complicated depending on how you represent the number in the string (exponential and what not).
If you'd like this to work and you don't mind not using floating point, then I'd recommend researching any number of fixed decimal libraries.
You can use
float f = 120.99f
or
double f = 120.99
by default c store floating-point values as double so if you store them in float variable implicit casting is happened and it is bad ...
i think this works.
I'm trying to represent the following mathematical expression in C:
P(n) = (n!)(6^n)
The program should compute the answer to expression when n = 156. I have attempted to create the program in C and it fails to produce an answer. The answer is approximately 10^397. The program utilises 2 logarithmic identities. It also utilises Stirling's approximation to calculate the large factorial.
How can I make it produce the correct answer and do you have any suggestions as to how I could improve the code? (I'm fairly new to programming):
#include <math.h>
typedef unsigned int uint;
int main()
{
uint n=156; // Declare variables
double F,pi=3.14159265359,L,e=exp(1),P;
F = sqrt(2*pi*n) * pow((n/e),n); // Stirling's Approximation Formula
L = log(F) + n*log(6); // Transform P(n) using logarithms - log(xy) = log(x) + log(y) and log(y^n) = n*log(y)
P = pow(e,L); // Transform the resultant logarithm back to a normal numbers
}
Thank you! :)
Neither integer nor floating point variables in most C implementations can support numbers of that magnitude. Typical 64-bit doubles go up to something like 10308 with substantial loss of precision at that magnitude.
You'll need what's called a 'bignum library' to compute this, which is not part of standard C.
One idea is to use the long double type. Its precision isn't guaranteed, so it may or may not be big enough for your needs, depending on what compiler you're using.
Replace double with long double. Add an 'l' (lower case L) suffix to all math functions (expl, logl, powl, sqrtl). Compile with C99 enabled, since the long double math functions are provided in C99. It worked for me using GCC 4.8.1.
#include <math.h>
#include <stdio.h>
typedef unsigned int uint;
int main()
{
uint n=156; // Declare variables
long double F,pi=3.14159265359,L,e=expl(1),P;
F = sqrtl(2*pi*n) * powl((n/e),n); // Stirling's Approximation Formula
L = logl(F) + n*logl(6); // Transform P(n) using logarithms - log(xy) = log(x) + log(y) and log(y^n) = n*log(y)
P = powl(e,L); // Transform the resultant logarithm back to a normal numbers
printf("%Lg\n", P);
}
I get 1.83969e+397.
Loosely speaking, in C a double is represented as a base number raised to a power. As already mentioned, the maximum is roughly 1E308, but as you get to larger and larger numbers (or smaller and smaller), you lose precision because the base number has a finite number of digits and cannot always be accurately represented in this way.
See http://en.wikipedia.org/wiki/Double-precision_floating-point_format for more information
#include <math.h>
#include <float.h>
typedef unsigned int uint;
int main()
{
uint n=156; // Declare variables
long double F,pi=3.14159265359,L,e=expl(1),P;
F = sqrtl(2*pi*n) * powl((n/e),n); // Stirling's Approximation Formula
L = logl(F) + n*logl(6); // Transform P(n) using logarithms - log(xy) = log(x) + log(y) and log(y^n) = n*log(y)
P = powl(e,L); // Transform the resultant logarithm back to a normal numbers
printf("%d\n", LDBL_MAX_10_EXP);
}