The function calculates the value of sinh(x) using the following
development in a Taylor series:
I want to calculate the value of sinh(3) = 10.01787, but the function outputs 9. I also get this warning:
1>main.c(24): warning C4244: 'function': conversion from 'double' to 'int', possible loss of data
This is my code:
int fattoriale(int n)
{
int risultato = 1;
if (n == 0)
{
return 1;
}
for (int i = 1; i < n + 1; i++)
{
risultato = risultato * i;
}
return risultato;
}
int esponenziale(int base, int esponente)
{
int risultato = 1;
for (int i = 0; i < esponente; i++)
{
risultato = risultato * base;
}
return risultato;
}
double seno_iperbolico(double x)
{
double risultato = 0, check = -1;
for (int n = 0; check != risultato; n++)
{
check = risultato;
risultato = risultato + (((esponenziale(x, ((2 * n) + 1))) / (fattoriale((2 * n) + 1))));
}
return risultato;
}
int main(void)
{
double numero = 1;
double risultato = seno_iperbolico(numero);
}
Please help me fix this program.
It is actually pretty great that the compiler is warning you about this kind of data loss.
You see, when you call this:
esponenziale(x, ((2 * n) + 1))
You essentially lose your accuracy since you are converting your double, which is x, to an int. This is since the signature of esponenziale is int esponenziale(int base, int esponente).
Change it to double esponenziale(double base, int esponente), risultato should be a double as well, since you are returning it from the function and performing mathematical operations with/on it.
Remember that dividing a double with an int gives you a double back.
Edit: According to ringø's comment, and seeing how it actually solved your issue, you should also set double fattoriale(int n) and inside that double risultato = 1;.
You are losing precision since many of the terms will be fractional quantities. Using an int will clobber the decimal portion. Replace your int types with double types as appropriate.
Your factorial function will overflow for surprisingly small values of n. For 16 bit int, the largest value of n is 7, for 32 bit it's 12 and for 64 bit it's 19. The behaviour on overflowing a signed integral type is undefined. You could use unsigned long long or a uint128_t if your compiler supports it. That will buy you a bit more time. But given you're converting to a double anyway, you may as well use a double from the get-go. Note that an IEEE764 floating point double will hit infinity at 171!
Be assured that the radius of convergence of the Maclaurin expansion of sinh is infinite for any value of x. So any value of x will work, although convergence might be slow. See http://math.cmu.edu/~bkell/21122-2011f/sinh-maclaurin.pdf.
Related
I made a program that can get the largest integral of a float value:
#include <stdio.h>
float get_value(float a);
int main() {
float num = 4.58;
float new_val = get_value(num);
printf("%f \n", new_val);
}
float get_value(float a) {
int c = a;
for (int i = 0; i < 99; i++) {
a -= 0.01;
if (a == c) {
break;
}
}
return a;
}
It didn't work in the way I wanted it to be, so I want a shorthand of it instead of making a function.
So is there a function that I can use for this?
Use floor() if you want the lowest integer (closest to minus infinity) not exceeding the floating point value. Or use trunc() to get the smallest integer (closest to zero) not exceeding the magnitude of the fp value.
Also, note that .1 has a repeating representation in binary fp, so your function as written is always going to have problems. Just like 1/3 becomes .3333 in decimal.
You can use modf:
double integral, fractional;
double num = 4.58;
int result;
fractional = modf(num, &integral);
result = (int)integral;
i don't understand why when i try to cast my double to int the values after the commas are rounded..
void print_float(double nb)
{
int negative;
int intpart;
double decpart = -10.754;
int v;
negative = (nb < 0.0f);
intpart = (int)nb;
decpart = nb - intpart;
v = (int)(decpart * 1000);
if (negative) {
v *= -1;
}
printf("%i.%i", intpart, v); // output: -10.753
}
I guess after thinking that the worries come from the cast, but I do not understand the problem..
A double cannot exactly encoded all numbers. It can exactly encoded about 264 different values. -10.754 is not one of them. Instead a nearby value is used just less than expected.
printf("%.24f", -10.754);
// -10.753999999999999559463504
The decpart * 1000 part introduces some imprecision yet the product is still below 754.0 and then the (int) cast makes that 753.
I have two functions here that together compute the nCr:
int factorial(int n) {
int c;
int result = 1;
for (c = 1; c <= n; c++)
{
result = result*c;
}
return result;
}
int nCr(int n, int r) {
int result;
result = factorial(n)/(factorial(r)*factorial(n-r));
return result;
}
I am having trouble with an error check I need to implement. As n gets larger, I won't have the ability to computer n! and this error check has to exist in both nCr and factorial. They both must detect this overflow.
Currently, when I enter a number that is too large for computation, I get a floating type error returned from the command line.
I am having trouble accounting for this overflow check. Any help would be much appreciated, thanks.
A better way of calculating binomial coefficients
typedef unsigned long long ull;
ull nCr(int n, int r) {
ull res = 1;
if (r > n - r) r = n - r;
for (int i = 0; i < r; ++i) {
res *= (n - i);
res /= (i + 1);
}
return res;
}
In your code, the maximum value is always factorial(n),
so you only need to check that n! isn't bigger than 2.147.483.647 (max int value).
Please note that the stored max value can be different based on the size of the int type in memory (different machines can specify different sizes).
However, the last bit in int type variables is reserved for storing the sign (+ or -), thus the max value can be half of 65.535 and 4.294.967.295 i.e. 32.767 and 2.147.483.647 for int types.
SIZE_OF_INT(bits) MAX VALUE(UNSIGNED) MAX_VALUE(SIGNED)
---------------------------------------------------------------
16 65.535 32.767
32 4.294.967.295 2.147.483.647
The value of 13! can go beyond the max value of the int type (in 32 bit).
12! = 479.001.600 and
13! = 6.227.020.800
So, you need to check in nCr(int n, int r) that the max value of n is always less than 13 (i.e. n<=12) and r<=n.
And in factorial(int n): n<=12.
This is from google's code jam, practice problem "All your base".
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
long long pow_longlong(int digit, int raiseto)
{
if (raiseto == 0) return 1;
else return digit * pow_longlong(digit, raiseto - 1);
}
long long base10_with_map(int base, char* instr, char* digits)
{
if (base < 2) base = 2;
long long result = 0;
int len = strlen(instr);
int i = 0;
while (len--)
result += digits[instr[len]] * pow_longlong(base, i++);
return result;
}
long long test(char* in)
{
char appear[256];
int i;
int len = strlen(in);
int hold = 0;
for (i = 0; i < 256; i++) appear[i] = '\xFF';
for (i = 0; i < len; i++)
if (appear[in[i]] == '\xFF')
{
if (hold == 0) { appear[in[i]] = 1; hold++; }
else if (hold == 1) { appear[in[i]] = 0; hold++; }
else appear[in[i]] = hold++;
}
return base10_with_map(hold, in, appear);
}
int main(int argc, char* argv[])
{
if (argc < 2)
{
printf("Usage: %s <input-file> \n", argv[0]); return 1;
}
char buf[100];
int a, i;
FILE* f = fopen(argv[1], "r");
fscanf(f, "%d", &a);
long long result;
for (i = 1; i <= a; i++)
{
fscanf(f, "%s", buf);
result = test(buf);
printf("Case #%d: %lld\n", i, result);
}
return 0;
}
This works as intended and produces correct result to the problem. But if I replace my own pow_longlong() with pow() from math.h some calculations differ.
What is the reason to this? Just curious.
Edits:
- No overflow, plain long is enough to store the values, long long is just overkill
- Of course I include math.h
- In example: test("wontyouplaywithme") with pow_longlong returns 674293938766347782 (right) and with math.h 674293938766347904 (wrong)
Sorry that I won't go through your example and your intermediary function; the issue you're having occurs due to double being insufficient, not the long long. It is just that the number grows too large, causing it to require more and more precision towards the end, more than double can safely represent.
Here, try this really simple programme out, or just trust in the output I append to it to see what I mean:
#include <stdio.h>
int main( ){
double a;
long long b;
a = 674293938766347782.0;
b = a;
printf( "%f\n", a );
printf( "%lld", b );
getchar( );
return 0;
}
/*
Output:
674293938766347780.000000
674293938766347776
*/
You see, the double may have 8 bytes, just as much as the long long has, but it is designed so that it would also be able to hold non-integral values, which makes it less precise than long long can get in some cases like this one.
I don't know the exact specifics, but here, in MSDN it is said that its representation range is from -1.7e308 to +1.7e308 with (probably just on average) 15 digit precision.
So, if you are going to work with positive integers only, stick with your function. If you want to have an optimized version, check this one out: https://stackoverflow.com/a/101613/2736228
It makes use of the fact that, for example, while calculating x to the power 8, you can get away with 3 operations:
...
result = x * x; // x^2
result = result * result; // (x^2)^2 = x^4
result = result * result; // (x^4)^2 = x^8
...
Instead of dealing with 7 operations, multiplying them one by one.
pow (see reference) is not defined for integers, but only for floating point numbers. If you call pow with int as an argument the result will be a double.
You can in general not assume that the result of pow will be exactly the same as if you would use pure integer math as in the function pow_longlong.
Citation from wikipedia about double precision floating point numbers:
Between 2^52=4,503,599,627,370,496 and 2^53=9,007,199,254,740,992 the
representable numbers are exactly the integers. For the next range,
from 2^53 to 2^54, everything is multiplied by 2, so the representable
numbers are the even ones, etc.
So you get inaccurate results with pow if the result would be bigger than 2^53.
My aim is to calculate the numerical integral of a probability distribution function (PDF) of the distance of an electron from the nucleus of the hydrogen atom in C programming language. I have written a sample code however it fails to find the numerical value correctly due to the fact that I cannot increase the limit as much as its necessary in my opinion. I have also included the library but I cannot use the values stated in the following post as integral boundaries: min and max value of data type in C . What is the remedy in this case? Should switch to another programming language maybe? Any help and suggestion is appreciated, thanks in advance.
Edit: After some value I get the error segmentation fault. I have checked the actual result of the integral to be 0.0372193 with Wolframalpha. In addition to this if I increment k in smaller amounts I get zero as a result that is why I defined r[k]=k, I know it should be smaller for increased precision.
#include <stdio.h>
#include <math.h>
#include <limits.h>
#define a0 0.53
int N = 200000;
// This value of N is the highest possible number in long double
// data format. Change its value to adjust the precision of integration
// and computation time.
// The discrete integral may be defined as follows:
long double trapezoid(long double x[], long double f[]) {
int i;
long double dx = x[1]-x[0];
long double sum = 0.5*(f[0]+f[N]);
for (i = 1; i < N; i++)
sum+=f[i];
return sum*dx;
}
main() {
long double P[N], r[N], a;
// Declare and initialize the loop variable
int k = 0;
for (k = 0; k < N; k++)
{
r[k] = k ;
P[k] = r[k] * r[k] * exp( -2*r[k] / a0);
//printf("%.20Lf \n", r[k]);
//printf("%.20Lf \n", P[k]);
}
a = trapezoid(r, P);
printf("%.20Lf \n", a);
}
Last Code:
#include <stdio.h>
#include <math.h>
#include <limits.h>
#include <stdlib.h>
#define a0 0.53
#define N LLONG_MAX
// This value of N is the highest possible number in long double
// data format. Change its value to adjust the precision of integration
// and computation time.
// The discrete integral may be defined as follows:
long double trapezoid(long double x[],long double f[]) {
int i;
long double dx = x[1]-x[0];
long double sum = 0.5*(f[0]+f[N]);
for (i = 1; i < N; i++)
sum+=f[i];
return sum*dx;
}
main() {
printf("%Ld", LLONG_MAX);
long double * P = malloc(N * sizeof(long double));
long double * r = malloc(N * sizeof(long double));
// Declare and initialize the loop variable
int k = 0;
long double integral;
for (k = 1; k < N; k++)
{
P[k] = r[k] * r[k] * expl( -2*r[k] / a0);
}
integral = trapezoid(r, P);
printf("%Lf", integral);
}
Edit last code working:
#include <stdio.h>
#include <math.h>
#include <limits.h>
#include <stdlib.h>
#define a0 0.53
#define N LONG_MAX/100
// This value of N is the highest possible number in long double
// data format. Change its value to adjust the precision of integration
// and computation time.
// The discrete integral may be defined as follows:
long double trapezoid(long double x[],long double f[]) {
int i;
long double dx = x[1]-x[0];
long double sum = 0.5*(f[0]+f[N]);
for (i = 1; i < N; i++)
sum+=f[i];
return sum*dx;
}
main() {
printf("%Ld \n", LLONG_MAX);
long double * P = malloc(N * sizeof(long double));
long double * r = malloc(N * sizeof(long double));
// Declare and initialize the loop variable
int k = 0;
long double integral;
for (k = 1; k < N; k++)
{
r[k] = k / 100000.0;
P[k] = r[k] * r[k] * expl( -2*r[k] / a0);
}
integral = trapezoid(r, P);
printf("%.15Lf \n", integral);
free((void *)P);
free((void *)r);
}
In particular I have changed the definition for r[k] by using a floating point number in the division operation to get a long double as a result and also as I have stated in my last comment I cannot go for Ns larger than LONG_MAX/100 and I think I should investigate the code and malloc further to get the issue. I have found the exact value that is obtained analytically by taking the limits; I have confirmed the result with TI-89 Titanium and Wolframalpha (both numerically and analytically) apart from doing it myself. The trapezoid rule worked out pretty well when the interval size has been decreased. Many thanks for all the posters here for their ideas. Having a value of 2147483647 LONG_MAX is not that particularly large as I expected by the way, should the limit not be around ten to power 308?
Numerical point of view
The usual trapezoid method doesn't work with improper integrals. As such, Gaussian quadrature rules are much better, since they not only provide 2n-1 exactness (that is, for a polynomial of degree 2n-1 they will return the correct solution), but also manage improper integrals by using the right weight function.
If your integral is improper in both sides, you should try the Gauss-Hermite quadrature, otherwise use the Gauss-Laguerre quadrature.
The "overflow" error
long double P[N], r[N], a;
P has a size of roughly 3MB, and so does r. That's too much memory. Allocate the memory instead:
long double * P = malloc(N * sizeof(long double));
long double * r = malloc(N * sizeof(long double));
Don't forget to include <stdlib.h> and use free on both P and r if you don't need them any longer. Also, you may not access the N-th entry, so f[N] is wrong.
Using Gauss-Laguerre quadrature
Now Gauss-Laguerre uses exp(-x) as weight function. If you're not familiar with Gaussian quadrature: the result of E(f) is the integral of w * f, where w is the weight function.
Your f looks like this, and:
f x = x^2 * exp (-2 * x / a)
Wait a minute. f already contains exp(-term), so we can substitute x with t = x * a /2 and get
f' x = (t * a/2)^2 * exp(-t) * a/2
Since exp(-t) is already part of our weight function, your function fits now perfectly into the Gauss-Laguerre quadrature. The resulting code is
#include <stdio.h>
#include <math.h>
/* x[] and a[] taken from
* https://de.wikipedia.org/wiki/Gau%C3%9F-Quadratur#Gau.C3.9F-Laguerre-Integration
* Calculating them by hand is a little bit cumbersome
*/
const int gauss_rule_length = 3;
const double gauss_x[] = {0.415774556783, 2.29428036028, 6.28994508294};
const double gauss_a[] = {0.711093009929, 0.278517733569, 0.0103892565016};
double f(double x){
return x *.53/2 * x *.53/2 * .53/2;
}
int main(){
int i;
double sum = 0;
for(i = 0; i < gauss_rule_length; ++i){
sum += gauss_a[i] * f(gauss_x[i]);
}
printf("%.10lf\n",sum); /* 0.0372192500 */
return 0;
}