How can I code exp(x) in c? - c

There is a serie for the exp function whitch looks like this:
exp(x) = (x^0)/0! + (x^1)/1! + (x^2)/2! + (x^3)/3! + ยทยทยท. And I'm trying to compute it for different values of x, checking my results with a calculator and I found that for big values, 20 for example, my results stop increasing and get stuck in a value that is almost the real one. I get 485165184.00 and the real value is 485165195.4.
I must do this code in a for cycle or a recursive function, since it is a homework assignment.
My code looks as following
#include <stdio.h>
#define N 13
#define xi 3
double fun7(int n, int m){
int i;
double res=1, aux=0;
for(i=1, aux=1; i<(n+1); i++){
res += aux;
aux *= m;
aux /= i;
}
return res-1;
}
int main() {
int a, b, pot, x[xi];
float R[N][xi];
x[0] = 5;
x[1] = 10;
x[2] = 20;
for(b=0; b<xi; b++){
for (a=0, pot=1; a<N; a++){
R[a][b] = fun7(pot, x[b]);
pot *= 2;
}
}
for(b=0; b<xi; b++){
for (a=0, pot=1; a<N; a++){
printf("%d\t%f\n", pot, R[a][b]);
pot *= 2;
}
printf("\n");
}
return 0;
}

The float data type can normally represent numbers with a tad more than 7 decimal digits of precision.
485165184 has 9 decimal digits. The last two digits are just meaningless noise as far as float goes. You really should be showing 4.851652e8, which is the correct value for exp(20) with the given level of precision.
If you want to increase precision, try using double or long double data types.

Related

Graph of cos(x) through MacLaurin series only getting the first result right

I'm trying to create a program that compares the efficiency of calculating a function through MacLaurin series.
The idea is: Make a graph (using gnuplot) of cos(x) between -Pi and Pi (100 intervals) calculating cos(x) using the first 4 terms of its MacLaurin series, then, the first 6 terms, and comparing the graph between them.
Cos(x) through MacLaurin.
So, to use gnuplot, I made the code below that gets 2 files with the data I need, however, when i run the code only the first result is correct. For the first 4 terms my file is:
-3.141593 -9.760222e-001
-3.078126 2.367934e+264
And the rest of what would be my Y axis is just 2.367934e+264 repeated over and over. The 6 terms file is also just that number. X axis is fine.
I'm fairly new to coding and just don't know what i'm doing wrong. Any help would be appreciated.
Here's the code:
#include <stdio.h>
#include <math.h>
#define X_INI -M_PI
#define X_FIM M_PI
#define NI 100
int fatorial(int);
double serie(int ,double );
int main()
{
double x, y[NI], dx;
int i;
FILE *fp[3];
fp[0]=fopen("4Termos.dat","w");
fp[1]=fopen("6Termos.dat","w");
x=X_INI;
dx = (X_FIM - X_INI)/ (NI - 1);
for(i=0; i<NI; i++){
y[i]=serie(4,x);
fprintf(fp[0],"%lf %e\n", x, y[i]);
y[i]=serie(6,x);
fprintf(fp[1],"%lf %e\n", x, y[i]);
x = x + dx;
}
return 0;
}
int fatorial(int n) {
int i,p;
p = 1;
if (n==0)
return 1;
else {
for (i=1;i<=n;i++)
p = p*i;
return p;
}
}
double serie(int m, double z){
double s;
int j;
for(j = 0; j < m+1; j++)
{
s = s + ( ( pow((-1) , j))*pow(z, (2*j)) ) / (fatorial(2*j));
}
return s;
}
Fatorial is used to calculate factorial, serie used to calculate MacLaurin...
Use of uninitialized s in serie() function (I've taken the liberty to format the code to my liking).
double serie(int m, double z) {
double s; // better: double s = 0;
int j;
for (j = 0; j < m + 1; j++) {
s += pow(-1, j) * pow(z, 2 * j) / fatorial(2 * j);
}
return s;
}

Calculate sin(x) and cos(x) using Taylor Series in C [duplicate]

I have been struggling with this code and just do not seem to grasp what I am doing wrong.
The code is suppose to calculate : Sum of a series of "Cosine" with pattern [(-1)^i(x)^2i]/(2i)!
Here is my code thus far:
#include <stdio.h>
#include <math.h>
float factorial(int n){
if (n==0)
return 1;
else
return 2*n*factorial(n-1);
}
int main (){
float i, n;
float sum=0;
printf("Enter desired interger: ");
scanf("%f", &n);
for (i=0; i<=1; i++)
sum = sum + (pow(-1,i)*pow(n,2*i))/(factorial(n));
printf("The value is %f\n", sum);
return 0;
}
I still working on it, any info or help will be much appreciated!
edit:
Just fixed it guys, this is new format I had to use for my professor:
#include <stdio.h>
#include <math.h>
int factorial(int n)
{
if (n==0) return 1;
else
return n*factorial(n-1);
}
float mycos(float x)
{
float sum=0;
int i;
for (i=0;i<=10;i++) sum = sum + (pow(-1,i)*pow(x,2*i))/factorial(2*i);
return sum;
}
int main()
{
int i=1;
printf(" x mycos(x) cos(x)\n");
for (i=1;i<=10;i++)
printf(" %f %f %f\n", i*.1, mycos(i*.1), cos(i*.1));
return 0;
}
Thank you all for your explanations, they helped out Immensely!
One thing I see, is that your for loop within main only runs through 2 real iterations, once for i == 0, and again for i == 1.
For the taylor expansion to work fairly effectively, it needs to be run through more sequence terms (more loop iterations).
another thing I see, is that your denominator is the n! rather than (2 * n)!
For efficiency, I might also implement the factorial routine as follows:
unsigned int factorial(int n){
unsigned int product = 1;
for(int I = 1; I <= n; I++) product *= I;
return product;
}
The above factorial routine is for a more EXACT factorial calculation, which perhaps you don't need for this purpose. For your purposes, perhaps the floating point variant might be good enough.
float factorial(int n){
float product = 1;
for(int I = 1; I <= n; I++) product *= (float)I;
return product;
}
I should also note why I am stating to perform factorial in this manner. In general a loop construct will be more efficient than its recursive counterpart. Your current implementation is recursive, and thus the implementation I provide SHOULD be quite a bit more efficient from both performance, and memory utilization.
Considering computation expense, you need to stop calculating the series at a point. The more you go, the more precise the result will be, but the more your program spends time. How about this simple program:
#include <stdio.h>
#include <math.h>
#define ITERATIONS 10 //control how far you go
float factorial(int n){
if (n==0)
return 1;
else
return n*factorial(n-1);
}
int main (){
float n;
float sum=0;
printf("Enter desired float: ");
scanf("%f", &n);
int c, i;
for (i=0; i<=ITERATIONS; i++) {
c = (i%2)==0? 1 : -1;
sum = sum + (c*pow(n,2*i+1))/(factorial(2*i+1));
}
printf("The value is %f\n", sum);
return 0;
}
1.) You are only multiplying even no.s in factorial function return 2*n*factorial(n-1); will give only even no.s. Instead you can replace n with 2n here- sum = sum + (pow(-1,i)*pow(n,2*i))/(factorial(2n)); This will give the correct (2n!).
2.) Check for the no, of iterations for (i=0; i<=1; i++) this will only run your loop twice. Try more no. of iterations for more accurate anwer.
Why are you calculating power etc for each item in the series? Also need to keep numbers in a suitable range for the data types
i.e. for cos
bool neg_sign = false;
float total = 1.0f;
float current = 1.0f;
for (int i = 0; i < length_of_series; ++i) {
neg_sign = !neg_sign;
current = current * (x / ((2 * i) + 1)) * (x / (( 2 * i) + 2));
total += neg_sign ? -current : current;
}
EDIT
Please see http://codepad.org/swDIh8P5
#include<stdio.h>
# define PRECISION 10 /*the number of terms to be processed*/
main()
{
float x,term=1,s=1.0;
int i,a=2;
scanf("%f",&x);
x=x*x;
for(i=1;i<PRECISION;i++)
{
term=-term*x/(a*(a-1));
s+=term;
a+=2;
}
printf("result=%f",s);
}
Your factorial() function actually calculates 2n.n!, which probably isn't what you had in mind. To calculate (2n)!, you need to remove the 2* from the function body and invoke factorial(2*n).

SSE Intrinsics arithmetic error

I've been experimenting with SSE intrinsics and I seem to have run into a weird bug that I can't figure out. I am computing the inner product of two float arrays, 4 elements at a time.
For testing I've set each element of both arrays to 1, so the product should be == size.
It runs correctly, but whenever I run the code with size > ~68000000 the code using the sse intrinsics starts computing the wrong inner product. It seems to get stuck at a certain sum and never exceeds this number. Here is an example run:
joe:~$./test_sse 70000000
sequential inner product: 70000000.000000
sse inner product: 67108864.000000
sequential time: 0.417932
sse time: 0.274255
Compilation:
gcc -fopenmp test_sse.c -o test_sse -std=c99
This error seems to be consistent amongst the handful of computers I've tested it on. Here is the code, perhaps someone might be able to help me figure out what is going on:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <omp.h>
#include <math.h>
#include <assert.h>
#include <xmmintrin.h>
double inner_product_sequential(float * a, float * b, unsigned int size) {
double sum = 0;
for(unsigned int i = 0; i < size; i++) {
sum += a[i] * b[i];
}
return sum;
}
double inner_product_sse(float * a, float * b, unsigned int size) {
assert(size % 4 == 0);
__m128 X, Y, Z;
Z = _mm_set1_ps(0.0f);
float arr[4] __attribute__((aligned(sizeof(float) * 4)));
for(unsigned int i = 0; i < size; i += 4) {
X = _mm_load_ps(a+i);
Y = _mm_load_ps(b+i);
X = _mm_mul_ps(X, Y);
Z = _mm_add_ps(X, Z);
}
_mm_store_ps(arr, Z);
return arr[0] + arr[1] + arr[2] + arr[3];
}
int main(int argc, char ** argv) {
if(argc < 2) {
fprintf(stderr, "usage: ./test_sse <size>\n");
exit(EXIT_FAILURE);
}
unsigned int size = atoi(argv[1]);
srand(time(0));
float *a = (float *) _mm_malloc(size * sizeof(float), sizeof(float) * 4);
float *b = (float *) _mm_malloc(size * sizeof(float), sizeof(float) * 4);
for(int i = 0; i < size; i++) {
a[i] = b[i] = 1;
}
double start, time_seq, time_sse;
start = omp_get_wtime();
double inner_seq = inner_product_sequential(a, b, size);
time_seq = omp_get_wtime() - start;
start = omp_get_wtime();
double inner_sse = inner_product_sse(a, b, size);
time_sse = omp_get_wtime() - start;
printf("sequential inner product: %f\n", inner_seq);
printf("sse inner product: %f\n", inner_sse);
printf("sequential time: %f\n", time_seq);
printf("sse time: %f\n", time_sse);
_mm_free(a);
_mm_free(b);
}
You are running into the precision limit of single precision floating point numbers. The number 16777216 (2^24), which is the value of each component of the vector Z when reaching the "limit" inner product, is represented in 32-bit floating point as hexadecimal 0x4b800000 or binary 0 10010111 00000000000000000000000, i.e. the 23-bit mantissa is all zeros (implicit leading 1 bit), and the 8-bit exponent part is 151 representing the exponent 151 - 127 = 24. If you add a 1 to that value this would require to increase the exponent but then the added one cannot be represented in the mantissa any longer, so in single precision floating point arithmetic 2^24 + 1 = 2^24.
You do not see that in your sequential function because there you are using a 64-bit double precision value to store the result, and as we are working on a x86 platform, internally most probably an 80-bit excess precision register is used.
You can force to use single precision throughout in your sequential code by rewriting it as
float sum;
float inner_product_sequential(float * a, float * b, unsigned int size) {
sum = 0;
for(unsigned int i = 0; i < size; i++) {
sum += a[i] * b[i];
}
return sum;
}
and you will see 16777216.000000 as maximum computed value.

Numerical Integral from 0 to infinity

My aim is to calculate the numerical integral of a probability distribution function (PDF) of the distance of an electron from the nucleus of the hydrogen atom in C programming language. I have written a sample code however it fails to find the numerical value correctly due to the fact that I cannot increase the limit as much as its necessary in my opinion. I have also included the library but I cannot use the values stated in the following post as integral boundaries: min and max value of data type in C . What is the remedy in this case? Should switch to another programming language maybe? Any help and suggestion is appreciated, thanks in advance.
Edit: After some value I get the error segmentation fault. I have checked the actual result of the integral to be 0.0372193 with Wolframalpha. In addition to this if I increment k in smaller amounts I get zero as a result that is why I defined r[k]=k, I know it should be smaller for increased precision.
#include <stdio.h>
#include <math.h>
#include <limits.h>
#define a0 0.53
int N = 200000;
// This value of N is the highest possible number in long double
// data format. Change its value to adjust the precision of integration
// and computation time.
// The discrete integral may be defined as follows:
long double trapezoid(long double x[], long double f[]) {
int i;
long double dx = x[1]-x[0];
long double sum = 0.5*(f[0]+f[N]);
for (i = 1; i < N; i++)
sum+=f[i];
return sum*dx;
}
main() {
long double P[N], r[N], a;
// Declare and initialize the loop variable
int k = 0;
for (k = 0; k < N; k++)
{
r[k] = k ;
P[k] = r[k] * r[k] * exp( -2*r[k] / a0);
//printf("%.20Lf \n", r[k]);
//printf("%.20Lf \n", P[k]);
}
a = trapezoid(r, P);
printf("%.20Lf \n", a);
}
Last Code:
#include <stdio.h>
#include <math.h>
#include <limits.h>
#include <stdlib.h>
#define a0 0.53
#define N LLONG_MAX
// This value of N is the highest possible number in long double
// data format. Change its value to adjust the precision of integration
// and computation time.
// The discrete integral may be defined as follows:
long double trapezoid(long double x[],long double f[]) {
int i;
long double dx = x[1]-x[0];
long double sum = 0.5*(f[0]+f[N]);
for (i = 1; i < N; i++)
sum+=f[i];
return sum*dx;
}
main() {
printf("%Ld", LLONG_MAX);
long double * P = malloc(N * sizeof(long double));
long double * r = malloc(N * sizeof(long double));
// Declare and initialize the loop variable
int k = 0;
long double integral;
for (k = 1; k < N; k++)
{
P[k] = r[k] * r[k] * expl( -2*r[k] / a0);
}
integral = trapezoid(r, P);
printf("%Lf", integral);
}
Edit last code working:
#include <stdio.h>
#include <math.h>
#include <limits.h>
#include <stdlib.h>
#define a0 0.53
#define N LONG_MAX/100
// This value of N is the highest possible number in long double
// data format. Change its value to adjust the precision of integration
// and computation time.
// The discrete integral may be defined as follows:
long double trapezoid(long double x[],long double f[]) {
int i;
long double dx = x[1]-x[0];
long double sum = 0.5*(f[0]+f[N]);
for (i = 1; i < N; i++)
sum+=f[i];
return sum*dx;
}
main() {
printf("%Ld \n", LLONG_MAX);
long double * P = malloc(N * sizeof(long double));
long double * r = malloc(N * sizeof(long double));
// Declare and initialize the loop variable
int k = 0;
long double integral;
for (k = 1; k < N; k++)
{
r[k] = k / 100000.0;
P[k] = r[k] * r[k] * expl( -2*r[k] / a0);
}
integral = trapezoid(r, P);
printf("%.15Lf \n", integral);
free((void *)P);
free((void *)r);
}
In particular I have changed the definition for r[k] by using a floating point number in the division operation to get a long double as a result and also as I have stated in my last comment I cannot go for Ns larger than LONG_MAX/100 and I think I should investigate the code and malloc further to get the issue. I have found the exact value that is obtained analytically by taking the limits; I have confirmed the result with TI-89 Titanium and Wolframalpha (both numerically and analytically) apart from doing it myself. The trapezoid rule worked out pretty well when the interval size has been decreased. Many thanks for all the posters here for their ideas. Having a value of 2147483647 LONG_MAX is not that particularly large as I expected by the way, should the limit not be around ten to power 308?
Numerical point of view
The usual trapezoid method doesn't work with improper integrals. As such, Gaussian quadrature rules are much better, since they not only provide 2n-1 exactness (that is, for a polynomial of degree 2n-1 they will return the correct solution), but also manage improper integrals by using the right weight function.
If your integral is improper in both sides, you should try the Gauss-Hermite quadrature, otherwise use the Gauss-Laguerre quadrature.
The "overflow" error
long double P[N], r[N], a;
P has a size of roughly 3MB, and so does r. That's too much memory. Allocate the memory instead:
long double * P = malloc(N * sizeof(long double));
long double * r = malloc(N * sizeof(long double));
Don't forget to include <stdlib.h> and use free on both P and r if you don't need them any longer. Also, you may not access the N-th entry, so f[N] is wrong.
Using Gauss-Laguerre quadrature
Now Gauss-Laguerre uses exp(-x) as weight function. If you're not familiar with Gaussian quadrature: the result of E(f) is the integral of w * f, where w is the weight function.
Your f looks like this, and:
f x = x^2 * exp (-2 * x / a)
Wait a minute. f already contains exp(-term), so we can substitute x with t = x * a /2 and get
f' x = (t * a/2)^2 * exp(-t) * a/2
Since exp(-t) is already part of our weight function, your function fits now perfectly into the Gauss-Laguerre quadrature. The resulting code is
#include <stdio.h>
#include <math.h>
/* x[] and a[] taken from
* https://de.wikipedia.org/wiki/Gau%C3%9F-Quadratur#Gau.C3.9F-Laguerre-Integration
* Calculating them by hand is a little bit cumbersome
*/
const int gauss_rule_length = 3;
const double gauss_x[] = {0.415774556783, 2.29428036028, 6.28994508294};
const double gauss_a[] = {0.711093009929, 0.278517733569, 0.0103892565016};
double f(double x){
return x *.53/2 * x *.53/2 * .53/2;
}
int main(){
int i;
double sum = 0;
for(i = 0; i < gauss_rule_length; ++i){
sum += gauss_a[i] * f(gauss_x[i]);
}
printf("%.10lf\n",sum); /* 0.0372192500 */
return 0;
}

Float size, matrix multiplication, OpenCL, sockets. Weird

I'm generating two matrices using the following function (note some code is omitted):
srand(2007);
randomInit(h_A_data, size_A);
void randomInit(float* data, int size)
{
int i;
for (i = 0; i < size; ++i){
data[i] = rand() / (float)RAND_MAX;
}
}
This is called for matrix A and B. This populates the matrices with 0.something values, e.g. 0.748667. I then perform a matrix multiplication using a CPU. I compare the result to a GPU implementation via OpenCL. The resulting matrix has values in the range 20.something, e.g. 23.472757. Both the CPU and the GPU give the same result. The CPU implementation is taken from the Cuda toolkit distrib by nvidia:
void computeGold(float* C, const float* A, const float* B, unsigned int hA, unsigned int wA, unsigned int wB)
{
unsigned int i;
unsigned int j;
unsigned int k;
for (i = 0; i < hA; ++i)
for (j = 0; j < wB; ++j) {
double sum = 0;
for (k = 0; k < wA; ++k) {
double a = A[i * wA + k];
double b = B[k * wB + j];
sum += a * b;
}
C[i * wB + j] = (float)sum;
}
}
The weird thing is, all three matrices in memory are of the same size, i.e. sizeof(float)*size_A, or *size_B for matrix B etc. When I dump them to the disk, the file for the result stored in matrix C (the multiplied matrix) is bigger than matrix A and B.
Even more critical, for my application I'm transferring these over a network via a socket. In terms of the raw number of bytes, all matrices are the same, and yet it takes longer to transfer matrix C over the network. The problem is extrapolated for large matrix sizes. Why is this?
UPDATE/EDIT:
fprintf(matrix_c_file,"\n\nMatrix C\n");
for(i = 0; i < size_C; i++)
{
fprintf(matrix_c_file,"%f ", h_C_data[i]);
}
fprintf(matrix_c_file,"\n");
When matrix A and B contain only zero's, all three (matrix A, B and C) are the same size on disk.
I think that lijie has the correct (albeit terse) answer in the comments. The %f format specifier can result in a string with variable width. Consider the following C code:
printf("%f\n", 0.0);
printf("%f\n", 3.1415926535897932384626433);
printf("%f\n", 20.53);
printf("%f\n", 20.5e38);
which produces:
0.000000
3.141593
20.530000
2050000000000000019963732141023730597888.000000
All of the output has the same number of digits after the decimal point (6 by default), but a variable number to the left of the decimal point. If you need the textual representation of your matrix to be a consistent size and you don't mind sacrificing some precision, you can use the %e format specifier instead to force an exponential representation like 2.345e12.

Resources