Parallelizing function using OpenMP and C

Parallelizing function using OpenMP and C - c

I'm trying to convert the function "integrate_openMP", which implements the trapezoid rule, so that it can be run in parallel. I'm in doubt as to which parts of the function should be governed by the "Critical" pragma and how to deal with the calculation itself regarding OpenMP.
The function is called with the pragma's
#pragma omp parallel and #pragma omp single from main.
Thank you
I have updated the code with my initial attempt to parallelize the function
double integrate_openMP(double a, double b, double (*f)(double), double e)
{
calls++;
double int_result;
double m = (a + b) / 2;
double one_trap_area = (b - a) * (f(a) + f(b)) / 2;
double two_trap_area = (b - a) * (f(a) + f(b) + 2 * f(m)) / 4;
if (fabs(one_trap_area - two_trap_area) <= e)
{
return two_trap_area;
}
else
{
double left_area, right_area;
#pragma omp task shared(left_area)
{
left_area = integrate_openMP(a, m, f, e / 2);
}
#pragma omp task shared(right_area)
{
right_area = integrate_openMP(m, b, f, e / 2);
}
#pragma omp taskwait
int_result = left_area + right_area;
return int_result;
}
}
double integrate_single(double a, double b, double (*f) (double), double e) {
calls ++;
double m = (a + b) / 2;
double one_trap_area = (b - a) * (f(a) + f(b)) / 2;
double two_trap_area = (b - a) * (f(a) + f(b) + 2 * f(m))/ 4;
if (fabs(one_trap_area - two_trap_area) <= e) {
return two_trap_area;
} else {
double left_area, right_area;
left_area = integrate_single(a, m, f, e/2);
right_area = integrate_single(m, b, f, e/2);
return left_area + right_area;
}
}

Ask yourself a few questions... "Is this loop parallleism?" in which case omp for is useful. "Is this recursive parallelism?" in which case go read up on openmp tasks...

Related

Fixing parameters of a fitting function in Nonlinear Least-Square GSL

I'm working on some code that I'm writing which uses the [GNU Scientific Library (GSL)][1]'s Nonlinear least-squares algorithm for curve fitting.
I have been successful in obtaining a working code that estimate the right parameters from the fitting analysis using a C++ wrapper from https://github.com/Eleobert/gsl-curve-fit/blob/master/example.cpp.
Now, I would like to fix some of the parameters of the function to be fit. And I would like to modify the function in such a way that I can already input the value of the parameter to be fixed.
Any idea on how to do?
I'm showing here the full code.
This is the code for performing nonlinear least-squares fitting:
#include <gsl/gsl_vector.h>
#include <gsl/gsl_multifit_nlinear.h>
#include <iostream>
#include <random>
#include <vector>
#include <cassert>
#include <functional>
template <typename F, size_t... Is>
auto gen_tuple_impl(F func, std::index_sequence<Is...> )
{
return std::make_tuple(func(Is)...);
}
template <size_t N, typename F>
auto gen_tuple(F func)
{
return gen_tuple_impl(func, std::make_index_sequence<N>{} );
}
auto internal_solve_system(gsl_vector* initial_params, gsl_multifit_nlinear_fdf *fdf,
gsl_multifit_nlinear_parameters *params) -> std::vector<double>
{
// This specifies a trust region method
const gsl_multifit_nlinear_type *T = gsl_multifit_nlinear_trust;
const size_t max_iter = 200;
const double xtol = 1.0e-8;
const double gtol = 1.0e-8;
const double ftol = 1.0e-8;
auto *work = gsl_multifit_nlinear_alloc(T, params, fdf->n, fdf->p);
int info;
// initialize solver
gsl_multifit_nlinear_init(initial_params, fdf, work);
//iterate until convergence
gsl_multifit_nlinear_driver(max_iter, xtol, gtol, ftol, nullptr, nullptr, &info, work);
// result will be stored here
gsl_vector * y = gsl_multifit_nlinear_position(work);
auto result = std::vector<double>(initial_params->size);
for(int i = 0; i < result.size(); i++)
{
result[i] = gsl_vector_get(y, i);
}
auto niter = gsl_multifit_nlinear_niter(work);
auto nfev = fdf->nevalf;
auto njev = fdf->nevaldf;
auto naev = fdf->nevalfvv;
// nfev - number of function evaluations
// njev - number of Jacobian evaluations
// naev - number of f_vv evaluations
//logger::debug("curve fitted after ", niter, " iterations {nfev = ", nfev, "} {njev = ", njev, "} {naev = ", naev, "}");
gsl_multifit_nlinear_free(work);
gsl_vector_free(initial_params);
return result;
}
auto internal_make_gsl_vector_ptr(const std::vector<double>& vec) -> gsl_vector*
{
auto* result = gsl_vector_alloc(vec.size());
int i = 0;
for(const auto e: vec)
{
gsl_vector_set(result, i, e);
i++;
}
return result;
}
template<typename C1>
struct fit_data
{
const std::vector<double>& t;
const std::vector<double>& y;
// the actual function to be fitted
C1 f;
};
template<typename FitData, int n_params>
int internal_f(const gsl_vector* x, void* params, gsl_vector *f)
{
auto* d = static_cast<FitData*>(params);
// Convert the parameter values from gsl_vector (in x) into std::tuple
auto init_args = [x](int index)
{
return gsl_vector_get(x, index);
};
auto parameters = gen_tuple<n_params>(init_args);
// Calculate the error for each...
for (size_t i = 0; i < d->t.size(); ++i)
{
double ti = d->t[i];
double yi = d->y[i];
auto func = [ti, &d](auto ...xs)
{
// call the actual function to be fitted
return d->f(ti, xs...);
};
auto y = std::apply(func, parameters);
gsl_vector_set(f, i, yi - y);
}
return GSL_SUCCESS;
}
using func_f_type = int (*) (const gsl_vector*, void*, gsl_vector*);
using func_df_type = int (*) (const gsl_vector*, void*, gsl_matrix*);
using func_fvv_type = int (*) (const gsl_vector*, const gsl_vector *, void *, gsl_vector *);
auto internal_make_gsl_vector_ptr(const std::vector<double>& vec) -> gsl_vector*;
auto internal_solve_system(gsl_vector* initial_params, gsl_multifit_nlinear_fdf *fdf,
gsl_multifit_nlinear_parameters *params) -> std::vector<double>;
template<typename C1>
auto curve_fit_impl(func_f_type f, func_df_type df, func_fvv_type fvv, gsl_vector* initial_params, fit_data<C1>& fd) -> std::vector<double>
{
assert(fd.t.size() == fd.y.size());
auto fdf = gsl_multifit_nlinear_fdf();
auto fdf_params = gsl_multifit_nlinear_default_parameters();
fdf.f = f;
fdf.df = df;
fdf.fvv = fvv;
fdf.n = fd.t.size();
fdf.p = initial_params->size;
fdf.params = &fd;
// "This selects the Levenberg-Marquardt algorithm with geodesic acceleration."
fdf_params.trs = gsl_multifit_nlinear_trs_lmaccel;
return internal_solve_system(initial_params, &fdf, &fdf_params);
}
template<typename Callable>
auto curve_fit(Callable f, const std::vector<double>& initial_params, const std::vector<double>& x, const std::vector<double>& y) -> std::vector<double>
{
// We can't pass lambdas without convert to std::function.
constexpr auto n = 3;
assert(initial_params.size() == n);
auto params = internal_make_gsl_vector_ptr(initial_params);
auto fd = fit_data<Callable>{x, y, f};
return curve_fit_impl(internal_f<decltype(fd), n>, nullptr, nullptr, params, fd);
}
// linspace from https://github.com/Eleobert/meth/blob/master/interpolators.hpp
template <typename Container>
auto linspace(typename Container::value_type a, typename Container::value_type b, size_t n)
{
assert(b > a);
assert(n > 1);
Container res(n);
const auto step = (b - a) / (n - 1);
auto val = a;
for(auto& e: res)
{
e = val;
val += step;
}
return res;
}
This is the function I use for fitting:
double gaussian(double x, double a, double b, double c)
{
const double z = (x - b) / c;
return a * std::exp(-0.5 * z * z);
}
And these last lines create a fake dataset of observed data (with some noise which is normally distributed) and test the fitting curve function.
int main()
{
auto device = std::random_device();
auto gen = std::mt19937(device());
auto xs = linspace<std::vector<double>>(0.0, 1.0, 300);
auto ys = std::vector<double>(xs.size());
double a = 5.0, b = 0.4, c = 0.15;
for(size_t i = 0; i < xs.size(); i++)
{
auto y = gaussian(xs[i], a, b, c);
auto dist = std::normal_distribution(0.0, 0.1 * y);
ys[i] = y + dist(gen);
}
auto r = curve_fit(gaussian, {1.0, 0.0, 1.0}, xs, ys);
std::cout << "result: " << r[0] << ' ' << r[1] << ' ' << r[2] << '\n';
std::cout << "error : " << r[0] - a << ' ' << r[1] - b << ' ' << r[2] - c << '\n';
}
In this case, I would like to fix one of the a, b, c parameters and estimate the remaining two. For example, fix a and estimate b and c. But I would like to find a solution such that I can input any value to the fixed parameter a, without needing to modify the gaussian function every time.

Ok. Here's the answer based on the code linked in http://github.com/Eleobert/gsl-curve-fit/blob/master/example.cpp. However, this is not the code posted in the question: you should update your question accordingly so that others may take advantage from both the question & answer.
So, basically, the main problem is that GSL is a library written in pure C, whereas you use a high-level wrapper written in C++, published in the aforementioned link . While the wrapper is written pretty well in modern C++, it has one basic problem: it is "stiff" - it can be used only for a subclass of problems it was designed for, and this subclass is a rather narrow subset of the capabilities offered by the original C code.
Let's try to improve it a bit and start from how the wrapper is supposed to be used:
double gaussian(double x, double a, double b, double c)
{
const double z = (x - b) / c;
return a * std::exp(-0.5 * z * z);
}
int main()
{
auto device = std::random_device();
auto gen = std::mt19937(device());
auto xs = linspace<std::vector<double>>(0.0, 1.0, 300);
auto ys = std::vector<double>(xs.size());
double a = 5.0, b = 0.4, c = 0.15;
for (size_t i = 0; i < xs.size(); i++)
{
auto y = gaussian(xs[i], a, b, c);
auto dist = std::normal_distribution(0.0, 0.1 * y);
ys[i] = y + dist(gen);
}
auto result = curve_fit(gaussian, {1.0, 0.0, 1.0}, xs, ys);
// use result
}
This code is amazingly simple compared to its original, C-language counterpart. One initializes the x-y pairs of values, here stored as vectors xs and ys and executes a single function that take 4 easy to understand parameters: the function to be fitted to the data, the initial values of the fitting parameters the function depends on, the x values and the corresponding y values of the data to which the function must be fitted.
Your problem is how to keep this high-level interface, but use it for fitting functions where only some parameters are "free", that is, can be changed during the fitting procedure, while the values of others must be fixed. This could be easily achieved using, e.g., global variables that the function has access to, but we hate global variables and never use them without a real cause.
I suggest using a well-known C++ alternative: functors. Look:
struct gaussian_fixed_a
{
double a;
gaussian_fixed_a(double a) : a{a} {}
double operator()(double x, double b, double c) const { return gaussian(x, a, b, c); }
};
This struct/class introduces a new type of function objects. In the constructor, parameter a is passed and stored in an object. Then, there's a function call operator that takes only 3 parameters rather than 4, substituting a from its stored value. This object can pretend to be a Gaussian with a fixed constant and only other 3 arguments, x, b, and c, that can vary.
We would like to use it like this:
gaussian_fixed_a g(a);
auto r2 = curve_fit(g, std::array{0.444, 0.11}, xs, ys);
This is almost the same code you'd use for the original wrapper save for 2 differences:
You now use an object name (here: g) rather than a function name
You have to pass the number of arguments to curve_fit as a compile-time constant, as it is then used internally by it to call a template parametrized by this number. I achieve it by using std::array, for which the compiler can deduce its size at compile time, just as needed. An alternative would ba a nasty template syntax, curve_fit<2>(...
For this to work, you need to change the interface of curve_fit, from
template <typename Callable>
auto curve_fit(Callable f, const std::vector<double>& initial_params, const std::vector<double>& x,
const std::vector<double>& y) -> std::vector<double>
to
template <typename Callable, auto n>
auto curve_fit(Callable f, const std::array<double, n>& initial_params, const std::vector<double>& x,
const std::vector<double>& y) -> std::vector<double>
(btw: this -> syntax with well-known type on its right-hand side is not the best one, IMHO, but let it be). The idea is to force the compiler to read the number of fitting parameters at compile time form the size of the array.
Then you need to make a similar adjustment in the argument list of curve_fit_impl - and this is almost it.
Here I spent quite a lot of time trying to figure out why this code does not work. It turned out it had worked all the time, the secret is that if you fit a function to some data, you'd better provide initial values reasonably close to the solution. That's why used this initializer std::array{0.444, 0.11} rather than the original {0, 1}, as the latter does not converge to anything close to the correct answer.
Do we really need to use explicit function objects? Perhaps lambdas will do? Yes, they will - this compiles and runs as expected:
auto r3 = curve_fit([a](double x, double b, double c) { return gaussian(x, a, b, c); }, std::array{0.444, 0.11}, xs, ys);
Here's the full diff between the original and modified code (without lambda):
7a8
> #include <array>
72c73,74
< auto internal_make_gsl_vector_ptr(const std::vector<double>& vec) -> gsl_vector*
---
> template<auto n>
> auto internal_make_gsl_vector_ptr(const std::array<double, n>& vec) -> gsl_vector*
158,159c160,161
< template <typename Callable>
< auto curve_fit(Callable f, const std::vector<double>& initial_params, const std::vector<double>& x,
---
> template <typename Callable, auto n>
> auto curve_fit(Callable f, const std::array<double, n>& initial_params, const std::vector<double>& x,
163,164c165,166
< constexpr auto n = 3;
< assert(initial_params.size() == n);
---
> // constexpr auto n = 2;
> // assert(initial_params.size() == n);
194a197,204
>
> struct gaussian_fixed_a
> {
> double a;
> gaussian_fixed_a(double a) : a{a} {}
> double operator()(double x, double b, double c) const { return gaussian(x, a, b, c); }
> };
>
212c222,224
< auto r = curve_fit(gaussian, {1.0, 0.0, 1.0}, xs, ys);
---
> auto r = curve_fit(gaussian, std::array{1.0, 0.0, 1.0}, xs, ys);
> gaussian_fixed_a g(a);
> auto r2 = curve_fit(g, std::array{0.444, 0.11}, xs, ys);
215a228,230
> std::cout << "\n";
> std::cout << "result: " << r2[0] << ' ' << r2[1] << '\n';
> std::cout << "error : " << r2[0] - b << ' ' << r2[1] - c << '\n';

Feed same input but got dfferent output in Householder reduction (tred2) for both C and Fortran program?

I translated some of C code in fortran on my research. While translating I got error in tred2 subroutine. To debug for more easier , I use input for c program while passing in tred2 routine same for fortran. Like (nb =number of band =18 A matrix A(nb,nb) as inout). but I got different output. I use many tred2.f90 subroutine standard and use lapack also for testing.
Here is my testing fortran code ..
Could you please point out the mistake i made ...
program test_househld
implicit none
INTEGER::nb,i,j
REAL*8:: d1
REAL*8, ALLOCATABLE ::C(:,:),p(:),q(:)
nb =18
ALLOCATE(C(nb,nb),p(nb),q(nb))
do i =1,nb
p(i) =0.0d0
q(i) =0.0d0
end do
open(221,file = 'C_matrix.inp',action ='read')
do i =1,nb
read(221,'(18F24.16)')(C(i,j),j =1,nb)
end do
close(221)
call tred2(C,nb,p,q)
end program test_househld
SUBROUTINE tred2(a,n,d,e)
IMPLICIT NONE
INTEGER :: n
REAL*8 :: a(n,n),d(n),e(n)
INTEGER :: i,j,k,l
REAL*8 :: f,g,h,hh,scale
DO i=n,2,-1
l=i-1
h=0.0D0
scale=0.0D0
IF (l > 1) THEN
scale=SUM(abs(a(i,1:l)))
IF (scale == 0.0D0) THEN
e(i)=a(i,l)
ELSE
a(i,1:l)=a(i,1:l)/scale
h=sum(a(i,1:l)**2)
f=a(i,l)
g=-sign(sqrt(h),f)
e(i)=scale*g
h=h-f*g
a(i,l)=f-g
f=0.0D0
DO j=1,l
! Omit following line if finding only eigenvalues
a(j,i)=a(i,j)/h
g=0.0D0
DO k=1,j
g=g+a(j,k)*a(i,k)
ENDDO
DO k=j+1,l
g=g+a(k,j)*a(i,k)
ENDDO
e(j)=g/h
f=f+e(j)*a(i,j)
ENDDO
hh=f/(h+h)
DO j=1,l
f=a(i,j)
g=e(j)-hh*f
e(j)=g
DO k=1,j
a(j,k)=a(j,k)-f*e(k)-g*a(i,k)
ENDDO
ENDDO
ENDIF
ELSE
e(i)=a(i,l)
ENDIF
d(i)=h
ENDDO
! Omit following line if finding only eigenvalues.
d(1)=0.0D0
e(1)=0.0D0
DO i=1,n
! Delete lines from here ...
l=i-1
IF (d(i) /= 0.0D0) THEN
DO j=1,l
g=0.0D0
DO k=1,l
g=g+a(i,k)*a(k,j)
ENDDO
DO k=1,l
a(k,j)=a(k,j)-g*a(k,i)
ENDDO
ENDDO
endif
! ... to here when finding only eigenvalues.
d(i)=a(i,i)
! Also delete lines from here ...
a(i,i)=1.
DO j=1,l
a(i,j)=0.0D0
a(j,i)=0.0D0
ENDDO
! ... to here when finding only eigenvalues.
ENDDO
END SUBROUTINE tred2
The used c-code routine is (tred2.c)
#include <math.h>
void tred2(a, n, d, e)
double** a, d[], e[];
int n;
{
int l, k, j, i;
double scale, hh, h, g, f;
for (i = n;i >= 2;i--) {
l = i - 1;
h = scale = 0.0;
if (l > 1) {
for (k = 1;k <= l;k++)
scale += fabs(a[i][k]);
if (scale == 0.0)
e[i] = a[i][l];
else {
for (k = 1;k <= l;k++) {
a[i][k] /= scale;
h += a[i][k] * a[i][k];
}
f = a[i][l];
g = (f >= 0.0 ? -sqrt(h) : sqrt(h));
e[i] = scale * g;
h -= f * g;
a[i][l] = f - g;
f = 0.0;
for (j = 1;j <= l;j++) {
a[j][i] = a[i][j] / h;
g = 0.0;
for (k = 1;k <= j;k++)
g += a[j][k] * a[i][k];
for (k = j + 1;k <= l;k++)
g += a[k][j] * a[i][k];
e[j] = g / h;
f += e[j] * a[i][j];
}
hh = f / (h + h);
for (j = 1;j <= l;j++) {
f = a[i][j];
e[j] = g = e[j] - hh * f;
for (k = 1;k <= j;k++)
a[j][k] -= (f * e[k] + g * a[i][k]);
}
}
}
else
e[i] = a[i][l];
d[i] = h;
}
d[1] = 0.0;
e[1] = 0.0;
/* Contents of this loop can be omitted if eigenvectors not
wanted except for statement d[i]=a[i][i]; */
for (i = 1;i <= n;i++) {
l = i - 1;
if (d[i]) {
for (j = 1;j <= l;j++) {
g = 0.0;
for (k = 1;k <= l;k++)
g += a[i][k] * a[k][j];
for (k = 1;k <= l;k++)
a[k][j] -= g * a[k][i];
}
}
d[i] = a[i][i];
a[i][i] = 1.0;
for (j = 1;j <= l;j++) a[j][i] = a[i][j] = 0.0;
}
}
input C_matrix.inp
-4.9643540965483464 0.0000000002923787 -0.0000000000015816 0.0000000000015866 -0.0000010270946462 -0.0000000000119348 -0.0000014910811312 -0.0000000000083231 0.0000000000051726 0.0000000006747626 0.0000000000042343 0.0000000000044931 0.0000008205316809 0.0000000000013976 -0.0000000000025613 0.0000000000031999 0.0000000000011099 -0.0000002065896463
0.0000000002923787 -3.0970798965053272 0.0000000000012606 -0.0000000000010828 -0.0000000020105181 0.0000019712291908 -0.0000000019589617 -0.0000000000008715 -0.0000000000035277 -0.0000011866352439 -0.0000000000025535 -0.0000000000030861 0.0000000105614882 0.0000000000007812 0.0000000000003148 -0.0000000000004940 -0.0000000000011315 -0.0000000268806379
-0.0000000000015816 0.0000000000012606 -3.0726587184630398 -0.0000000000020674 -0.0000000001967980 -0.0000000000031430 -0.0000000053059269 0.0000009717973030 -0.0000000000057488 -0.0000000000018134 -0.0000000000055421 -0.0000000000059327 0.0000000128630816 0.0000000000004427 0.0000000000006127 0.0000005919482648 0.0000000000004205 0.0000000017373891
0.0000000000015866 -0.0000000000010828 -0.0000000000020674 -3.0726581559742034 0.0000000001970779 0.0000000000026573 0.0000000053540850 -0.0000000000001896 0.0000010540828250 0.0000000000007506 0.0000000000046204 0.0000000000049601 -0.0000000129903302 0.0000000000001229 0.0000000000009025 0.0000000000025886 -0.0000006192561004 -0.0000000019022855
-0.0000010270946462 -0.0000000020105181 -0.0000000001967980 0.0000000001970779 -1.4799627603758512 -0.0000000000207997 0.0000040022051207 0.0000000000453210 0.0000000000093959 -0.0000000013251917 0.0000000000076454 0.0000000000082252 0.0000012017814372 0.0000000000047051 -0.0000000000118067 -0.0000000000293670 -0.0000000000003220 -0.0000008984357514
-0.0000000000119348 0.0000019712291908 -0.0000000000031430 0.0000000000026573 -0.0000000000207997 -1.3876784532515820 0.0000000023541494 0.0000000000019625 0.0000000000090133 -0.0000045240623974 0.0000000000124138 0.0000000000078450 -0.0000000018546957 0.0000000000008909 -0.0000000000001635 0.0000000000005688 0.0000000000027158 -0.0000000672281206
-0.0000014910811312 -0.0000000019589617 -0.0000000053059269 0.0000000053540850 0.0000040022051207 0.0000000023541494 -0.8583134955154931 0.0000000000128576 -0.0000000000192999 -0.0000000014957286 -0.0000000000142530 -0.0000000000152029 0.0000018516481554 0.0000000000039194 -0.0000000000091955 -0.0000000000227099 -0.0000000000003105 0.0000016740054685
-0.0000000000083231 -0.0000000000008715 0.0000009717973030 -0.0000000000001896 0.0000000000453210 0.0000000000019625 0.0000000000128576 -0.6260557686606371 0.0000000000034069 -0.0000000000008690 0.0000000000023244 0.0000000000025902 -0.0000000015017165 -0.0000000000038208 0.0000000000039194 0.0000046551165327 0.0000000000227413 -0.0000000010756788
0.0000000000051726 -0.0000000000035277 -0.0000000000057488 0.0000010540828250 0.0000000000093959 0.0000000000090133 -0.0000000000192999 0.0000000000034069 -0.6260536476274911 0.0000000000115542 0.0000000000137463 0.0000000000147947 -0.0000000024681981 0.0000000000006085 -0.0000000000004013 0.0000000000248150 -0.0000047190361606 -0.0000000002257375
0.0000000006747626 -0.0000011866352439 -0.0000000000018134 0.0000000000007506 -0.0000000013251917 -0.0000045240623974 -0.0000000014957286 -0.0000000000008690 0.0000000000115542 -0.5803541891855615 0.0000000000049024 0.0000000000096723 -0.0000000023349175 0.0000000000037884 0.0000000000025921 -0.0000000000039961 0.0000000000017223 0.0000004168417969
0.0000000000042343 -0.0000000000025535 -0.0000000000055421 0.0000000000046204 0.0000000000076454 0.0000000000124138 -0.0000000000142530 0.0000000000023244 0.0000000000137463 0.0000000000049024 -0.5574486988431754 0.0000000000119368 -0.0000000012816454 0.0000063881906259 0.0000000008948734 0.0000000000000565 0.0000000000038825 -0.0000000002143914
0.0000000000044931 -0.0000000000030861 -0.0000000000059327 0.0000000000049601 0.0000000000082252 0.0000000000078450 -0.0000000000152029 0.0000000000025902 0.0000000000147947 0.0000000000096723 0.0000000000119368 -0.5574485377065610 -0.0000000013589895 -0.0000000008438915 0.0000063940897453 0.0000000000005915 0.0000000000039688 -0.0000000002034879
0.0000008205316809 0.0000000105614882 0.0000000128630816 -0.0000000129903302 0.0000012017814372 -0.0000000018546957 0.0000018516481554 -0.0000000015017165 -0.0000000024681981 -0.0000000023349175 -0.0000000012816454 -0.0000000013589895 -0.4561210997669952 0.0000000000031373 -0.0000000000086804 -0.0000000000561001 -0.0000000000078430 -0.0000006303634354
0.0000000000013976 0.0000000000007812 0.0000000000004427 0.0000000000001229 0.0000000000047051 0.0000000000008909 0.0000000000039194 -0.0000000000038208 0.0000000000006085 0.0000000000037884 0.0000063881906259 -0.0000000008438915 0.0000000000031373 -0.2436725624222180 -0.0000000000051047 -0.0000000000123756 0.0000000000002341 -0.0000000011331551
-0.0000000000025613 0.0000000000003148 0.0000000000006127 0.0000000000009025 -0.0000000000118067 -0.0000000000001635 -0.0000000000091955 0.0000000000039194 -0.0000000000004013 0.0000000000025921 0.0000000008948734 0.0000063940897453 -0.0000000000086804 -0.0000000000051047 -0.2436723950073047 0.0000000000031071 0.0000000000049463 0.0000000018149829
0.0000000000031999 -0.0000000000004940 0.0000005919482648 0.0000000000025886 -0.0000000000293670 0.0000000000005688 -0.0000000000227099 0.0000046551165327 0.0000000000248150 -0.0000000000039961 0.0000000000000565 0.0000000000005915 -0.0000000000561001 -0.0000000000123756 0.0000000000031071 -0.2191906816051025 -0.0000000000013001 -0.0000000003356968
0.0000000000011099 -0.0000000000011315 0.0000000000004205 -0.0000006192561004 -0.0000000000003220 0.0000000000027158 -0.0000000000003105 0.0000000000227413 -0.0000047190361606 0.0000000000017223 0.0000000000038825 0.0000000000039688 -0.0000000000078430 0.0000000000002341 0.0000000000049463 -0.0000000000013001 -0.2191900460492452 0.0000000002003037
-0.0000002065896463 -0.0000000268806379 0.0000000017373891 -0.0000000019022855 -0.0000008984357514 -0.0000000672281206 0.0000016740054685 -0.0000000010756788 -0.0000000002257375 0.0000004168417969 -0.0000000002143914 -0.0000000002034879 -0.0000006303634354 -0.0000000011331551 0.0000000018149829 -0.0000000003356968 0.0000000002003037 -0.0602903370568763

There is no reason to expect the results to be the same. The error is in your expectation. As just one reason the output could be different (and there are many), compare:
for (k = 1;k <= j;k++)
a[j][k] -= (f * e[k] + g * a[i][k]);
And
DO k=1,j
a(j,k)=a(j,k)-f*e(k)-g*a(i,k)
ENDDO
In floating point math, a-(b+c) is not the same as (a-b)-c. This is particularly the case if b+c or a-b is different in magnitude from a or c. Where precision is lost affects the results.
Here's some code to demonstrate one way it can fail:
#include <stdio.h>
void compare(double a, double b, double c)
{
double d = a-(b+c);
double e = (a-b)-c;
printf("d=%lf e=%lf d-e=%lf d-e=0->%s\n",
d, e, d-e, ((d-e) == 0) ? "yes" : "no");
}
int main()
{
compare (1.0, 2.0, 3.0); // happens to be the same
compare (1e18L, 1e18L-0.0001L, 0.00001); // happens to not be the same
}
See the output here.

How to stop recursion after some time in C

I need to stop recursion after, let's say, 30 seconds in C. One of my attempts was using a goto - despite recommendations of not using it, but i can't use it between different functions. The code is below:
void func_t(int n, int a, int b, int c, int d, int e, int f, int g, int height, Data *data, time_t start_time){
int cont;
data->level_recursions[height] = data->level_recursions[height] + 1;
if ( n<= 1) return;
for (cont = 1; cont <= a; cont++){
data->height = height + 1;
func_t( (n/b) - c, a, b, c, d, e, f, g, 1 + height, data );
}
for (cont = 1; cont <= d; cont++) {
data->height = height + 1;
func_t( (n/e) - f, a, b, c, d, e, f, g, 1 + height, data );
}
clock_t begin = clock();
for (cont = 1; cont <= fn(n, g); cont++);
clock_t end = clock();
data->level_work[height] = data->level_work[height] + ((double)(end - begin) / CLOCKS_PER_SEC);
}

Create a global const parameter(current time) that you init for the first running time.
If 30 sec + the value of the global < current time -> hit return or do what ever you want)
Maybe this is the if statement to enter to the recursion

Poisson calculation (erlang C)

i posted this before, user told me to post it on codereview. i did, and they closed it...so one more time here: (i deleted the old question)
I have these formulas:
and I need the Poisson formulas for the erlangC formula:
I tried to rebuild the formulas in C:
double getPoisson(double m, double u, bool cumu)
{
double ret = 0;
if(!cumu)
{
ret = (exp(-u)*pow(u,m)) / (factorial(m));
}
else
{
double facto = 1;
double ehu = exp(-u);
for(int i = 0; i < m; i++)
{
ret = ret + (ehu * pow(u,i)) / facto;
facto *= (i+1);
}
}
return ret;
}
The Erlang C Formula:
double getErlangC(double m, double u, double p)
{
double numerator = getPoisson(m, u, false);
double denominator = getPoisson(m, u, false) + (1-p) * getPoisson(m, u, true);
return numerator/denominator;
}
The main problem is, the m parameter in getPoisson is a big value (>170)
so it wants to calculate >170! but it cannot handle it. I think the primitive data types are too small to work correctly, or what do you say?
BTW: This is the factorial function I use for the first Poisson:
double factorial(double n)
{
if(n >= 1)
return n*factorial(n-1);
else
return 1;
}
Some samples:
Input:
double l = getErlangC(50, 48, 0.96);
printf("%g", l);
Output:
0.694456 (correct)
Input:
double l = getErlangC(100, 96, 0.96);
printf("%g", l);
Output:
0.5872811 (correct)
if i use a value higher than 170 for the first parameter (m) of getErlangC like:
Input:
double l = getErlangC(500, 487, 0.974);
printf("%g", l);
Output:
naN (incorrect)
Excepted:
0.45269
How's my approach? Would be there a better way to calculate Poisson and erlangC?
Some Info: Excel has the POISSON Function, and on Excel it works perfekt... would there be a way to see the algorithm(code) EXCEL uses for POISSON?

(pow(u, m)/factorial(m)) can be expressed as a recursive loop with each element shown as u/n where each n is an element of m!.
double ratio(double u, int n)
{
if(n > 0)
{
// Avoid the ratio overflow by calculating each ratio element
double val;
val = u/n;
return val*ratio(u, n-1);
}
else
{
// Avoid division by 0 as power and factorial of 0 are 1
return 1;
}
}
Note that if you want to avoid recursion, you can do it as a loop as well
double ratio(double u, int n)
{
int i;
// Avoid the ratio overflow by calculating each ratio element
// default the ratio to 1 for n == 0
double val = 1;
// calculate the next n-1 ratios and put them into the total
for (i = 1; i<=n; i++)
{
// Put in the next element of the ratio
val *= u/i;
}
// return the final value of the ratio
return val;
}

To cope with values exceeding the double range, re-code to use the log of values. Downside- some precision loss.
Precision can be re-gained with improved code, but here is something that at least copes with the range issues.
Slight variant of OP's code follows: Used for comparison.
long double factorial(unsigned m) {
long double f = 1.0;
while (m > 0) {
f *= m;
m--;
}
return f;
}
double getPoisson(unsigned m, double u, bool cumu) {
double ret = 0;
if (!cumu) {
ret = (double) ((exp(-u) * pow(u, m)) / (factorial(m)));
} else {
double facto = 1;
double ehu = exp(-u);
for (unsigned i = 0; i < m; i++) {
ret = ret + (ehu * pow(u, i)) / facto;
facto *= (i + 1);
}
}
return ret;
}
double getErlang(unsigned m, double u, double p) {
double numerator = getPoisson(m, u, false);
double denominator = numerator + (1.0 - p) * getPoisson(m, u, true);
return numerator / denominator;
}
Suggested changes
#ifdef M_PI
#define MY_PI M_PI
#else
#define MY_PI 3.1415926535897932384626433832795
#endif
// log of n!
//
// Gosper Approximation of Stirling's Approximation
// http://mathworld.wolfram.com/StirlingsApproximation.html
// n! about= sqrt(pi*(2*n + 1/3.)) * pow(n,n) * exp(-n)
static double ln_factorial(unsigned n) {
if (n <= 1) return 0.0;
double x = n;
return log(sqrt(MY_PI * (2 * x + 1 / 3.0))) + log(x) * x - x;
}
double getPoisson_2(unsigned m, double u, bool cumu) {
double ret = 0.0;
if (cumu) {
// Simplify term calculation. `mul` does not get too large nor small.
double mul = exp(-u);
for (unsigned i = 0; i < m; i++) {
ret += mul;
mul *= u/(i + 1);
// printf("ret:% 10e mul:% 10e\n", ret, mul);
}
} else {
// ret = (exp(-u) * pow(u, m)) / (factorial(m));
double ln_ret = -u + log(u) * m - ln_factorial(m);
return exp(ln_ret);
}
return ret;
}
double getErlang_2(unsigned m, double u, double p) {
double numerator = getPoisson_2(m, u, false);
double denominator = numerator + (1 - p) * getPoisson_2(m, u, true);
return numerator / denominator;
}
Test code
void ErTest(unsigned m, double u, double p, double expect) {
printf("m:%4u u:% 14e p:% 14e", m, u, p);
printf(" E0:% 14e", expect);
double y1 = getErlang(m, u, p);
printf(" E1:% 14e", y1);
double y2 = getErlang_2(m, u, p);
printf(" E2:% 14e", y2);
puts("");
}
int main(void) {
ErTest(50, 48, 0.96, 0.694456);
ErTest(100, 96, 0.96, 0.5872811);
ErTest(500, 487, 0.974, 0.45269);
}
m: 50 u: 4.800000e+01 p: 9.600000e-01 E0: 6.944560e-01 E1: 6.944556e-01 E2: 6.944562e-01
m: 100 u: 9.600000e+01 p: 9.600000e-01 E0: 5.872811e-01 E1: 5.872811e-01 E2: 5.872813e-01
m: 500 u: 4.870000e+02 p: 9.740000e-01 E0: 4.526900e-01 E1: nan E2: 4.464746e-01

Your large recursive factorial is a problem as it might produce a stack overflow as well as a value overflow. pow might also get large.
Here's a way to combine things incrementally:
double
getPoisson(double m, double u, bool cumu)
{
double sum = 0;
double facto = 1;
double u_i = 1;
double ehu = exp(-u);
double cur = ehu;
// u_i -- pow(u,i)
// cur -- current/last term in series
// sum -- sum of terms
for (int i = 0; i < m; i++) {
cur = (ehu * u_i) / facto;
sum += cur;
u_i *= u;
facto *= (i + 1);
}
return cumu ? sum : cur;
}
The above is "okay", but still might overflow some values because of the u_i and facto terms.
Here is an alternate that combines the terms as a ratio. It is less likely to overflow:
double
getPoisson(double m, double u, bool cumu)
{
double sum = 0;
double ehu = exp(-u);
double cur = ehu;
double ratio = 1;
// cur -- current/last term in series
// sum -- sum of terms
// ratio -- u^i / factorial(i)
for (int i = 0; i < m; i++) {
cur = ehu * ratio;
sum += cur;
ratio *= u;
ratio /= (i + 1);
}
return cumu ? sum : cur;
}
The above might still produce some large values. If so, you might have to use long double, quadmath, or multiprecision arithmetic. Or, come up with an "analog" of the equation/algorithm.

Realtime Band-Limited Impulse Train Synthesis using SDL mixer

I'm trying to implement a audio synthesizer using this technique:
https://ccrma.stanford.edu/~stilti/papers/blit.pdf
I'm doing it in standard C, using SDL2_Mixer library.
This is my BLIT function implementation:
double blit(double angle, double M, double P) {
double x = M * angle / P;
double denom = (M * sin(M_PI * angle / P));
if (denom < 1)
return (M / P) * cos(M_PI * x) / cos(M_PI * x / M);
else {
double numerator = sin(M_PI * x);
return (M / P) * numerator / denom;
}
}
The idea is to combine it to generate a square wave, following the paper instructions. I setted up SDL2_mixer with this configuration:
SDL_AudioSpec *desired, *obtained;
SDL_AudioSpec *hardware_spec;
desired = (SDL_AudioSpec*)malloc(sizeof(SDL_AudioSpec));
obtained = (SDL_AudioSpec*)malloc(sizeof(SDL_AudioSpec));
desired->freq=44100;
desired->format=AUDIO_U8;
desired->channels=1;
desired->samples=2048;
desired->callback=create_rect;
desired->userdata=NULL;
And here's my create_rect function. It creates a bipolar impulse train, then it integrates it's value to generate a band-limited rect function.
void create_rect(void *userdata, Uint8 *stream, int len) {
static double angle = 0;
static double integral = 0;
int i = 0;
// This is the freq of my tone
double f1 = tone_table[current_wave.note];
// Sample rate
double fs = 44100;
// Pulse
double P = fs / f1;
int M = 2 * floor(P / 2) + 1;
double oldbipolar = 0;
double bipolar = 0;
for(i = 0; i < len; i++) {
if (++angle > P)
angle -= P;
double angle2 = angle + floor(P/2);
if (angle2 > P)
angle2 -= P;
bipolar = blit(angle2, M, P) - blit(angle, M, P);
integral += (bipolar + old bipolar) * 0.5;
oldbipolar = bipolar;
*stream++ = (integral + 0.5) * 127;
}
}
My problem is: the resulting wave is quite ok, but after few seconds it starts to make noises. I tried to plot the result, and here's it:
Any idea?
EDIT: Here's a plot of the bipolar BLIT before integrating it:

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Parallelizing function using OpenMP and C - c

Ask yourself a few questions... "Is this loop parallleism?" in which case omp for is useful. "Is this recursive parallelism?" in which case go read up on openmp tasks...

Related

Fixing parameters of a fitting function in Nonlinear Least-Square GSL

Feed same input but got dfferent output in Householder reduction (tred2) for both C and Fortran program?

How to stop recursion after some time in C

Poisson calculation (erlang C)

Realtime Band-Limited Impulse Train Synthesis using SDL mixer

Categories

Resources