How do you use GSL's Cholesky Decomposition function with C - c

I've been using GSL to support some matrix manipulation using C. I'm having a challenge with its Cholesky Decomposition function though and the documentation in the GSL reference manual is sparse to say the least. How do I get the Lower Triangular matrix output of the function?
Below is my code so far ...
# include <gsl/gsl_matrix.h>
# include <gsl/gsl_linalg.h>
#define rows 6
#define cols 6
double cov[rows*cols] = {107.3461, 12.0710, -48.3746, 174.7796, 21.0202, -80.6075,
12.0710, 8.0304, -5.9610, 20.2434, 2.2427, -9.312,
-48.3746, -5.9610, 25.2222, -78.6277, -9.4400, 36.1789,
174.7796, 20.2434, -78.6277, 291.3491, 35.0176, -134.3626,
21.0202, 2.2427, -9.4400, 35.0176, 4.2144, -16.1499,
-80.6075, -9.3129, 36.1789, -134.3626, -16.1499, 61.9666};
gsl_matrix_view m = gsl_matrix_view_array(cov, rows, cols);
int gsl_linalg_cholesky_decomp1(gsl_matrix *m)
... don't know what to do after this step
I know the formulas for calculating this manually, but I'd prefer to take advantage of this library instead.
Any help in this regard would be much appreciated.

Got things to work right with David's suggestion and a bit more digging ...
#include <stdio.h>
#include <gsl/gsl_linalg.h>
int main ()
{
double cov[9] = {2, -1, 0, -1, 2, -1, 0, -1, 2};
gsl_matrix_view m = gsl_matrix_view_array(cov, 3, 3);
gsl_matrix *x = gsl_matrix_alloc(3,3);
gsl_linalg_cholesky_decomp1(&m.matrix);
printf ("x = \n");
gsl_matrix_fprintf (stdout, x, "%g");
}

Related

OpenMP paralellize a for loop

I want to use OpenMP to improve the process to calculate the arrays K11, K22, K33. In this code, I calculate the interaction force between 100 particles. The input arrays K11, K22, K33 are arrays with lengths 100*99 filled with zeros. The array sarray contains the separation between the particles. The function trapzd integrates the function K_i that is defined in a header file.
#include <stdio.h>
#include <math.h>
#include "FuncInteraction.h"
void ForceVel(double *sarray, double *K11, double *K22, double *K33, int NTrheads) {
double rr, ll;
rr=0;
ll=0;
int pp;
#pragma omp parallel for private(pp,rr,ll) num_threads(NTrheads)
for (pp = 0; pp <100*99; pp++)
{
rr = 50/sarray[pp];
ll = 5/sarray[pp];
K11[pp] = 2*trapzd( K1, rr, ll, 0, 12000, 20);
K22[pp] = 2*trapzd( K2,rr, ll, 0, 10000, 20) ;
K33[pp] = 2*trapzd( K3,rr, ll, 0, 10000, 20) ;
}
}
When I execute this code, I observe that runs in a single core independent of the value of the NTrheads. I would expect to be able to do that loop using more than 1 core, considering that the calculation of K11, K22 and K33 takes more than 1 s.

Example of using MPI_Type_create_subarray to do 2d cyclic distribution

I would like to have an example showing how to use MPI_Type_create_subarray to build 2D cyclic distribution for large matrix.
I know that MPI_Type_create_darray will give me 2D cyclic distribution, but it is not compatible with SCALAPACK process grid.
I would to do 2d block cyclic distribution using MPI_Type_create_subarray and pass the matrices to SCALAPACK routines.
Could I have an example showing this?
There are at least two parts to your question. The following sections address these two component pieces, but leave integration of the two to you. The example code contained below in both sections, along with explanations provided in the ScaLapack link below should provide some guidance...
From DeinoMPI:
The following sample code illustrates MPI_Type_create_subarray.
#include "mpi.h"
#include <stdio.h>
int main(int argc, char *argv[])
{
int myrank;
MPI_Status status;
MPI_Datatype subarray;
int array[9] = { -1, 1, 2, 3, -2, -3, -4, -5, -6 };
int array_size[] = {9};
int array_subsize[] = {3};
int array_start[] = {1};
int i;
MPI_Init(&argc, &argv);
/* Create a subarray datatype */
MPI_Type_create_subarray(1, array_size, array_subsize, array_start, MPI_ORDER_C, MPI_INT, &subarray);
MPI_Type_commit(&subarray);
MPI_Comm_rank(MPI_COMM_WORLD, &myrank);
if (myrank == 0)
{
MPI_Send(array, 1, subarray, 1, 123, MPI_COMM_WORLD);
}
else if (myrank == 1)
{
for (i=0; i<9; i++)
array[i] = 0;
MPI_Recv(array, 1, subarray, 0, 123, MPI_COMM_WORLD, &status);
for (i=0; i<9; i++)
printf("array[%d] = %d\n", i, array[i]);
fflush(stdout);
}
MPI_Finalize();
return 0;
}
And from ScaLapack in C essentials:
Unfortunately, there is no C interface for ScaLAPACK or PBLAS.All
parametersshould be passed into routines and functionsby reference,
you can also define constants (i_one for 1, i_negone for -1, d_two for
2.0E+0 etc.) to pass into routines.Matrices should bestoredas 1d array(A[ i + lda*j ], not A[i][j])
To invoke ScaLAPACK routines in your program, you should first
initialize grid via BLACS routines (BLACS is enough). Second, you
should distribute your matrix over process grid (block cyclic 2d
distribution). You can do this by means of pdgeadd_ PBLAS routine.
This routine cumputes sum of two matrices A, B: B:=alphaA+betaB).
Matrices can have different distribution,in particularmatrixA can be
owned by only one process, thus, setting alpha=1, beta=0 you cansimply
copy your non-distributed matrix A into distributed matrix B.
Third, call pdgeqrf_ for matrix B. In the end of ScaLAPACK part of
code, you can collect results on one process (just copy distributed
matrix into local one via pdgeadd_). Finally, close grid via
blacs_gridexit_ and blacs_exit_.
After all, ScaLAPACK-using program should contain following:
void main(){
// Useful constants
const int i_one = 1, i_negone = -1, i_zero = 0;
const double zero=0.0E+0, one=1.0E+0;
... (See the rest of code in linked location above...)

passing in R array to C function with "NA"

I work in R using C libraries. I need to pass to a C function an array with numbers between 1 and 10 but that could also be "NA". Then in C, depending on the value I need to set the output.
Here's a simplified code
heredyn.load("ranking.so")
fun <- function(ranking) {
nrak <- length(ranking)
out <- .C("ranking", as.integer(nrak), as.character(ranking), rr = as.integer(vector("integer",nrak)))
out$rr
}
ranking <- sample(c(NA,seq(1,10)),10,replace=TRUE)
rr <- fun(ranking)
The C function could simply be such as
#include <R.h>
void ranking(int *nrak, char *ranking, int *rr) {
int i ;
for (i=0;i<*nrak;i++) {
if (ranking[i] == 'NA')
rr[i] = 1 ;
else
rr[i] = (int) strtol(&ranking[i],(char **)NULL,10) ;
}
}
Due to the "NA" value I set ranking as character but maybe there's another way to do that, using integer and without replacing "NA" to 0 before calling the function?
(The code like this, gives me always an array of zeros...)
Test for whether the value is an NA using R_NaInt, like
#include <R.h>
void ranking_c(int *nrak, int *ranking, int *rr) {
for (int i=0; i < *nrak; i++)
rr[i] = R_NaInt == ranking[i] ? -1 : ranking[i];
}
Invoke from R by explicitly allowing NAs
> x = c(1:2, NA_integer_)
> .C("ranking_c", length(x), as.integer(x), integer(length(x)), NAOK=TRUE)[[3]]
[1] 1 2 -1
Alternatively, use R's .Call() interface. Each R object is represented as an S-expression. There are C-level functions to manipulate S-expressions, e.g., length Rf_length(), data access INTEGER(), and allocation Rf_allocVector() of different types of S-expressions such as INTSXP for integer vectors.
R memory management uses a garbage collector that can run on any call that allocates memory. It is therefore best practice to PROTECT() any R allocation while in scope.
Your function will accept 0 or more S-expressions as input, and return a single S-expression; it might be implemented as
#include <Rinternals.h>
#include <R_ext/Arith.h>
SEXP ranking_call(SEXP ranking)
{
/* allocate space for result, PROTECTing from garbage collection */
SEXP result = PROTECT(Rf_allocVector(INTSXP, Rf_length(ranking)));
/* assign result */
for (int i = 0; i < Rf_length(ranking); ++i)
INTEGER(result)[i] =
R_NaInt == INTEGER(ranking)[i] ? -1 : INTEGER(ranking)[i];
UNPROTECT(1); /* no more need to protect */
return result;
}
And invoked from R with .Call("ranking_call", as.integer(ranking)).
Using .Call is more efficient than .C in terms of speed and memory allocation (.C may copy atomic vectors on the way in), but the primary reason to use it is for the flexibility it offers in terms of working directly with R's data structures. This is especially important when the return values are more complicated than atomic vectors.
You are attempting to address a couple of delicate and non-trivial points, least of all how to compile code with R, and to test for non-finite values.
You asked for help with C. I would like to suggest C++ -- which you do not need to use in a complicated way. Consider this short file with contains a function to process a vector along the lines you suggest (I just test for NA and then assign 42 as a marker for simplicit) or else square the value:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector foo(NumericVector x) {
unsigned int n = x.size();
for (unsigned int i=0; i<n; i++)
if (NumericVector::is_na(x[i]))
x[i] = 42.0;
else
x[i] = pow(x[i], 2);
return x;
}
/*** R
foo( c(1, 3, NA, NaN, 6) )
*/
If I save this on my box as /tmp/foo.cpp, in order compile, link, load and even run the embedded R use example, I only need one line to call sourceCpp():
R> Rcpp::sourceCpp("/tmp/foo.cpp")
R> foo( c(1, 3, NA, NaN, 6))
[1] 1 9 42 42 36
R>
We can do the same with integers:
// [[Rcpp::export]]
IntegerVector bar(IntegerVector x) {
unsigned int n = x.size();
for (unsigned int i=0; i<n; i++)
if (IntegerVector::is_na(x[i]))
x[i] = 42;
else
x[i] = pow(x[i], 2);
return x;
}

Armadillo: eigs_gen for smallest eigenvalue

I'm using armadillo's eigs_gen to find the smallest algebraic eigenvalue of a sparse matrix.
If I request the function for just the smallest eigenvalue the result is incorrect but if I request it for the 2 smallest eigenvalues the result is correct. The code is:
#include <iostream>
#include <armadillo>
using namespace std;
using namespace arma;
int
main(int argc, char** argv)
{
cout << "Armadillo version: " << arma_version::as_string() << endl;
sp_mat A(5,5);
A(1,2) = -1;
A(2,1) = -1;
A(3,4) = -1;
A(4,3) = -1;
cx_vec eigval;
cx_mat eigvec;
eigs_gen(eigval, eigvec, A, 1, "sr"); // find smallest eigenvalue ---> INCORRECT RESULTS
eigval.print("Smallest real eigval:");
eigs_gen(eigval, eigvec, A, 2, "sr"); // find 2 smallest eigenvalues ---> ALMOST CORRECT RESULTS
eigval.print("Two smallest real eigvals:");
return 0;
}
My compile command is:
g++ file.cpp -o file.exe -O2 -I/path-to-armadillo/armadillo-4.600.3/include -DARMA_DONT_USE_WRAPPER -lblas -llapack -larpack
The output is:
Armadillo version: 4.600.3 (Off The Reservation)
Smallest real eigval:
(+1.000e+00,+0.000e+00)
Two smallest real eigvals:
(-1.000e+00,+0.000e+00)
(-1.164e-17,+0.000e+00)
Any idea on why this is happening and how to overcome this is appreciated.
Note: second result is only almost correct because we expect -1, -1 as the two lowest eigenvalues but perhaps repeated eigenvalues are ignored.
Update: including a test matrix construction which, after ryan's changes to include the "sa" option to the library, doesn't seem to converge:
#define ARMA_64BIT_WORD
#include <armadillo>
#include <iostream>
#include <vector>
#include <stdio.h>
using namespace arma;
using namespace std;
int main(){
size_t l(3), ls(l*l*l);
sp_mat A = sprandn<sp_mat>(ls, ls, 0.01);
sp_mat B = A.t()*A;
vec eigval;
mat eigvec;
eigs_sym(eigval, eigvec, B, 1, "sa");
return 0;
}
The matrix sizes of interest are much larger e.g. ls = 8000 - 27000, and is not quite the matrix constructed here but I presume the problem should be the same.
I believe that the issue here is that you are running eigs_gen() (which calls DNAUPD) on a symmetric matrix. ARPACK notes that DNAUPD is not meant for symmetric matrices, but does not specify what will happen if you use symmetric matrices anyway:
NOTE: If the linear operator "OP" is real and symmetric with respect to the real positive semi-definite symmetric matrix B, i.e. B*OP = (OP')*B, then subroutine ssaupd should be used instead.
(from http://www.mathkeisan.com/usersguide/man/dnaupd.html )
I modified the internal Armadillo code to pass "sa" (smallest algebraic) to the ARPACK calls in eigs_sym() (sp_auxlib_meat.hpp), and I was able to obtain the correct eigenvalues. I've submitted a patch upstream to make "sa" and "la" support available for eigs_sym(), which I think should solve your problem once a new version is released (or at some point in the future).
The problem is with repeated eigenvalues; if I change the first two matrix elements to
A(1,2) = -1.00000001;
A(2,1) = -1.00000001;
the expected results are obtained.

cblas_dgemm - works ONLY if (beta) is power-of-two

I am totally stumped. I have a fairly large recursive program written in c that calls cblas_dgemm(). The result is verified independently by a program that works correctly.
C = alpha*A*B + beta*C
On repeated tests using random matrices and all possible combination of parameters the program gives correct answer ONLY if abs(beta) = 2^n (1,2,4,8..). Any value works for alpha. Any other positive/negative, odd/even value for beta gives correct answer b/w 10-30% of the time.
I am using Ubuntu 10.04, GCC 4.4.x, I have tried system installed blas/cblas/atlas as well as manually compiled atlas.
Any hints or suggestions would be greatly appreciated. I am amazed at the wonderfully generous (and smart) folks lurking at this site.
Thanking you all in advance,
Russ
Two completely unrelated errors conspired to produce an illusive picture. It made me look for problems in the wrong place.
(1) There was a simple error in the logic of the function calling dgemm. Would have been easily fixed if I was not chasing the wrong problem.
(2) My double-compare function: double version of AlmostEqual2sComplement() (http://www.cygnus-software.com/papers/comparingfloats/comparingfloats.htm) used incorrect sized integer - resulting in an incorrect TRUE under certain rare circumstances. This was the first time the error bit me!
Thanks again for the useful suggestion of using the scientific method when trying to debug a program.
Russ
Yes, a full example would be handy. Here is an old example I had hanging around using GSL's sgemm variant; should be easy to fix to double. Please try and see if this gives the result shown in the GSL manual:
/* from the gsl info documentation in node 'gsl cblas examples' */
/* compile via 'gcc -o $file $file.c -lgslcblas' */
/* edd 15 Nov 2003 */
#include <stdio.h>
#include <gsl/gsl_cblas.h>
int
main (void)
{
int lda = 3;
float A[] = { 0.11, 0.12, 0.13,
0.21, 0.22, 0.23 };
int ldb = 2;
float B[] = { 1011, 1012,
1021, 1022,
1031, 1032 };
int ldc = 2;
float C[] = { 0.00, 0.00,
0.00, 0.00 };
/* Compute C = A B */
cblas_sgemm (CblasRowMajor,
CblasNoTrans, CblasNoTrans, 2, 2, 3,
1.0, A, lda, B, ldb, 0.0, C, ldc);
printf ("[ %g, %g\n", C[0], C[1]);
printf (" %g, %g ]\n", C[2], C[3]);
return 0;
}

Resources