Armadillo: eigs_gen for smallest eigenvalue - sparse-matrix

I'm using armadillo's eigs_gen to find the smallest algebraic eigenvalue of a sparse matrix.
If I request the function for just the smallest eigenvalue the result is incorrect but if I request it for the 2 smallest eigenvalues the result is correct. The code is:
#include <iostream>
#include <armadillo>
using namespace std;
using namespace arma;
int
main(int argc, char** argv)
{
cout << "Armadillo version: " << arma_version::as_string() << endl;
sp_mat A(5,5);
A(1,2) = -1;
A(2,1) = -1;
A(3,4) = -1;
A(4,3) = -1;
cx_vec eigval;
cx_mat eigvec;
eigs_gen(eigval, eigvec, A, 1, "sr"); // find smallest eigenvalue ---> INCORRECT RESULTS
eigval.print("Smallest real eigval:");
eigs_gen(eigval, eigvec, A, 2, "sr"); // find 2 smallest eigenvalues ---> ALMOST CORRECT RESULTS
eigval.print("Two smallest real eigvals:");
return 0;
}
My compile command is:
g++ file.cpp -o file.exe -O2 -I/path-to-armadillo/armadillo-4.600.3/include -DARMA_DONT_USE_WRAPPER -lblas -llapack -larpack
The output is:
Armadillo version: 4.600.3 (Off The Reservation)
Smallest real eigval:
(+1.000e+00,+0.000e+00)
Two smallest real eigvals:
(-1.000e+00,+0.000e+00)
(-1.164e-17,+0.000e+00)
Any idea on why this is happening and how to overcome this is appreciated.
Note: second result is only almost correct because we expect -1, -1 as the two lowest eigenvalues but perhaps repeated eigenvalues are ignored.
Update: including a test matrix construction which, after ryan's changes to include the "sa" option to the library, doesn't seem to converge:
#define ARMA_64BIT_WORD
#include <armadillo>
#include <iostream>
#include <vector>
#include <stdio.h>
using namespace arma;
using namespace std;
int main(){
size_t l(3), ls(l*l*l);
sp_mat A = sprandn<sp_mat>(ls, ls, 0.01);
sp_mat B = A.t()*A;
vec eigval;
mat eigvec;
eigs_sym(eigval, eigvec, B, 1, "sa");
return 0;
}
The matrix sizes of interest are much larger e.g. ls = 8000 - 27000, and is not quite the matrix constructed here but I presume the problem should be the same.

I believe that the issue here is that you are running eigs_gen() (which calls DNAUPD) on a symmetric matrix. ARPACK notes that DNAUPD is not meant for symmetric matrices, but does not specify what will happen if you use symmetric matrices anyway:
NOTE: If the linear operator "OP" is real and symmetric with respect to the real positive semi-definite symmetric matrix B, i.e. B*OP = (OP')*B, then subroutine ssaupd should be used instead.
(from http://www.mathkeisan.com/usersguide/man/dnaupd.html )
I modified the internal Armadillo code to pass "sa" (smallest algebraic) to the ARPACK calls in eigs_sym() (sp_auxlib_meat.hpp), and I was able to obtain the correct eigenvalues. I've submitted a patch upstream to make "sa" and "la" support available for eigs_sym(), which I think should solve your problem once a new version is released (or at some point in the future).

The problem is with repeated eigenvalues; if I change the first two matrix elements to
A(1,2) = -1.00000001;
A(2,1) = -1.00000001;
the expected results are obtained.

Related

How do you use GSL's Cholesky Decomposition function with C

I've been using GSL to support some matrix manipulation using C. I'm having a challenge with its Cholesky Decomposition function though and the documentation in the GSL reference manual is sparse to say the least. How do I get the Lower Triangular matrix output of the function?
Below is my code so far ...
# include <gsl/gsl_matrix.h>
# include <gsl/gsl_linalg.h>
#define rows 6
#define cols 6
double cov[rows*cols] = {107.3461, 12.0710, -48.3746, 174.7796, 21.0202, -80.6075,
12.0710, 8.0304, -5.9610, 20.2434, 2.2427, -9.312,
-48.3746, -5.9610, 25.2222, -78.6277, -9.4400, 36.1789,
174.7796, 20.2434, -78.6277, 291.3491, 35.0176, -134.3626,
21.0202, 2.2427, -9.4400, 35.0176, 4.2144, -16.1499,
-80.6075, -9.3129, 36.1789, -134.3626, -16.1499, 61.9666};
gsl_matrix_view m = gsl_matrix_view_array(cov, rows, cols);
int gsl_linalg_cholesky_decomp1(gsl_matrix *m)
... don't know what to do after this step
I know the formulas for calculating this manually, but I'd prefer to take advantage of this library instead.
Any help in this regard would be much appreciated.
Got things to work right with David's suggestion and a bit more digging ...
#include <stdio.h>
#include <gsl/gsl_linalg.h>
int main ()
{
double cov[9] = {2, -1, 0, -1, 2, -1, 0, -1, 2};
gsl_matrix_view m = gsl_matrix_view_array(cov, 3, 3);
gsl_matrix *x = gsl_matrix_alloc(3,3);
gsl_linalg_cholesky_decomp1(&m.matrix);
printf ("x = \n");
gsl_matrix_fprintf (stdout, x, "%g");
}

Enabling HVX SIMD in Hexagon DSP by using instruction intrinsics

I was using Hexagon-SDK 3.0 to compile my sample application for HVX DSP architecture. There are many tools related to Hexagon-LLVM available to use located folder at:
~/Qualcomm/HEXAGON_Tools/7.2.12/Tools/bin
I wrote a small example to calculate the product of two arrays to makes sure I can utilize the HVX hardware acceleration. However, when I generate my assembly, either with -S , or, with -S -emit-llvm I don't find any definition of HVX instructions such as vmem, vX, etc. My C application is executing on hexagon-sim for now till I manage to find a way to run in on the board as well.
As far as I understood, I need to define my HVX part of the code in C Intrinsics, but was not able to adapt the existing examples to match my own needs. It would be great if somebody could demonstrate how this process can be done. Also in the Hexagon V62 Programmer's Reference Manual many of the intrinsic instructions are not defined.
Here is my small app in pure C:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>
#if defined(__hexagon__)
#include "hexagon_standalone.h"
#include "subsys.h"
#endif
#include "io.h"
#include "hvx.cfg.h"
#define KERNEL_SIZE 9
#define Q 8
#define PRECISION (1<<Q)
double vectors_dot_prod2(const double *x, const double *y, int n)
{
double res = 0.0;
int i = 0;
for (; i <= n-4; i+=4)
{
res += (x[i] * y[i] +
x[i+1] * y[i+1] +
x[i+2] * y[i+2] +
x[i+3] * y[i+3]);
}
for (; i < n; i++)
{
res += x[i] * y[i];
}
return res;
}
int main (int argc, char* argv[])
{
int n;
long long start_time, total_cycles;
/* -----------------------------------------------------*/
/* Allocate memory for input/output */
/* -----------------------------------------------------*/
//double *res = memalign(VLEN, 4 *sizeof(double));
const double *x = memalign(VLEN, n *sizeof(double));
const double *y = memalign(VLEN, n *sizeof(double));
if ( *x == NULL || *y == NULL ){
printf("Error: Could not allocate Memory for image\n");
return 1;
}
#if defined(__hexagon__)
subsys_enable();
SIM_ACQUIRE_HVX;
#if LOG2VLEN == 7
SIM_SET_HVX_DOUBLE_MODE;
#endif
#endif
/* -----------------------------------------------------*/
/* Call fuction */
/* -----------------------------------------------------*/
RESET_PMU();
start_time = READ_PCYCLES();
vectors_dot_prod2(x,y,n);
total_cycles = READ_PCYCLES() - start_time;
DUMP_PMU();
printf("Array product of x[i] * y[i] = %f\n",vectors_dot_prod2(x,y,4));
#if defined(__hexagon__)
printf("AppReported (HVX%db-mode): Array product of x[i] * y[i] =%f\n", VLEN, vectors_dot_prod2(x,y,4));
#endif
return 0;
}
I compile it using hexagon-clang:
hexagon-clang -v -O2 -mv60 -mhvx-double -DLOG2VLEN=7 -I../../common/include -I../include -DQDSP6SS_PUB_BASE=0xFE200000 -o arrayProd.o -c arrayProd.c
Then link it with subsys.o (is found in DSK and already compiled) and -lhexagon to generate my executable:
hexagon-clang -O2 -mv60 -o arrayProd.exe arrayProd.o subsys.o -lhexagon
Finally, run it using the sim:
hexagon-sim -mv60 arrayProd.exe
A bit late, but might still be useful.
Hexagon Vector eXtensions are not emitted automatically and current instruction set (as of 8.0 SDK) only supports integer manipulation, so compiler will not emit anything for the C code containing "double" type (it is similar to SSE programming, you have to manually pack xmm registers and use SSE intrinsics to do what you need).
You need to define what your application really requires.
E.g., if you are writing something 3D-related and really need to calculate double (or float) dot products, you might convert yout floats to 16.16 fixed point and then use instructions (i.e., C intrinsics) like
Q6_Vw_vmpyio_VwVh and Q6_Vw_vmpye_VwVuh to emulate fixed-point multiplication.
To "enable" HVX you should use HVX-related types defined in
#include <hexagon_types.h>
#include <hexagon_protos.h>
The instructions like 'vmem' and 'vmemu' are emitted automatically for statements like
// I assume 64-byte mode, no `-mhvx-double`. For 128-byte mode use 32 int array
int values[16] = { 1, 2, 3, ..... };
/* The following line compiles to
{
r4 = __address_of_values
v1 = vmem(r4 + #0)
}
You can get the exact code by using '-S' switch, as you already do
*/
HVX_Vector v = *(HVX_Vector*)values;
Your (fixed-point) version of dot_product may read out 16 integers at a time, multiply all 16 integers in a couple of instructions (see HVX62 programming manual, there is a tip to implement 32-bit integer multiplication from 16-bit one),
then shuffle/deal/ror data around and sum up rearranged vectors to get dot product (this way you may calculate 4 dot products almost at once and if you preload 4 HVX registers - that is 16 4D vectors - you may calculate 16 dot products in parallel).
If what you are doing is really just byte/int image processing, you might use specific 16-bit and 8-bit hardware dot products in Hexagon instruction set, instead of emulating doubles and floats.

Using LAPACKE_zgetrs with LAPACK_ROW_MAJOR causes illegal memory access

I am trying to solve a linear system using the following code:
#include <stdio.h>
#include <lapacke.h>
int main () {
lapack_complex_double mat[4];
lapack_complex_double vec[2];
lapack_int p[2];
mat[0] = lapack_make_complex_double(1,0);
mat[1] = lapack_make_complex_double(1,0);
mat[2] = lapack_make_complex_double(1,0);
mat[3] = lapack_make_complex_double(-1,0);
vec[0] = lapack_make_complex_double(1,0);
vec[1] = lapack_make_complex_double(1,0);
LAPACKE_zgetrf(LAPACK_ROW_MAJOR, 2, 2, mat, 2, p);
LAPACKE_zgetrs(LAPACK_ROW_MAJOR, 'N', 2, 1, mat, 2, p, vec, 2);
printf("%g %g\n", lapack_complex_double_real(vec[0]),
lapack_complex_double_imag(vec[0]));
return 0;
}
For some reasons, this causes illegal memory access in LAPACKE_zgetrs (as detected by valgrind and by my big program crashing in zgetrs because of "glibc detected corruption or double free"). I did not include this in my SSCCE for brevity, but all LAPACKE routines that return, return 0.
The same code with LAPACK_COL_MAJOR runs and valgrinds flawlessly.
My lapacke, lapack etc. is self-built for Ubuntu 12.04. I used the following settings in the lapack CMake file:
BUILD_COMPLEX ON
BUILD_COMPLEX16 ON
BUILD_DOUBLE ON
BUILD_SHARED_LIBS ON
BUILD_SINGLE ON
BUILD_STATIC_LIBS ON
BUILD_TESTING ON
CMAKE_BUILD_TYPE Release
LAPACKE ON
LAPACKE_WITH_TMG ON
and the rest (the optimized blas/lapack and xblas) off. There were no errors during the build and all tests succeeded.
Where did I mess up?
Edit: I just tried this with Fedora21 and the packaged lapacke. It did not reproduce the error.
Edit 2: While it does not reproduce the memory fails, it produces a wrong solution, namely (1 + 0I, 1 + 0I) for the above input (should be (1,0))
After some more research and overthinking things, I found the culprit:
Using LAPACK_ROW_MAJOR switches the meaning of the ld* leading dimension parameters. While the leading dimension of a normal Fortran array is the numbers of rows, switching to ROW_MAJOR switches its meaning to the number of columns. So the correct calls (giving correct results) would be:
LAPACKE_zgetrs(LAPACK_ROW_MAJOR, 'N', 2, 1, mat, 2, p, vec, 1);
where the second 2 is the number of columns (not rows!) of mat, and the last parameter must equal the number of right hand sides nrhs (not the number of variables!). I isolated this very call because all the other calls in my project dealt with square matrices, so the "wrong" calls do not have any negative effect due to symmetry.
As usual, if you are skipping columns at the end, the leading dimensions get bigger accordingly, as they would with skipping rows in the normal setting.
Obviously, this is not mentioned in the Fortran documentations. Unfortunately, I did see no such remark in the Lapacke documentation, which would have saved me a couple of hours of my life. :)

IDL CALL_EXTERNAL pass array

I am currently trying to interface some fortran routines with IDL, yes it is as painful as it sounds. To start with I tried to get the example given in the IDL documentation to work, but here i run in to a very strange problem. When I try to pass an array, as demonstrated here http://www.exelisvis.com/docs/FORTRANExamples.html, the code only passes the first element.
Here is the code i use. Currently I use the c wrapper given in the above link.
The fortran code. (clf.F)
SUBROUTINE SUM_ARRAY1(array, n, sum)
implicit none
INTEGER n,i
INTEGER array(3), sum
sum=0.0
DO i=1,n
st = sum + array(i)
sum = st
ENDDO
!sum = n
!sum = array(1)
RETURN
END
And here is the c code. (caller.c)
#include <stdio.h>
void sum_array(int argc, void *argv[])
{
extern void sum_array1_();/* Fortran routine */
int *n;
int *s, *f;
f = (int *) argv[0];/* Array pntr */
n = (int *) argv[1];/* Get # of elements */
s = (int *) argv[2];/* Pass back result a parameter */
sum_array1_(f, n, s);/* Compute sum */
}
I compile and link with
gfortran -c clf.F -fPIC
&& gcc -c caller.c -fPIC
&& gcc -shared -fpic clf.o caller.o -o mylb.so
And call in IDL with
a = [5,6,7]
sm = 0
S = CALL_EXTERNAL('mylb.so','sum_array', a, N_ELEMENTS(a), sm)
print, sm,a
Now this should return the sum of my numbers, in other words sm = 18. However, when I run the code as given by exelisvis I get some random number. Now I have played around with it. As you can see I have tried to set sum = n and sum = array(1). Here I get the correct output, 3 and 5. However if I try with sum = array(2) I get back to strange numbers.
From what I can gather by doing some debugging is that the whole array is not passed to the fortran array. I have also tried with a fortran interface and with different compilers. When I used the fortran wrapper I tried to define an array here and past it to the subroutine, that worked like a charm.
So it seems to me that the problem is relay in the passing of information from IDL to fortran/c. It surprises me that I can not even get the examples on the webpage to work. I am currently on a 64 system and tomorrow I will try to compile in 32 and see if that changes anything, the manual mentions this. However, I need to get it working for a 64bit system.
Since this is very new territory to me I hope that there is some silly mistake here somewhere and that someone can spot it. All forms of help is appreciated. Thanks.
If you change your array initialization statement in IDL to:
a = long([5,6,7])
Also cast sm and N_ELEMENTS as long() in your IDL.

cblas_dgemm - works ONLY if (beta) is power-of-two

I am totally stumped. I have a fairly large recursive program written in c that calls cblas_dgemm(). The result is verified independently by a program that works correctly.
C = alpha*A*B + beta*C
On repeated tests using random matrices and all possible combination of parameters the program gives correct answer ONLY if abs(beta) = 2^n (1,2,4,8..). Any value works for alpha. Any other positive/negative, odd/even value for beta gives correct answer b/w 10-30% of the time.
I am using Ubuntu 10.04, GCC 4.4.x, I have tried system installed blas/cblas/atlas as well as manually compiled atlas.
Any hints or suggestions would be greatly appreciated. I am amazed at the wonderfully generous (and smart) folks lurking at this site.
Thanking you all in advance,
Russ
Two completely unrelated errors conspired to produce an illusive picture. It made me look for problems in the wrong place.
(1) There was a simple error in the logic of the function calling dgemm. Would have been easily fixed if I was not chasing the wrong problem.
(2) My double-compare function: double version of AlmostEqual2sComplement() (http://www.cygnus-software.com/papers/comparingfloats/comparingfloats.htm) used incorrect sized integer - resulting in an incorrect TRUE under certain rare circumstances. This was the first time the error bit me!
Thanks again for the useful suggestion of using the scientific method when trying to debug a program.
Russ
Yes, a full example would be handy. Here is an old example I had hanging around using GSL's sgemm variant; should be easy to fix to double. Please try and see if this gives the result shown in the GSL manual:
/* from the gsl info documentation in node 'gsl cblas examples' */
/* compile via 'gcc -o $file $file.c -lgslcblas' */
/* edd 15 Nov 2003 */
#include <stdio.h>
#include <gsl/gsl_cblas.h>
int
main (void)
{
int lda = 3;
float A[] = { 0.11, 0.12, 0.13,
0.21, 0.22, 0.23 };
int ldb = 2;
float B[] = { 1011, 1012,
1021, 1022,
1031, 1032 };
int ldc = 2;
float C[] = { 0.00, 0.00,
0.00, 0.00 };
/* Compute C = A B */
cblas_sgemm (CblasRowMajor,
CblasNoTrans, CblasNoTrans, 2, 2, 3,
1.0, A, lda, B, ldb, 0.0, C, ldc);
printf ("[ %g, %g\n", C[0], C[1]);
printf (" %g, %g ]\n", C[2], C[3]);
return 0;
}

Resources