Outer product using CBLAS - c

I am having trouble utilizing CBLAS to perform an Outer Product. My code is as follows:
//===SET UP===//
double x1[] = {1,2,3,4};
double x2[] = {1,2,3};
int dx1 = 4;
int dx2 = 3;
double X[dx1 * dx2];
for (int i = 0; i < (dx1*dx2); i++) {X[i] = 0.0;}
//===DO THE OUTER PRODUCT===//
cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasTrans, dx1, dx2, 1, 1.0, x1, dx1, x2, 1, 0.0, X, dx1);
//===PRINT THE RESULTS===//
printf("\nMatrix X (%d x %d) = x1 (*) x2 is:\n", dx1, dx2);
for (i=0; i<4; i++) {
for (j=0; j<3; j++) {
printf ("%lf ", X[j+i*3]);
}
printf ("\n");
}
I get:
Matrix X (4 x 3) = x1 (*) x2 is:
1.000000 2.000000 3.000000
0.000000 -1.000000 -2.000000
-3.000000 0.000000 7.000000
14.000000 21.000000 0.000000
But the correct answer is found here:
https://www.sharcnet.ca/help/index.php/BLAS_and_CBLAS_Usage_and_Examples
I have seen: Efficient computation of kronecker products in C
But, it doesn't help me because they don't actually say how to utilize dgemm to actually do this...
Any help? What am I doing wrong here?

You can do it with dgemm, but it would be more stylistically correct to use dger, which is a dedicated outer-product implementation. As such it's somewhat easier to use correctly:
cblas_dger(CblasRowMajor, /* you’re using row-major storage */
dx1, /* the matrix X has dx1 rows ... */
dx2, /* ... and dx2 columns. */
1.0, /* scale factor to apply to x1x2' */
x1,
1, /* stride between elements of x1. */
x2,
1, /* stride between elements of x2. */
X,
dx2); /* leading dimension of matrix X. */
dgemm does have the nice feature that passing \beta = 0 initializes the result matrix for you, which saves you from needing to explicitly zero it out yourself before the call. #Artem Shinkarov’s answer provides a nice description of how to use dgemm.

The interfaces are not very convenient in BLAS, however, let's try to figure it out. First of all, let's say that all our matrices are in RowMajor. Now we have the following set-up
row col
x1: dx1 1 (A)
x2: 1 dx2 (B)
X: dx1 dx2 (C)
Now, we just need to fill the call according to the documentation, which is specified in terms of
C = \alpha A*B + \beta C
So we get:
cblas_dgemm (CblasRowMajor, CblasNoTrans, CblasNoTrans,
(int)dx1, /* rows in A */
(int)dx2, /* columns in B */
(int)1, /* columns in A */
1.0, x1, /* \alpha, A itself */
(int)1, /* Colums in A */
x2, /* B itself */
(int)dx2, /* Columns in B */
0.0, X, /* \beta, C itself */
(int)dx2 /* Columns in C */);
So that should do the job I would hope.
Here is a description of the parameters of dgemm: Link

Related

Discrete Fourier Transform With C, Implementation problems?

I'm trying to understand some basics of DFT, some math equations, and try to implement it with C.
Well, this is the function i used from a book (Algorithms for Image Processing And Computer Vision)
void slowft (float *x, COMPLEX *y, int n)
{
COMPLEX tmp, z1, z2, z3, z4;
int m, k;
/* Constant factor -2 pi */
cmplx (0.0, (float)(atan (1.0)/n * -8.0), &tmp);
printf (" constant factor -2 pi %f ", (float)(atan (1.0)/n * -8.0));
for (m = 0; m<=n; m++)
{
NEXT();
cmplx (x[0], 0.0, &(y[m]));
for (k=1; k<=n-1; k++)
{
/* Exp (tmp*k*m) */
cmplx ((float)k, 0.0, &z2);
cmult (tmp, z2, &z3);
cmplx ((float)m, 0.0, &z2);
cmult (z2, z3, &z4);
cexp (z4, &z2);
/* *x[k] */
cmplx (x[k], 0.0, &z3);
cmult (z2, z3, &z4);
/* + y[m] */
csum (y[m], z4, &z2);
y[m].real = z2.real; y[m].imag = z2.imag;
}
}
}
So actually, I'm stuck on the Constant Factor part. I didn't understand:
1-) what it came from(especially arctan(1)) and
2-) what its purpose of it.
This is the equation of DFT:
And these are other functions that i used:
void cexp (COMPLEX z1, COMPLEX *res)
{
COMPLEX x, y;
x.real = exp((double)z1.real);
x.imag = 0.0;
y.real = (float)cos((double)z1.imag);
y.imag = (float)sin((double)z1.imag);
cmult (x, y, res);
}
void cmult (COMPLEX z1, COMPLEX z2, COMPLEX *res)
{
res->real = z1.real*z2.real - z1.imag*z2.imag;
res->imag = z1.real*z2.imag + z1.imag*z2.real;
}
void csum (COMPLEX z1, COMPLEX z2, COMPLEX *res)
{
res->real = z1.real + z2.real;
res->imag = z1.imag + z2.imag;
}
void cmplx (float rp, float ip, COMPLEX *z)
{
z->real = rp;
z->imag = ip;
}
float cnorm (COMPLEX z)
{
return z.real*z.real + z.imag*z.imag;
}
1-) what it came from(especially arctan(1)) and
The code comment immediately above clues you in:
/* Constant factor -2 pi */
... although actually what is being computed is -2 pi / n (in the broader context of producing a complex number with that as the coefficient of its imaginary component). Observe that the tangent has value 1 for angles whose sine and cosine are equal. The angle that has that property and is in the range [0, pi) is pi / 4, so atan(1.0) * -8.0 is (a good approximation to) -2 pi.
2-) what its purpose of it.
It (or actually its additive inverse) appears in the DFT equation you presented, so it is natural that it appears in a function intended to implement that formula.
Here is the code with comments explaining it.
void slowft (float *x, COMPLEX *y, int n)
{
COMPLEX tmp, z1, z2, z3, z4;
int m, k;
/* Constant factor -2 pi */
cmplx (0.0, (float)(atan (1.0)/n * -8.0), &tmp);
/* atan(1) is π/4, so this sets tmp to -2πi/n. Note that the i
factor, the imaginary unit, comes from putting the expression in
the second argument, which gives the imaginary portion of the
complex number being assigned. (It is written as "j" in the
equation displayed in the question. That is because engineers use
"j" for i, having historically already used "i" for other purposes.)
*/
printf (" constant factor -2 pi %f ", (float)(atan (1.0)/n * -8.0));
for (m = 0; m<=n; m++)
{
NEXT();
// Well, that is a frightening thing to see in code. It is cryptic.
cmplx (x[0], 0.0, &(y[m]));
/* This starts to calculate a sum that will be accumulated in y[m].
The sum will be over k from 0 to n-1. For the first term, k is 0,
so -2πiwk/n will be 0. The coefficient is e to the power of that,
and e**0 is 1, so the first term is x[0] * 1, so we just put x[0]
diretly in y[m] with no multiplication.
*/
for (k=1; k<=n-1; k++)
// This adds the rest of the terms.
{
/* Exp (tmp*k*m) */
cmplx ((float)k, 0.0, &z2);
// This sets z2 to k.
cmult (tmp, z2, &z3);
/* This multiplies the -2πi/n from above with k, so it puts
-2πi/n from above, and This computes -2πik/n it in z3.
*/
cmplx ((float)m, 0.0, &z2);
// This sets z2 to m. m corresponds to the ω in the equation.
cmult (z2, z3, &z4);
// This multiplies m by -2πik/n, putting -2πiwk/n in z4.
cexp (z4, &z2);
/* This raises e to the power of -2πiwk/n, finishing the
coefficient of the term in the sum.
*/
/* *x[k] */
cmplx (x[k], 0.0, &z3);
// This sets z3 to x[k].
cmult (z2, z3, &z4);
// This multiplies x[k] by the coefficient, e**(-2πiwk/n).
/* + y[m] */
csum (y[m], z4, &z2);
/* This adds the term (z4) to the sum being accumulated (y[m])
and puts the updated sum in z2.
*/
y[m].real = z2.real; y[m].imag = z2.imag;
/* This moves the updated sum to y[m]. This is not necessary
because csum is passed its operands as values, so they are
copied when calling the function, and it is safe to update its
output. csum(y[m], z4, &y[m]) above would have worked. But
this works too.
*/
}
}
Standard C has support for complex arithmetic, so it would be easier and clearer to include <complex.h> and write code this way:
void slowft(float *x, complex float *y, int n)
{
static const float TwoPi = 0x3.243f6a8885a308d313198a2e03707344ap1f;
float t0 = -TwoPi/n;
for (int m = 0; m <=n; m++)
{
float t1 = t0*m;
y[m] = x[0];
for (int k = 1; k < n; k++)
y[m] += x[k] * cexpf(t1 * k * I);
}
}

How to calculate the volume of glass juice?

The radius of the upper part r1 and lower part r2 is given. If the height of the glass is h and height of the juice is p what is the volume of the juice in the glass? For some input-output is wrong:
#include <stdio.h>
#include <math.h>
int main() {
int r1, r2, h, p;
scanf("%d %d %d %d", &r1, &r2, &h, &p);
//for juice height p we will find out the radius r3
float r3 = ((r1 * p) / h); //calculating r3 by percentage formula
float v = ((M_PI / 3) * 3 * ((r3 * r3) + (r2 * r2) + (r3 + r2)));
printf("Volume : %f\n", v);
return 0;
}
EDIT: After I changed the all int to double still for some input output is wrong for an example for input 5 2 3 2 the answer should be 58.64.
Integer division yields an integer result - 1/2 == 0, 5/2 == 2, etc., so ((r1*p)/h) is not giving you the right value.
You should declare your inputs as double instead of int and use %lf instead of %d to read them:
double r1,r2,h,p;
scanf("%lf %lf %lf %lf",&r1,&r2,&h,&p);
You'll also want to declare r3 and v as double instead of float.
Besides the errors shown in John Bode's answer, the posted code is using the wrong formulas to calculate the volume.
#include <stdio.h>
#include <math.h>
// M_PI is not a standard macro
#if !defined(M_PI)
#define M_PI 3.14159265358979323846
#endif
int main(void)
{
// Given the following calculations, I'd use a floating-point type
// to store the values.
double r1, r2, h, p;
// The format specifiers need to be changed accordingly.
scanf("%lf%lf%lf%lf", &r1, &r2, &h, &p);
// To extimate the radius of top surface of the juice, we can't forget
// the radius at the bottom of the glass (it's not zero!).
double r3 = r2 + ((r1 - r2) * p) / h;
// The volume of the conical frustum can be calculated with a formula different
// from the posted one and we need to consider the height of the juice.
double v = M_PI * p * (r3 * r3 + r3 * r2 + r2 * r2) / 3.0;
printf("Volume : %f\n", v);
return 0;
}

How to print some information in the terminal while running a graphics.h based C program?

This is my first question in StackOverflow so forgive me for my mistakes in asking the question if any. I am trying to learn to use graphics.h library in C programming language as a part of the course curriculum and I am having trouble printing some information to the terminal in Linux while using libgraph. The printf() function prints the given information in the libgraph window instead of the terminal whereas I want it to print the information to the Linux terminal. Here are my code and screenshot of the output of this code Screenshot:
DDA algorithm screenshot of printf problem:
#include<stdio.h>
#include<graphics.h>
//Function for finding absolute value
int abs (int n)
{
return ( (n>0) ? n : ( n * (-1)));
}
//DDA Function for line generation
void DDA(int X0, int Y0, int X1, int Y1)
{
// calculate dx & dy
int dx = X1 - X0;
int dy = Y1 - Y0;
// calculate steps required for generating pixels
int steps = abs(dx) > abs(dy) ? abs(dx) : abs(dy);
// calculate increment in x & y for each steps
float Xinc = dx / (float) steps;
float Yinc = dy / (float) steps;
// Put pixel for each step
float X = X0;
float Y = Y0;
for (int i = 0; i <= steps; i++)
{
printf("(%f,%f)",X,Y);
putpixel (X,Y,RED); // put pixel at (X,Y)
X += Xinc; // increment in x at each step
Y += Yinc; // increment in y at each step
delay(100); // for visualization of line-
// generation step by step
}
}
// Driver program
int main()
{
int gd = DETECT, gm;
// Initialize graphics function
initgraph (&gd, &gm, "");
int X0 = 2, Y0 = 2, X1 = 14, Y1 = 16;
DDA(2, 2, 100, 100);
getch();
return 0;
}
What I want is that printf to print in the Linux terminal instead of the libgraph window.
Some, if not all implementations of libgraph have this line in one of the header files:
#define printf grprintf
So, they redefine printf with a macro, and you can't use it to print in the linux terminal. But since they don't redefine other output functions, you can use e. g.
fprintf(stdout, "(%f,%f)", X, Y), fflush(stdout); // or stderr instead of stdout
or puts for constant strings.
Or, even simpler, you can #undef printf after the #include<graphics.h> to get back the normal behavior.

Eigenvalue calculation using TQLI algorithm fails with segmentation fault

I am trying to calculate eigenvalues using the TQLI algorithm that I got from the website of the CACS of the University of Southern California. My test script looks like this:
#include <stdio.h>
int main()
{
int i;
i = rand();
printf("My random number: %d\n", i);
float d[4] = {
{1, 2, 3, 4}
};
float e[4] = {
{0, 0, 0, 0}
};
float z[4][4] = {
{1.0, 0.0, 0.0, 0.0} ,
{0.0, 1.0, 0.0, 0.0} ,
{0.0, 0.0, 1.0, 0.0},
{0.0, 0.0, 0.0, 1.0}
};
double *zptr;
zptr = &z[0][0];
printf("Element [2][1] of identity matrix: %f\n", z[2][1]);
printf("Element [2][2] of identity matrix: %f\n", z[2][2]);
tqli(d, e, 4, zptr);
printf("First eigenvalue: %f\n", d[0]);
return 0;
}
When I try to run this script I get a segmentation fault error as you can see in here. At what location does my code produce this segmentation fault. As I believe the code from USC is bug-free I am pretty sure the mistake must be in my call of the function. However I can't see where I made a mistake in my set-up of the arrays as in my opinion I followed the instructions.
Eigenvalue calculation using TQLI algorithm fails with segmentation
fault
Segmentation fault comes from crossing the supplied array boundary. tqli requires specific data preparation.
1) The eigen code from CACS is Fortran based and counts indexes from 1.
2) The tqli expects double pointer for its matrix and double vectors.
/******************************************************************************/
void tqli(double d[], double e[], int n, double **z)
/*******************************************************************************
d, and e should be declared as double.
3) The program needs modification in respect to the data preparation for the above function.
Helper 1-index based vectors have to be created to supply properly formatted data for the tqli:
double z[NP][NP] = { {2, 0, 0}, {0, 4, 0}, {0, 0, 2} } ;
double **a;
double *d,*e,*f;
d=dvector(1,NP); // 1-index based vector
e=dvector(1,NP);
f=dvector(1,NP);
a=dmatrix(1,NP,1,NP); // 1-index based matrix
for (i=1;i<=NP;i++) // loading data from zero besed `ze` to `a`
for (j=1;j<=NP;j++) a[i][j]=z[i-1][j-1];
Complete test program is supplied below. It uses the eigen code from CACS:
/*******************************************************************************
Eigenvalue solvers, tred2 and tqli, from "Numerical Recipes in C" (Cambridge
Univ. Press) by W.H. Press, S.A. Teukolsky, W.T. Vetterling, and B.P. Flannery
*******************************************************************************/
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#define NR_END 1
#define SIGN(a,b) ((b) >= 0.0 ? fabs(a) : -fabs(a))
double **dmatrix(int nrl, int nrh, int ncl, int nch)
/* allocate a double matrix with subscript range m[nrl..nrh][ncl..nch] */
{
int i,nrow=nrh-nrl+1,ncol=nch-ncl+1;
double **m;
/* allocate pointers to rows */
m=(double **) malloc((size_t)((nrow+NR_END)*sizeof(double*)));
m += NR_END;
m -= nrl;
/* allocate rows and set pointers to them */
m[nrl]=(double *) malloc((size_t)((nrow*ncol+NR_END)*sizeof(double)));
m[nrl] += NR_END;
m[nrl] -= ncl;
for(i=nrl+1;i<=nrh;i++) m[i]=m[i-1]+ncol;
/* return pointer to array of pointers to rows */
return m;
}
double *dvector(int nl, int nh)
/* allocate a double vector with subscript range v[nl..nh] */
{
double *v;
v=(double *)malloc((size_t) ((nh-nl+1+NR_END)*sizeof(double)));
return v-nl+NR_END;
}
/******************************************************************************/
void tred2(double **a, int n, double d[], double e[])
/*******************************************************************************
Householder reduction of a real, symmetric matrix a[1..n][1..n].
On output, a is replaced by the orthogonal matrix Q effecting the
transformation. d[1..n] returns the diagonal elements of the tridiagonal matrix,
and e[1..n] the off-diagonal elements, with e[1]=0. Several statements, as noted
in comments, can be omitted if only eigenvalues are to be found, in which case a
contains no useful information on output. Otherwise they are to be included.
*******************************************************************************/
{
int l,k,j,i;
double scale,hh,h,g,f;
for (i=n;i>=2;i--) {
l=i-1;
h=scale=0.0;
if (l > 1) {
for (k=1;k<=l;k++)
scale += fabs(a[i][k]);
if (scale == 0.0) /* Skip transformation. */
e[i]=a[i][l];
else {
for (k=1;k<=l;k++) {
a[i][k] /= scale; /* Use scaled a's for transformation. */
h += a[i][k]*a[i][k]; /* Form sigma in h. */
}
f=a[i][l];
g=(f >= 0.0 ? -sqrt(h) : sqrt(h));
e[i]=scale*g;
h -= f*g; /* Now h is equation (11.2.4). */
a[i][l]=f-g; /* Store u in the ith row of a. */
f=0.0;
for (j=1;j<=l;j++) {
/* Next statement can be omitted if eigenvectors not wanted */
a[j][i]=a[i][j]/h; /* Store u/H in ith column of a. */
g=0.0; /* Form an element of A.u in g. */
for (k=1;k<=j;k++)
g += a[j][k]*a[i][k];
for (k=j+1;k<=l;k++)
g += a[k][j]*a[i][k];
e[j]=g/h; /* Form element of p in temporarily unused element of e. */
f += e[j]*a[i][j];
}
hh=f/(h+h); /* Form K, equation (11.2.11). */
for (j=1;j<=l;j++) { /* Form q and store in e overwriting p. */
f=a[i][j];
e[j]=g=e[j]-hh*f;
for (k=1;k<=j;k++) /* Reduce a, equation (11.2.13). */
a[j][k] -= (f*e[k]+g*a[i][k]);
}
}
} else
e[i]=a[i][l];
d[i]=h;
}
/* Next statement can be omitted if eigenvectors not wanted */
d[1]=0.0;
e[1]=0.0;
/* Contents of this loop can be omitted if eigenvectors not
wanted except for statement d[i]=a[i][i]; */
for (i=1;i<=n;i++) { /* Begin accumulation of transformation matrices. */
l=i-1;
if (d[i]) { /* This block skipped when i=1. */
for (j=1;j<=l;j++) {
g=0.0;
for (k=1;k<=l;k++) /* Use u and u/H stored in a to form P.Q. */
g += a[i][k]*a[k][j];
for (k=1;k<=l;k++)
a[k][j] -= g*a[k][i];
}
}
d[i]=a[i][i]; /* This statement remains. */
a[i][i]=1.0; /* Reset row and column of a to identity matrix for next iteration. */
for (j=1;j<=l;j++) a[j][i]=a[i][j]=0.0;
}
}
/******************************************************************************/
void tqli(double d[], double e[], int n, double **z)
/*******************************************************************************
QL algorithm with implicit shifts, to determine the eigenvalues and eigenvectors
of a real, symmetric, tridiagonal matrix, or of a real, symmetric matrix
previously reduced by tred2 sec. 11.2. On input, d[1..n] contains the diagonal
elements of the tridiagonal matrix. On output, it returns the eigenvalues. The
vector e[1..n] inputs the subdiagonal elements of the tridiagonal matrix, with
e[1] arbitrary. On output e is destroyed. When finding only the eigenvalues,
several lines may be omitted, as noted in the comments. If the eigenvectors of
a tridiagonal matrix are desired, the matrix z[1..n][1..n] is input as the
identity matrix. If the eigenvectors of a matrix that has been reduced by tred2
are required, then z is input as the matrix output by tred2. In either case,
the kth column of z returns the normalized eigenvector corresponding to d[k].
*******************************************************************************/
{
double pythag(double a, double b);
int m,l,iter,i,k;
double s,r,p,g,f,dd,c,b;
for (i=2;i<=n;i++) e[i-1]=e[i]; /* Convenient to renumber the elements of e. */
e[n]=0.0;
for (l=1;l<=n;l++) {
iter=0;
do {
for (m=l;m<=n-1;m++) { /* Look for a single small subdiagonal element to split the matrix. */
dd=fabs(d[m])+fabs(d[m+1]);
if ((double)(fabs(e[m])+dd) == dd) break;
}
if (m != l) {
if (iter++ == 30) printf("Too many iterations in tqli");
g=(d[l+1]-d[l])/(2.0*e[l]); /* Form shift. */
r=pythag(g,1.0);
g=d[m]-d[l]+e[l]/(g+SIGN(r,g)); /* This is dm - ks. */
s=c=1.0;
p=0.0;
for (i=m-1;i>=l;i--) { /* A plane rotation as in the original QL, followed by Givens */
f=s*e[i]; /* rotations to restore tridiagonal form. */
b=c*e[i];
e[i+1]=(r=pythag(f,g));
if (r == 0.0) { /* Recover from underflow. */
d[i+1] -= p;
e[m]=0.0;
break;
}
s=f/r;
c=g/r;
g=d[i+1]-p;
r=(d[i]-g)*s+2.0*c*b;
d[i+1]=g+(p=s*r);
g=c*r-b;
/* Next loop can be omitted if eigenvectors not wanted */
for (k=1;k<=n;k++) { /* Form eigenvectors. */
f=z[k][i+1];
z[k][i+1]=s*z[k][i]+c*f;
z[k][i]=c*z[k][i]-s*f;
}
}
if (r == 0.0 && i >= l) continue;
d[l] -= p;
e[l]=g;
e[m]=0.0;
}
} while (m != l);
}
}
/******************************************************************************/
double pythag(double a, double b)
/*******************************************************************************
Computes (a2 + b2)1/2 without destructive underflow or overflow.
*******************************************************************************/
{
double absa,absb;
absa=fabs(a);
absb=fabs(b);
if (absa > absb) return absa*sqrt(1.0+(absb/absa)*(absb/absa));
else return (absb == 0.0 ? 0.0 : absb*sqrt(1.0+(absa/absb)*(absa/absb)));
}
#define NP 3
#define TINY 1.0e-6
double sqrt(double x)
{
union
{
int i;
double x;
} u;
u.x = x;
u.i = (1<<29) + (u.i >> 1) - (1<<22);
return u.x;
}
int main()
{
int i,j,k;
double ze[NP][NP] = { {2, 0, 0}, {0, 4, 0}, {0, 0, 2} } ;
double **a;
double *d,*e,*f;
d=dvector(1,NP);
e=dvector(1,NP);
f=dvector(1,NP);
a=dmatrix(1,NP,1,NP);
for (i=1;i<=NP;i++)
for (j=1;j<=NP;j++) a[i][j]=ze[i-1][j-1];
tred2(a,NP,d,e);
tqli(d,e,NP,a);
printf("\nEigenvectors for a real symmetric matrix:\n");
for (i=1;i<=NP;i++) {
for (j=1;j<=NP;j++) {
f[j]=0.0;
for (k=1;k<=NP;k++)
f[j] += (ze[j-1][k-1]*a[k][i]);
}
printf("%s %3d %s %10.6f\n","\neigenvalue",i," =",d[i]);
printf("%11s %14s %9s\n","vector","mtrx*vect.","ratio");
for (j=1;j<=NP;j++) {
if (fabs(a[j][i]) < TINY)
printf("%12.6f %12.6f %12s\n",
a[j][i],f[j],"div. by 0");
else
printf("%12.6f %12.6f %12.6f\n",
a[j][i],f[j],f[j]/a[j][i]);
}
}
//free_dmatrix(a,1,NP,1,NP);
//free_dvector(f,1,NP);
//free_dvector(e,1,NP);
//free_dvector(d,1,NP);
return 0;
}
Output:
Eigenvectors for a real symmetric matrix:
eigenvalue 1 = 2.000000
vector mtrx*vect. ratio
1.000000 2.000000 2.000000
0.000000 0.000000 div. by 0
0.000000 0.000000 div. by 0
eigenvalue 2 = 4.000000
vector mtrx*vect. ratio
0.000000 0.000000 div. by 0
1.000000 4.000000 4.000000
0.000000 0.000000 div. by 0
eigenvalue 3 = 2.000000
vector mtrx*vect. ratio
0.000000 0.000000 div. by 0
0.000000 0.000000 div. by 0
1.000000 2.000000 2.000000
I hope it finaly helps to clarify confusion regarding the data preparation for tqli.

Finding the intersection of a closed irregular triangulated 3D surface with a Cartesian rectangular 3D grid

I am searching online for an efficient method that can intersect a Cartesian rectangular 3D grid with a close irregular 3D surface which is triangulated.
This surface is represented is a set of vertices, V, and a set of faces, F. The Cartesian rectangular grid is stored as:
x_0, x_1, ..., x_(ni-1)
y_0, y_1, ..., y_(nj-1)
z_0, z_1, ..., z_(nk-1)
In the figure below, a single cell of the Cartesian grid is shown. In addition, two triangles of the surface is schematically shown. This intersection is shown by a dotted red lines with the solid red circles the intersection points with this particular cell. My goal is to find the points of intersection of the surface with the edges of the cells, which can be non-planar.
I will implement in either MATLAB, C, or C++.
Assuming we have a regular axis-aligned rectangular grid, with each grid cell matching the unit cube (and thus grid point (i,j,k) is at (i,j,k), with i,j,k integers), I would suggest trying a 3D variant of 2D triangle rasterization.
The basic idea is to draw the triangle perimeter, then every intersection between the triangle and each plane perpendicular to an axis and intersecting that axis at integer coordinates.
You end up with line segments on grid cell faces, wherever the triangle passes through a grid cell. The lines on the faces of each grid cell form a closed planar polygon. (However, you'll need to connect the line segments and orient the polygon yourself.)
For finding out only the grid cells the triangle passes through, a simplified approach can be used, and a bitmap (one bit per grid cell). This case is essentially just a 3D version of triangle rasterization.
The key observation is that if you have a line (X0,Y0,Z0)-(X1,Y1,Z1), you can split it into segments at integer coordinates xi along the x axis using
ti = (xi - X0) / (X1 - X0)
yi = (1 - ti) Y0 + ti Y1
zi = (1 - ti) Z0 + ti Z1
Similarly along the other axes, of course.
You'll need to do three passes, one along each axis. If you sort the vertices so that the coordinates are nondecreasing along that axis, i.e. p0 ≤ p1 ≤ p2, one endpoint is at integer coordinates intersecting line p0p2, and the other endpoint intersects line p0p1 at small coordinates, and line p1p2 at large coordinates.
The intersection line between those endpoints is perpendicular to one axis, but it still needs to be split into segments that do not cross integer coordinates along the other two dimensions. This is fortunately simple; just maintain tj and tk along those two dimensions just like ti above, and step to the next integer coordinate that has the smaller t value; start at 0, and end at 1.
The edges of the original triangle also need to be drawn, just split along all three dimensions. Again, this is straightforward, by maintaining the t for each axis, and stepping along the axis with the smallest value. I have example code in C99 for this, the most complicated case, below.
There are quite a few implementation details to consider.
Because each cell shares each face with another cell, and each edge with three other edges, let's define the following properties for each cell (i,j,k), where i,j,k are integers identifying the cell:
X face: The cell face at x=i, perpendicular to the x axis
Y face: The cell face at y=j, perpendicular to the y axis
Z face: The cell face at z=k, perpendicular to the z axis
X edge: The edge from (i,j,k) to (i+1,j,k)
Y edge: The edge from (i,j,k) to (i,j+1,k)
Z edge: The edge from (i,j,k) to (i,j,k+1)
The other three faces for cell (i,j,k) are
the X face at (i+1,j,k)
the Y face at (i,j+1,k)
the Z face at (i,j,k+1)
Similarly, each edge is an edge for three other cells. The X edge of cell (i,j,k) is also an edge for grid cells (i,j+1,k), (i,j,k+1), and (i,j+1,k+1). The Y edge of cell (i,j,k) is also an edge for grid cells (i+1,j,k), (i,j,k+1), and (i+1,j,k+1). The Z edge of cell (i,j,k) is also an edge for grid cells (i+1,j,k), (i,j+1,k), and (i+1,j+1,k).
Here is an image that might help.
(Ignore the fact that it's left-handed; I just thought it'd be easier to label this way.)
This means that if you have a line segment on a specific grid cell face, the line segment is shared between the two grid cells sharing that face. Similarly, if a line segment endpoint is on a grid cell edge, there are four different grid cell faces the other line segment endpoint could be on.
To clarify this, my example code below prints not only the coordinates, but the grid cell and the face/edge/vertex the line segment endpoint is on.
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include <errno.h>
#include <math.h>
typedef struct {
double x;
double y;
double z;
} vector;
typedef struct {
long x;
long y;
long z;
} gridpos;
typedef enum {
INSIDE = 0, /* Point is inside the grid cell */
X_FACE = 1, /* Point is at integer X coordinate (on the YZ face) */
Y_FACE = 2, /* Point is at integer Y coordinate (on the XZ face) */
Z_EDGE = 3, /* Point is at integet X and Y coordinates (on the Z edge) */
Z_FACE = 4, /* Point is at integer Z coordinate (on the XY face) */
Y_EDGE = 5, /* Point is at integer X and Z coordinates (on the Y edge) */
X_EDGE = 6, /* Point is at integer Y and Z coordinates (on the X edge) */
VERTEX = 7, /* Point is at integer coordinates (at the grid point) */
} cellpos;
static inline cellpos cellpos_of(const vector v)
{
return (v.x == floor(v.x))
+ (v.y == floor(v.y)) * 2
+ (v.z == floor(v.z)) * 4;
}
static const char *const face_name[8] = {
"inside",
"x-face",
"y-face",
"z-edge",
"z-face",
"y-edge",
"x-edge",
"vertex",
};
static int line_segments(const vector p0, const vector p1,
int (*segment)(void *custom,
const gridpos src_cell, const cellpos src_face, const vector src_vec,
const gridpos dst_cell, const cellpos dst_face, const vector dst_vec),
void *const custom)
{
const vector range = { p1.x - p0.x, p1.y - p0.y, p1.z - p0.z };
const gridpos step = { (range.x < 0.0) ? -1L : (range.x > 0.0) ? +1L : 0L,
(range.y < 0.0) ? -1L : (range.y > 0.0) ? +1L : 0L,
(range.z < 0.0) ? -1L : (range.z > 0.0) ? +1L : 0L };
const gridpos end = { floor(p1.x), floor(p1.y), floor(p1.z) };
gridpos prev_cell, curr_cell = { floor(p0.x), floor(p0.y), floor(p0.z) };
vector prev_vec, curr_vec = p0;
vector curr_at = { 0.0, 0.0, 0.0 };
vector next_at = { (range.x != 0.0 && curr_cell.x != end.x) ? ((double)(curr_cell.x + step.x) - p0.x) / range.x : 1.0,
(range.y != 0.0 && curr_cell.y != end.y) ? ((double)(curr_cell.y + step.y) - p0.y) / range.y : 1.0,
(range.z != 0.0 && curr_cell.z != end.z) ? ((double)(curr_cell.z + step.z) - p0.z) / range.z : 1.0};
cellpos prev_face, curr_face;
double at;
int retval;
curr_face = cellpos_of(p0);
while (curr_at.x < 1.0 || curr_at.y < 1.0 || curr_at.z < 1.0) {
prev_cell = curr_cell;
prev_face = curr_face;
prev_vec = curr_vec;
if (next_at.x < 1.0 && next_at.x <= next_at.y && next_at.x <= next_at.z) {
/* YZ plane */
at = next_at.x;
curr_vec.x = round( (1.0 - at) * p0.x + at * p1.x );
curr_vec.y = (1.0 - at) * p0.y + at * p1.y;
curr_vec.z = (1.0 - at) * p0.z + at * p1.z;
} else
if (next_at.y < 1.0 && next_at.y < next_at.x && next_at.y <= next_at.z) {
/* XZ plane */
at = next_at.y;
curr_vec.x = (1.0 - at) * p0.x + at * p1.x;
curr_vec.y = round( (1.0 - at) * p0.y + at * p1.y );
curr_vec.z = (1.0 - at) * p0.z + at * p1.z;
} else
if (next_at.z < 1.0 && next_at.z < next_at.x && next_at.z < next_at.y) {
/* XY plane */
at = next_at.z;
curr_vec.x = (1.0 - at) * p0.x + at * p1.x;
curr_vec.y = (1.0 - at) * p0.y + at * p1.y;
curr_vec.z = round( (1.0 - at) * p0.z + at * p1.z );
} else {
at = 1.0;
curr_vec = p1;
}
curr_face = cellpos_of(curr_vec);
curr_cell.x = floor(curr_vec.x);
curr_cell.y = floor(curr_vec.y);
curr_cell.z = floor(curr_vec.z);
retval = segment(custom,
prev_cell, prev_face, prev_vec,
curr_cell, curr_face, curr_vec);
if (retval)
return retval;
if (at < 1.0) {
curr_at = next_at;
if (at >= next_at.x) {
/* recalc next_at.x */
if (curr_cell.x != end.x) {
next_at.x = ((double)(curr_cell.x + step.x) - p0.x) / range.x;
if (next_at.x > 1.0)
next_at.x = 1.0;
} else
next_at.x = 1.0;
}
if (at >= next_at.y) {
/* reclac next_at.y */
if (curr_cell.y != end.y) {
next_at.y = ((double)(curr_cell.y + step.y) - p0.y) / range.y;
if (next_at.y > 1.0)
next_at.y = 1.0;
} else
next_at.y = 1.0;
}
if (at >= next_at.z) {
/* recalc next_at.z */
if (curr_cell.z != end.z) {
next_at.z = ((double)(curr_cell.z + step.z) - p0.z) / range.z;
if (next_at.z > 1.0)
next_at.z = 1.0;
} else
next_at.z = 1.0;
}
} else {
curr_at.x = curr_at.y = curr_at.z = 1.0;
next_at.x = next_at.y = next_at.z = 1.0;
}
}
return 0;
}
int print_segment(void *outstream,
const gridpos src_cell, const cellpos src_face, const vector src_vec,
const gridpos dst_cell, const cellpos dst_face, const vector dst_vec)
{
FILE *const out = outstream ? outstream : stdout;
fprintf(out, "%.6f %.6f %.6f %.6f %.6f %.6f %s %ld %ld %ld %s %ld %ld %ld\n",
src_vec.x, src_vec.y, src_vec.z,
dst_vec.x, dst_vec.y, dst_vec.z,
face_name[src_face], src_cell.x, src_cell.y, src_cell.z,
face_name[dst_face], dst_cell.x, dst_cell.y, dst_cell.z);
fflush(out);
return 0;
}
static int parse_vector(const char *s, vector *const v)
{
double x, y, z;
char c;
if (!s)
return EINVAL;
if (sscanf(s, " %lf %*[,/:;] %lf %*[,/:;] %lf %c", &x, &y, &z, &c) == 3) {
if (v) {
v->x = x;
v->y = y;
v->z = z;
}
return 0;
}
return ENOENT;
}
int main(int argc, char *argv[])
{
vector start, end;
if (argc != 3 || !strcmp(argv[1], "-h") || !strcmp(argv[1], "--help")) {
fprintf(stderr, "\n");
fprintf(stderr, "Usage: %s [ -h | --help ]\n", argv[0]);
fprintf(stderr, " %s x0:y0:z0 x1:y1:z1\n", argv[0]);
fprintf(stderr, "\n");
return EXIT_FAILURE;
}
if (parse_vector(argv[1], &start)) {
fprintf(stderr, "%s: Invalid start point.\n", argv[1]);
return EXIT_FAILURE;
}
if (parse_vector(argv[2], &end)) {
fprintf(stderr, "%s: Invalid end point.\n", argv[2]);
return EXIT_FAILURE;
}
if (line_segments(start, end, print_segment, stdout))
return EXIT_FAILURE;
return EXIT_SUCCESS;
}
The program takes two command-line parameters, the 3D endpoints for the line to be segmented. If you compile the above to say example, then running
./example 0.5/0.25/3.50 3.5/4.0/0.50
outputs
0.500000 0.250000 3.500000 1.000000 0.875000 3.000000 inside 0 0 3 x-face 1 0 3
1.000000 0.875000 3.000000 1.100000 1.000000 2.900000 x-face 1 0 3 y-face 1 1 2
1.100000 1.000000 2.900000 1.900000 2.000000 2.100000 y-face 1 1 2 y-face 1 2 2
1.900000 2.000000 2.100000 2.000000 2.125000 2.000000 y-face 1 2 2 y-edge 2 2 2
2.000000 2.125000 2.000000 2.700000 3.000000 1.300000 y-edge 2 2 2 y-face 2 3 1
2.700000 3.000000 1.300000 3.000000 3.375000 1.000000 y-face 2 3 1 y-edge 3 3 1
3.000000 3.375000 1.000000 3.500000 4.000000 0.500000 y-edge 3 3 1 y-face 3 4 0
which shows that line (0.5, 0.25, 3.50) - (3.5, 4.0, 0.50) gets split into seven segments; this particular line passing through exactly seven grid cells.
For the rasterization case -- when you are only interested in which grid cells the surface triangles pass through --, you do not need to store the line segment points, only compute them all. When a point is at a vertex or inside a grid cell, mark the bit corresponding to that grid cell. When a point is at a face, set the bit for the two grid cells that share that face. When a line segment endpoint is at an edge, set the bit for the four grid cells that share that edge.
Questions?
Break the problem down into smaller steps.
Finding the intersection of a line segment with a triangle is easy.
Once you have that implemented, just perform a nested loop that checks for intersection between every combination of lines from the grid with triangles of the surface.
Use Line-Plane intersection as explained here on each surface triangle edge. You'll use six planes, one for each grid cell face.

Resources