Fast Fourier Transform Optimization - How To Pre Calculate Exponential?

Fast Fourier Transform Optimization - How To Pre Calculate Exponential? - c

Actually, I think that I get the discrete Fourier transform some basics. And now I have some problems with the fast Fourier transform algorithm.
I don't want to share all the functions so as not to complicate the problem. But if you don't understand some parts I can edit the question.
Slow Fourier transform:
void slowft (float *x, COMPLEX *y, int n)
{
COMPLEX tmp, z1, z2, z3, z4;
int m, k;
/* Constant factor -2 pi */
cmplx (0.0, (float)(atan (1.0)/n * -8.0), &tmp);
printf (" constant factor -2 pi %f ", (float)(atan (1.0)/n * -8.0));
for (m = 0; m<=n; m++)
{
cmplx (x[0], 0.0, &(y[m]));
for (k=1; k<=n-1; k++)
{
/* Exp (tmp*k*m) */
cmplx ((float)k, 0.0, &z2);
cmult (tmp, z2, &z3);
cmplx ((float)m, 0.0, &z2);
cmult (z2, z3, &z4);
cexp (z4, &z2);
/* *x[k] */
cmplx (x[k], 0.0, &z3);
cmult (z2, z3, &z4);
/* + y[m] */
csum (y[m], z4, &z2);
y[m].real = z2.real; y[m].imag = z2.imag;
}
}
}
to make clear:
cmplx is creating a complete number, cmult is complex multiplication and cexp is taking exponent. that's all.
And some optimizations:
void newslowft (double *x, COMPLEX *y, int n)
{
COMPLEX tmp, z1, z2, z3, z4, *pre;
long m, k, i, p;
pre = (COMPLEX *)malloc(sizeof(struct cpx)*1024);
/* Constant factor -2 pi */
cmplx (0.0, atan (1.0)/n * -8.0, &z1);
cexp (z1, &tmp);
/* Pre-compute most of the exponential */
cmplx (1.0, 0.0, &z1); /* Z1 = 1.0; */
//n=1024
for (i=0; i<n; i++)
{
cmplx (z1.real, z1.imag, &(pre[i]));
cmult (z1, tmp, &z3);
cmplx (z3.real, z3.imag, &z1);
}
/* Double loop to compute all Y entries */
for (m = 0; m<n; m++)
{
cmplx (x[0], 0.0, &(y[m]));
for (k=1; k<=n-1; k++)
{
/* Exp (tmp*k*m) */
p = (k*m % n);
/* *x[k] */
cmplx (x[k], 0.0, &z3);
cmult (z3, pre[p], &z4);
/* + y[m] */
csum (y[m], z4, &z2);
y[m].real = z2.real;
y[m].imag = z2.imag;
}
}
}
The problem: The first step of the optimization:
"precalculating some exponential inside the loop".
So this is actually what I ask. How does this algorithm calculate the all exponential?
I think we are calculating the following exponentials: e^0 e^1 e^2.... e^1023 So where are the other exponentials?
I mean, in the first algorithm, inside the for loops we are using m (m=0; m<=1024; m ) and k(k=0; k<1023-1; k ) but, where is the e^1000*900?
As far as I understand, the second algorithm takes the mode according to n. I think this is the key point right? But I didn't get how to work?
Thanks in advance masters.

Related

Discrete Fourier Transform With C, Implementation problems?

I'm trying to understand some basics of DFT, some math equations, and try to implement it with C.
Well, this is the function i used from a book (Algorithms for Image Processing And Computer Vision)
void slowft (float *x, COMPLEX *y, int n)
{
COMPLEX tmp, z1, z2, z3, z4;
int m, k;
/* Constant factor -2 pi */
cmplx (0.0, (float)(atan (1.0)/n * -8.0), &tmp);
printf (" constant factor -2 pi %f ", (float)(atan (1.0)/n * -8.0));
for (m = 0; m<=n; m++)
{
NEXT();
cmplx (x[0], 0.0, &(y[m]));
for (k=1; k<=n-1; k++)
{
/* Exp (tmp*k*m) */
cmplx ((float)k, 0.0, &z2);
cmult (tmp, z2, &z3);
cmplx ((float)m, 0.0, &z2);
cmult (z2, z3, &z4);
cexp (z4, &z2);
/* *x[k] */
cmplx (x[k], 0.0, &z3);
cmult (z2, z3, &z4);
/* + y[m] */
csum (y[m], z4, &z2);
y[m].real = z2.real; y[m].imag = z2.imag;
}
}
}
So actually, I'm stuck on the Constant Factor part. I didn't understand:
1-) what it came from(especially arctan(1)) and
2-) what its purpose of it.
This is the equation of DFT:
And these are other functions that i used:
void cexp (COMPLEX z1, COMPLEX *res)
{
COMPLEX x, y;
x.real = exp((double)z1.real);
x.imag = 0.0;
y.real = (float)cos((double)z1.imag);
y.imag = (float)sin((double)z1.imag);
cmult (x, y, res);
}
void cmult (COMPLEX z1, COMPLEX z2, COMPLEX *res)
{
res->real = z1.real*z2.real - z1.imag*z2.imag;
res->imag = z1.real*z2.imag + z1.imag*z2.real;
}
void csum (COMPLEX z1, COMPLEX z2, COMPLEX *res)
{
res->real = z1.real + z2.real;
res->imag = z1.imag + z2.imag;
}
void cmplx (float rp, float ip, COMPLEX *z)
{
z->real = rp;
z->imag = ip;
}
float cnorm (COMPLEX z)
{
return z.real*z.real + z.imag*z.imag;
}

1-) what it came from(especially arctan(1)) and
The code comment immediately above clues you in:
/* Constant factor -2 pi */
... although actually what is being computed is -2 pi / n (in the broader context of producing a complex number with that as the coefficient of its imaginary component). Observe that the tangent has value 1 for angles whose sine and cosine are equal. The angle that has that property and is in the range [0, pi) is pi / 4, so atan(1.0) * -8.0 is (a good approximation to) -2 pi.
2-) what its purpose of it.
It (or actually its additive inverse) appears in the DFT equation you presented, so it is natural that it appears in a function intended to implement that formula.

Here is the code with comments explaining it.
void slowft (float *x, COMPLEX *y, int n)
{
COMPLEX tmp, z1, z2, z3, z4;
int m, k;
/* Constant factor -2 pi */
cmplx (0.0, (float)(atan (1.0)/n * -8.0), &tmp);
/* atan(1) is π/4, so this sets tmp to -2πi/n. Note that the i
factor, the imaginary unit, comes from putting the expression in
the second argument, which gives the imaginary portion of the
complex number being assigned. (It is written as "j" in the
equation displayed in the question. That is because engineers use
"j" for i, having historically already used "i" for other purposes.)
*/
printf (" constant factor -2 pi %f ", (float)(atan (1.0)/n * -8.0));
for (m = 0; m<=n; m++)
{
NEXT();
// Well, that is a frightening thing to see in code. It is cryptic.
cmplx (x[0], 0.0, &(y[m]));
/* This starts to calculate a sum that will be accumulated in y[m].
The sum will be over k from 0 to n-1. For the first term, k is 0,
so -2πiwk/n will be 0. The coefficient is e to the power of that,
and e**0 is 1, so the first term is x[0] * 1, so we just put x[0]
diretly in y[m] with no multiplication.
*/
for (k=1; k<=n-1; k++)
// This adds the rest of the terms.
{
/* Exp (tmp*k*m) */
cmplx ((float)k, 0.0, &z2);
// This sets z2 to k.
cmult (tmp, z2, &z3);
/* This multiplies the -2πi/n from above with k, so it puts
-2πi/n from above, and This computes -2πik/n it in z3.
*/
cmplx ((float)m, 0.0, &z2);
// This sets z2 to m. m corresponds to the ω in the equation.
cmult (z2, z3, &z4);
// This multiplies m by -2πik/n, putting -2πiwk/n in z4.
cexp (z4, &z2);
/* This raises e to the power of -2πiwk/n, finishing the
coefficient of the term in the sum.
*/
/* *x[k] */
cmplx (x[k], 0.0, &z3);
// This sets z3 to x[k].
cmult (z2, z3, &z4);
// This multiplies x[k] by the coefficient, e**(-2πiwk/n).
/* + y[m] */
csum (y[m], z4, &z2);
/* This adds the term (z4) to the sum being accumulated (y[m])
and puts the updated sum in z2.
*/
y[m].real = z2.real; y[m].imag = z2.imag;
/* This moves the updated sum to y[m]. This is not necessary
because csum is passed its operands as values, so they are
copied when calling the function, and it is safe to update its
output. csum(y[m], z4, &y[m]) above would have worked. But
this works too.
*/
}
}
Standard C has support for complex arithmetic, so it would be easier and clearer to include <complex.h> and write code this way:
void slowft(float *x, complex float *y, int n)
{
static const float TwoPi = 0x3.243f6a8885a308d313198a2e03707344ap1f;
float t0 = -TwoPi/n;
for (int m = 0; m <=n; m++)
{
float t1 = t0*m;
y[m] = x[0];
for (int k = 1; k < n; k++)
y[m] += x[k] * cexpf(t1 * k * I);
}
}

Mysterious problem with minimal ray tracer in C

I'm working on a minimal ray tracer in C, and I've written a ray tracer a little while ago so I understand the theory behind them, just wanted to do a rewrite for cleanup purposes.
I have the necessary elements for a ray tracer, and nothing more. I've written triangle intersection, transforming pixel space coordinates to NDC (with aspect ratio and FOV accounted for), and writing out the frame buffer.
However, it does not work as expected. The image is entirely black when it should be rendering a single triangle. I've tested writing a single test pixel, and it works fine so I know it isn't an issue with the image writing code.
I've double and triple-checked the code behind the math, and it looks fine to me. Intersection code is basically a duplicate of the source code in the original Moller-Trumbore paper:
/* ray triangle intersection */
bool ray_triangle_intersect(double orig[3], double dir[3], double vert0[3],
double vert1[3], double vert2[3], double* t, double* u, double* v) {
double edge1[3], edge2[3];
double tvec[3], pvec[3], qvec[3];
double det, inv_det;
/* edges */
SUB(edge1, vert1, vert0);
SUB(edge2, vert2, vert0);
/* determinant */
CROSS(pvec, dir, edge2);
/* ray in plane of triangle if near zero */
det = DOT(edge1, pvec);
if(det < EPSILON)
return 0;
SUB(tvec, orig, vert0);
inv_det = 1.0 / det;
/* calculate, check bounds */
*u = DOT(tvec, pvec) * inv_det;
if(*u < 0.0 || *u > 1.0)
return 0;
CROSS(qvec, tvec, edge1);
/* calculate, check bounds */
*v = DOT(dir, qvec) * inv_det;
if(*v < 0.0 || *u + *v > 1.0)
return 0;
*t = DOT(edge2, qvec) * inv_det;
return 1;
}
CROSS, DOT, and SUB are just macros:
#define CROSS(v,v0,v1) \
v[0] = v0[1] * v1[2] - v0[2] * v1[1]; \
v[1] = v0[2] * v1[0] - v0[0] * v1[2]; \
v[2] = v0[0] * v1[1] - v0[1] * v1[0];
#define DOT(v0,v1) (v0[0] * v1[0] + v0[1] * v1[1] + v0[2] * v1[2])
/* v = v0 - v1 */
#define SUB(v,v0,v1) \
v[0] = v0[0] - v1[0]; \
v[1] = v0[1] - v1[1]; \
v[2] = v0[2] - v1[2];
Transformation code is as follows:
double ndc[2];
screen_to_ndc(x, y, &ndc[0], &ndc[1]);
double dir[3];
dir[0] = ndc[0] * ar * tfov;
dir[1] = ndc[1] * tfov;
dir[2] = -1;
norm(dir);
And screen_to_ndc:
void screen_to_ndc(unsigned int x, unsigned int y, double* ndcx, double* ndcy) {
*ndcx = 2 * (((double) x + (1.0 / 2.0)) / (double) WIDTH) - 1;
*ndcy = 1 - 2 * (((double) y + (1.0 / 2.0)) / (double) HEIGHT);
}
Any help would be appreciated.

Try reversing the orientation of your triangle. Your ray-triangle intersection code culls backfaces because it returns early when det is negative.

Ray tracing a Hemisphere

I am currently working on a basic raytracing program using C, and i have managed to so some simple shapes ex, sphere/box/plane/cone/..., and i also did some shading to them using phong illumination.
But my question is that i can get a hang of how i can ray trace a Hemisphere , like is there a set equation that define the Hemisphere if so enlighten me on it because i couldn't find any , or is there a set method to do it that i couldn't figure out.
I have also tried to tried to cut the sphere with a plane and only show the only the top half but it didn't work (I am still new to all this so my understanding may be wrong).
Edit: Ok, I am sorry because i am really new to all this but here is what i have tryied.
#include "raytacing.h"
t_env *init_sphere(t_env *e)
{
//sphere position and radius
e->sph.posi.x = 0;
e->sph.posi.y = 0;
e->sph.posi.z = -1;
e->sph.rad = 0;
e->sph.color = (t_color){255, 255, 128);
return (e);
}
t_env *init_plane(t_env *e)
{
//plane position
e->plane.posi.x = 0;
e->olane.posi.y = -0.5;
e->plane.posi.z = 0;
//plane normal
e->plane.norm.x = 0;
e->olane.norm.y = 1;
e->plane.norm.z = 0;
e->plane.color = (t_color){0, 255, 0);
return (e);
}
double inter_plane(t_env *e, double *t) //calculating plane intersection
{
t_vect dist;
double norm;
norm = dot(e->plane.normal, e->r.direction);
if (fabs(norm) > 1e-6)
{
dist = vect_sub(e->plane.posi, e->r.start);
e->t0 = dot(dist, e->plane.normal) / norm;
if (e->t0 < *t && e->t0 > 1e-6)
{
*t = e->t0;
return (1);
}
else
return (0);
}
return (0);
}
double inter_sph(t_env *e, double *t) //calculating sphere intersection
{
double delta;
double sqrtd;
t_vect dist;
e->a = dot(e->r.direction, e->r.direction);
dist = vect_sub(e->r.start, e->sph.posi);
e->b = 2 * dot(dist, e->r.direction);
e->c = dot(dist, dist) - e->sph.rad * e->sph.rad;
delta = e->b * e->b - 4 * e->a * e->c;
if (delta < 0)
return (0);
sqrtd = sqrt(delta);
e->t0 = (-e->b + sqrtd) / (2 * e->a);
e->t1 = (-e->b - sqrtd) / (2 * e->a);
if (e->t0 > e->t1)
e->t0 = e->t1;
if ((e->t0 > 1e-6) && (e->t0 < *t))
{
*t = e->t0;
return (1);
}
else
return (0);
}
double inter_hemisphere(t_env *e) //calculating hemisphere intersection
{
t_vect hit_normal;
if (inter_sph(e, &e->t) == 1)
{
hit_normal = vect_add(e->r.start, vect_scalaire(e->t, e->r.direction));
hit_normal = vect_normalize(hit_normal);
if (inter_plane(e, &(e->t)) == 1)
{
if (dot(e->plane.normal, hit_normal) < 0)
return (1);
return (0);
}
}
return (0);
}
the e->t is . supposed to be the closest distance to the camera so that i get an exact display of close and far objects
And here i tried to apply what Spektre said and got some thing displayed and look like something like this:
And when i try to rotate it i get this:
Edit2 : After using Spektre Method I got a functional Intersection of a Hemisphere and the intersection look something like this.
double inter_hemisphere(t_env *e, double *t)
{
double delta;
double sqrtd;
t_vect dist;
e->a = dot(e->r.direction, e->r.direction);
dist = vect_sub(e->r.start, e->sph.posi);
e->b = 2 * dot(dist, e->r.direction);
e->c = dot(dist, dist) - e->sph.rad * e->sph.rad;
delta = e->b * e->b - 4 * e->a * e->c;
if (delta < 0)
return (0);
sqrtd = sqrt(delta);
e->t0 = (-e->b + sqrtd) / (2 * e->a);
e->t1 = (-e->b - sqrtd) / (2 * e->a);
t_vect v2;
v2 = vect_add(e->r.start, vect_sub(vect_scalaire(e->t0, e->r.direction), e->sph.posi));
if (dot(e->plane.normal, v2) > 0.0)
e->t0 =-1.0;
v2 = vect_add(e->r.start, vect_sub(vect_scalaire(e->t1, e->r.direction), e->sph.posi));
if (dot(e->plane.normal, v2) > 0.0)
e->t1 =-1.0;
if (e->t0 < 0.0)
e->t0 = e->t1;
if (e->t1 < 0.0)
e->t1 = e->t0;
double tt;
tt = fmin(e->t0, e->t1);
if (tt <= 0.0)
tt = fmax(e->t0, e->t1);
if (tt > 1e-6 && tt < e->t)
{
*t = tt;
return (1);
}
return (0);
}
And here is the Result:

The simplest way is to cut your sphere by a plane.
If you have plane normal than any direction (point on sphere - sphere center) with the same direction to normal is cut off. Simply by this condition:
dot(point on sphere - sphere center , plane normal ) > 0.0
But do not forget to test both intersections of ray and sphere as the closest one can be on the other side of plane ...
I tried to implement this into mine GLSL Ray tracer:
Reflection and refraction impossible without recursive ray tracing?
And come up with this updated fragment shaders:
Vertex (no change):
//------------------------------------------------------------------
#version 420 core
//------------------------------------------------------------------
uniform float aspect;
uniform float focal_length;
uniform mat4x4 tm_eye;
layout(location=0) in vec2 pos;
out smooth vec2 txt_pos; // frag position on screen <-1,+1> for debug prints
out smooth vec3 ray_pos; // ray start position
out smooth vec3 ray_dir; // ray start direction
//------------------------------------------------------------------
void main(void)
{
vec4 p;
txt_pos=pos;
// perspective projection
p=tm_eye*vec4(pos.x/aspect,pos.y,0.0,1.0);
ray_pos=p.xyz;
p-=tm_eye*vec4(0.0,0.0,-focal_length,1.0);
ray_dir=normalize(p.xyz);
gl_Position=vec4(pos,0.0,1.0);
}
//------------------------------------------------------------------
Fragment (added hemispheres):
//------------------------------------------------------------------
#version 420 core
//------------------------------------------------------------------
// Ray tracer ver: 1.000
//------------------------------------------------------------------
in smooth vec3 ray_pos; // ray start position
in smooth vec3 ray_dir; // ray start direction
uniform float n0; // refractive index of camera origin
uniform int fac_siz; // square texture x,y resolution size
uniform int fac_num; // number of valid floats in texture
uniform sampler2D fac_txr; // scene mesh data texture
out layout(location=0) vec4 frag_col;
//---------------------------------------------------------------------------
#define _reflect
#define _refract
//---------------------------------------------------------------------------
void main(void)
{
const vec3 light_dir=normalize(vec3(0.1,0.1,1.0));
const float light_iamb=0.1; // dot offset
const float light_idir=0.5; // directional light amplitude
const vec3 back_col=vec3(0.2,0.2,0.2); // background color
const float _zero=1e-6; // to avoid intrsection with start point of ray
const int _fac_triangles =0; // r,g,b,a, n, triangle count, { x0,y0,z0,x1,y1,z1,x2,y2,z2 }
const int _fac_spheres =1; // r,g,b,a, n, sphere count, { x,y,z,r }
const int _fac_hemispheres=2; // r,g,b,a, n, hemisphere count,{ x,y,z,r,nx,ny,nz }
// ray scene intersection
struct _ray
{
dvec3 pos,dir,nor;
vec3 col;
float refl,refr;// reflection,refraction intensity coeficients
float n0,n1; // refaction index (start,end)
double l; // ray length
int lvl,i0,i1; // recursion level, reflect, refract
};
const int _lvls=4;
const int _rays=(1<<_lvls)-1;
_ray ray[_rays]; int rays;
dvec3 v0,v1,v2,pos;
vec3 c;
float refr,refl,n1;
double tt,t,a;
int i0,ii,num,id;
// fac texture access
vec2 st; int i,j; float ds=1.0/float(fac_siz-1);
#define fac_get texture(fac_txr,st).r; st.s+=ds; i++; j++; if (j==fac_siz) { j=0; st.s=0.0; st.t+=ds; }
// enque start ray
ray[0].pos=ray_pos;
ray[0].dir=normalize(ray_dir);
ray[0].nor=vec3(0.0,0.0,0.0);
ray[0].refl=0.0;
ray[0].refr=0.0;
ray[0].n0=n0;
ray[0].n1=1.0;
ray[0].l =0.0;
ray[0].lvl=0;
ray[0].i0=-1;
ray[0].i1=-1;
rays=1;
// loop all enqued rays
for (i0=0;i0<rays;i0++)
{
// loop through all objects
// find closest forward intersection between them and ray[i0]
// strore it to ray[i0].(nor,col)
// strore it to pos,n1
t=tt=-1.0; ii=1; ray[i0].l=0.0;
ray[i0].col=back_col;
pos=ray[i0].pos; n1=n0;
for (st=vec2(0.0,0.0),i=j=0;i<fac_num;)
{
c.r=fac_get; // RGBA
c.g=fac_get;
c.b=fac_get;
refl=fac_get;
refr=fac_get;
n1=fac_get; // refraction index
a=fac_get; id=int(a); // object type
a=fac_get; num=int(a); // face count
if (id==_fac_triangles)
for (;num>0;num--)
{
v0.x=fac_get; v0.y=fac_get; v0.z=fac_get;
v1.x=fac_get; v1.y=fac_get; v1.z=fac_get;
v2.x=fac_get; v2.y=fac_get; v2.z=fac_get;
dvec3 e1,e2,n,p,q,r;
double t,u,v,det,idet;
//compute ray triangle intersection
e1=v1-v0;
e2=v2-v0;
// Calculate planes normal vector
p=cross(ray[i0].dir,e2);
det=dot(e1,p);
// Ray is parallel to plane
if (abs(det)<1e-8) continue;
idet=1.0/det;
r=ray[i0].pos-v0;
u=dot(r,p)*idet;
if ((u<0.0)||(u>1.0)) continue;
q=cross(r,e1);
v=dot(ray[i0].dir,q)*idet;
if ((v<0.0)||(u+v>1.0)) continue;
t=dot(e2,q)*idet;
if ((t>_zero)&&((t<=tt)||(ii!=0)))
{
ii=0; tt=t;
// store color,n ...
ray[i0].col=c;
ray[i0].refl=refl;
ray[i0].refr=refr;
// barycentric interpolate position
t=1.0-u-v;
pos=(v0*t)+(v1*u)+(v2*v);
// compute normal (store as dir for now)
e1=v1-v0;
e2=v2-v1;
ray[i0].nor=cross(e1,e2);
}
}
if (id==_fac_spheres)
for (;num>0;num--)
{
float r;
v0.x=fac_get; v0.y=fac_get; v0.z=fac_get; r=fac_get;
// compute l0 length of ray(p0,dp) to intersection with sphere(v0,r)
// where rr= r^-2
double aa,bb,cc,dd,l0,l1,rr;
dvec3 p0,dp;
p0=ray[i0].pos-v0; // set sphere center to (0,0,0)
dp=ray[i0].dir;
rr = 1.0/(r*r);
aa=2.0*rr*dot(dp,dp);
bb=2.0*rr*dot(p0,dp);
cc= rr*dot(p0,p0)-1.0;
dd=((bb*bb)-(2.0*aa*cc));
if (dd<0.0) continue;
dd=sqrt(dd);
l0=(-bb+dd)/aa;
l1=(-bb-dd)/aa;
if (l0<0.0) l0=l1;
if (l1<0.0) l1=l0;
t=min(l0,l1); if (t<=_zero) t=max(l0,l1);
if ((t>_zero)&&((t<=tt)||(ii!=0)))
{
ii=0; tt=t;
// store color,n ...
ray[i0].col=c;
ray[i0].refl=refl;
ray[i0].refr=refr;
// position,normal
pos=ray[i0].pos+(ray[i0].dir*t);
ray[i0].nor=pos-v0;
}
}
if (id==_fac_hemispheres)
for (;num>0;num--)
{
float r;
v0.x=fac_get; v0.y=fac_get; v0.z=fac_get; r=fac_get;
v1.x=fac_get; v1.y=fac_get; v1.z=fac_get;
// compute l0 length of ray(p0,dp) to intersection with sphere(v0,r)
// where rr= r^-2
double aa,bb,cc,dd,l0,l1,rr;
dvec3 p0,dp;
p0=ray[i0].pos-v0; // set sphere center to (0,0,0)
dp=ray[i0].dir;
rr = 1.0/(r*r);
aa=2.0*rr*dot(dp,dp);
bb=2.0*rr*dot(p0,dp);
cc= rr*dot(p0,p0)-1.0;
dd=((bb*bb)-(2.0*aa*cc));
if (dd<0.0) continue;
dd=sqrt(dd);
l0=(-bb+dd)/aa;
l1=(-bb-dd)/aa;
// test both hits-v0 against normal v1
v2=ray[i0].pos+(ray[i0].dir*l0)-v0; if (dot(v1,v2)>0.0) l0=-1.0;
v2=ray[i0].pos+(ray[i0].dir*l1)-v0; if (dot(v1,v2)>0.0) l1=-1.0;
if (l0<0.0) l0=l1;
if (l1<0.0) l1=l0;
t=min(l0,l1); if (t<=_zero) t=max(l0,l1);
if ((t>_zero)&&((t<=tt)||(ii!=0)))
{
ii=0; tt=t;
// store color,n ...
ray[i0].col=c;
ray[i0].refl=refl;
ray[i0].refr=refr;
// position,normal
pos=ray[i0].pos+(ray[i0].dir*t);
ray[i0].nor=pos-v0;
}
}
}
ray[i0].l=tt;
ray[i0].nor=normalize(ray[i0].nor);
// split ray from pos and ray[i0].nor
if ((ii==0)&&(ray[i0].lvl<_lvls-1))
{
t=dot(ray[i0].dir,ray[i0].nor);
// reflect
#ifdef _reflect
if ((ray[i0].refl>_zero)&&(t<_zero)) // do not reflect inside objects
{
ray[i0].i0=rays;
ray[rays]=ray[i0];
ray[rays].lvl++;
ray[rays].i0=-1;
ray[rays].i1=-1;
ray[rays].pos=pos;
ray[rays].dir=ray[rays].dir-(2.0*t*ray[rays].nor);
ray[rays].n0=ray[i0].n0;
ray[rays].n1=ray[i0].n0;
rays++;
}
#endif
// refract
#ifdef _refract
if (ray[i0].refr>_zero)
{
ray[i0].i1=rays;
ray[rays]=ray[i0];
ray[rays].lvl++;
ray[rays].i0=-1;
ray[rays].i1=-1;
ray[rays].pos=pos;
t=dot(ray[i0].dir,ray[i0].nor);
if (t>0.0) // exit object
{
ray[rays].n0=ray[i0].n0;
ray[rays].n1=n0;
if (i0==0) ray[i0].n1=n1;
v0=-ray[i0].nor; t=-t;
}
else{ // enter object
ray[rays].n0=n1;
ray[rays].n1=ray[i0].n0;
ray[i0 ].n1=n1;
v0=ray[i0].nor;
}
n1=ray[i0].n0/ray[i0].n1;
tt=1.0-(n1*n1*(1.0-t*t));
if (tt>=0.0)
{
ray[rays].dir=(ray[i0].dir*n1)-(v0*((n1*t)+sqrt(tt)));
rays++;
}
}
#endif
}
else if (i0>0) // ignore last ray if nothing hit
{
ray[i0]=ray[rays-1];
rays--; i0--;
}
}
// back track ray intersections and compute output color col
// lvl is sorted ascending so backtrack from end
for (i0=rays-1;i0>=0;i0--)
{
// directional + ambient light
t=abs(dot(ray[i0].nor,light_dir)*light_idir)+light_iamb;
t*=1.0-ray[i0].refl-ray[i0].refr;
ray[i0].col.rgb*=float(t);
// reflect
ii=ray[i0].i0;
if (ii>=0) ray[i0].col.rgb+=ray[ii].col.rgb*ray[i0].refl;
// refract
ii=ray[i0].i1;
if (ii>=0) ray[i0].col.rgb+=ray[ii].col.rgb*ray[i0].refr;
}
frag_col=vec4(ray[0].col,1.0);
}
//---------------------------------------------------------------------------
The Vertex shader just creates the Ray position and direction which is interpolated by GPU and then Fragment shader handles each ray (per pixel).
I use this scene:
// init mesh raytracer
ray.gl_init();
ray.beg();
// r g b rfl rfr n
ray.add_material(1.0,0.7,0.1,0.3,0.0,_n_glass); ray.add_hemisphere( 0.0, 0.0, 2.0,0.5, 0.0, 0.0, 1.0);
ray.add_material(1.0,1.0,1.0,0.3,0.0,_n_glass); ray.add_box ( 0.0, 0.0, 6.0,9.0,9.0,0.1);
ray.add_material(1.0,1.0,1.0,0.1,0.8,_n_glass); ray.add_sphere ( 0.0, 0.0, 0.5,0.5);
ray.add_material(1.0,0.1,0.1,0.3,0.0,_n_glass); ray.add_sphere (+2.0, 0.0, 2.0,0.5);
ray.add_material(0.1,1.0,0.1,0.3,0.0,_n_glass); ray.add_box (-2.0, 0.0, 2.0,0.5,0.5,0.5);
ray.add_material(0.1,0.1,1.0,0.3,0.0,_n_glass);
ray.add_tetrahedron
(
0.0, 0.0, 3.0,
-1.0,-1.0, 4.0,
+1.0,-1.0, 4.0,
0.0,+1.0, 4.0
);
ray.end();
containing single yellow hemisphere at (0.0, 0.0, 2.0) with radius r=0.5 and plane normal (0.0, 0.0, 1.0). Rotation of the object can by done simply by rotating the plane normal.
And this is preview:
As you can see hemisphere is working by just cutting with a plane ... The only important code from above for you is this (see the *** comments):
if (id==_fac_hemispheres) // *** ignore
for (;num>0;num--) // *** ignore
{
float r;
// *** here v0 is center, v1 is plane normal and r is radius
v0.x=fac_get; v0.y=fac_get; v0.z=fac_get; r=fac_get;
v1.x=fac_get; v1.y=fac_get; v1.z=fac_get;
// *** this is ray/ellipsoid intersection returning l0,l1 ray distances for both hits
// compute l0 length of ray(p0,dp) to intersection with sphere(v0,r)
// where rr= r^-2
double aa,bb,cc,dd,l0,l1,rr;
dvec3 p0,dp;
p0=ray[i0].pos-v0; // set sphere center to (0,0,0)
dp=ray[i0].dir;
rr = 1.0/(r*r);
aa=2.0*rr*dot(dp,dp);
bb=2.0*rr*dot(p0,dp);
cc= rr*dot(p0,p0)-1.0;
dd=((bb*bb)-(2.0*aa*cc));
if (dd<0.0) continue;
dd=sqrt(dd);
l0=(-bb+dd)/aa;
l1=(-bb-dd)/aa;
// *** this thro away hits on wrong side of plane
// test both hits-v0 against normal v1
v2=ray[i0].pos+(ray[i0].dir*l0)-v0; if (dot(v1,v2)>0.0) l0=-1.0;
v2=ray[i0].pos+(ray[i0].dir*l1)-v0; if (dot(v1,v2)>0.0) l1=-1.0;
// *** this is just using closer valid hit
if (l0<0.0) l0=l1;
if (l1<0.0) l1=l0;
t=min(l0,l1); if (t<=_zero) t=max(l0,l1);
if ((t>_zero)&&((t<=tt)||(ii!=0)))
{
ii=0; tt=t;
// store color,n ...
ray[i0].col=c;
ray[i0].refl=refl;
ray[i0].refr=refr;
// position,normal
pos=ray[i0].pos+(ray[i0].dir*t);
ray[i0].nor=pos-v0;
}
}
I used mine ray and ellipsoid intersection accuracy improvement as it returns both hits not just the first one.
If you cross check the spheres and hemispheres you will see I just added these two lines:
v2=ray[i0].pos+(ray[i0].dir*l0)-v0; if (dot(v1,v2)>0.0) l0=-1.0;
v2=ray[i0].pos+(ray[i0].dir*l1)-v0; if (dot(v1,v2)>0.0) l1=-1.0;
which just converts ray distances to hit positions and computing the condition mentioned above...

Eigenvalue calculation using TQLI algorithm fails with segmentation fault

I am trying to calculate eigenvalues using the TQLI algorithm that I got from the website of the CACS of the University of Southern California. My test script looks like this:
#include <stdio.h>
int main()
{
int i;
i = rand();
printf("My random number: %d\n", i);
float d[4] = {
{1, 2, 3, 4}
};
float e[4] = {
{0, 0, 0, 0}
};
float z[4][4] = {
{1.0, 0.0, 0.0, 0.0} ,
{0.0, 1.0, 0.0, 0.0} ,
{0.0, 0.0, 1.0, 0.0},
{0.0, 0.0, 0.0, 1.0}
};
double *zptr;
zptr = &z[0][0];
printf("Element [2][1] of identity matrix: %f\n", z[2][1]);
printf("Element [2][2] of identity matrix: %f\n", z[2][2]);
tqli(d, e, 4, zptr);
printf("First eigenvalue: %f\n", d[0]);
return 0;
}
When I try to run this script I get a segmentation fault error as you can see in here. At what location does my code produce this segmentation fault. As I believe the code from USC is bug-free I am pretty sure the mistake must be in my call of the function. However I can't see where I made a mistake in my set-up of the arrays as in my opinion I followed the instructions.

Eigenvalue calculation using TQLI algorithm fails with segmentation
fault
Segmentation fault comes from crossing the supplied array boundary. tqli requires specific data preparation.
1) The eigen code from CACS is Fortran based and counts indexes from 1.
2) The tqli expects double pointer for its matrix and double vectors.
/******************************************************************************/
void tqli(double d[], double e[], int n, double **z)
/*******************************************************************************
d, and e should be declared as double.
3) The program needs modification in respect to the data preparation for the above function.
Helper 1-index based vectors have to be created to supply properly formatted data for the tqli:
double z[NP][NP] = { {2, 0, 0}, {0, 4, 0}, {0, 0, 2} } ;
double **a;
double *d,*e,*f;
d=dvector(1,NP); // 1-index based vector
e=dvector(1,NP);
f=dvector(1,NP);
a=dmatrix(1,NP,1,NP); // 1-index based matrix
for (i=1;i<=NP;i++) // loading data from zero besed `ze` to `a`
for (j=1;j<=NP;j++) a[i][j]=z[i-1][j-1];
Complete test program is supplied below. It uses the eigen code from CACS:
/*******************************************************************************
Eigenvalue solvers, tred2 and tqli, from "Numerical Recipes in C" (Cambridge
Univ. Press) by W.H. Press, S.A. Teukolsky, W.T. Vetterling, and B.P. Flannery
*******************************************************************************/
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#define NR_END 1
#define SIGN(a,b) ((b) >= 0.0 ? fabs(a) : -fabs(a))
double **dmatrix(int nrl, int nrh, int ncl, int nch)
/* allocate a double matrix with subscript range m[nrl..nrh][ncl..nch] */
{
int i,nrow=nrh-nrl+1,ncol=nch-ncl+1;
double **m;
/* allocate pointers to rows */
m=(double **) malloc((size_t)((nrow+NR_END)*sizeof(double*)));
m += NR_END;
m -= nrl;
/* allocate rows and set pointers to them */
m[nrl]=(double *) malloc((size_t)((nrow*ncol+NR_END)*sizeof(double)));
m[nrl] += NR_END;
m[nrl] -= ncl;
for(i=nrl+1;i<=nrh;i++) m[i]=m[i-1]+ncol;
/* return pointer to array of pointers to rows */
return m;
}
double *dvector(int nl, int nh)
/* allocate a double vector with subscript range v[nl..nh] */
{
double *v;
v=(double *)malloc((size_t) ((nh-nl+1+NR_END)*sizeof(double)));
return v-nl+NR_END;
}
/******************************************************************************/
void tred2(double **a, int n, double d[], double e[])
/*******************************************************************************
Householder reduction of a real, symmetric matrix a[1..n][1..n].
On output, a is replaced by the orthogonal matrix Q effecting the
transformation. d[1..n] returns the diagonal elements of the tridiagonal matrix,
and e[1..n] the off-diagonal elements, with e[1]=0. Several statements, as noted
in comments, can be omitted if only eigenvalues are to be found, in which case a
contains no useful information on output. Otherwise they are to be included.
*******************************************************************************/
{
int l,k,j,i;
double scale,hh,h,g,f;
for (i=n;i>=2;i--) {
l=i-1;
h=scale=0.0;
if (l > 1) {
for (k=1;k<=l;k++)
scale += fabs(a[i][k]);
if (scale == 0.0) /* Skip transformation. */
e[i]=a[i][l];
else {
for (k=1;k<=l;k++) {
a[i][k] /= scale; /* Use scaled a's for transformation. */
h += a[i][k]*a[i][k]; /* Form sigma in h. */
}
f=a[i][l];
g=(f >= 0.0 ? -sqrt(h) : sqrt(h));
e[i]=scale*g;
h -= f*g; /* Now h is equation (11.2.4). */
a[i][l]=f-g; /* Store u in the ith row of a. */
f=0.0;
for (j=1;j<=l;j++) {
/* Next statement can be omitted if eigenvectors not wanted */
a[j][i]=a[i][j]/h; /* Store u/H in ith column of a. */
g=0.0; /* Form an element of A.u in g. */
for (k=1;k<=j;k++)
g += a[j][k]*a[i][k];
for (k=j+1;k<=l;k++)
g += a[k][j]*a[i][k];
e[j]=g/h; /* Form element of p in temporarily unused element of e. */
f += e[j]*a[i][j];
}
hh=f/(h+h); /* Form K, equation (11.2.11). */
for (j=1;j<=l;j++) { /* Form q and store in e overwriting p. */
f=a[i][j];
e[j]=g=e[j]-hh*f;
for (k=1;k<=j;k++) /* Reduce a, equation (11.2.13). */
a[j][k] -= (f*e[k]+g*a[i][k]);
}
}
} else
e[i]=a[i][l];
d[i]=h;
}
/* Next statement can be omitted if eigenvectors not wanted */
d[1]=0.0;
e[1]=0.0;
/* Contents of this loop can be omitted if eigenvectors not
wanted except for statement d[i]=a[i][i]; */
for (i=1;i<=n;i++) { /* Begin accumulation of transformation matrices. */
l=i-1;
if (d[i]) { /* This block skipped when i=1. */
for (j=1;j<=l;j++) {
g=0.0;
for (k=1;k<=l;k++) /* Use u and u/H stored in a to form P.Q. */
g += a[i][k]*a[k][j];
for (k=1;k<=l;k++)
a[k][j] -= g*a[k][i];
}
}
d[i]=a[i][i]; /* This statement remains. */
a[i][i]=1.0; /* Reset row and column of a to identity matrix for next iteration. */
for (j=1;j<=l;j++) a[j][i]=a[i][j]=0.0;
}
}
/******************************************************************************/
void tqli(double d[], double e[], int n, double **z)
/*******************************************************************************
QL algorithm with implicit shifts, to determine the eigenvalues and eigenvectors
of a real, symmetric, tridiagonal matrix, or of a real, symmetric matrix
previously reduced by tred2 sec. 11.2. On input, d[1..n] contains the diagonal
elements of the tridiagonal matrix. On output, it returns the eigenvalues. The
vector e[1..n] inputs the subdiagonal elements of the tridiagonal matrix, with
e[1] arbitrary. On output e is destroyed. When finding only the eigenvalues,
several lines may be omitted, as noted in the comments. If the eigenvectors of
a tridiagonal matrix are desired, the matrix z[1..n][1..n] is input as the
identity matrix. If the eigenvectors of a matrix that has been reduced by tred2
are required, then z is input as the matrix output by tred2. In either case,
the kth column of z returns the normalized eigenvector corresponding to d[k].
*******************************************************************************/
{
double pythag(double a, double b);
int m,l,iter,i,k;
double s,r,p,g,f,dd,c,b;
for (i=2;i<=n;i++) e[i-1]=e[i]; /* Convenient to renumber the elements of e. */
e[n]=0.0;
for (l=1;l<=n;l++) {
iter=0;
do {
for (m=l;m<=n-1;m++) { /* Look for a single small subdiagonal element to split the matrix. */
dd=fabs(d[m])+fabs(d[m+1]);
if ((double)(fabs(e[m])+dd) == dd) break;
}
if (m != l) {
if (iter++ == 30) printf("Too many iterations in tqli");
g=(d[l+1]-d[l])/(2.0*e[l]); /* Form shift. */
r=pythag(g,1.0);
g=d[m]-d[l]+e[l]/(g+SIGN(r,g)); /* This is dm - ks. */
s=c=1.0;
p=0.0;
for (i=m-1;i>=l;i--) { /* A plane rotation as in the original QL, followed by Givens */
f=s*e[i]; /* rotations to restore tridiagonal form. */
b=c*e[i];
e[i+1]=(r=pythag(f,g));
if (r == 0.0) { /* Recover from underflow. */
d[i+1] -= p;
e[m]=0.0;
break;
}
s=f/r;
c=g/r;
g=d[i+1]-p;
r=(d[i]-g)*s+2.0*c*b;
d[i+1]=g+(p=s*r);
g=c*r-b;
/* Next loop can be omitted if eigenvectors not wanted */
for (k=1;k<=n;k++) { /* Form eigenvectors. */
f=z[k][i+1];
z[k][i+1]=s*z[k][i]+c*f;
z[k][i]=c*z[k][i]-s*f;
}
}
if (r == 0.0 && i >= l) continue;
d[l] -= p;
e[l]=g;
e[m]=0.0;
}
} while (m != l);
}
}
/******************************************************************************/
double pythag(double a, double b)
/*******************************************************************************
Computes (a2 + b2)1/2 without destructive underflow or overflow.
*******************************************************************************/
{
double absa,absb;
absa=fabs(a);
absb=fabs(b);
if (absa > absb) return absa*sqrt(1.0+(absb/absa)*(absb/absa));
else return (absb == 0.0 ? 0.0 : absb*sqrt(1.0+(absa/absb)*(absa/absb)));
}
#define NP 3
#define TINY 1.0e-6
double sqrt(double x)
{
union
{
int i;
double x;
} u;
u.x = x;
u.i = (1<<29) + (u.i >> 1) - (1<<22);
return u.x;
}
int main()
{
int i,j,k;
double ze[NP][NP] = { {2, 0, 0}, {0, 4, 0}, {0, 0, 2} } ;
double **a;
double *d,*e,*f;
d=dvector(1,NP);
e=dvector(1,NP);
f=dvector(1,NP);
a=dmatrix(1,NP,1,NP);
for (i=1;i<=NP;i++)
for (j=1;j<=NP;j++) a[i][j]=ze[i-1][j-1];
tred2(a,NP,d,e);
tqli(d,e,NP,a);
printf("\nEigenvectors for a real symmetric matrix:\n");
for (i=1;i<=NP;i++) {
for (j=1;j<=NP;j++) {
f[j]=0.0;
for (k=1;k<=NP;k++)
f[j] += (ze[j-1][k-1]*a[k][i]);
}
printf("%s %3d %s %10.6f\n","\neigenvalue",i," =",d[i]);
printf("%11s %14s %9s\n","vector","mtrx*vect.","ratio");
for (j=1;j<=NP;j++) {
if (fabs(a[j][i]) < TINY)
printf("%12.6f %12.6f %12s\n",
a[j][i],f[j],"div. by 0");
else
printf("%12.6f %12.6f %12.6f\n",
a[j][i],f[j],f[j]/a[j][i]);
}
}
//free_dmatrix(a,1,NP,1,NP);
//free_dvector(f,1,NP);
//free_dvector(e,1,NP);
//free_dvector(d,1,NP);
return 0;
}
Output:
Eigenvectors for a real symmetric matrix:
eigenvalue 1 = 2.000000
vector mtrx*vect. ratio
1.000000 2.000000 2.000000
0.000000 0.000000 div. by 0
0.000000 0.000000 div. by 0
eigenvalue 2 = 4.000000
vector mtrx*vect. ratio
0.000000 0.000000 div. by 0
1.000000 4.000000 4.000000
0.000000 0.000000 div. by 0
eigenvalue 3 = 2.000000
vector mtrx*vect. ratio
0.000000 0.000000 div. by 0
0.000000 0.000000 div. by 0
1.000000 2.000000 2.000000
I hope it finaly helps to clarify confusion regarding the data preparation for tqli.

3D identity matrix to correctly set vertices

Im playing around with matrices, with a view to doing 3D transformation in GDI (for the fun of it). At the moment i'm checking that im getting the right values from identity matrix given a representation of four vertices arranged in a square. I've been scratching my head as to why it's not giving expected output. I have done my research but can't see what i am doing wrong here.
Here's my definition of matrix.
typedef struct m{
float _m01, _m05, _m09, _m13;
float _m02, _m06, _m10, _m14;
float _m03, _m07, _m11, _m15;
float _m04, _m08, _m12, _m16;
}mat;
struct m matIdentity(struct m *m1){
m1->_m01 = 1.0; m1->_m05 = 0.0; m1->_m09 = 0.0; m1->_m13 = 0.0;
m1->_m02 = 0.0; m1->_m06 = 1.0; m1->_m10 = 0.0; m1->_m14 = 0.0;
m1->_m03 = 0.0; m1->_m07 = 0.0; m1->_m11 = 1.0; m1->_m15 = 0.0;
m1->_m04 = 0.0; m1->_m08 = 0.0; m1->_m12 = 0.0; m1->_m16 = 1.0;
}
Here's making use of matrix with
struct m matrix;
matIdentity(&matrix);
//represent 4 vertices(x,y,z,w);
float square[4][4] = {
{0.0, 0.0, 0.0, 1.0},
{0.0, 20.0, 0.0, 1.0},
{20.0, 20.0, 0.0, 1.0},
{20.0, 0.0, 0.0, 1.0}
};
float result[4][4];
int i = 0;
for(i = 0; i < 4; i++){
result[i][1] = (matrix._m01 * square[i][0]) + (matrix._m05 * square[i][1]) + (matrix._m09 * square[i][2]) + (matrix._m13 * square[i][3]);
result[i][2] = (matrix._m02 * square[i][0]) + (matrix._m06 * square[i][1]) + (matrix._m10 * square[i][2]) + (matrix._m14 * square[i][3]);
result[i][3] = (matrix._m03 * square[i][0]) + (matrix._m07 * square[i][1]) + (matrix._m11 * square[i][2]) + (matrix._m15 * square[i][3]);
result[i][4] = (matrix._m04 * square[i][0]) + (matrix._m08 * square[i][1]) + (matrix._m12 * square[i][2]) + (matrix._m16 * square[i][3]);
}
char strOutput[500];
sprintf(strOutput,"%f %f %f %f\n %f %f %f %f\n %f %f %f %f\n %f %f %f %f\n ",
result[0][0], result[0][1], result[0][2], result[0][3],
result[1][0], result[1][1], result[1][2], result[1][3],
result[2][0], result[2][1], result[2][2], result[2][3],
result[3][0], result[3][1], result[3][2], result[3][3]
);
I have a feeling the problem is somewhere to do with multiplying a row based representation of vertices using a column major matrix. Can anyone please suggest how i should be doing this.

I don't understand why you don't use array first, then start to use array and iteration, and in the end give up iteration. Please, such program can only cause confusion.
The correct formula is C(i, j)=sigma(A(i, k)*B(k, j), k=1..n), where C=AB and n is 4 for your case.
(e.g., this line should be like: result[i][0] = (matrix._m01 * square[0][i]) + (matrix._m02 * square[1][i]) + (matrix._m03 * square[2][i]) + (matrix._m04 * square[3][i]); )Write a simple nested for-iteration to calculate this...
This is not for one vector, but n vectors....

This is not matrix multiplication. Multiplying a vector by a matrix goes like this:
float mat[4][4];
float vec_in[4];
float vec_out[4];
// todo: initialize values
for (int j = 0; j < 4; ++j)
{
vec_out[j] = 0.0f;
for (int i = 0; i < 4; ++i)
{
vec_out[j] += vec_in[i] * mat[i][j];
}
}