Is GLKMatrix4Translate correct? - glkit

I am not sure if I am misunderstanding things, or I am simply tired.
Consider the two lines of code below.
GLKMatrix4 projection = GLKMatrix4Translate(_perspective, 0, 0, -5);
GLKMatrix4 projection = GLKMatrix4Multiply(_perspective, GLKMatrix4MakeTranslation(0, 0, -5));
I expected them to be equivalent. But they are not. The source for GLKMatrix4Translate is:
static __inline__ GLKMatrix4 GLKMatrix4Translate(GLKMatrix4 matrix, float tx, float ty, float tz)
GLKMatrix4 m = { matrix.m[0], matrix.m[1], matrix.m[2], matrix.m[3],
matrix.m[4], matrix.m[5], matrix.m[6], matrix.m[7],
matrix.m[8], matrix.m[9], matrix.m[10], matrix.m[11],
matrix.m[0] * tx + matrix.m[4] * ty + matrix.m[8] * tz + matrix.m[12],
matrix.m[1] * tx + matrix.m[5] * ty + matrix.m[9] * tz + matrix.m[13],
matrix.m[2] * tx + matrix.m[6] * ty + matrix.m[10] * tz + matrix.m[14],
matrix.m[15] };
return m;
however I expected the last line to be:
matrix.m[3] * tx + matrix.m[7] * ty + matrix.m[11] * tz + m[15];
Am I correct, or in a tired fog?

I think you're right, there must be a bug in Translate. I use the MakeTranslation+Multiply instead of the Translate function because the m[15] always remains unchanged.


Erroneous result using inverse Vincenty's formula in C

I have written a C script to implement the inverse Vincenty's formula to calculate the distance between two sets of GPS coordinates based on the equations shown at
However, my results are different to the results given by this online calculator and Google maps. My results are consistently around 1.18 times the result of the online calculator.
My function is below, any tips on where I could be going wrong would be very much appreciated!
double get_distance(double lat1, double lon1, double lat2, double lon2)
double rad_eq = 6378137.0; //Radius at equator
double flattening = 1 / 298.257223563; //flattenig of earth
double rad_pol = (1 - flattening) * rad_eq; //Radius at poles
double U1,U2,L,lambda,old_lambda,sigma,sin_sig,cos_sig,alpha,cos2sigmam,A,B,C,u_sq,delta_s,dis;
//Convert to radians
//Calculate U1 and U2
double tolerance=pow(10.,-12.);//iteration tollerance should give 0.6mm
double diff=1.;
while (abs(diff)>tolerance)
//Returns distance in metres
return dis;
This formula is not symmetric:
cos_sig = sin(U1)*cos(U2)
+ cos(U1)*cos(U2) * cos(lambda);
And turns out to be wrong, a sin is missing.
Another style of formatting (one including some whitespace) could also help.
Besides the fabs for abs and one sin for that cos I also changed the loop; there were two abs()-calls and diff had to be preset with the while-loop.
I inserted a printf to see how the value progresses.
Some parentheses can be left out. These formulas are really difficult to realize. Some more helper variables could be useful in this jungle of nested math operations.
do {
sin_sig = sqrt(pow( cos(U2) * sin(lambda), 2)
+ pow(cos(U1)*sin(U2)
- (sin(U1)*cos(U2) * cos(lambda))
, 2)
cos_sig = sin(U1) * sin(U2)
+ cos(U1) * cos(U2) * cos(lambda);
sigma = atan2(sin_sig, cos_sig);
alpha = asin(cos(U1) * cos(U2) * sin(lambda)
/ sin_sig
double cos2alpha = cos(alpha)*cos(alpha); // helper var.
cos2sigmam = cos(sigma) - 2*sin(U1)*sin(U2) / cos2alpha;
C = (flat/16) * cos2alpha * (4 + flat * (4 - 3*cos2alpha));
old_lambda = lambda;
lambda = L + (1-C) * flat * sin(alpha)
*(sigma + C*sin_sig
*(cos2sigmam + C*cos_sig
*(2 * pow(cos2sigmam, 2) - 1)
diff = fabs(old_lambda - lambda);
printf("%.12f\n", diff);
} while (diff > tolerance);
For 80,80, 0,0 the output is (in km):
which corresponds to the millimeter with WGS-84.

OpenMP threads and SIMD for instructions without explicit for-loop

I have function in my library which computes N (N = 500 to 2000) explicit rather simple operations but it is called hundreds of thousands of times by he main software. Each small computation is independent from other and each one is slightly different (polynomial coefficients and sometimes other additional features vary) and therefore no loop is made but the cases are hard coded into the function.
Unfortunately the calls (loop) in the main software cannot be threaded because before the actual call to this particular function is made, the code there is not thread safe. (bigger software package to deal with here...)
I already tested to create a team of openmp threads in the beginning of this function and execute the computations in e.g. 4 blocks via the sections functionality in openmp, but it seems that the overhead of the thread creation #pragma omp parallel, was too high (Can it be?)
Any nice ideas how to speed-up this kind of situation? Perhaps applying SIMD features but how would it happen when I don't have an explicit for loop here to deal with?
#include "needed.h"
void eval_func (const double x, const double y, const double * __restrict__ z, double * __restrict__ out1, double * __restrict__ out2) {
double logx = log(x);
double tmp1;
double tmp2;
//calculation 1
tmp1 = exp(3.6 + 2.7 * logx - (3.1e+03 / x));
out1[0] = z[6] * z[5] * tmp1;
if (x <= 1.0) {
tmp2 = (-4.1 + 9.2e-01 * logx + x * (-3.3e-03 + x * (2.95e-06 + x * (-1.4e-09 + 3.2e-13 * x))) - 8.8e+02 / x);
} else {
tmp2 = (2.71e+00 + -3.3e-01 * logx + x * (3.4e-04 + x * (-6.8e-08 + x * (8.7e-12 + -4.2e-16 * x))) - 1.0e+03 / x);
tmp2 = 1.3 * exp(tmp2);
out2[0] = z[3] * z[7] * tmp1 / tmp2;
//calculation 2
out1[1] = ...
out2[1] = ...
//calculation N
out1[N-1] = ...
out2[N-1] = ...

How can i get r-g-b values for each pixel by using c?

I need to do color space conversion from RGB to YCbCr in C for my homework. First, I get r-g-b values for each pixel of a bmp file. Then, use the code shown below. But I can not get r-g-b values of pixels. How can I do that?
struct YCbCr ycbcr;
ycbcr.Y = (float)(0.2989 * fr + 0.5866 * fg + 0.1145 * fb);
ycbcr.Cb = (float)(-0.1687 * fr - 0.3313 * fg + 0.5000 * fb);
ycbcr.Cr = (float)(0.5000 * fr - 0.4184 * fg - 0.0816 * fb);

Matlab: Help understanding sinusoidal curve fit

I have an unknown sine wave with some noise that I am trying to reconstruct. The ultimate goal is to come up with a C algorithm to find the amplitude, dc offset, phase, and frequency of a sine wave but I am prototyping in Matlab (Octave actually) first. The sine wave is of the form
y = a + b*sin(c + 2*pi*d*t)
a = dc offset
b = amplitude
c = phase shift (rad)
d = frequency
I have found this example and in the comments John D'Errico presents a method for using Least Squares to fit a sine wave to data. It is a neat little algorithm and works remarkably well but I am having difficulties understanding one aspect. The algorithm is as follows:
Suppose you have a sine wave of the form:
(1) y = a + b*sin(c+d*x)
Using the identity
(2) sin(u+v) = sin(u)*cos(v) + cos(u)*sin(v)
We can rewrite (1) as
(3) y = a + b*sin(c)*cos(d*x) + b*cos(c)*sin(d*x)
Since b*sin(c) and b*cos(c) are constants, these can be wrapped into constants b1 and b2.
(4) y = a + b1*cos(d*x) + b2*sin(d*x)
This is the equation that is used to fit the sine wave. A function is created to generate regression coefficients and a sum-of-squares residual error.
(5) cfun = #(d) [ones(size(x)), sin(d*x), cos(d*x)] \ y;
(6) sumerr2 = #(d) sum((y - [ones(size(x)), sin(d*x), cos(d*x)] * cfun(d)) .^ 2);
Next, sumerr2 is minimized for the frequency d using fminbnd with lower limit l1 and upper limit l2.
(7) dopt = fminbnd(sumerr2, l1, l2);
Now a, b, and c can be computed. The coefficients to compute a, b, and c are given from (4) at dopt
(8) abb = cfun(dopt);
The dc offset is simply the first value
(9) a = abb(1);
A trig identity is used to find b
(10) sin(u)^2 + cos(u)^2 = 1
(11) b = sqrt(b1^2 + b2^2)
(12) b = norm(abb([2 3]));
Finally the phase offset is found
(13) b1 = b*cos(c)
(14) c = acos(b1 / b);
(15) c = acos(abb(2) / b);
What is going on in (5) and (6)? Can someone break down what is happening in pseudo-code or perhaps perform the same function in a more explicit way?
(5) cfun = #(d) [ones(size(x)), sin(d*x), cos(d*x)] \ y;
(6) sumerr2 = #(d) sum((y - [ones(size(x)), sin(d*x), cos(d*x)] * cfun(d)) .^ 2);
Also, given (4) shouldn't it be:
[ones(size(x)), cos(d*x), sin(d*x)]
Here is the Matlab code in full. Blue line is the actual signal. Green line is the reconstructed signal.
close all
clear all
y = [111,140,172,207,243,283,319,350,383,414,443,463,483,497,505,508,503,495,479,463,439,412,381,347,311,275,241,206,168,136,108,83,63,54,45,43,41,45,51,63,87,109,137,168,204,239,279,317,348,382,412,439,463,479,496,505,508,505,495,483,463,441,414,383,350,314,278,245,209,175,140,140,110,85,63,51,45,41,41,44,49,63,82,105,135,166,200,236,277,313,345,379,409,438,463,479,495,503,508,503,498,485,467,444,415,383,351,318,281,247,211,174,141,111,87,67,52,45,42,41,45,50,62,79,104,131,163,199,233,273,310,345,377,407,435,460,479,494,503,508,505,499,486,467,445,419,387,355,319,284,249,215,177,143,113,87,67,55,46,43,41,44,48,63,79,102,127,159,191,232,271,307,343,373,404,437,457,478,492,503,508,505,499,488,470,447,420,391,360,323,287,254,215,182,147,116,92,70,55,46,43,42,43,49,60,76,99,127,159,191,227,268,303,339,371,401,431,456,476,492,502,507,507,500,488,471,447,424,392,361,326,287,287,255,220,185,149,119,92,72,55,47,42,41,43,47,57,76,95,124,156,189,223,258,302,337,367,399,428,456,476,492,502,508,508,501,489,471,451,425,396,364,328,294,259,223,188,151,119,95,72,57,46,43,44,43,47,57,73,95,124,153,187,222,255,297,335,366,398,426,451,471,494,502,507,508,502,489,474,453,428,398,367,332,296,262,227,191,154,124,95,75,60,47,43,41,41,46,55,72,94,119,150,183,215,255,295,331,361,396,424,447,471,489,500,508,508,502,492,475,454,430,401,369,335,299,265,228,191,157,126,99,76,59,49,44,41,41,46,55,72,92,118,147,179,215,252,291,328,360,392,422,447,471,488,499,507,508,503,493,477,456,431,403]';
fs = 100e3;
N = length(y);
t = (0:1/fs:N/fs-1/fs)';
cfun = #(d) [ones(size(t)), sin(2*pi*d*t), cos(2*pi*d*t)]\y;
sumerr2 = #(d) sum((y - [ones(size(t)), sin(2*pi*d*t), cos(2*pi*d*t)] * cfun(d)) .^ 2);
dopt = fminbnd(sumerr2, 2300, 2500);
abb = cfun(dopt);
a = abb(1);
b = norm(abb([2 3]));
c = acos(abb(2) / b);
d = dopt;
y_reconstructed = a + b*sin(2*pi*d*t - c);
hold on
title('Signal Reconstruction')
grid on
plot(t*1000, y, 'b')
plot(t*1000, y_reconstructed, 'g')
ylim = get(gca, 'ylim');
xlim = get(gca, 'xlim');
text(xlim(1), ylim(2) - 15, [num2str(b) ' cos(2\pi * ' num2str(d) 't - ' ...
num2str(c * 180/pi) ') + ' num2str(a)]);
hold off
(5) and (6) are defining anonymous functions that can be used within the optimisation code. cfun returns an array that is a function of t, y and the parameter d (that is the optimisation parameter that will be varied). Similarly, sumerr2 is another anonymous function, with the same arguments, this time returning a scalar. That scalar will be the error that is to be minimised by fminbnd.

Raytracing a cone along an arbitrary axis

I'm working on a RayTracer and I can't figure out what I'm doing wrong when I try to calculate an intersection with a cone. I have my ray vector and the position of the cone with its axis. I know that compute a cone along a simple axis is easy but I want to do it with an arbitrary axis.
I'm using this link for the cone equation (page 7-8) and here is my code :
alpha = cone->angle * (PI / 180);
axe.x = 0;
axe.y = 1;
axe.z = 0;
delt_p = vectorize(cone->position, ray.origin);
tmp1.x = ray.vector.x - (dot_product(ray.vector, axe) * axe.x);
tmp1.y = ray.vector.y - (dot_product(ray.vector, axe) * axe.y);
tmp1.z = ray.vector.z - (dot_product(ray.vector, axe) * axe.z);
tmp2.x = (delt_p.x) - (dot_product(delt_p, axe) * axe.x);
tmp2.y = (delt_p.y) - (dot_product(delt_p, axe) * axe.y);
tmp2.z = (delt_p.z) - (dot_product(delt_p, axe) * axe.z);
a = (pow(cos(alpha), 2) * dot_product(tmp1, tmp1)) - (pow(sin(alpha), 2) * dot_product(ray.vector, axe));
b = 2 * ((pow(cos(alpha), 2) * dot_product(tmp1, tmp2)) - (pow(sin(alpha), 2) * dot_product(ray.vector, axe) * dot_product(delt_p, axe)));
c = (pow(cos(alpha), 2) * dot_product(tmp2, tmp2)) - (pow(sin(alpha), 2) * dot_product(delt_p, axe));
delta = pow(b, 2) - (4 * a * c);
if (delta >= 0)
t1 = (((-1) * b) + sqrt(delta)) / (2 * a);
t2 = (((-1) * b) - sqrt(delta)) / (2 * a);
t = (t1 < t2 ? t1 : t2);
return (t);
I initialised my axis with the y axis so I can rotate it.
Here is what I get :
Instead of a cone, I have that paraboloid red shape on the right, and I know that it's almost the same equation as a cone.
You probably need to implement arbitrary transformations on primitives using homogenous matrices, rather than support arbitrary orientation for each primitive.
For example, it's not uncommon for ray tracers to only support cones that have their base on the origin, and that point along the vertical axis. You would then use affine transformations to move the cone to the right place and orientation.
My own ray tracer (which thus far only supports planes, boxes and spheres) has the same problem, and implementation transformation matrices is my next task.