Related
I am working on embedded programming with written code by other people.
this algorithm be used in calculate average for mic and accelerometer
sound_value_Avg = 0;
sound_value = 0;
memset((char *)soundRaw, 0x00, SOUND_COUNT*2);
for(int i2=0; i2 < SOUND_COUNT; i2++)
{
soundRaw[i2] = analogRead(PIN_ANALOG_IN);
if (i2 == 0)
{
sound_value_Avg = soundRaw[i2];
}
else
{
sound_value_Avg = (sound_value_Avg + soundRaw[i2]) / 2;
}
}
sound_value = sound_value_Avg;
acceleromter is similar to this
n1=p1
(n2+p1)/2 = p2
(n3+p2)/2 = p3
(n4+p3)/2 = p4
...
avg(n1~nx)=px
it not seems to be correct.
can someone explain why he used this algorithm?
is it specific way for sin graph? like noise, vibration?
It appears to be a flawed attempt at maintaining a cumulative mean. The error is in believing that:
An+1 = (An + sn) / 2
when in fact it should be:
An+1 = ((An * n) + s) / (n + 1)
However it is computationally simpler to maintain a running sum and generate an average in the usual manner:
S = S + s
An = S / n
It is possible that the intent was to avoid overflow when the sum grows large, but the attempt is mathematically flawed.
To see how wrong this statement is consider:
True
n s Running Avg. (An + sn) / 2
--------------------------------------
1 20 20 20
2 21 20.5 20.25
3 22 21 20.625
In this case however, nothing is done with the intermediate mean value, so you don'e in fact need to maintain a running mean at all. You simply need to accumulate a running sum and calculate the average at the end. For example:
sum = 0 ;
sound_value = 0 ;
for( int i2 = 0; i2 < SOUND_COUNT; i2++ )
{
soundRaw[i2] = analogRead( PIN_ANALOG_IN ) ;
sum += soundRaw[i2] ;
}
sound_value = sum / SOUND_COUNT ;
In this you do need to make sure that the data type forsum can accommodate a value of the maximum analogRead() return multiplied by SOUND_COUNT.
However you say that this is used for some sort of signal conditioning or processing of both a microphone and an accelerator. These devices have rather dissimilar bandwidth and dynamics, and it seems rather unlikely that the same filter would suit both. Applying robust DSP techniques such as IIR or FIR filters with suitably calculated coefficients would make a great deal more sense. You'd also need a suitable fixed sample rate that I am willing to bet is not achieved by simply reading the ADC in a loop with no specific timing
I am parsing through data on an sd card one large chunk at a time on a microcontroller. It is accelerometer data so it is oscillating constantly. At certain points huge oscillations occur (shown in graph). I need an algorithm to be able to detect these large oscillations, more so, determine the range of data that contains this spike.
I have some example data:
This is the overall graph, there is only one spike of interest, the first one.
Here it is zoomed in a bit
As you can see it is a large oscillation that produces the spike.
So any algorithm that can scan through a data set and determine the portion of data that contains a spike relative to some threshold would be great. This data set is about 50,000 samples, each sample is 32 bits long. I have ample RAM to be able to hold this much data.
Thanks!
For the following signal:
If you take the absolute value of the differential between two consecutive samples, you get:
That is not quite good enough to unambiguously distinguish from the minor "unsustained" disturbances. But if you then take a simple moving sum (a leaky integrator) of the abs-differentials. Here a window width of 4 diff-samples was used:
The moving average introduces a lag or phase shift, which in cases where the data is stored and processing is not real-time can easily be compensated for by subtracting half the window width from the timing:
For real-time processing if the lag is critical a more sophisticated IIR filter might be appropriate. Anyhow a clear threshold can be selected from this data.
In code for the above data set:
#include <stdio.h>
#include <stdint.h>
#include <stdbool.h>
#include <stdlib.h>
static int32_t dataset[] = { 0,0,0,0,0,3,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,3,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,
0,-10,-15,-5,20,25,50,-10,-20,-30,0,30,5,-5,
0,0,5,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,1,0,0,0,0,6,0,0,0,0,0,0,0} ;
#define DATA_LEN (sizeof(dataset)/sizeof(*dataset))
#define WINDOW_WIDTH 4
#define THRESHOLD 15
int main()
{
uint32_t window[WINDOW_WIDTH] = {0} ;
int window_index = 0 ;
int window_sum = 0 ;
bool spike = false ;
for( int s = 1; s < DATA_LEN ; s++ )
{
uint32_t diff = abs( dataset[s] - dataset[s-1] ) ;
window_sum -= window[window_index] ;
window[window_index] = diff ;
window_index++ ;
window_index %= WINDOW_WIDTH ;
window_sum += diff ;
if( !spike && window_sum >= THRESHOLD )
{
spike = true ;
printf( "Spike START # %d\n", s - WINDOW_WIDTH / 2 ) ;
}
else if( spike && window_sum < THRESHOLD )
{
spike = false ;
printf( "Spike END # %d\n", s - WINDOW_WIDTH / 2 ) ;
}
}
return 0;
}
The output is:
Spike START # 66
Spike END # 82
https://onlinegdb.com/ryEw69jJH
Comparing the original data with the detection threshold gives:
For your real data, you will need to select a suitable window width and threshold to get the desired result, both of which will depend on the bandwidth and amplitude of the disturbances you wish to detect.
Also you may need to guard against arithmetic overflow if your samples are of sufficient magnitude. They need to be less than 232 / window-width to guarantee no overflow in the integrator. Alternatively you could use floating-point or uint64_t for the window type, or add code to deal with saturation.
You could look at statistical analysis. Calculating the standard deviation over the data set and then checking when your data goes out of bound.
You can choose to do this in two way's; either you use a running average over a fixed (relatively small) number of samples or you take the average over the whole data set. As I see multiple spikes in your set I would suggest the first. This way you can possibly stop processing (and later continue) every time you find a spike.
For your purpose you do not really need to calculate the standard deviation sigma. You could actually leave it at the squared of sigma. This will give you a slight performance optimization not having to calculate the square root.
Some pseudo code:
// The data set.
int x[N];
// The number of samples in your mean and std calculation.
int M <= N;
// Simga at index i over the previous M samples.
int sigma_i = sqrt( sum( pow(x[i] - mean(x,M), 2) ) / M );
// Or the squared of sigma
int sigma_squared_i = sum( pow(x[i] - mean(x,M), 2) ) / M;
The disadvantage of this method is that you need to set a threshold for the value of sigma at which you trigger. However it is very safe to say that when setting the threshold at 4 or 5 times your average sigma you will have an usable system.
Managed to get a working algorithm. Basically, determine the average difference between data points. If my data starts to exceed some multiple of that value consecutively then most likely there is a spike occurring.
I have recently implemented a library of CORDIC functions to reduce the required computational power (my project is based on a PowerPC and is extremely strict in its execution time specifications). The language is ANSI-C.
The other functions (sin/cos/atan) work within accuracy limits both in 32 and in 64 bit implementations.
Unfortunately, the asin() function fails systematically for certain inputs.
For testing purposes I have implemented an .h file to be used in a simulink S-Function. (This is only for my convenience, you can compile the following as a standalone .exe with minimal changes)
Note: I have forced 32 iterations because I am working in 32 bit precision and the maximum possible accuracy is required.
Cordic.h:
#include <stdio.h>
#include <stdlib.h>
#define FLOAT32 float
#define INT32 signed long int
#define BIT_XOR ^
#define CORDIC_1K_32 0x26DD3B6A
#define MUL_32 1073741824.0F /*needed to scale float -> int*/
#define INV_MUL_32 9.313225746E-10F /*needed to scale int -> float*/
INT32 CORDIC_CTAB_32 [] = {0x3243f6a8, 0x1dac6705, 0x0fadbafc, 0x07f56ea6, 0x03feab76, 0x01ffd55b, 0x00fffaaa, 0x007fff55,
0x003fffea, 0x001ffffd, 0x000fffff, 0x0007ffff, 0x0003ffff, 0x0001ffff, 0x0000ffff, 0x00007fff,
0x00003fff, 0x00001fff, 0x00000fff, 0x000007ff, 0x000003ff, 0x000001ff, 0x000000ff, 0x0000007f,
0x0000003f, 0x0000001f, 0x0000000f, 0x00000008, 0x00000004, 0x00000002, 0x00000001, 0x00000000};
/* CORDIC Arcsine Core: vectoring mode */
INT32 CORDIC_asin(INT32 arc_in)
{
INT32 k;
INT32 d;
INT32 tx;
INT32 ty;
INT32 x;
INT32 y;
INT32 z;
x=CORDIC_1K_32;
y=0;
z=0;
for (k=0; k<32; ++k)
{
d = (arc_in - y)>>(31);
tx = x - (((y>>k) BIT_XOR d) - d);
ty = y + (((x>>k) BIT_XOR d) - d);
z += ((CORDIC_CTAB_32[k] BIT_XOR d) - d);
x = tx;
y = ty;
}
return z;
}
/* Wrapper function for scaling in-out of cordic core*/
FLOAT32 asin_wrap(FLOAT32 arc)
{
return ((FLOAT32)(CORDIC_asin((INT32)(arc*MUL_32))*INV_MUL_32));
}
This can be called in a manner similar to:
#include "Cordic.h"
#include "math.h"
void main()
{
y1 = asin_wrap(value_32); /*my implementation*/
y2 = asinf(value_32); /*standard math.h for comparison*/
}
The results are as shown:
Top left shows the [-1;1] input over 2000 steps (0.001 increments), bottom left the output of my function, bottom right the standard output and top right the difference of the two outputs.
It is immediate to see that the error is not within 32 bit accuracy.
I have analysed the steps performed (and the intermediate results) by my code and it seems to me that at a certain point the value of y is "close enough" to the initial value of arc_in and what could be related to a bit-shift causes the solution to diverge.
My questions:
I am at a loss, is this error inherent in the CORDIC implementation or have I made a mistake in the implementation? I was expecting the decrease of accuracy near the extremes, but those spikes in the middle are quite unexpected. (the most notable ones are just beyond +/- 0.6, but even removed these there are more at smaller values, albeit not as pronounced)
If it is something part of the CORDIC implementation, are there known workarounds?
EDIT:
Since some comment mention it, yes, I tested the definition of INT32, even writing
#define INT32 int32_T
does not change the results by the slightest amount.
The computation time on the target hardware has been measured by hundreds of repetitions of block of 10.000 iterations of the function with random input in the validity range. The observed mean results (for one call of the function) are as follows:
math.h asinf() 100.00 microseconds
CORDIC asin() 5.15 microseconds
(apparently the previous test had been faulty, a new cross-test has obtained no better than an average of 100 microseconds across the validity range)
I apparently found a better implementation. It can be downloaded in matlab version here and in C here. I will analyse more its inner workings and report later.
To review a few things mentioned in the comments:
The given code outputs values identical to another CORDIC implementation. This includes the stated inaccuracies.
The largest error is as you approach arcsin(1).
The second largest error is that the values of arcsin(0.60726) to arcsin(0.68514) all return 0.754805.
There are some vague references to inaccuracies in the CORDIC method for some functions including arcsin. The given solution is to perform "double-iterations" although I have been unable to get this to work (all values give a large amount of error).
The alternate CORDIC implemention has a comment /* |a| < 0.98 */ in the arcsin() implementation which would seem to reinforce that there is known inaccuracies close to 1.
As a rough comparison of a few different methods consider the following results (all tests performed on a desktop, Windows7 computer using MSVC++ 2010, benchmarks timed using 10M iterations over the arcsin() range 0-1):
Question CORDIC Code: 1050 ms, 0.008 avg error, 0.173 max error
Alternate CORDIC Code (ref): 2600 ms, 0.008 avg error, 0.173 max error
atan() CORDIC Code: 2900 ms, 0.21 avg error, 0.28 max error
CORDIC Using Double-Iterations: 4700 ms, 0.26 avg error, 0.917 max error (???)
Math Built-in asin(): 200 ms, 0 avg error, 0 max error
Rational Approximation (ref): 250 ms, 0.21 avg error, 0.26 max error
Linear Table Lookup (see below) 100 ms, 0.000001 avg error, 0.00003 max error
Taylor Series (7th power, ref): 300 ms, 0.01 avg error, 0.16 max error
These results are on a desktop so how relevant they would be for an embedded system is a good question. If in doubt, profiling/benchmarking on the relevant system would be advised. Most solutions tested don't have very good accuracy over the range (0-1) and all but one are actually slower than the built-in asin() function.
The linear table lookup code is posted below and is my usual method for any expensive mathematical function when speed is desired over accuracy. It simply uses a 1024 element table with linear interpolation. It seems to be both the fastest and most accurate of all methods tested, although the built-in asin() is not much slower really (test it!). It can easily be adjusted for more or less accuracy by changing the size of the table.
// Please test this code before using in anything important!
const size_t ASIN_TABLE_SIZE = 1024;
double asin_table[ASIN_TABLE_SIZE];
int init_asin_table (void)
{
for (size_t i = 0; i < ASIN_TABLE_SIZE; ++i)
{
float f = (float) i / ASIN_TABLE_SIZE;
asin_table[i] = asin(f);
}
return 0;
}
double asin_table (double a)
{
static int s_Init = init_asin_table(); // Call automatically the first time or call it manually
double sign = 1.0;
if (a < 0)
{
a = -a;
sign = -1.0;
}
if (a > 1) return 0;
double fi = a * ASIN_TABLE_SIZE;
double decimal = fi - (int)fi;
size_t i = fi;
if (i >= ASIN_TABLE_SIZE-1) return Sign * 3.14159265359/2;
return Sign * ((1.0 - decimal)*asin_table[i] + decimal*asin_table[i+1]);
}
The "single rotate" arcsine goes badly wrong when the argument is just greater than the initial value of 'x', where that is the magical scaling factor -- 1/An ~= 0.607252935 ~= 0x26DD3B6A.
This is because, for all arguments > 0, the first step always has y = 0 < arg, so d = +1, which sets y = 1/An, and leaves x = 1/An. Looking at the second step:
if arg <= 1/An, then d = -1, and the steps which follow converge to a good answer
if arg > 1/An, then d = +1, and this step moves further away from the right answer, and for a range of values a little bigger than 1/An, the subsequent steps all have d = -1, but are unable to correct the result :-(
I found:
arg = 0.607 (ie 0x26D91687), relative error 7.139E-09 -- OK
arg = 0.608 (ie 0x26E978D5), relative error 1.550E-01 -- APALLING !!
arg = 0.685 (ie 0x2BD70A3D), relative error 2.667E-04 -- BAD !!
arg = 0.686 (ie 0x2BE76C8B), relative error 1.232E-09 -- OK, again
The descriptions of the method warn about abs(arg) >= 0.98 (or so), and I found that somewhere after 0.986 the process fails to converge and the relative error jumps to ~5E-02 and hits 1E-01 (!!) at arg=1 :-(
As you did, I also found that for 0.303 < arg < 0.313 the relative error jumps to ~3E-02, and reduces slowly until things return to normal. (In this case step 2 overshoots so far that the remaining steps cannot correct it.)
So... the single rotate CORDIC for arcsine looks rubbish to me :-(
Added later... when I looked even closer at the single rotate CORDIC, I found many more small regions where the relative error is BAD...
...so I would not touch this as a method at all... it's not just rubbish, it's useless.
BTW: I thoroughly recommend "Software Manual for the Elementary Functions", William Cody and William Waite, Prentice-Hall, 1980. The methods for calculating the functions are not so interesting any more (but there is a thorough, practical discussion of the relevant range-reductions required). However, for each function they give a good test procedure.
The additional source I linked at the end of the question apparently contains the solution.
The proposed code can be reduced to the following:
#define M_PI_2_32 1.57079632F
#define SQRT2_2 7.071067811865476e-001F /* sin(45°) = cos(45°) = sqrt(2)/2 */
FLOAT32 angles[] = {
7.8539816339744830962E-01F, 4.6364760900080611621E-01F, 2.4497866312686415417E-01F, 1.2435499454676143503E-01F,
6.2418809995957348474E-02F, 3.1239833430268276254E-02F, 1.5623728620476830803E-02F, 7.8123410601011112965E-03F,
3.9062301319669718276E-03F, 1.9531225164788186851E-03F, 9.7656218955931943040E-04F, 4.8828121119489827547E-04F,
2.4414062014936176402E-04F, 1.2207031189367020424E-04F, 6.1035156174208775022E-05F, 3.0517578115526096862E-05F,
1.5258789061315762107E-05F, 7.6293945311019702634E-06F, 3.8146972656064962829E-06F, 1.9073486328101870354E-06F,
9.5367431640596087942E-07F, 4.7683715820308885993E-07F, 2.3841857910155798249E-07F, 1.1920928955078068531E-07F,
5.9604644775390554414E-08F, 2.9802322387695303677E-08F, 1.4901161193847655147E-08F, 7.4505805969238279871E-09F,
3.7252902984619140453E-09F, 1.8626451492309570291E-09F, 9.3132257461547851536E-10F, 4.6566128730773925778E-10F};
FLOAT32 arcsin_cordic(FLOAT32 t)
{
INT32 i;
INT32 j;
INT32 flip;
FLOAT32 poweroftwo;
FLOAT32 sigma;
FLOAT32 sign_or;
FLOAT32 theta;
FLOAT32 x1;
FLOAT32 x2;
FLOAT32 y1;
FLOAT32 y2;
flip = 0;
theta = 0.0F;
x1 = 1.0F;
y1 = 0.0F;
poweroftwo = 1.0F;
/* If the angle is small, use the small angle approximation */
if ((t >= -0.002F) && (t <= 0.002F))
{
return t;
}
if (t >= 0.0F)
{
sign_or = 1.0F;
}
else
{
sign_or = -1.0F;
}
/* The inv_sqrt() is the famous Fast Inverse Square Root from the Quake 3 engine
here used with 3 (!!) Newton iterations */
if ((t >= SQRT2_2) || (t <= -SQRT2_2))
{
t = 1.0F/inv_sqrt(1-t*t);
flip = 1;
}
if (t>=0.0F)
{
sign_or = 1.0F;
}
else
{
sign_or = -1.0F;
}
for ( j = 0; j < 32; j++ )
{
if (y1 > t)
{
sigma = -1.0F;
}
else
{
sigma = 1.0F;
}
/* Here a double iteration is done */
x2 = x1 - (sigma * poweroftwo * y1);
y2 = (sigma * poweroftwo * x1) + y1;
x1 = x2 - (sigma * poweroftwo * y2);
y1 = (sigma * poweroftwo * x2) + y2;
theta += 2.0F * sigma * angles[j];
t *= (1.0F + poweroftwo * poweroftwo);
poweroftwo *= 0.5F;
}
/* Remove bias */
theta -= sign_or*4.85E-8F;
if (flip)
{
theta = sign_or*(M_PI_2_32-theta);
}
return theta;
}
The following is to be noted:
It is a "Double-Iteration" CORDIC implementation.
The angles table thus differs in construction from the old table.
And the computation is done in floating point notation, this will cause a major increase in computation time on the target hardware.
A small bias is present in the output, removed via the theta -= sign_or*4.85E-8F; passage.
The following picture shows the absolute (left) and relative errors (right) of the old implementation (top) vs the implementation contained in this answer (bottom).
The relative error is obtained only by dividing the CORDIC output with the output of the built-in math.h implementation. It is plotted around 1 and not 0 for this reason.
The peak relative error (when not dividing by zero) is 1.0728836e-006.
The average relative error is 2.0253509e-007 (almost in accordance to 32 bit accuracy).
For convergence of iterative process it is necessary that any "wrong" i-th
iteration could be "corrected" in the subsequent (i+1)-th, (i+2)-th, (i+3)-th,
etc. etc. iterations. Or, in other words, at least a half of the "wrong"
i-th iteration could be corrected in the next (i+1)-th iteration.
For atan(1/2^i) this condition is satisfied, i.e.:
atan(1/2^(i+1)) > 1/2*atan(1/2^i)
Read more at
http://cordic-bibliography.blogspot.com/p/double-iterations-in-cordic.html
and:
http://baykov.de/CORDIC1972.htm
(note I'm the author of those pages)
I've got some code, and I've been trying to make some minor tweaks to it. It used to use fgets to load in a single character from a line, and use it to colour points in a 3D plot. So it would read
a
p
p
n
c
and then use other data files to assign what x, y, z points to give these. The result is a really pretty 3D plot.
I've edited the input file so it reads
0
1
1
0
2
2
0
and I want it to colour numbers the same colour.
This is where I've gotten so far with the code:
function PlotCluster(mcStep)
clear all
filename = input('Please enter filename: ', 's');
disp('Loading hopping site coordinates ...')
load x.dat
load y.dat
load z.dat
temp = z;
z = x;
x = temp;
n_sites = length(x);
disp('Loading hopping site types ...')
fp = fopen([filename]);
data = load(filename); %# Load the data
% Plot the devices
% ----------------
disp('Plotting the sample surface ...')
figure
disp('Hello world!')
ia = data == 0;
in = data == 1;
ip = data == 2;
disp('Hello Again')
plot3(x(ia),y(ia),z(ia),'b.') %,'MarkerSize',4)
hold on
plot3(x(ic),y(ic),z(ic),'b.') %,'MarkerSize',4)
plot3(x(in),y(in),z(in),'g.') %,'MarkerSize',4)
plot3(x(ip),y(ip),z(ip),'r.') %,'MarkerSize',4)
daspect([1 1 1])
set(gca,'Projection','Perspective')
set(gca,'FontSize',16)
axis tight
xlabel('z (nm)','FontSize',18)
ylabel('y (nm)','FontSize',18)
zlabel('x (nm)','FontSize',18)
%title(['Metropolis Monte Carlo step ' num2str(mcStep)])
view([126.5 23])
My issue is I'm getting this error
Index exceeds matrix dimensions.
Error in PlotCluster (line 34)
plot3(x(ia),y(ia),z(ia),'b.') %,'MarkerSize',4)
And I don't see why ia would go out of bounds of the x array. Is it to do with changing the fgets to a load statement? It was the only way to get it read the correct numbers in (not 49s and 50s which was very odd.)
The main bits that are sticking me are these lines (where the number used to correspond to 'a','n','p' etc)
ia = data == 0;
in = data == 1;
ip = data == 2;
They look like implied if statements with assignment from data to ia etc. where ia becomes an array. But I'm not sure.
Any help understanding this would be greatly appreciated.
I've fixed the issue, I hadn't updated my input correctly. To clear this up for anyone who comes to this question: ia = data ==0 means 'Make an array the same size as data, and fill it with 1 or 0 depending on if the logic (data == 0) is true or false'
This code intermittently works. It's running on a small microcontroller. It will work fine even after restarting the processor, but if I change some part of the code, it breaks. This makes me think that it's some kind of pointer bug or memory corruption. What's happening is the coordinate, p_res.pos.x is sometimes read as 0 (the incorrect value) and 96 (the correct value) when it is passed to write_circle_outlined. y seems to be correct most of the time. If anyone can spot anything obviously wrong please point it out!
int demo_game()
{
long int d;
int x, y;
struct WorldCamera p_viewer;
struct Point3D_LLA p_subj;
struct Point2D_CalcRes p_res;
p_viewer.hfov = 27;
p_viewer.vfov = 32;
p_viewer.width = 192;
p_viewer.height = 128;
p_viewer.p.lat = 51.26f;
p_viewer.p.lon = -1.0862f;
p_viewer.p.alt = 100.0f;
p_subj.lat = 51.20f;
p_subj.lon = -1.0862f;
p_subj.alt = 100.0f;
while(1)
{
fill_buffer(draw_buffer_mask, 0x0000);
fill_buffer(draw_buffer_level, 0xffff);
compute_3d_transform(&p_viewer, &p_subj, &p_res, 10000.0f);
x = p_res.pos.x;
y = p_res.pos.y;
write_circle_outlined(x, y, 1.0f / p_res.est_dist, 0, 0, 0, 1);
p_viewer.p.lat -= 0.0001f;
//p_viewer.p.alt -= 0.00001f;
d = 20000;
while(d--);
}
return 1;
}
The code for compute_3d_transform is:
void compute_3d_transform(struct WorldCamera *p_viewer, struct Point3D_LLA *p_subj, struct Point2D_CalcRes *res, float cliph)
{
// Estimate the distance to the waypoint. This isn't intended to replace
// proper lat/lon distance algorithms, but provides a general indication
// of how far away our subject is from the camera. It works accurately for
// short distances of less than 1km, but doesn't give distances in any
// meaningful unit (lat/lon distance?)
res->est_dist = hypot2(p_viewer->p.lat - p_subj->lat, p_viewer->p.lon - p_subj->lon);
// Save precious cycles if outside of visible world.
if(res->est_dist > cliph)
goto quick_exit;
// Compute the horizontal angle to the point.
// atan2(y,x) so atan2(lon,lat) and not atan2(lat,lon)!
res->h_angle = RAD2DEG(angle_dist(atan2(p_viewer->p.lon - p_subj->lon, p_viewer->p.lat - p_subj->lat), p_viewer->yaw));
res->small_dist = res->est_dist * 0.0025f; // by trial and error this works well.
// Using the estimated distance and altitude delta we can calculate
// the vertical angle.
res->v_angle = RAD2DEG(atan2(p_viewer->p.alt - p_subj->alt, res->est_dist));
// Normalize the results to fit in the field of view of the camera if
// the point is visible. If they are outside of (0,hfov] or (0,vfov]
// then the point is not visible.
res->h_angle += p_viewer->hfov / 2;
res->v_angle += p_viewer->vfov / 2;
// Set flags.
if(res->h_angle < 0 || res->h_angle > p_viewer->hfov)
res->flags |= X_OVER;
if(res->v_angle < 0 || res->v_angle > p_viewer->vfov)
res->flags |= Y_OVER;
res->pos.x = (res->h_angle / p_viewer->hfov) * p_viewer->width;
res->pos.y = (res->v_angle / p_viewer->vfov) * p_viewer->height;
return;
quick_exit:
res->flags |= X_OVER | Y_OVER;
return;
}
Structure for the results:
typedef struct Point2D_Pixel { unsigned int x, y; };
// Structure for storing calculated results (from camera transforms.)
typedef struct Point2D_CalcRes
{
struct Point2D_Pixel pos;
float h_angle, v_angle, est_dist, small_dist;
int flags;
};
The code is part of an open source project of mine so it's okay to post a lot of code here.
I see some of your calculation depends on p_viewer->yaw, but I do not see any intialization for p_viewer->yaw. Is this your problem?
A couple of things that seem sketchy:
You can return from compute_3d_transform without setting many of the fields in p_res/res but the caller never checks for this situation.
You consistently read from res->flags without initializing it first.
Whenever the output differs, it possibly means some value is not initialized and the outcome depends on the garbage value present in a variable. Keeping that in mind, I looked for uninitialized variables. the structure p_res is not initialized.
if(res->est_dist > cliph)
goto quick_exit;
that means if condition may turn out to be true or false depending on what garbage value is stored in res->est_dist. When if condition turns out to true, it goes straight to quick_exit label and doesn't update p_res.pos.x. If condition turned out to be false then its updated.
When I used to program C, I would use a divide and conquer debugging technique for this kind of problem to try to isolate the offending operation (paying attention to whether the symptoms change as debugging code is added, which is indicative of dangling pointer type bugs).
Essentially, start with the first line where the value is known to be good (and prove that it is consistently good at that line). Then identify where is it known to be bad. Then approx. halfway between the two points insert a test to see if it's bad. If not, then insert a test halfway between the mid-point and the known bad location, if it is bad then insert a test halfway between the mid-point and the known good location, and so on.
If the line identified is itself a function call, this process can be repeated in that called function, and so on.
When using this kind of approach, it's important to minimize the amount of added code and the artificial "noise", which can create timing changes.
Use this if you don't have (or can't use) an interactive debugger, or if the problem does not manifest when using one.