How to reference a previous iteration in a for loop? - c

I'm writing some code to draw 2 8 pixel long lines on a LCD end to end. I would like to do this using a for loop, however I am stuck working out how to connect the start of the second to the end of the first. The following code produces the pattern I am after, however is very repetitive when doing many lines:
void draw_road(){
double angle = PI/2;
double length = 8;
int starting_x = 24;
int starting_y = 48;
double x1a = starting_x;
double y1a = starting_y;
double x2a = x1a + (cos(angle) * length);
double y2a = y1a - (sin(angle) * length);
draw_line(x1a, y1a, x2a, y2a, FG_COLOUR);
double x1b = x2a ;
double y1b = y2a;
double x2b = x1b + (cos(angle-(angle/4.5)) * length);
double y2b = y1b - (sin(angle-(angle/4.5)) * length);
draw_line(x1b, y1b, x2b, y2b, FG_COLOUR);
}
I have tried the code below, however I don't think it knows where to look for [i-1].
void draw_road(){
double angle = PI/2;
double length = 8;
int starting_x = 24;
int starting_y = 48;
double x1[2];
double y1[2];
double x2[2];
double y2[2];
for (int i = 0; i < 2; i++){
x1[i] = starting_x + x2[i-1];
y1[i] = starting_y + y2[i-1];
x2[i] = x1[i] + (cos(angle) * length);
y2[i] = y1[i] - (sin(angle) * length);
draw_line(x1[i], y1[i], x2[i], y2[i], FG_COLOUR);
angle /= 2;
}
}
How can I correct this so the for loop knows the values of the last loop (especially if it is the very first loop)?

In the first iteration, you don't have a "previous" position; So there is no line to draw but just to declare the starting point.
An if around the call to draw and conditional operators for distinguishing between "setting a starting point" and "calculating the next point" could do the job:
for (int i = 0; i < 2; i++){
x1[i] = (i>0) ? starting_x + x2[i-1] : starting_x;
...
if (i>0) {
drawLine(...)
}
}

Related

how to create a simple iir low pass filter with not round errors? (16 bit pcm data)

i have an array of n length fullfilled by 16 bit (int16) pcm raw data,the data is in 44100 sample_rate
and stereo,so i have in my array first 2 bytes left channel then right channel etc...i tried to implement a simple low pass converting my array into floating points -1 1,the low pass works but there are round errors that cause little pops in the sound
now i do simply this :
INT32 left_id = 0;
INT32 right_id = 1;
DOUBLE filtered_l_db = 0.0;
DOUBLE filtered_r_db = 0.0;
DOUBLE last_filtered_left = 0;
DOUBLE last_filtered_right = 0;
DOUBLE l_db = 0.0;
DOUBLE r_db = 0.0;
DOUBLE low_filter = filter_freq(core->audio->low_pass_cut);
for(UINT32 a = 0; a < (buffer_size/2);++a)
{
l_db = ((DOUBLE)input_buffer[left_id]) / (DOUBLE)32768;
r_db = ((DOUBLE)input_buffer[right_id]) / (DOUBLE)32768;
///////////////LOW PASS
filtered_l_db = last_filtered_left +
(low_filter * (l_db -last_filtered_left ));
filtered_r_db = last_filtered_right +
(low_filter * (r_db - last_filtered_right));
last_filtered_left = filtered_l_db;
last_filtered_right = filtered_r_db;
INT16 l = (INT16)(filtered_l_db * (DOUBLE)32768);
INT16 r = (INT16)(filtered_r_db * (DOUBLE)32768);
output_buffer[left_id] = (output_buffer[left_id] + l);
output_buffer[right_id] = (output_buffer[right_id] + r);
left_id +=2;
right_id +=2;
}
PS: the input buffer is an int16 array with the pcm data from -32767 to 32767;
i found this function here
Low Pass filter in C
and was the only one that i could understand xd
DOUBLE filter_freq(DOUBLE cut_freq)
{
DOUBLE a = 1.0/(cut_freq * 2 * PI);
DOUBLE b = 1.0/SAMPLE_RATE;
return b/(a+b);
}
my aim is instead to have absolute precision on the wave,and to directly low pass using only integers
with the cost to lose resolution on the filter(and i'm ok with it)..i saw a lot of examples but i really didnt understand anything...someone of you would be so gentle to explain how this is done like you would explain to a little baby?(in code or pseudo code rapresentation) thank you
Assuming the result of function filter_freq can be written as a fraction m/n your filter calculation basically is
y_new = y_old + (m/n) * (x - y_old);
which can be transformed to
y_new = ((n * y_old) + m * (x - y_old)) / n;
The integer division / n truncates the result towards 0. If you want rounding instead of truncation you can implement it as
y_tmp = ((n * y_old) + m * (x - y_old));
if(y_tmp < 0) y_tmp -= (n / 2);
else y_tmp += (n / 2);
y_new = y_tmp / n
In order to avoid losing precision from dividing the result by n in one step and multiplying it by n in the next step you can save the value y_tmp before the division and use it in the next cycle.
y_tmp = (y_tmp + m * (x - y_old));
if(y_tmp < 0) y_new = y_tmp - (n / 2);
else y_new = y_tmp + (n / 2);
y_new /= n;
If your input data is int16_t I suggest to implement the calculation using int32_t to avoid overflows.
I tried to convert the filter in your code without checking other parts for possible problems.
INT32 left_id = 0;
INT32 right_id = 1;
int32_t filtered_l_out = 0; // output value after division
int32_t filtered_r_out = 0;
int32_t filtered_l_tmp = 0; // used to keep the output value before division
int32_t filtered_r_tmp = 0;
int32_t l_in = 0; // input value
int32_t r_in = 0;
DOUBLE low_filter = filter_freq(core->audio->low_pass_cut);
// define denominator and calculate numerator
// use power of 2 to allow bit-shift instead of division
const uint32_t filter_shift = 16U;
const int32_t filter_n = 1U << filter_shift;
int32_t filter_m = (int32_t)(low_filter * filter_n)
for(UINT32 a = 0; a < (buffer_size/2);++a)
{
l_in = input_buffer[left_id]);
r_in = input_buffer[right_id];
///////////////LOW PASS
filtered_l_tmp = filtered_l_tmp + filter_m * (l_in - filtered_l_out);
if(last_filtered_left < 0) {
filtered_l_out = last_filtered_left - filter_n/2;
} else {
filtered_l_out = last_filtered_left + filter_n/2;
}
//filtered_l_out /= filter_n;
filtered_l_out >>= filter_shift;
/* same calculation for right */
INT16 l = (INT16)(filtered_l_out);
INT16 r = (INT16)(filtered_r_out);
output_buffer[left_id] = (output_buffer[left_id] + l);
output_buffer[right_id] = (output_buffer[right_id] + r);
left_id +=2;
right_id +=2;
}
As your filter is initialized with 0 it may need several samples to follow a possible step to the first input value. Depending on your data it might be better to initialize the filter based on the first input value.

Writing a wave generator with SDL

I've coded a simple sequencer in C with SDL 1.2 and SDL_mixer(to play .wav file). It works well and I want to add some audio synthesis to this program. I've look up the and I found this sinewave code using SDL2(https://github.com/lundstroem/synth-samples-sdl2/blob/master/src/synth_samples_sdl2_2.c)
Here's how the sinewave is coded in the program:
static void build_sine_table(int16_t *data, int wave_length)
{
/*
Build sine table to use as oscillator:
Generate a 16bit signed integer sinewave table with 1024 samples.
This table will be used to produce the notes.
Different notes will be created by stepping through
the table at different intervals (phase).
*/
double phase_increment = (2.0f * pi) / (double)wave_length;
double current_phase = 0;
for(int i = 0; i < wave_length; i++) {
int sample = (int)(sin(current_phase) * INT16_MAX);
data[i] = (int16_t)sample;
current_phase += phase_increment;
}
}
static double get_pitch(double note) {
/*
Calculate pitch from note value.
offset note by 57 halfnotes to get correct pitch from the range we have chosen for the notes.
*/
double p = pow(chromatic_ratio, note - 57);
p *= 440;
return p;
}
static void audio_callback(void *unused, Uint8 *byte_stream, int byte_stream_length) {
/*
This function is called whenever the audio buffer needs to be filled to allow
for a continuous stream of audio.
Write samples to byteStream according to byteStreamLength.
The audio buffer is interleaved, meaning that both left and right channels exist in the same
buffer.
*/
// zero the buffer
memset(byte_stream, 0, byte_stream_length);
if(quit) {
return;
}
// cast buffer as 16bit signed int.
Sint16 *s_byte_stream = (Sint16*)byte_stream;
// buffer is interleaved, so get the length of 1 channel.
int remain = byte_stream_length / 2;
// split the rendering up in chunks to make it buffersize agnostic.
long chunk_size = 64;
int iterations = remain/chunk_size;
for(long i = 0; i < iterations; i++) {
long begin = i*chunk_size;
long end = (i*chunk_size) + chunk_size;
write_samples(s_byte_stream, begin, end, chunk_size);
}
}
static void write_samples(int16_t *s_byteStream, long begin, long end, long length) {
if(note > 0) {
double d_sample_rate = sample_rate;
double d_table_length = table_length;
double d_note = note;
/*
get correct phase increment for note depending on sample rate and table length.
*/
double phase_increment = (get_pitch(d_note) / d_sample_rate) * d_table_length;
/*
loop through the buffer and write samples.
*/
for (int i = 0; i < length; i+=2) {
phase_double += phase_increment;
phase_int = (int)phase_double;
if(phase_double >= table_length) {
double diff = phase_double - table_length;
phase_double = diff;
phase_int = (int)diff;
}
if(phase_int < table_length && phase_int > -1) {
if(s_byteStream != NULL) {
int16_t sample = sine_wave_table[phase_int];
sample *= 0.6; // scale volume.
s_byteStream[i+begin] = sample; // left channel
s_byteStream[i+begin+1] = sample; // right channel
}
}
}
}
}
I don't understand how I could change the sinewave formula to genrate other waveform like square/triangle/saw ect...
EDIT:
Because I forgot to explain it, here's what I tried.
I followed the example I've seen on this video series(https://www.youtube.com/watch?v=tgamhuQnOkM). The source code of the method provided by the video is on github, and the wave generation code is looking like this:
double w(double dHertz)
{
return dHertz * 2.0 * PI;
}
// General purpose oscillator
double osc(double dHertz, double dTime, int nType = OSC_SINE)
{
switch (nType)
{
case OSC_SINE: // Sine wave bewteen -1 and +1
return sin(w(dHertz) * dTime);
case OSC_SQUARE: // Square wave between -1 and +1
return sin(w(dHertz) * dTime) > 0 ? 1.0 : -1.0;
case OSC_TRIANGLE: // Triangle wave between -1 and +1
return asin(sin(w(dHertz) * dTime)) * (2.0 / PI);
}
Because the C++ code here uses windows soun api I could not copy/paste this method to make it work on the piece of code I've found using SDL2.
So I tried to this in order to obtain a square wave:
static void build_sine_table(int16_t *data, int wave_length)
{
double phase_increment = ((2.0f * pi) / (double)wave_length) > 0 ? 1.0 : -1.0;
double current_phase = 0;
for(int i = 0; i < wave_length; i++) {
int sample = (int)(sin(current_phase) * INT16_MAX);
data[i] = (int16_t)sample;
current_phase += phase_increment;
}
}
This didn't gave me a square wave but more a saw wave.
Here's what I tried to get a triangle wave:
static void build_sine_table(int16_t *data, int wave_length)
{
double phase_increment = (2.0f * pi) / (double)wave_length;
double current_phase = 0;
for(int i = 0; i < wave_length; i++) {
int sample = (int)(asin(sin(current_phase) * INT16_MAX)) * (2 / pi);
data[i] = (int16_t)sample;
current_phase += phase_increment;
}
}
This also gave me another type of waveform, not triangle.
You’d replace the sin function call with call to one of the following:
// this is a helper function only
double normalize(double phase)
{
double cycles = phase/(2.0*M_PI);
phase -= trunc(cycles) * 2.0 * M_PI;
if (phase < 0) phase += 2.0*M_PI;
return phase;
}
double square(double phase)
{ return (normalize(phase) < M_PI) ? 1.0 : -1.0; }
double sawtooth(double phase)
{ return -1.0 + normalize(phase) / M_PI; }
double triangle(double phase)
{
phase = normalize(phase);
if (phase >= M_PI)
phase = 2*M_PI - phase;
return -1.0 + 2.0 * phase / M_PI;
}
You’d be building tables just like you did for the sine, except they’d be the square, sawtooth and triangle tables, respectively.

C: Accessing lookup tables faster?

I have a piece of code that traces 4 sines at a time.
My original code was making roughly 12000 sin() function calls per frame and was running at 30 fps.
I tried optimizing it by generating lookup tables. I ended up with 16 different lookup tables. I declared and load them in a separate header file at the top of my program. Each table is declared like so:
static const float d4_lookup[800] {...};
Now, with this new method I actually lost fps?! I'm running at 20 fps now instead of 30. Each frame now only has to do 8 sin / cos calls and 19200 lookup calls vs 12000 sin() calls.
I compile using gcc with -O3 flag on. At the moment, the lookup tables are included at the top and are part of the global scope of the program.
I assume I'm not loading them in the right memory or something to that effect. How can I speed up the lookup time?
** EDIT 1 **
As requested, here's the function that uses the lookup calls, it is called once per frame:
void
update_sines(void)
{
static float c1_sin, c1_cos;
static float c2_sin, c2_cos;
static float c3_sin, c3_cos;
static float c4_sin, c4_cos;
clock_gettime(CLOCK_MONOTONIC, &spec);
s = spec.tv_sec;
ms = spec.tv_nsec * 0.0000001;
etime = concatenate((long)s, ms);
c1_sin = sinf(etime * 0.00525);
c1_cos = cosf(etime * 0.00525);
c2_sin = sinf(etime * 0.007326);
c2_cos = cosf(etime * 0.007326);
c3_sin = sinf(etime * 0.0046);
c3_cos = cosf(etime * 0.0046);
c4_sin = sinf(etime * 0.007992);
c4_cos = cosf(etime * 0.007992);
int k;
for (k = 0; k < 800; ++k)
{
sine1[k] = a1_lookup[k] * ((bx1_sin_lookup[k] * c1_cos) + (c1_sin * bx1_cos_lookup[k])) + d1_lookup[k];
sine2[k] = a2_lookup[k] * ((bx2_sin_lookup[k] * c2_cos) + (c2_sin * bx2_cos_lookup[k])) + d2_lookup[k] + 50;
sine3[k] = a3_lookup[k] * ((bx3_sin_lookup[k] * c3_cos) + (c3_sin * bx3_cos_lookup[k])) + d3_lookup[k];
sine4[k] = a4_lookup[k] * ((bx4_sin_lookup[k] * c4_cos) + (c4_sin * bx4_cos_lookup[k])) + d4_lookup[k] + 50;
}
}
** UPDATE **
For anyone reading this thread, I gave up on this problem. I tried using OpenCL kernels, structs, SIMD instructions as well as all the solutions shown here. In the end the original code that computed the sinf() 12800 per frame worked faster than the lookup tables since the lookup tables didn't fit into the cache. Yet it was still only doing 30 fps. It just had too much going on to keep up with my 60fps expectations. I've decided to take a different direction. Thanks to everyone who contributed to this thread. Most of these solutions would probably work to get some half decent speed improvements but nothing like the 200% speed up I needed here to have the lookup tables work the way I wanted.
Sometimes it's hard to know what's slowing you down, but potentially you are going to ruin your cache hits, you could try a lookup of a struct
typedef struct
{
float bx1_sin;
float bx2_sin;
float bx3_sin;
float bx4_sin;
float bx1_cos;
etc etc
including sine1,2,3,4 as well
} lookup_table
then
lookup_table lookup[800]
now everything at the kth lookup will be in the same small chunk of memory.
also, if you use a macro that takes k as a parameter to do do the contents of the loop lets say SINE_CALC(k), or an inline function...
you can do
for (k = 0; k < 800; ++k)
{
SINE_CALC(k); k++;
SINE_CALC(k); k++;
SINE_CALC(k); k++;
SINE_CALC(k); k++;
SINE_CALC(k); k++;
}
if you do a macro, make sure the k++ is outside the macro call like shown
Try unrolling your loops like this:
for (k = 0; k < 800; ++k)
{
sine1[k] = a1_lookup[k];
sine2[k] = a2_lookup[k];
sine3[k] = a3_lookup[k];
sine4[k] = a4_lookup[k];
}
for (k = 0; k < 800; ++k)
{
sine1[k] *= ((bx1_sin_lookup[k] * c1_cos) + (c1_sin * bx1_cos_lookup[k]));
sine2[k] *= ((bx2_sin_lookup[k] * c2_cos) + (c2_sin * bx2_cos_lookup[k]));
sine3[k] *= ((bx3_sin_lookup[k] * c3_cos) + (c3_sin * bx3_cos_lookup[k]));
sine4[k] *= ((bx4_sin_lookup[k] * c4_cos) + (c4_sin * bx4_cos_lookup[k]));
}
for (k = 0; k < 800; ++k)
{
sine1[k] += d1_lookup[k];
sine2[k] += d2_lookup[k] + 50;
sine3[k] += d3_lookup[k];
sine4[k] += d4_lookup[k] + 50;
}
By accessing fewer lookup tables in each loop, you should be able to stay in the cache. The middle loop could be split up as well, but you'll need to create an intermediate table for one of the sub-expressions.
Intel processors can predict serial access (and perform prefetch) for up to 4 arrays both for forward and backward traverse. At least this was true in Core 2 Duo days. Split your for in:
for (k = 0; k < 800; ++k)
sine1[k] = a1_lookup[k] * ((bx1_sin_lookup[k] * c1_cos) + (c1_sin * bx1_cos_lookup[k])) + d1_lookup[k];
for (k = 0; k < 800; ++k)
sine2[k] = a2_lookup[k] * ((bx2_sin_lookup[k] * c2_cos) + (c2_sin * bx2_cos_lookup[k])) + d2_lookup[k] + 50;
for (k = 0; k < 800; ++k)
sine3[k] = a3_lookup[k] * ((bx3_sin_lookup[k] * c3_cos) + (c3_sin * bx3_cos_lookup[k])) + d3_lookup[k];
for (k = 0; k < 800; ++k)
sine4[k] = a4_lookup[k] * ((bx4_sin_lookup[k] * c4_cos) + (c4_sin * bx4_cos_lookup[k])) + d4_lookup[k] + 50;
I guess you have more cache load than benchmarks in other answers so this does matters. I recommend you not to unroll loops, compilers do it well.
Using a simple sin lookup table will yields >20% speed increase on my linux machine (vm, gcc, 64bit). Interestingly, the size of lookup table (within reasonable < L1 cache size values) does not influence the speed of execution.
Using a fastsin simple implementation from here I got >45% improvement.
Code:
#include <math.h>
#include <stdio.h>
#include <stdint.h>
#include <sys/time.h>
#include <time.h>
#define LOOKUP_SIZE 628
uint64_t currentTimestampUs( void )
{
struct timeval tv;
time_t localTimeRet;
uint64_t timestamp = 0;
//time_t tzDiff = 0;
struct tm when;
int64_t localeOffset = 0;
{
localTimeRet = time(NULL);
localtime_r ( &localTimeRet, &when );
localeOffset = when.tm_gmtoff * 1000000ll;
}
gettimeofday ( &tv, NULL );
timestamp = ((uint64_t)((tv.tv_sec) * 1000000ll) ) + ( (uint64_t)(tv.tv_usec) );
timestamp+=localeOffset;
return timestamp;
}
const double PI = 3.141592653589793238462;
const double PI2 = 3.141592653589793238462 * 2;
static float sinarr[LOOKUP_SIZE];
void initSinArr() {
int a =0;
for (a=0; a<LOOKUP_SIZE; a++) {
double arg = (1.0*a/LOOKUP_SIZE)*((double)PI * 0.5);
float sinval_f = sin(arg); // double computation earlier to avoid losing precision on value
sinarr[a] = sinval_f;
}
}
float sinlookup(float val) {
float normval = val;
while (normval < 0) {
normval += PI2;
}
while (normval > PI2) {
normval -= PI2;
}
int index = LOOKUP_SIZE*(2*normval/PI);
if (index > 3*LOOKUP_SIZE) {
index = -index + 4*LOOKUP_SIZE;//LOOKUP_SIZE - (index-3*LOOKUP_SIZE);
return -sinarr[index];
} else if (index > 2*LOOKUP_SIZE) {
index = index - 2*LOOKUP_SIZE;
return -sinarr[index];
} else if (index > LOOKUP_SIZE) {
index = 2*LOOKUP_SIZE - index;
return sinarr[index];
} else {
return sinarr[index];
}
}
float sin_fast(float x) {
while (x < -PI)
x += PI2;
while (x > PI)
x -= PI2;
//compute sine
if (x < 0)
return 1.27323954 * x + .405284735 * x * x;
else
return 1.27323954 * x - 0.405284735 * x * x;
}
int main(void) {
initSinArr();
int a = 0;
float val = 0;
const int num_tries = 100000;
uint64_t startLookup = currentTimestampUs();
for (a=0; a<num_tries; a++) {
for (val=0; val<PI2; val+=0.01) {
float compval = sinlookup(val);
(void)compval;
}
}
uint64_t startSin = currentTimestampUs();
for (a=0; a<num_tries; a++) {
for (val=0; val<PI2; val+=0.01) {
float compval = sin(val);
(void)compval;
}
}
uint64_t startFastSin = currentTimestampUs();
for (a=0; a<num_tries; a++) {
for (val=0; val<PI2; val+=0.01) {
float compval = sin_fast(val);
(void)compval;
}
}
uint64_t end = currentTimestampUs();
int64_t lookupMs = (startSin - startLookup)/1000;
int64_t sinMs = (startFastSin - startSin)/1000;
int64_t fastSinMs = (end - startFastSin)/1000;
printf(" lookup: %lld ms\n", lookupMs );
printf(" sin: %lld ms\n", sinMs );
printf(" diff: %lld ms\n", sinMs-lookupMs);
printf(" diff%: %lld %\n", 100*(sinMs-lookupMs)/sinMs);
printf("fastsin: %lld ms\n", fastSinMs );
printf(" sin: %lld ms\n", sinMs );
printf(" diff: %lld ms\n", sinMs-fastSinMs);
printf(" diff%: %lld %\n", 100*(sinMs-fastSinMs)/sinMs);
}
Sample result:
lookup: 2276 ms
sin: 3004 ms
diff: 728 ms
diff%: 24 %
fastsin: 1500 ms
sin: 3004 ms
diff: 1504 ms
diff%: 50 %

How to generate a set of points that are equidistant from each other and lie on a circle

I am trying to generate an array of n points that are equidistant from each other and lie on a circle in C. Basically, I need to be able to pass a function the number of points that I would like to generate and get back an array of points.
It's been a really long time since I've done C/C++, so I've had a stab at this more to see how I got on with it, but here's some code that will calculate the points for you. (It's a VS2010 console application)
// CirclePoints.cpp : Defines the entry point for the console application.
//
#include "stdafx.h"
#include "stdio.h"
#include "math.h"
int _tmain()
{
int points = 8;
double radius = 100;
double step = ((3.14159265 * 2) / points);
double x, y, current = 0;
for (int i = 0; i < points; i++)
{
x = sin(current) * radius;
y = cos(current) * radius;
printf("point: %d x:%lf y:%lf\n", i, x, y);
current += step;
}
return 0;
}
Try something like this:
void make_circle(float *output, size_t num, float radius)
{
size_t i;
for(i = 0; i < num; i++)
{
const float angle = 2 * M_PI * i / num;
*output++ = radius * cos(angle);
*output++ = radius * sin(angle);
}
}
This is untested, there might be an off-by-one hiding in the angle step calculation but it should be close.
This assumes I understood the question correctly, of course.
UPDATE: Redid the angle computation to not be incrementing, to reduce float precision loss due to repeated addition.
Here's a solution, somewhat optimized, untested. Error can accumulate, but using double rather than float probably more than makes up for it except with extremely large values of n.
void make_circle(double *dest, size_t n, double r)
{
double x0 = cos(2*M_PI/n), y0 = sin(2*M_PI/n), x=x0, y=y0, tmp;
for (;;) {
*dest++ = r*x;
*dest++ = r*y;
if (!--n) break;
tmp = x*x0 - y*y0;
y = x*y0 + y*x0;
x = tmp;
}
}
You have to solve this in c language:
In an x-y Cartesian coordinate system, the circle with centre coordinates (a, b) and radius r is the set of all points (x, y) such that
(x - a)^2 + (y - b)^2 = r^2
Here's a javascript implementation that also takes an optional center point.
function circlePoints (radius, numPoints, centerX, centerY) {
centerX = centerX || 0;
centerY = centerY || 0;
var
step = (Math.PI * 2) / numPoints,
current = 0,
i = 0,
results = [],
x, y;
for (; i < numPoints; i += 1) {
x = centerX + Math.sin(current) * radius;
y = centerY + Math.cos(current) * radius;
results.push([x,y]);
console.log('point %d # x:%d, y:%d', i, x, y);
current += step;
}
return results;
}

fast & efficient least squares fit algorithm in C?

I am trying to implement a linear least squares fit onto 2 arrays of data: time vs amplitude. The only technique I know so far is to test all of the possible m and b points in (y = m*x+b) and then find out which combination fits my data best so that it has the least error. However, I think iterating so many combinations is sometimes useless because it tests out everything. Are there any techniques to speed up the process that I don't know about? Thanks.
Try this code. It fits y = mx + b to your (x,y) data.
The arguments to linreg are
linreg(int n, REAL x[], REAL y[], REAL* b, REAL* m, REAL* r)
n = number of data points
x,y = arrays of data
*b = output intercept
*m = output slope
*r = output correlation coefficient (can be NULL if you don't want it)
The return value is 0 on success, !=0 on failure.
Here's the code
#include "linreg.h"
#include <stdlib.h>
#include <math.h> /* math functions */
//#define REAL float
#define REAL double
inline static REAL sqr(REAL x) {
return x*x;
}
int linreg(int n, const REAL x[], const REAL y[], REAL* m, REAL* b, REAL* r){
REAL sumx = 0.0; /* sum of x */
REAL sumx2 = 0.0; /* sum of x**2 */
REAL sumxy = 0.0; /* sum of x * y */
REAL sumy = 0.0; /* sum of y */
REAL sumy2 = 0.0; /* sum of y**2 */
for (int i=0;i<n;i++){
sumx += x[i];
sumx2 += sqr(x[i]);
sumxy += x[i] * y[i];
sumy += y[i];
sumy2 += sqr(y[i]);
}
REAL denom = (n * sumx2 - sqr(sumx));
if (denom == 0) {
// singular matrix. can't solve the problem.
*m = 0;
*b = 0;
if (r) *r = 0;
return 1;
}
*m = (n * sumxy - sumx * sumy) / denom;
*b = (sumy * sumx2 - sumx * sumxy) / denom;
if (r!=NULL) {
*r = (sumxy - sumx * sumy / n) / /* compute correlation coeff */
sqrt((sumx2 - sqr(sumx)/n) *
(sumy2 - sqr(sumy)/n));
}
return 0;
}
Example
You can run this example online.
int main()
{
int n = 6;
REAL x[6]= {1, 2, 4, 5, 10, 20};
REAL y[6]= {4, 6, 12, 15, 34, 68};
REAL m,b,r;
linreg(n,x,y,&m,&b,&r);
printf("m=%g b=%g r=%g\n",m,b,r);
return 0;
}
Here is the output
m=3.43651 b=-0.888889 r=0.999192
Here is the Excel plot and linear fit (for verification).
All values agree exactly with the C code above (note C code returns r while Excel returns R**2).
There are efficient algorithms for least-squares fitting; see Wikipedia for details. There are also libraries that implement the algorithms for you, likely more efficiently than a naive implementation would do; the GNU Scientific Library is one example, but there are others under more lenient licenses as well.
From Numerical Recipes: The Art of Scientific Computing in (15.2) Fitting Data to a Straight Line:
Linear Regression:
Consider the problem of fitting a set of N data points (xi, yi) to a straight-line model:
Assume that the uncertainty: sigmai associated with each yi and that the xi’s (values of the dependent variable) are known exactly. To measure how well the model agrees with the data, we use the chi-square function, which in this case is:
The above equation is minimized to determine a and b. This is done by finding the derivative of the above equation with respect to a and b, equate them to zero and solve for a and b. Then we estimate the probable uncertainties in the estimates of a and b, since obviously the measurement errors in the data must introduce some uncertainty in the determination of those parameters. Additionally, we must estimate the goodness-of-fit of the data to the
model. Absent this estimate, we have not the slightest indication that the parameters a and b in the model have any meaning at all.
The below struct performs the mentioned calculations:
struct Fitab {
// Object for fitting a straight line y = a + b*x to a set of
// points (xi, yi), with or without available
// errors sigma i . Call one of the two constructors to calculate the fit.
// The answers are then available as the variables:
// a, b, siga, sigb, chi2, and either q or sigdat.
int ndata;
double a, b, siga, sigb, chi2, q, sigdat; // Answers.
vector<double> &x, &y, &sig;
// Constructor.
Fitab(vector<double> &xx, vector<double> &yy, vector<double> &ssig)
: ndata(xx.size()), x(xx), y(yy), sig(ssig), chi2(0.), q(1.), sigdat(0.)
{
// Given a set of data points x[0..ndata-1], y[0..ndata-1]
// with individual standard deviations sig[0..ndata-1],
// sets a,b and their respective probable uncertainties
// siga and sigb, the chi-square: chi2, and the goodness-of-fit
// probability: q
Gamma gam;
int i;
double ss=0., sx=0., sy=0., st2=0., t, wt, sxoss; b=0.0;
for (i=0;i < ndata; i++) { // Accumulate sums ...
wt = 1.0 / SQR(sig[i]); //...with weights
ss += wt;
sx += x[i]*wt;
sy += y[i]*wt;
}
sxoss = sx/ss;
for (i=0; i < ndata; i++) {
t = (x[i]-sxoss) / sig[i];
st2 += t*t;
b += t*y[i]/sig[i];
}
b /= st2; // Solve for a, b, sigma-a, and simga-b.
a = (sy-sx*b) / ss;
siga = sqrt((1.0+sx*sx/(ss*st2))/ss);
sigb = sqrt(1.0/st2); // Calculate chi2.
for (i=0;i<ndata;i++) chi2 += SQR((y[i]-a-b*x[i])/sig[i]);
if (ndata>2) q=gam.gammq(0.5*(ndata-2),0.5*chi2); // goodness of fit
}
// Constructor.
Fitab(vector<double> &xx, vector<double> &yy)
: ndata(xx.size()), x(xx), y(yy), sig(xx), chi2(0.), q(1.), sigdat(0.)
{
// As above, but without known errors (sig is not used).
// The uncertainties siga and sigb are estimated by assuming
// equal errors for all points, and that a straight line is
// a good fit. q is returned as 1.0, the normalization of chi2
// is to unit standard deviation on all points, and sigdat
// is set to the estimated error of each point.
int i;
double ss,sx=0.,sy=0.,st2=0.,t,sxoss;
b=0.0; // Accumulate sums ...
for (i=0; i < ndata; i++) {
sx += x[i]; // ...without weights.
sy += y[i];
}
ss = ndata;
sxoss = sx/ss;
for (i=0;i < ndata; i++) {
t = x[i]-sxoss;
st2 += t*t;
b += t*y[i];
}
b /= st2; // Solve for a, b, sigma-a, and sigma-b.
a = (sy-sx*b)/ss;
siga=sqrt((1.0+sx*sx/(ss*st2))/ss);
sigb=sqrt(1.0/st2); // Calculate chi2.
for (i=0;i<ndata;i++) chi2 += SQR(y[i]-a-b*x[i]);
if (ndata > 2) sigdat=sqrt(chi2/(ndata-2));
// For unweighted data evaluate typical
// sig using chi2, and adjust
// the standard deviations.
siga *= sigdat;
sigb *= sigdat;
}
};
where struct Gamma:
struct Gamma : Gauleg18 {
// Object for incomplete gamma function.
// Gauleg18 provides coefficients for Gauss-Legendre quadrature.
static const Int ASWITCH=100; When to switch to quadrature method.
static const double EPS; // See end of struct for initializations.
static const double FPMIN;
double gln;
double gammp(const double a, const double x) {
// Returns the incomplete gamma function P(a,x)
if (x < 0.0 || a <= 0.0) throw("bad args in gammp");
if (x == 0.0) return 0.0;
else if ((Int)a >= ASWITCH) return gammpapprox(a,x,1); // Quadrature.
else if (x < a+1.0) return gser(a,x); // Use the series representation.
else return 1.0-gcf(a,x); // Use the continued fraction representation.
}
double gammq(const double a, const double x) {
// Returns the incomplete gamma function Q(a,x) = 1 - P(a,x)
if (x < 0.0 || a <= 0.0) throw("bad args in gammq");
if (x == 0.0) return 1.0;
else if ((Int)a >= ASWITCH) return gammpapprox(a,x,0); // Quadrature.
else if (x < a+1.0) return 1.0-gser(a,x); // Use the series representation.
else return gcf(a,x); // Use the continued fraction representation.
}
double gser(const Doub a, const Doub x) {
// Returns the incomplete gamma function P(a,x) evaluated by its series representation.
// Also sets ln (gamma) as gln. User should not call directly.
double sum,del,ap;
gln=gammln(a);
ap=a;
del=sum=1.0/a;
for (;;) {
++ap;
del *= x/ap;
sum += del;
if (fabs(del) < fabs(sum)*EPS) {
return sum*exp(-x+a*log(x)-gln);
}
}
}
double gcf(const Doub a, const Doub x) {
// Returns the incomplete gamma function Q(a, x) evaluated
// by its continued fraction representation.
// Also sets ln (gamma) as gln. User should not call directly.
int i;
double an,b,c,d,del,h;
gln=gammln(a);
b=x+1.0-a; // Set up for evaluating continued fraction
// by modified Lentz’s method with with b0 = 0.
c=1.0/FPMIN;
d=1.0/b;
h=d;
for (i=1;;i++) {
// Iterate to convergence.
an = -i*(i-a);
b += 2.0;
d=an*d+b;
if (fabs(d) < FPMIN) d=FPMIN;
c=b+an/c;
if (fabs(c) < FPMIN) c=FPMIN;
d=1.0/d;
del=d*c;
h *= del;
if (fabs(del-1.0) <= EPS) break;
}
return exp(-x+a*log(x)-gln)*h; Put factors in front.
}
double gammpapprox(double a, double x, int psig) {
// Incomplete gamma by quadrature. Returns P(a,x) or Q(a, x),
// when psig is 1 or 0, respectively. User should not call directly.
int j;
double xu,t,sum,ans;
double a1 = a-1.0, lna1 = log(a1), sqrta1 = sqrt(a1);
gln = gammln(a);
// Set how far to integrate into the tail:
if (x > a1) xu = MAX(a1 + 11.5*sqrta1, x + 6.0*sqrta1);
else xu = MAX(0.,MIN(a1 - 7.5*sqrta1, x - 5.0*sqrta1));
sum = 0;
for (j=0;j<ngau;j++) { // Gauss-Legendre.
t = x + (xu-x)*y[j];
sum += w[j]*exp(-(t-a1)+a1*(log(t)-lna1));
}
ans = sum*(xu-x)*exp(a1*(lna1-1.)-gln);
return (psig?(ans>0.0? 1.0-ans:-ans):(ans>=0.0? ans:1.0+ans));
}
double invgammp(Doub p, Doub a);
// Inverse function on x of P(a,x) .
};
const Doub Gamma::EPS = numeric_limits<Doub>::epsilon();
const Doub Gamma::FPMIN = numeric_limits<Doub>::min()/EPS
and stuct Gauleg18:
struct Gauleg18 {
// Abscissas and weights for Gauss-Legendre quadrature.
static const Int ngau = 18;
static const Doub y[18];
static const Doub w[18];
};
const Doub Gauleg18::y[18] = {0.0021695375159141994,
0.011413521097787704,0.027972308950302116,0.051727015600492421,
0.082502225484340941, 0.12007019910960293,0.16415283300752470,
0.21442376986779355, 0.27051082840644336, 0.33199876341447887,
0.39843234186401943, 0.46931971407375483, 0.54413605556657973,
0.62232745288031077, 0.70331500465597174, 0.78649910768313447,
0.87126389619061517, 0.95698180152629142};
const Doub Gauleg18::w[18] = {0.0055657196642445571,
0.012915947284065419,0.020181515297735382,0.027298621498568734,
0.034213810770299537,0.040875750923643261,0.047235083490265582,
0.053244713977759692,0.058860144245324798,0.064039797355015485
0.068745323835736408,0.072941885005653087,0.076598410645870640,
0.079687828912071670,0.082187266704339706,0.084078218979661945,
0.085346685739338721,0.085983275670394821};
and, finally fuinction Gamma::invgamp():
double Gamma::invgammp(double p, double a) {
// Returns x such that P(a,x) = p for an argument p between 0 and 1.
int j;
double x,err,t,u,pp,lna1,afac,a1=a-1;
const double EPS=1.e-8; // Accuracy is the square of EPS.
gln=gammln(a);
if (a <= 0.) throw("a must be pos in invgammap");
if (p >= 1.) return MAX(100.,a + 100.*sqrt(a));
if (p <= 0.) return 0.0;
if (a > 1.) {
lna1=log(a1);
afac = exp(a1*(lna1-1.)-gln);
pp = (p < 0.5)? p : 1. - p;
t = sqrt(-2.*log(pp));
x = (2.30753+t*0.27061)/(1.+t*(0.99229+t*0.04481)) - t;
if (p < 0.5) x = -x;
x = MAX(1.e-3,a*pow(1.-1./(9.*a)-x/(3.*sqrt(a)),3));
} else {
t = 1.0 - a*(0.253+a*0.12); and (6.2.9).
if (p < t) x = pow(p/t,1./a);
else x = 1.-log(1.-(p-t)/(1.-t));
}
for (j=0;j<12;j++) {
if (x <= 0.0) return 0.0; // x too small to compute accurately.
err = gammp(a,x) - p;
if (a > 1.) t = afac*exp(-(x-a1)+a1*(log(x)-lna1));
else t = exp(-x+a1*log(x)-gln);
u = err/t;
// Halley’s method.
x -= (t = u/(1.-0.5*MIN(1.,u*((a-1.)/x - 1))));
// Halve old value if x tries to go negative.
if (x <= 0.) x = 0.5*(x + t);
if (fabs(t) < EPS*x ) break;
}
return x;
}
Here is my version of a C/C++ function that does simple linear regression. The calculations follow the wikipedia article on simple linear regression. This is published as a single-header public-domain (MIT) library on github: simple_linear_regression. The library (.h file) is tested to work on Linux and Windows, and from C and C++ using -Wall -Werror and all -std versions supported by clang/gcc.
#define SIMPLE_LINEAR_REGRESSION_ERROR_INPUT_VALUE -2
#define SIMPLE_LINEAR_REGRESSION_ERROR_NUMERIC -3
int simple_linear_regression(const double * x, const double * y, const int n, double * slope_out, double * intercept_out, double * r2_out) {
double sum_x = 0.0;
double sum_xx = 0.0;
double sum_xy = 0.0;
double sum_y = 0.0;
double sum_yy = 0.0;
double n_real = (double)(n);
int i = 0;
double slope = 0.0;
double denominator = 0.0;
if (x == NULL || y == NULL || n < 2) {
return SIMPLE_LINEAR_REGRESSION_ERROR_INPUT_VALUE;
}
for (i = 0; i < n; ++i) {
sum_x += x[i];
sum_xx += x[i] * x[i];
sum_xy += x[i] * y[i];
sum_y += y[i];
sum_yy += y[i] * y[i];
}
denominator = n_real * sum_xx - sum_x * sum_x;
if (denominator == 0.0) {
return SIMPLE_LINEAR_REGRESSION_ERROR_NUMERIC;
}
slope = (n_real * sum_xy - sum_x * sum_y) / denominator;
if (slope_out != NULL) {
*slope_out = slope;
}
if (intercept_out != NULL) {
*intercept_out = (sum_y - slope * sum_x) / n_real;
}
if (r2_out != NULL) {
denominator = ((n_real * sum_xx) - (sum_x * sum_x)) * ((n_real * sum_yy) - (sum_y * sum_y));
if (denominator == 0.0) {
return SIMPLE_LINEAR_REGRESSION_ERROR_NUMERIC;
}
*r2_out = ((n_real * sum_xy) - (sum_x * sum_y)) * ((n_real * sum_xy) - (sum_x * sum_y)) / denominator;
}
return 0;
}
Usage example:
#define SIMPLE_LINEAR_REGRESSION_IMPLEMENTATION
#include "simple_linear_regression.h"
#include <stdio.h>
/* Some data that we want to find the slope, intercept and r2 for */
static const double x[] = { 1.47, 1.50, 1.52, 1.55, 1.57, 1.60, 1.63, 1.65, 1.68, 1.70, 1.73, 1.75, 1.78, 1.80, 1.83 };
static const double y[] = { 52.21, 53.12, 54.48, 55.84, 57.20, 58.57, 59.93, 61.29, 63.11, 64.47, 66.28, 68.10, 69.92, 72.19, 74.46 };
int main() {
double slope = 0.0;
double intercept = 0.0;
double r2 = 0.0;
int res = 0;
res = simple_linear_regression(x, y, sizeof(x) / sizeof(x[0]), &slope, &intercept, &r2);
if (res < 0) {
printf("Error: %s\n", simple_linear_regression_error_string(res));
return res;
}
printf("slope: %f\n", slope);
printf("intercept: %f\n", intercept);
printf("r2: %f\n", r2);
return 0;
}
The original example above worked well for me with slope and offset but I had a hard time with the corr coef. Maybe I don't have my parenthesis working the same as the assumed precedence? Anyway, with some help of other web pages I finally got values that match the linear trend-line in Excel. Thought I would share my code using Mark Lakata's variable names. Hope this helps.
double slope = ((n * sumxy) - (sumx * sumy )) / denom;
double intercept = ((sumy * sumx2) - (sumx * sumxy)) / denom;
double term1 = ((n * sumxy) - (sumx * sumy));
double term2 = ((n * sumx2) - (sumx * sumx));
double term3 = ((n * sumy2) - (sumy * sumy));
double term23 = (term2 * term3);
double r2 = 1.0;
if (fabs(term23) > MIN_DOUBLE) // Define MIN_DOUBLE somewhere as 1e-9 or similar
r2 = (term1 * term1) / term23;
as an assignment I had to code in C a simple linear regression using RMSE loss function. The program is dynamic and you can enter your own values and choose your own loss function which is for now limited to Root Mean Square Error. But first here are the algorithms I used:
now the code... you need gnuplot to display the chart, sudo apt install gnuplot
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>
#include <sys/types.h>
#define BUFFSIZE 64
#define MAXSIZE 100
static double vector_x[MAXSIZE] = {0};
static double vector_y[MAXSIZE] = {0};
static double vector_predict[MAXSIZE] = {0};
static double max_x;
static double max_y;
static double mean_x;
static double mean_y;
static double teta_0_intercept;
static double teta_1_grad;
static double RMSE;
static double r_square;
static double prediction;
static char intercept[BUFFSIZE];
static char grad[BUFFSIZE];
static char xrange[BUFFSIZE];
static char yrange[BUFFSIZE];
static char lossname_RMSE[BUFFSIZE] = "Simple Linear Regression using RMSE'";
static char cmd_gnu_0[BUFFSIZE] = "set title '";
static char cmd_gnu_1[BUFFSIZE] = "intercept = ";
static char cmd_gnu_2[BUFFSIZE] = "grad = ";
static char cmd_gnu_3[BUFFSIZE] = "set xrange [0:";
static char cmd_gnu_4[BUFFSIZE] = "set yrange [0:";
static char cmd_gnu_5[BUFFSIZE] = "f(x) = (grad * x) + intercept";
static char cmd_gnu_6[BUFFSIZE] = "plot f(x), 'data.temp' with points pointtype 7";
static char const *commands_gnuplot[] = {
cmd_gnu_0,
cmd_gnu_1,
cmd_gnu_2,
cmd_gnu_3,
cmd_gnu_4,
cmd_gnu_5,
cmd_gnu_6,
};
static size_t size;
static void user_input()
{
printf("Enter x,y vector size, MAX = 100\n");
scanf("%lu", &size);
if (size > MAXSIZE) {
printf("Wrong input size is too big\n");
user_input();
}
printf("vector's size is %lu\n", size);
size_t i;
for (i = 0; i < size; i++) {
printf("Enter vector_x[%ld] values\n", i);
scanf("%lf", &vector_x[i]);
}
for (i = 0; i < size; i++) {
printf("Enter vector_y[%ld] values\n", i);
scanf("%lf", &vector_y[i]);
}
}
static void display_vector()
{
size_t i;
for (i = 0; i < size; i++){
printf("vector_x[%lu] = %lf\t", i, vector_x[i]);
printf("vector_y[%lu] = %lf\n", i, vector_y[i]);
}
}
static void concatenate(char p[], char q[]) {
int c;
int d;
c = 0;
while (p[c] != '\0') {
c++;
}
d = 0;
while (q[d] != '\0') {
p[c] = q[d];
d++;
c++;
}
p[c] = '\0';
}
static void compute_mean_x_y()
{
size_t i;
double tmp_x = 0.0;
double tmp_y = 0.0;
for (i = 0; i < size; i++) {
tmp_x += vector_x[i];
tmp_y += vector_y[i];
}
mean_x = tmp_x / size;
mean_y = tmp_y / size;
printf("mean_x = %lf\n", mean_x);
printf("mean_y = %lf\n", mean_y);
}
static void compute_teta_1_grad()
{
double numerator = 0.0;
double denominator = 0.0;
double tmp1 = 0.0;
double tmp2 = 0.0;
size_t i;
for (i = 0; i < size; i++) {
numerator += (vector_x[i] - mean_x) * (vector_y[i] - mean_y);
}
for (i = 0; i < size; i++) {
tmp1 = vector_x[i] - mean_x;
tmp2 = tmp1 * tmp1;
denominator += tmp2;
}
teta_1_grad = numerator / denominator;
printf("teta_1_grad = %lf\n", teta_1_grad);
}
static void compute_teta_0_intercept()
{
teta_0_intercept = mean_y - (teta_1_grad * mean_x);
printf("teta_0_intercept = %lf\n", teta_0_intercept);
}
static void compute_prediction()
{
size_t i;
for (i = 0; i < size; i++) {
vector_predict[i] = teta_0_intercept + (teta_1_grad * vector_x[i]);
printf("y^[%ld] = %lf\n", i, vector_predict[i]);
}
printf("\n");
}
static void compute_RMSE()
{
compute_prediction();
double error = 0;
size_t i;
for (i = 0; i < size; i++) {
error = (vector_predict[i] - vector_y[i]) * (vector_predict[i] - vector_y[i]);
printf("error y^[%ld] = %lf\n", i, error);
RMSE += error;
}
/* mean */
RMSE = RMSE / size;
/* square root mean */
RMSE = sqrt(RMSE);
printf("\nRMSE = %lf\n", RMSE);
}
static void compute_loss_function()
{
int input = 0;
printf("Which loss function do you want to use?\n");
printf(" 1 - RMSE\n");
scanf("%d", &input);
switch(input) {
case 1:
concatenate(cmd_gnu_0, lossname_RMSE);
compute_RMSE();
printf("\n");
break;
default:
printf("Wrong input try again\n");
compute_loss_function(size);
}
}
static void compute_r_square(size_t size)
{
double num_err = 0.0;
double den_err = 0.0;
size_t i;
for (i = 0; i < size; i++) {
num_err += (vector_y[i] - vector_predict[i]) * (vector_y[i] - vector_predict[i]);
den_err += (vector_y[i] - mean_y) * (vector_y[i] - mean_y);
}
r_square = 1 - (num_err/den_err);
printf("R_square = %lf\n", r_square);
}
static void compute_predict_for_x()
{
double x = 0.0;
printf("Please enter x value\n");
scanf("%lf", &x);
prediction = teta_0_intercept + (teta_1_grad * x);
printf("y^ if x = %lf -> %lf\n",x, prediction);
}
static void compute_max_x_y()
{
size_t i;
double tmp1= 0.0;
double tmp2= 0.0;
for (i = 0; i < size; i++) {
if (vector_x[i] > tmp1) {
tmp1 = vector_x[i];
max_x = vector_x[i];
}
if (vector_y[i] > tmp2) {
tmp2 = vector_y[i];
max_y = vector_y[i];
}
}
printf("vector_x max value %lf\n", max_x);
printf("vector_y max value %lf\n", max_y);
}
static void display_model_line()
{
sprintf(intercept, "%0.7lf", teta_0_intercept);
sprintf(grad, "%0.7lf", teta_1_grad);
sprintf(xrange, "%0.7lf", max_x + 1);
sprintf(yrange, "%0.7lf", max_y + 1);
concatenate(cmd_gnu_1, intercept);
concatenate(cmd_gnu_2, grad);
concatenate(cmd_gnu_3, xrange);
concatenate(cmd_gnu_3, "]");
concatenate(cmd_gnu_4, yrange);
concatenate(cmd_gnu_4, "]");
printf("grad = %s\n", grad);
printf("intercept = %s\n", intercept);
printf("xrange = %s\n", xrange);
printf("yrange = %s\n", yrange);
printf("cmd_gnu_0: %s\n", cmd_gnu_0);
printf("cmd_gnu_1: %s\n", cmd_gnu_1);
printf("cmd_gnu_2: %s\n", cmd_gnu_2);
printf("cmd_gnu_3: %s\n", cmd_gnu_3);
printf("cmd_gnu_4: %s\n", cmd_gnu_4);
printf("cmd_gnu_5: %s\n", cmd_gnu_5);
printf("cmd_gnu_6: %s\n", cmd_gnu_6);
/* print plot */
FILE *gnuplot_pipe = (FILE*)popen("gnuplot -persistent", "w");
FILE *temp = (FILE*)fopen("data.temp", "w");
/* create data.temp */
size_t i;
for (i = 0; i < size; i++)
{
fprintf(temp, "%f %f \n", vector_x[i], vector_y[i]);
}
/* display gnuplot */
for (i = 0; i < 7; i++)
{
fprintf(gnuplot_pipe, "%s \n", commands_gnuplot[i]);
}
}
int main(void)
{
printf("===========================================\n");
printf("INPUT DATA\n");
printf("===========================================\n");
user_input();
display_vector();
printf("\n");
printf("===========================================\n");
printf("COMPUTE MEAN X:Y, TETA_1 TETA_0\n");
printf("===========================================\n");
compute_mean_x_y();
compute_max_x_y();
compute_teta_1_grad();
compute_teta_0_intercept();
printf("\n");
printf("===========================================\n");
printf("COMPUTE LOSS FUNCTION\n");
printf("===========================================\n");
compute_loss_function();
printf("===========================================\n");
printf("COMPUTE R_square\n");
printf("===========================================\n");
compute_r_square(size);
printf("\n");
printf("===========================================\n");
printf("COMPUTE y^ according to x\n");
printf("===========================================\n");
compute_predict_for_x();
printf("\n");
printf("===========================================\n");
printf("DISPLAY LINEAR REGRESSION\n");
printf("===========================================\n");
display_model_line();
printf("\n");
return 0;
}
Look at Section 1 of this paper. This section expresses a 2D linear regression as a matrix multiplication exercise. As long as your data is well-behaved, this technique should permit you to develop a quick least squares fit.
Depending on the size of your data, it might be worthwhile to algebraically reduce the matrix multiplication to simple set of equations, thereby avoiding the need to write a matmult() function. (Be forewarned, this is completely impractical for more than 4 or 5 data points!)
The fastest, most efficient way to solve least squares, as far as I am aware, is to subtract (the gradient)/(the 2nd order gradient) from your parameter vector. (2nd order gradient = i.e. the diagonal of the Hessian.)
Here is the intuition:
Let's say you want to optimize least squares over a single parameter. This is equivalent to finding the vertex of a parabola. Then, for any random initial parameter, x0, the vertex of the loss function is located at x0 - f(1) / f(2). That's because adding - f(1) / f(2) to x will always zero out the derivative, f(1).
Side note: Implementing this in Tensorflow, the solution appeared at w0 - f(1) / f(2) / (number of weights), but I'm not sure if that's due to Tensorflow or if it's due to something else..

Resources