Segfault with large int - not enough memory? - c

I am fairly new to C and how arrays and memory allocation works. I'm solving a very simple function right now, vector_average(), which computes the mean value between two successive array entries, i.e., the average between (i) and (i + 1). This average function is the following:
void
vector_average(double *cc, double *nc, int n)
{
//#pragma omp parallel for
double tbeg ;
double tend ;
tbeg = Wtime() ;
for (int i = 0; i < n; i++) {
cc[i] = .5 * (nc[i] + nc[i+1]);
}
tend = Wtime() ;
printf("vector_average() took %g seconds\n", tend - tbeg);
}
My goal is to set int n extremely high, to the point where it actually takes some time to complete this loop (hence, why I am tracking wall time in this code). I'm passing this function a random test function of x, f(x) = sin(x) + 1/3 * sin(3 x), denoted in this code as x_nc, in main() in the following form:
int
main(int argc, char **argv)
{
int N = 1.E6;
double x_nc[N+1];
double dx = 2. * M_PI / N;
for (int i = 0; i <= N; i++) {
double x = i * dx;
x_nc[i] = sin(x) + 1./3. * sin(3.*x);
}
double x_cc[N];
vector_average(x_cc, x_nc, N);
}
But my problem here is that if I set int N any higher than 1.E5, it segfaults. Please provide any suggestions for how I might set N much higher. Perhaps I have to do something with malloc, but, again, I am new to all of this stuff and I'm not quite sure how I would implement this.
-CJW

A function only has 1M stack memory on Windows or other system. Obviously, the size of temporary variable 'x_nc' is bigger than 1M. So, you should use heap to save data of x_nc:
int
main(int argc, char **argv)
{
int N = 1.E6;
double* x_nc = (double*)malloc(sizeof(dounble)*(N+1));
double dx = 2. * M_PI / N;
for (int i = 0; i <= N; i++) {
double x = i * dx;
x_nc[i] = sin(x) + 1./3. * sin(3.*x);
}
double* x_cc = (double*)malloc(sizeof(double)*N);
vector_average(x_cc, x_nc, N);
free(x_nc);
free(x_cc);
return 0;
}

Related

Graph of cos(x) through MacLaurin series only getting the first result right

I'm trying to create a program that compares the efficiency of calculating a function through MacLaurin series.
The idea is: Make a graph (using gnuplot) of cos(x) between -Pi and Pi (100 intervals) calculating cos(x) using the first 4 terms of its MacLaurin series, then, the first 6 terms, and comparing the graph between them.
Cos(x) through MacLaurin.
So, to use gnuplot, I made the code below that gets 2 files with the data I need, however, when i run the code only the first result is correct. For the first 4 terms my file is:
-3.141593 -9.760222e-001
-3.078126 2.367934e+264
And the rest of what would be my Y axis is just 2.367934e+264 repeated over and over. The 6 terms file is also just that number. X axis is fine.
I'm fairly new to coding and just don't know what i'm doing wrong. Any help would be appreciated.
Here's the code:
#include <stdio.h>
#include <math.h>
#define X_INI -M_PI
#define X_FIM M_PI
#define NI 100
int fatorial(int);
double serie(int ,double );
int main()
{
double x, y[NI], dx;
int i;
FILE *fp[3];
fp[0]=fopen("4Termos.dat","w");
fp[1]=fopen("6Termos.dat","w");
x=X_INI;
dx = (X_FIM - X_INI)/ (NI - 1);
for(i=0; i<NI; i++){
y[i]=serie(4,x);
fprintf(fp[0],"%lf %e\n", x, y[i]);
y[i]=serie(6,x);
fprintf(fp[1],"%lf %e\n", x, y[i]);
x = x + dx;
}
return 0;
}
int fatorial(int n) {
int i,p;
p = 1;
if (n==0)
return 1;
else {
for (i=1;i<=n;i++)
p = p*i;
return p;
}
}
double serie(int m, double z){
double s;
int j;
for(j = 0; j < m+1; j++)
{
s = s + ( ( pow((-1) , j))*pow(z, (2*j)) ) / (fatorial(2*j));
}
return s;
}
Fatorial is used to calculate factorial, serie used to calculate MacLaurin...
Use of uninitialized s in serie() function (I've taken the liberty to format the code to my liking).
double serie(int m, double z) {
double s; // better: double s = 0;
int j;
for (j = 0; j < m + 1; j++) {
s += pow(-1, j) * pow(z, 2 * j) / fatorial(2 * j);
}
return s;
}

How do i use more than 100.000 dot?

I have made this Project it can calculate the Pi with the Monte-Carlo method, but if I use more than 100.000 dots it crashes. Does anyone know how to use like a 1.000.000 dots without crashing? I'm using the GNU compiler. I've tried with another compiler but I had the same problem.
#include <time.h>
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
int main() {
srand(time(NULL));
int dot;
int dotC = 0;
int dotS = 0;
printf("How many dot do you want to use?: ");
scanf("%d", &dot);
float pi[dot];
float x[dot];
float y[dot];
for (int i = 1; i < dot; i++) {
x[i] = (float)rand() / (float)RAND_MAX;
}
for (int i = 0; i < dot; i++) {
y[i] = (float)rand() / (float)RAND_MAX;
}
float distance[dot];
for (int i = 0; i < dot; ++i) {
distance[i] = sqrt(pow(x[i], 2) + pow(y[i], 2));
}
for (int i = 0; i < dot; ++i) {
if (distance[i] < 1) {
dotC++;
}
dotS++;
}
for (int i = 0; i < dot; ++i) {
pi[i] = (float)dotC / (float)dotS * 4;
}
printf("approximation of PY is: ");
printf("%f\n", pi[0]);
}
You get a stack overflow because you allocate large arrays with automatic storage (aka on the stack) exceeding the stack space available to your program.
You can fix the problem by allocating these from the heap with malloc() or calloc(), but you can simplify the algorithm by not using arrays at all:
for each random dot, compute the distance and update the dotC and dotS counters. No need to store the values.
the loop to initialize the x array should start at 0. You have undefined behavior as you do not initialize x[0].
dotS is actually redundant as its final value is the same as dot.
you should use double instead of float for increased precision.
you should use a simple multiplication instead of the more costly pow() function.
there is no need for sqrt() either: comparing the square of the distance to 1.0 gives the same result.
Here is a simplified version:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
int main() {
srand(time(NULL));
int dots;
int dotC = 0;
printf("How many dots do you want to use?: ");
if (scanf("%d", &dots) != 1)
return 1;
for (int i = 0; i < dots; i++) {
double x = (double)rand() / (double)RAND_MAX;
double y = (double)rand() / (double)RAND_MAX;
if (x * x + y * y <= 1.0)
dotC++;
}
printf("approximation of PI is: %.9f\n", 4 * (double)dotC / (double)dots);
return 0;
}
On systems with slow floating point, you could change the for loop to use 64-bit integers and produce the same result:
for (int i = 0; i < dots; i++) {
long long x = rand();
long long y = rand();
if (x * x + y * y <= (long long)RAND_MAX * RAND_MAX)
dotC++;
}
This algorithm is really a benchmark of the pseudo random number generator, running it for 1 billion dots only produces 4 or 5 decimal places in 13 seconds on my old Macbook with the Apple libC. Integer or floating point versions run at the same speed on this CPU.
If you want dynamically allocated array, you should use malloc. Replace :
float *pi, *x, *y;
x = malloc(dot * sizeof(float));
if (x==NULL) {
printf("no memory for x\n");
exit(1);
}
y = malloc(dot * sizeof(float));
if (y==NULL) {
printf("no memory for y\n");
exit(1);
}
pi = malloc(dot * sizeof(float));
if (pi==NULL) {
printf("no memory for pi\n");
exit(1);
}
That would work, however, for big numbers, it would get out of memory. Perhaps you can calculate incremental? Why store all the calculation? Perhaps the algorithm is not clear enough to me, but I guess that should be possible...
I looked at your code again, and this seems to do the same:
int main(){
srand(time(NULL));
int dot;
int dotC = 0;
int dotS = 0;
printf("How mmany dot do you want to use?: ");
scanf("%d",&dot);
float pi, x, y;
for (int i = 0; i < dot ; i++){
x = (float)rand()/(float)RAND_MAX;
y = (float)rand()/(float)RAND_MAX;
if (sqrt(pow(x, 2) + pow(y, 2)) < 1)
{
dotC++;
}
dotS++;
}
printf("approximation of PY is: %f\n", dotC / (float)dotS * 4);
}
You are probably overflowing the stack (hence this website name) with the dynamic array allocations (like float distance[dot]) when dot is too large. To solve this, you can either increase the stack size (but you will encounter the same issue with larger numbers) or allocate your arrays on the heap instead, eg.
float * x = calloc(dot, sizeof(float));
float * y = calloc(dot, sizeof(float));
/* ... */
free(x);
free(y);
Checking calloc return values for NULL is usually recommended as well, see man calloc.
Well while the other answers are correct and more reasonable, you can always temporarily increase the stack size limit using the ulimit commands if you are on UNIX. This will allow for a few more decimals.
Get the currect stack size with ulimit -s or ulimit -a to get more info.
Then you can find the maximum size the stack can be by typing ulimit -H -s or ulimit -H -a for more info.
Finally you can set the stack size to the maximum by typing ulimit -s maxsize, maxsize is equal to the number you get when you type ulimit -H -s .
Stack size will reset to its default value when your terminate your shell.

Segmentation Fault 11 in C caused by larger operation numbers

I have known that when encountered with segmentation fault 11, it means the program has attempted to access an area of memory that it is not allowed to access.
Here I am trying to calculate a Fourier transform, using the following code.
It works well when nPoints = 2^15 (or of course with less points) , however it corrupts when I further increase the points to 2^16. I am wondering, is that caused by occupying too much memory? But I did not notice too much memory occupation during the operation. And although it use recursion, it transforms in-place. I thought it would occupy not so much memory. Then, where's the problem?
Thanks in advance
PS: one thing I forgot to say is, the result above was on Max OS (8G memory).
When I running the code on Windows (16G memory), it corrupts when nPoints = 2^14. So it makes me confused whether it's caused by the memory allocation, as the Windows PC has a larger memory (but it's really hard to say, because the two operation systems utilize different memory strategy).
#include <stdio.h>
#include <tgmath.h>
#include <string.h>
// in place FFT with O(n) memory usage
long double PI;
typedef long double complex cplx;
void _fft(cplx buf[], cplx out[], int n, int step)
{
if (step < n) {
_fft(out, buf, n, step * 2);
_fft(out + step, buf + step, n, step * 2);
for (int i = 0; i < n; i += 2 * step) {
cplx t = exp(-I * PI * i / n) * out[i + step];
buf[i / 2] = out[i] + t;
buf[(i + n)/2] = out[i] - t;
}
}
}
void fft(cplx buf[], int n)
{
cplx out[n];
for (int i = 0; i < n; i++) out[i] = buf[i];
_fft(buf, out, n, 1);
}
int main()
{
const int nPoints = pow(2, 15);
PI = atan2(1.0l, 1) * 4;
double tau = 0.1;
double tSpan = 12.5;
long double dt = tSpan / (nPoints-1);
long double T[nPoints];
cplx At[nPoints];
for (int i = 0; i < nPoints; ++i)
{
T[i] = dt * (i - nPoints / 2);
At[i] = exp( - T[i]*T[i] / (2*tau*tau));
}
fft(At, nPoints);
return 0;
}
You cannot allocate very large arrays in the stack. The default stack size on macOS is 8 MiB. The size of your cplx type is 32 bytes, so an array of 216 cplx elements is 2 MiB, and you have two of them (one in main and one in fft), so that is 4 MiB. That fits on the stack, but, at that size, the program runs to completion when I try it. At 217, it fails, which makes sense because then the program has two arrays taking 8 MiB on stack. The proper way to allocate such large arrays is to include <stdlib.h> and use cmplx *At = malloc(nPoints * sizeof *At); followed by if (!At) { /* Print some error message about being unable to allocate memory and terminate the program. */ }. You should do that for At, T, and out. Also, when you are done with each array, you should free it, as with free(At);.
To calculate an integer power of two, use the integer operation 1 << power, not the floating-point operation pow(2, 16). We have designed pow well on macOS, but, on other systems, it may return approximations even when exact results are possible. An approximate result may be slightly less than the exact integer value, so converting it to an integer truncates to the wrong result. If it may be a power of two larger than suitable for an int, then use (type) 1 << power, where type is a suitably large integer type.
the following, instrumented, code clearly shows that the OPs code repeatedly updates the same locations in the out[] array and actually does not update most of the locations in that array.
#include <stdio.h>
#include <tgmath.h>
#include <assert.h>
// in place FFT with O(n) memory usage
#define N_POINTS (1<<15)
double T[N_POINTS];
double At[N_POINTS];
double PI;
// prototypes
void _fft(double buf[], double out[], int step);
void fft( void );
int main( void )
{
PI = 3.14159;
double tau = 0.1;
double tSpan = 12.5;
double dt = tSpan / (N_POINTS-1);
for (int i = 0; i < N_POINTS; ++i)
{
T[i] = dt * (i - (N_POINTS / 2));
At[i] = exp( - T[i]*T[i] / (2*tau*tau));
}
fft();
return 0;
}
void fft()
{
double out[ N_POINTS ];
for (int i = 0; i < N_POINTS; i++)
out[i] = At[i];
_fft(At, out, 1);
}
void _fft(double buf[], double out[], int step)
{
printf( "step: %d\n", step );
if (step < N_POINTS)
{
_fft(out, buf, step * 2);
_fft(out + step, buf + step, step * 2);
for (int i = 0; i < N_POINTS; i += 2 * step)
{
double t = exp(-I * PI * i / N_POINTS) * out[i + step];
buf[i / 2] = out[i] + t;
buf[(i + N_POINTS)/2] = out[i] - t;
printf( "index: %d buf update: %d, %d\n", i, i/2, (i+N_POINTS)/2 );
}
}
}
Suggest running via (where untitled1 is the name of the executable and on linux)
./untitled1 > out.txt
less out.txt
the out.txt file is 8630880 bytes
An examination of that file shows the lack of coverage and shows that any one entry is NOT the sum of the prior two entries, so I suspect this is not a valid Fourier transform,

Matrix and vector multiplication optimization algorithm

Assume that the dimensions are very large (up to 1 billion elements in a matrix). How would I implement a cache oblivious algorithm for matrix-vector product? Based on wikipedia I will need to recursively divide and conquer however I feel like there would be a lot of overhead.. Would it be efficient to do so?
Follow up question and answer: OpenMP with matrices and vectors
So the answer to the question, "how do I make this basic linear algebra operation fast", is always and everywhere to find and link to a tuned BLAS library for your platform. Eg, GotoBLAS (whose work is being continued in OpenBLAS), or the slower autotuned ATLAS, or commercial packages like Intel's MKL. Linear algebra is so fundamental to so many other operations that enormous amounts of effort goes into optimizing these packages for various platforms, and there's just no chance you're going to come up with something in a few afternoon's work that will compete. The particular subroutine calls you're looking for for general dense matrix-vector multiplicaiton is SGEMV/DGEMV/CGEMV/ZGEMV.
Cache-oblivious algorithms, or autotuning, are for when you can't be bothered tuning for the specific cache architecture of your system - which might be fine, normally, but since people are willing to do that for BLAS routines, and then make the tuned results available, means that you're best off just using those routines.
The memory access pattern for GEMV is straightforward enough that you don't really need divide and conquer (same for the standard case of matrix transpose) - you just find the cache blocking size and use it. In GEMV (y = Ax), you still have to scan through the entire matrix once, so there's nothing to be done for reuse (and thus effective cache use) there, but you can try reuse x as much as possible so you load it once instead of (number of rows) times - and you still want access to A to be cache friendly. So the obvious cache blocking thing to do is to break along blocks:
A x -> [ A11 | A12 ] | x1 | = | A11 x1 + A12 x2 |
[ A21 | A22 ] | x2 | | A21 x1 + A22 x2 |
And you can certainly do that recursively. But doing a naive implementation, it's slower than the simple double-loop, and way slower than a proper SGEMV library call:
$ ./gemv
Testing for N=4096
Double Loop: time = 0.024995, error = 0.000000
Divide and conquer: time = 0.299945, error = 0.000000
SGEMV: time = 0.013998, error = 0.000000
The code follows:
#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>
#include "mkl.h"
float **alloc2d(int n, int m) {
float *data = malloc(n*m*sizeof(float));
float **array = malloc(n*sizeof(float *));
for (int i=0; i<n; i++)
array[i] = &(data[i*m]);
return array;
}
void tick(struct timeval *t) {
gettimeofday(t, NULL);
}
/* returns time in seconds from now to time described by t */
double tock(struct timeval *t) {
struct timeval now;
gettimeofday(&now, NULL);
return (double)(now.tv_sec - t->tv_sec) + ((double)(now.tv_usec - t->tv_usec)/1000000.);
}
float checkans(float *y, int n) {
float err = 0.;
for (int i=0; i<n; i++)
err += (y[i] - 1.*i)*(y[i] - 1.*i);
return err;
}
/* assume square matrix */
void divConquerGEMV(float **a, float *x, float *y, int n,
int startr, int endr, int startc, int endc) {
int nr = endr - startr + 1;
int nc = endc - startc + 1;
if (nr == 1 && nc == 1) {
y[startc] += a[startr][startc] * x[startr];
} else {
int midr = (endr + startr+1)/2;
int midc = (endc + startc+1)/2;
divConquerGEMV(a, x, y, n, startr, midr-1, startc, midc-1);
divConquerGEMV(a, x, y, n, midr, endr, startc, midc-1);
divConquerGEMV(a, x, y, n, startr, midr-1, midc, endc);
divConquerGEMV(a, x, y, n, midr, endr, midc, endc);
}
}
int main(int argc, char **argv) {
const int n=4096;
float **a = alloc2d(n,n);
float *x = malloc(n*sizeof(float));
float *y = malloc(n*sizeof(float));
struct timeval clock;
double eltime;
printf("Testing for N=%d\n", n);
for (int i=0; i<n; i++) {
x[i] = 1.*i;
for (int j=0; j<n; j++)
a[i][j] = 0.;
a[i][i] = 1.;
}
/* naive double loop */
tick(&clock);
for (int i=0; i<n; i++) {
y[i] = 0.;
for (int j=0; j<n; j++) {
y[i] += a[i][j]*x[j];
}
}
eltime = tock(&clock);
printf("Double Loop: time = %lf, error = %f\n", eltime, checkans(y,n));
for (int i=0; i<n; i++) y[i] = 0.;
/* naive divide and conquer */
tick(&clock);
divConquerGEMV(a, x, y, n, 0, n-1, 0, n-1);
eltime = tock(&clock);
printf("Divide and conquer: time = %lf, error = %f\n", eltime, checkans(y,n));
/* decent GEMV implementation */
tick(&clock);
float alpha = 1.;
float beta = 0.;
int incrx=1;
int incry=1;
char trans='N';
sgemv(&trans,&n,&n,&alpha,&(a[0][0]),&n,x,&incrx,&beta,y,&incry);
eltime = tock(&clock);
printf("SGEMV: time = %lf, error = %f\n", eltime, checkans(y,n));
return 0;
}

Selecting and analysing window of points in an array

Could someone please advise me on how to resolve this problem.
I have a function which performs a simple regression analysis on a sets of point contained in an array.
I have one array (pval) which contains all the data I want to perform regression analysis on.
This is how I want to implement this.
I get an average value for the first 7 elements of the array. This is what I call a 'ref_avg' in the programme.
I want to perform a regression analysis for every five elements of the array taking the first element of this array as the 'ref_avg'. That is in every step of the regression analysis I will have 6 points in the array.
e.g
For the 1st step the ref_avg as calculated below is 70.78. So the 1st step in the simple regression will contain these points
1st = {70.78,76.26,69.17,68.68,71.49,73.08},
The second step will contain the ref_avg as the 1st element and other elements starting from the second element in the original array
2nd = {70.78,69.17,68.68,71.49,73.08,72.99},
3rd = {70.78,68.68,71.49,73.08,72.99,70.36},
4th = {70.78,71.49,73.08,72.99,70.36,57.82} and so on until the end.
The regression function is also shown below.
I don't understand why the first 3 elements of the 'calcul' array have value 0.00 on the first step of the regression, 2 elements on the 2nd step,1 elements on the 3rd.
Also the last step of the regression function is printed 3 times.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
float pval[]={76.26,69.17,68.68,71.49,73.08,72.99,70.36,57.82,58.98,69.71,70.43,77.53,80.77,70.30,70.5,70.79,75.58,76.88,80.20,77.69,80.80,70.5,85.27,75.25};
int count,Nhour;
const int MAX_HOUR = 24;
float *calcul=NULL;
float *tab_time =NULL;
float ref_avg;
int size_hour=7;
float sum=0;
int length = Nhour+1;
float m;
float b;
calcul=(float*)calloc(MAX_HOUR,sizeof(calcul));
if (calcul==NULL)
{
printf(" error in buffer\n");
exit(EXIT_FAILURE);
}
tab_time= calloc(MAX_HOUR,sizeof(float));
/* Get the average of the first seven elements */
int i;
for (i=0;i<size_hour;i++)
{
sum += pval[i];
}
ref_avg = sum / size_hour;
count=0;
/* perform the regression analysis on 5 hours increment */
while(count<=MAX_HOUR)
{
++count;
Nhour=5;
int pass = -(Nhour-1);
int i=0;
for(i=0;i<Nhour+1;i++)
{
if(count<MAX_HOUR)
{
calcul[0]=ref_avg;
calcul[i] =pval[count+pass];
pass++;
}
printf("calc=%.2f\n",calcul[i]); // For debug only
tab_time[i]=i+1;
if(i==Nhour)
{
linear_regression(tab_time, calcul, length, &m, &b);
printf("Slope= %.2f\n", m);
}
}
}
free(calcul);
calcul=NULL;
free(tab_time);
tab_time=NULL;
return 0;
}
/* end of the main function */
/* This function is used to calculate the linear
regression as it was called above in the main function.
It compiles and runs very well, was just included for the
compilation and execution of the main function above where I have a problem. */
int linear_regression(const float *x, const float *y, const int n, float *beta1, float *beta0)
{
float sumx = 0,
sumy = 0,
sumx2 = 0,
sumxy = 0;
int i;
if (n <= 1) {
*beta1 = 0;
*beta0= 0;
printf("Not enough data for regression \n");
}
else
{
float variance;
for (i = 0; i < n; i++)
{
sumx += x[i];
sumy += y[i];
sumx2 += (x[i] * x[i]);
sumxy += (x[i] * y[i]);
}
variance = (sumx2 - ((sumx * sumx) / n));
if ( variance != 0) {
*beta1 = (sumxy - ((sumx * sumy) / n)) / variance;
*beta0 = (sumy - ((*beta1) * sumx)) / n;
}
else
{
*beta1 = 0;
*beta0 = 0;
}
}
return 0;
}
I think this code produces sane answers. The reference average quoted in the question seems to be wrong. The memory allocation is not needed. The value of MAX_HOUR was 24 but there were only 23 data values in the array. The indexing in building up the array to be regressed was bogus, referencing negative indexes in the pval array (and hence leading to erroneous results). The variable Nhour was referenced before it was initialized; the variable length was not correctly set. There wasn't good diagnostic printing.
The body of main() here is substantially rewritten; the editing on linear_regression() is much more nearly minimal. The code is more consistently laid out and white space has been used to make it easier to read. This version terminates the regression when there is no longer enough data left to fill the array with 5 values - it is not clear what the intended termination condition was.
#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void linear_regression(const float *x, const float *y, const int n,
float *beta1, float *beta0);
int main(void)
{
float pval[]={
76.26, 68.68, 71.49, 73.08, 72.99, 70.36, 57.82, 58.98,
69.71, 70.43, 77.53, 80.77, 70.30, 70.50, 70.79, 75.58,
76.88, 80.20, 77.69, 80.80, 70.50, 85.27, 75.25,
};
const int Nhour = 5;
const int MAX_HOUR = sizeof(pval)/sizeof(pval[0]);
const int size_hour = 7;
float ref_avg;
float sum = 0.0;
float m;
float b;
float calc_y[6];
float calc_x[6];
/* Get the average of the first seven elements */
for (int i = 0; i < size_hour; i++)
sum += pval[i];
ref_avg = sum / size_hour;
printf("ref avg = %5.2f\n", ref_avg); // JL
/* perform the regression analysis on 5 hours increment */
for (int pass = 0; pass <= MAX_HOUR - Nhour; pass++) // JL
{
calc_y[0] = ref_avg;
calc_x[0] = pass + 1;
printf("pass %d\ncalc_y[0] = %5.2f, calc_x[0] = %5.2f\n",
pass, calc_y[0], calc_x[0]);
for (int i = 1; i <= Nhour; i++)
{
int n = pass + i - 1;
calc_y[i] = pval[n];
calc_x[i] = pass + i + 1;
printf("calc_y[%d] = %5.2f, calc_x[%d] = %5.2f, n = %2d\n",
i, calc_y[i], i, calc_x[i], n);
}
linear_regression(calc_x, calc_y, Nhour+1, &m, &b);
printf("Slope= %5.2f, intercept = %5.2f\n", m, b);
}
return 0;
}
void linear_regression(const float *x, const float *y, const int n, float *beta1, float *beta0)
{
float sumx1 = 0.0;
float sumy1 = 0.0;
float sumx2 = 0.0;
float sumxy = 0.0;
assert(n > 1);
for (int i = 0; i < n; i++)
{
sumx1 += x[i];
sumy1 += y[i];
sumx2 += (x[i] * x[i]);
sumxy += (x[i] * y[i]);
}
float variance = (sumx2 - ((sumx1 * sumx1) / n));
if (variance != 0.0)
{
*beta1 = (sumxy - ((sumx1 * sumy1) / n)) / variance;
*beta0 = (sumy1 - ((*beta1) * sumx1)) / n;
}
else
{
*beta1 = 0.0;
*beta0 = 0.0;
}
}

Resources