include early stopping in C - c

I want to stop the for loop for validation set when the current accuracy is less than previous accuracy calculated in the loop.
for (int j = 0; j < epochs; j++) {
DPRINT("Epoch %d: 000%%", j + 1);
DTIME_START_COUNTER;
for (int i = 0; i < train_data->m; i++) {
X->data[0] = train_data->data[i];
y->data[0] = train_labels->data[i];
backprop(network, X, y);
if (i % (int)mini_batch_size == 0 || i == train_data->m - 1) {
DPRINT("\b\b\b\b%03.0f%%", i * 100 / (double)(train_data->m - 1));
apply_derivation(network, mini_batch_size, training_rate);
}
}
DPRINT(" Eval accuracy: %.2f%%\n",
network_accuracy(network, eval_data, eval_labels, NULL));
}

Yes break, but I want to store accuracy (from network_accuracy) in array and compare the value of two consecutive iterations
You need to do something twice, or just finer control, do an extract method refactoring and put the code in a function. Then call that function.
// Put allocation of structs into their own function.
// All the necessary allocation goes in here.
Network *Network_new() {
// If you simply did `Network network` it would be automatically
// freed when the function exits. To return a complex value,
// allocate it on the heap.
return malloc(sizeof(Network);
}
// Pass in everything it needs. No globals.
// I'm just guessing at the types.
// A proliferation of arguments might suggest the need for a struct
// to gather them together.
Network *do_the_thing(
Data *train_data,
Data *train_labels,
Data *x, Data *y,
size_t mini_batch_size,
int training_rate
) {
// Allocate a new Network struct.
Network *network = Network_new();
for (int i = 0; i < train_data->m; i++) {
x->data[0] = train_data->data[i];
y->data[0] = train_labels->data[i];
backprop(network, x, y);
if (i % (int)mini_batch_size == 0 || i == train_data->m - 1) {
DPRINT("\b\b\b\b%03.0f%%", i * 100 / (double)(train_data->m - 1));
apply_derivation(network, mini_batch_size, training_rate);
}
}
// Presumably backprop and apply_derivation have changed network.
return network;
}
// Allocate space for two Network pointers.
Network *networks[2];
for(int i = 0; i < 2; i++) {
// Save the results to the array.
networks[i] = do_the_thing(
train_data,
train_labels,
x, y,
mini_batch_size,
training_rate
);
}
// Compare networks[0] and networks[1] as you like.
// Then free them.
for(int i = 0; i < 2; i++) {
free(network[i]);
}
Alternatively, pass in a pre-allocated return value. This is more flexible, but requires a bit more work in the code calling the function.
Network *do_the_thing(
Data *train_data,
Data *train_labels,
Data *x, Data *y,
size_t mini_batch_size,
int training_rate,
Network *network
) {
// same as before, but don't allocate the network
}
// Allocate space for two Networks.
// Again, there may be additional allocation and initialization required.
Network networks[2];
for(int i = 0; i < 2; i++) {
do_the_thing(
train_data,
train_labels,
x, y,
mini_batch_size,
training_rate,
&network[i] // pass in the pre-allocated memory
);
}
// No need to free, it will be done automatically.

Related

decreasing time it takes to run my program in c

I was writing a program that is reading from a file and then storing the data in two tables that are in a table of structure. I am expanding the tables with realloc and the time my program takes to run is ~ 0.7 s.
Can i somehow decrease this time?
typedef struct {
int *node;
int l;
int *waga;
} przejscie_t;
void czytaj(przejscie_t **graf, int vp, int vk, int waga) {
(*graf)[vp].node[(*graf)[vp].l - 1] = vk;
(*graf)[vp].waga[(*graf)[vp].l - 1] = waga;
(*graf)[vp].l++;
}
void wypisz(przejscie_t *graf, int i) {
printf("i=%d l=%d ", i, graf[i].l);
for (int j = 0; j < (graf[i].l - 1); j++) {
printf("vk=%d waga=%d ", graf[i].node[j], graf[i].waga[j]);
}
printf("\n");
}
void init(przejscie_t **graf, int vp, int n) {
*graf = realloc(*graf, (vp + 1) * sizeof(przejscie_t));
if (n == vp || n == -1){
(*graf)[vp].l = 1;
(*graf)[vp].node = malloc((*graf)[vp].l * sizeof(int));
(*graf)[vp].waga = malloc((*graf)[vp].l * sizeof(int));
}
else {
for (int i = n; i <= vp; i++) {
(*graf)[i].l = 1;
(*graf)[i].node = malloc((*graf)[i].l * sizeof(int));
(*graf)[i].waga = malloc((*graf)[i].l * sizeof(int));
}
}
}
Here some suggestions:
I think you should pre-calculate the required size of your *graf memory instead of reallocating it again and again. By using a prealloc_graf function for example.
You will get some great time improvement since reallocating is time-consuming especially when it must actually move the memory.
You should do this method especially if you are working with big files.
And since you're working with files, pre-calculating should be done easily.
If your files size are both light and heavy, you have two choices:
Accept your fate and allow your code to be a little bit less optimized on small files.
Create two init functions: The first one is optimized for small files, the other one will be for bigger files but... You will have to run some benchmarks to actually determine what algorithm is the best for each case before being able to implement it. You could actually automate that if you have the time and the will to do so.
It is important to check for successful memory allocation before trying to use the said memory because allocation function can fail.
Finally, some changes for the init function :
void init(przejscie_t **graf, int vp, int n) {
*graf = realloc(*graf, (vp + 1) * sizeof(przejscie_t));
// The `if` statement was redundant.
// Added a ternary operator for ``n == -1``.
// Alternatively, you could use ``n = (n == -1 ? vp : n)`` right before the loop.
for (int i = (n == -1 ? vp : n); i <= vp; i++) {
(*graf)[i].l = 1;
// (*graf)[X].l is is always 1.
// There is no reason to use (*graf)[X].l * sizeof(int) for malloc.
(*graf)[i].node = malloc(sizeof(int));
(*graf)[i].waga = malloc(sizeof(int));
}
}
I've commented everything that I've changed but here is a summary :
The if statement was redundant.
The for loop cover all cases with ternary operator for n
equals -1.
The code should be easier to understand and to comprehend this way.
The node and waga arrays were not being initialized "properly".
Since l is always equals 1 there was no need for an
additional operation.
This doesn't really change execution time tho since its constant.
I would also suggest that your "functions running allocation functions" should return a boolean saying if the function succeeded. In the case the allocation failed you can return false to say that your function failed.

Segfault after refactoring nested loops

I have some MATLAB code from a digital audio course that I've ported to C. Given an array of numeric data (for example, PCM audio encoded as double-precision floating-point), produce an array of data segments of a specified width and which overlap each other by a specified amount. Here's the relevant code.
typedef struct AudioFramesDouble {
const size_t n, // number of elements in each frame
num_frames;
double* frames[];
} AudioFramesDouble;
/*
* Produce a doubly-indexed array of overlapping substrings (a.k.a windows, frames,
* segments ...) from a given array of data.
*
* x: array of (i.e., pointer to) data
* sz: number of data elements to consider
* n: number of elements in each frame
* overlap: each frame overlaps the next by a factor of 1 - 1/overlap.
*/
AudioFramesDouble* audio_frames_double(register const double x[], const size_t sz, const unsigned n, const unsigned overlap) {
// Graceful exit on nullptr
if (!x) return (void*) x;
const double hop_d = ((double) n) / ((double) overlap); // Lets us "hop" to the start of the next frame.
const unsigned hop = (unsigned) ceil(hop_d);
const unsigned remainder = (unsigned) sz % hop;
const double num_frames_d = ((double) sz) / hop_d;
const size_t num_frames = (size_t) (remainder == 0
? floor(num_frames_d) // paranoia about floating point errors
: ceil(num_frames_d)); // room for zero-padding
const size_t total_samples = (size_t) n * num_frames;
AudioFramesDouble af = {.n = n, .num_frames = num_frames};
// We want afp->frames to appear as (double*)[num_frames].
AudioFramesDouble* afp = malloc((sizeof *afp) + (sizeof (double*) * num_frames));
if (!afp) return afp;
memcpy(afp, &af, sizeof af);
for (size_t i = 0; i < num_frames; ++i) {
/* Allocate zero-initialized space at the start of each frame. If this
fails, free up the memory and vomit a null pointer. */
afp->frames[i] = calloc(n, sizeof(double));
if (!afp->frames[i]) {
double* p = afp->frames[i];
for (long ii = ((long)i) - 1; 0 <= ii; ii--) {
free(afp->frames[--i]);
}
free(afp);
return (void*) p;
}
for (size_t j = 0, k; j < n; ++j) {
if (sz <= (k = (i*hop) + j)) break;
afp->frames[i][j] = x[k];
}
}
return afp;
}
This performs as expected. I wanted to optimize the nested FOR to the following
for (size_t i = 0, j = 0, k; i < num_frames; (j == n - 1) ? (j = 0,i++) : ++j) {
// If we've reached the end of the frame, reset j to zero.
// Then allocate the next frame and check for null.
if (j == 0 && !!(afp->frames[i] = calloc(n, sizeof(double)))) {
double* p = afp->frames[i];
for (long ii = ((long)i) - 1; 0 <= ii; ii--) {
free(afp->frames[--i]);
}
free(afp);
return (void*) p;
}
if (sz <= (k = (i*hop) + j)) break;
afp->frames[i][j] = x[k];
}
This actually compiles and runs just fine; but in my testing, when I try to access the last frame as in
xFrames->frames[xFrames->num_frames-1],
I get a segmentation fault. What's going on here? Am I neglecting an edge case in my loop? I've been looking over the code for awhile, but I might need a second set of eyes. Sorry if the answer is glaringly obvious; I'm a bit of a C novice.
P.S. I'm a fan of branchless programming, so if anyone has tips for cutting out those IFs, I'm all ears. I was using ternary operators before, but reverted to IFs for readability in debugging.
Remember that the logical operator && and || does short-circuit evaluation.
That means if j != 0 then you won't actually call calloc, and you'll have an invalid pointer in afp->frames[i].

C Keep Getting Double Free, despite trying to free in same form as allocation

Hey I'm trying to do a simple machine learning application for school but I keep getting double free for some reason I cannot even fathom.
float * evaluate(Network net,float * in)
{
int i,j;
float * out;
Neuron cur_neu;
for(i=0,j=0;i<net.n_lay;i++) j = net.lay_sizes[i]>j?net.lay_sizes[i]:j; //Calculating the maximum lay size for output storage
out = (float *) malloc(j*sizeof(float));
for(i=0;i<net.n_lay;i++) //Cycling through layers
{
for(j=0;j<net.lay_sizes[i];j++) //Cycling through Neurons
{
cur_neu=net.matrix[i][j];
out[j] = cur_neu.af(cur_neu.w,in,net.lay_sizes[i-1]); //Storing each answer in out
}
for(j=0;j<net.lay_sizes[i];j++) in[j] = out[j]; //Transfering answers to in
}
return out;
}
float loss(Network net, float **ins_orig, int t_steps)
{
float **profecies;
float st = .5f;
int d_steps = 4;
int t, i, j;
int out_size = net.lay_sizes[net.n_lay - 1];
int in_size = net.lay_sizes[0];
float out = 0.0f;
float **ins;
/*
d_steps = Divination Steps: Number of time steps forward the network has to predict.
The size of the output layer must be d_steps*#ins (deconsidering any conceptual i/os)
t_steps = Total of Steps: Total number of time steps to simulate.
*/
//Copying ins
ins = (float **)malloc(t_steps * sizeof(float *));
for (i = 0; i < t_steps; i++) //I allocate memory for and copy ins_orig to ins here
{
ins[i] = (float *)malloc(in_size * sizeof(float));
for (j = 0; j < in_size; j++)
ins[i][j] = ins_orig[i][j];
}
//
profecies = (float **)malloc(t_steps * sizeof(float *));
for (t = 0; t < t_steps; t++)
{
profecies[t] = evaluate(net, ins[t]);
/*
Profecy 0:
[[a1,b1,c1,d1]
[e1,f1,g1,h1]
[i1,j1,k1,l1]]
Profecy 1:
[[e2,f2,g2,h2]
[i2,j2,k2,l2]
[m2,n2,o2,q2]]
Verification for:
t=0:
loss+= abs(a1-ins[t][0]+b2-ins[t][1]...)
t=1:
t=0:
loss+= abs(e1-ins[t][0]+f2-ins[t][1]...)
*/
for (i = 0; i < d_steps; i++) //i is distance of prediction
{
if (i <= t) // stops negative profecy indexing
{
for (j = 0; j < in_size; j++)
{
out += (ins[t][j] - profecies[t-i][j+in_size*i]) * (ins[t][j] - profecies[t-i][j+in_size*i]) * (1 + st*i); //(1+st*i) The further the prediction, the bigger reward
}
}
}
}
//Free ins
for (i = 0; i < t_steps; i++) //I try to free it here, but to no avail
{
free(ins[i]);
}
free(ins);
return out;
}
I realize it's probably something very obvious but, I can't figure it out for the life of me and would appreciate the help.
Extra details that probably aren't necessary:
evaluate just passes the input to the network (stored in ins) and returns the output
both inputs and outputs are stored in float "matrixes"
Edit: Added evaluate
In your loss() you allocate the same number of floats for each ins:
ins[i] = (float *)malloc(in_size * sizeof(float));
In your evaluate() you calculate the longest lay_size, indicating that it may NOT be net.lay_sizes[0]:
for(i=0,j=0;i<net.n_lay;i++) j = net.lay_sizes[i]>j?net.lay_sizes[i]:j; //Calculating the maximum lay size for output storage
Then you are writing out-of-bounds here:
for(j=0;j<net.lay_sizes[i];j++) in[j] = out[j]; //Transfering answers to in
From that point, your memory is corrupted.

Can a function remember variables between numerous calls?

I am trying to create a function that will take in the positions of numerous bodies moving in circular motion, and output their orbital periods. Each body is stored in a struct which contains its X, Y and Z co-ordinate (as well as some other information I don't need for this specific task).
My current function for doing this is:
double calc_period(Body *bodies, double t, int Nbodies, int i)
{
double orbit_angle[Nbodies];
double initial_angle[Nbodies];
double last_angle[Nbodies]
double last_time[Nbodies];
double half_step[Nbodies];
double running_total[Nbodies];
double orbitN[Nbodies];
double average_period[Nbodies];
orbit_angle[i] = atan(bodies[i].r[Y] / bodies[i].r[X]);
if (t==0) {
//Initialise all the variables to 0 the first time through
last_angle[i] = 0;
initial_angle[i] = orbit_angle[i];
orbitN[i] = 0;
half_step[i] = 1;
}
if (last_angle[i] < initial_angle[i] && orbit_angle[i] > initial_angle[i]) {
if (half_step[i] == 0) {
if (orbitN[i]==0) {
last_t[i] = t;
running_total[i] = t;
} else {
running_total[i] += t - last_t[i];
last_t[i] = t;
}
orbitN[i]++;
average_period[i] = running_total[i] / (DAYS_TO_SECS * orbitN[i]);
half_step[i] = 1;
} else if (half_step[i] == 1) {
half_step[i] = 0;
}
}
last_angle[i] = orbit_angle[i];
return average_period[i];
}
and this function is called in main like so:
for (double j = 0; j < max_time; j += timestep) {
update_positions(bodies, Nbodies, j);
for (int i = 0; i < Nbodies; i++) {
average_period[i] = calc_period(bodies, j, Nbodies, i);
if (j > max_time - timestep) {
printf("%s average period: %lg\n", bodies[i].name, average_period[i]);
}
}
}
and the problem that I'm having is that of course when the calc_period function finishes, the variables within are destroyed, so it cannot remember what initial_angle, last_angle or last_t were, so doesn't work. However I'm struggling to come up with a solution for this. If anyone can give any guidance it would be much appreciated.
Any data in the stack is volatile within function calls. The space in the stack you are using now will be replaced on future function calls, so functions on the stack should be used just until the function ends.
For your problem different methods exists.
You can create a struct as a global variable which will be stored in BSS, you can create a global pointer pointing to your struct on dynamic memory, allocated with functions like malloc().
Also you can create a static variable so on future calls to that function you can use it.
The best solution would be either defining a static pointer to malloc() where your struct resides, or doing the same on a global variable as a pointer.

Function Warnings in C

Hello guys i have threefunctions for which i get 4 warnings...!!
The first one is this
void evaluatearxikos(void)
{
int mem;
int i;
double x[NVARS+1];
FILE *controlpointsarxika;
controlpointsarxika = fopen("controlpointsarxika.txt","r");
remove("save.txt");
for(mem = 0; mem < POPSIZE; mem++)
{
for(i = 0; i < NVARS; i++)
{
x[i+1] = population[mem].gene[i];
}
rbsplinearxiki();
XfoilCall();
population[mem].fitness = FileRead();
remove("save.txt");
}
fclose(controlpointsarxika);
}
For this one the compiler warns me tha variable x is set but not used...!! But actually i am using the variable x...!!!
The second function is this one...
void elitist(void)
{
int i;
double best,worst;
int best_mem,worst_mem;
best = population[0].fitness;
worst = population[0].fitness;
for(i = 0; i < POPSIZE - 1; i++)
{
if(population[i].fitness > population[i+1].fitness)
{
if(population[i].fitness >= best)
{
best = population[i].fitness;
best_mem = i;
}
if(population[i+1].fitness <= worst)
{
worst = population[i+1].fitness;
worst_mem = i+1;
}
}
else
{
if(population[i].fitness <= worst)
{
worst = population[i].fitness;
worst_mem = i;
}
if(population[i+1].fitness >= best)
{
best = population[i+1].fitness;
best_mem = i+1;
}
}
}
if(best >= population[POPSIZE].fitness)
{
for(i = 0; i < NVARS; i++)
{
population[POPSIZE].gene[i] = population[best_mem].gene[i];
}
population[POPSIZE].fitness = population[best_mem].fitness;
}
else
{
for(i = 0; i < NVARS; i++)
{
population[worst_mem].gene[i] = population[POPSIZE].gene[i];
}
population[worst_mem].fitness = population[POPSIZE].fitness;
}
}
For this one i get two warnings that the variables worst_mem and best_mem may be used uninitialized in this function..!! But i initialize values to both of them..!!
And the third function is this...
void crossover(void)
{
int mem,one;
int first = 0;
double x;
for(mem =0; mem < POPSIZE; mem++)
{
x = rand()%1000/1000;
if(x < PXOVER)
{
first++;
if(first%2 == 0)
{
random_Xover(one,mem);
}
else
{
one = mem;
}
}
}
}
For which i get that the variable one may be used unitialized..!! But it is initialized..!
Can you please tell me what is wrong with these functions...??
Thank you in advance
In your first function, you set (assign) x, but you never read it, hence you are not using it... you're only wasting CPU cycles by writing to it. (Note also that because you index it as i+1 you write beyond the space you've allocated for it).
In the second function, your initializations to those variables are in conditional blocks. You can see that (perhaps? I didn't verify) in all conditions they are initialized but your compiler isn't that smart.
In your third function, it does appear that one could be refered to without having first been initialized.
First: You set x but do not use it. It's a local variable that gets set but it's dropped as soon as the function returns.
Second: There might be values that makes it so that your best_mem/worst_mem never gets set in your if/else, but you are using them later on. If they haven't been set, they contain garbage if not initialized.
Third: While it shouldn't happen that you try to use an uninitialized variable in your code, it still looks weird and compiler doesn't see that it won't happen first time.
When you get compiler warnings, treat is as you are doing something wrong or rather not recommended and that it could be done in a better way.
The x variable is only used on the left hand side (i.e. assigned a value). You are not using that value on the right hand side or pass it to a function.
It may be possible to get to the end of the loop for(i = 0; i < POPSIZE - 1; i++) without those variables given a value. Why not set them in the declaration.
The call to random_Xover(one,mem); could be called when one is not set. Change the line int mem,one; to int mem,one = <some value>;

Resources