I have a problem with a series of functions. I have an array of 'return values' (i compute them through matrices) from a single function sys which depends on a integer variable, lets say, j, and I want to return them according to this j , i mean, if i want the equation number j, for example, i just write sys(j)
For this, i used a for loop but i don't know if it's well defined, because when i run my code, i don't get the right values.
Is there a better way to have an array of functions and call them in a easy way? That would make easier to work with a function in a Runge Kutta method to solve a diff equation.
I let this part of the code here: (c is just the j integer i used to explain before)
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
int N=3;
double s=10.;
//float r=28.;
double b=8.0/3.0;
/ * Define functions * /
double sys(int c,double r,double y[])
{
int l,m,n,p=0;
double tmp;
double t[3][3]={0};
double j[3][3]={{-s,s,0},{r-y[2],-1,-y[0]},{y[1],y[0],-b}}; //Jacobiano
double id[3][3] = { {y[3],y[6],y[9]} , {y[4],y[7],y[10]} , {y[5],y[8],y[11]} };
double flat[N*(N+1)];
// Multiplication of matrices J * Y
for(l=0;l<N;l++)
{
for(m=0;m<N;m++)
{
for(n=0;n<N;n++)
{
t[l][m] += j[l][n] * id[n][m];
}
}
}
// Transpose the matrix (J * Y) -> () t
for(l=0;l<N;l++)
{
for(m=l+1;m<N;m++)
{
tmp = t[l][m];
t[l][m] = t[m][l];
t[m][l] = tmp;
}
}
// We flatten the array to be left in one array
for(l=0;l<N;l++)
{
for(m=0;m<N;m++)
{
flat[p+N] = t[l][m];
}
}
flat[0] = s*(y[1]-y[0]);
flat[1] = y[0]*(r-y[2])-y[1];
flat[2] = y[0]*y[1]-b*y[2];
for(l=0;l<(N*(N+1));l++)
{
if(c==l)
{
return flat[c];
}
}
}
EDIT ----------------------------------------------------------------
Ok, this is the part of the code where i use the function
int main(){
output = fopen("lyapcoef.dat","w");
int j,k;
int N2 = N*N;
int NN = N*(N+1);
double r;
double rmax = 29;
double t = 0;
double dt = 0.05;
double tf = 50;
double z[NN]; // Temporary matrix for RK4
double k1[N2],k2[N2],k3[N2],k4[N2];
double y[NN]; // Matrix for all variables
/* Initial conditions */
double u[N];
double phi[N][N];
double phiu[N];
double norm;
double lyap;
//Here we integrate the system using Runge-Kutta of fourth order
for(r=28;r<rmax;r++){
y[0]=19;
y[1]=20;
y[2]=50;
for(j=N;j<NN;j++) y[j]=0;
for(j=N;j<NN;j=j+3) y[j]=1; // Identity matrix for y from 3 to 11
while(t<tf){
/* RK4 step 1 */
for(j=0;j<NN;j++){
k1[j] = sys(j,r,y)*dt;
z[j] = y[j] + k1[j]*0.5;
}
/* RK4 step 2 */
for(j=0;j<NN;j++){
k2[j] = sys(j,r,z)*dt;
z[j] = y[j] + k2[j]*0.5;
}
/* RK4 step 3 */
for(j=0;j<NN;j++){
k3[j] = sys(j,r,z)*dt;
z[j] = y[j] + k3[j];
}
/* RK4 step 4 */
for(j=0;j<NN;j++){
k4[j] = sys(j,r,z)*dt;
}
/* Updating y matrix with new values */
for(j=0;j<NN;j++){
y[j] += (k1[j]/6.0 + k2[j]/3.0 + k3[j]/3.0 + k4[j]/6.0);
}
printf("%lf %lf %lf \n",y[0],y[1],y[2]);
t += dt;
}
Since you're actually computing all these values at the same time, what you really want is for the function to return them all together. The easiest way to do this is to pass in a pointer to an array, into which the function will write the values. Or perhaps two arrays; it looks to me as if the output of your function is (conceptually) a 3x3 matrix together with a length-3 vector.
So the declaration of sys would look something like this:
void sys(double v[3], double JYt[3][3], double r, const double y[12]);
where v would end up containing the first three elements of your flat and JYt would contain the rest. (More informative names are probably possible.)
Incidentally, the for loop at the end of your code is exactly equivalent to just saying return flat[c]; except that if c happens not to be >=0 and <N*(N+1) then control will just fall off the end of your function, which in practice means that it will return some random number that almost certainly isn't what you want.
Your function sys() does an O(N3) calculation to multiply two matrices, then does a couple of O(N2) operations, and finally selects a single number to return. Then it is called the next time and goes through most of the same processing. It feels a tad wasteful unless (even if?) the matrices are really small.
The final loop in the function is a little odd, too:
for(l=0;l<(N*(N+1));l++)
{
if(c==l)
{
return flat[c];
}
}
Isn't that more simply written as:
return flat[c];
Or, perhaps:
if (c < N * (N+1))
return flat[c];
else
...do something on disastrous error other than fall off the end of the
...function without returning a value as the code currently does...
I don't see where you are selecting an algorithm by the value of j. If that's what you're trying to describe, in C you can have an array of pointers to functions; you could use a numerical index to choose a function from the array, but you can also pass a pointer-to-a-function to another function that will call it.
That said: Judging from your code, you should keep it simple. If you want to use a number to control which code gets executed, just use an if or switch statement.
switch (c) {
case 0:
/* Algorithm 0 */
break;
case 1:
/* Algorithm 1 */
etc.
Related
Background
Imagine that we have N particles inside a box of length L, which interact with each other (through a Lennard Jones potential).
I want to compute the total potential energy of the system. I implemented the function POT which calculates all the contributions from all the particles and gives the correct results (this is tested and can be assumed true).
I also wrote a function POT_ONE which only calculates the potential energy of one particle with respect to all the others. This means that if I want to calculate the total potential energy I will have to call this function N times (making sure that the particle does not interact with itself) and then divide by 2 since I double count the interactions.
Goal
My goal is to make the second function yield the same results as the first one.
Problem
There is something really strange going on: If I put 4 particles, the two functions give the same results. If I put a fifth one then there is deviation. Then for 6,7,8 particles,again, it gives correct results and then for N=9 I am getting a different result. In the case N=1000 the result that I am getting from POT_ONE is somemthing like 113383820348202024.
My results for N=5 are:
-0.003911 with POT and
12.864234 with POT_ONE
In case someone tries to run the code and wants to check the N=4 case, he/she should change the number of particles (np) which is defined as global variable and then comment the line pos[12]=1;pos[13]=1;pos[14]=1;.
Code
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
/*GENERAL PARAMETERS*/
int dim=3; //number of dimensions
int np=5; //number of particles
double L=36.413; //box length (A)
double invL=1/36.413; //inverse of box length
/*ARGON CHARACTERISTICS*/
double sig=3.4; // Angstroms (A)
double e=0.001; // eV
double distSQ(double array[]){
/*calculates the squared distance given the array x=[dx,dy,dz]*/
int i;
double r2=0;
for(i=0;i<dim;i++) r2+=array[i]*array[i];
return r2;
}//distSQ
void MIC(double dr[],double L, int dim){
/* MINIMUM IMAGE CONVENTION: dr[] is the array dr = [dx,dy,dz] describing relative
positions of two particles, L is the box length, dim the number
of dimensions */
int i;
for(i=0;i<dim;i++) dr[i]-=round(dr[i]*invL)*L;
}//MIC
void POT(double x1[],double* potential){
/*given the positions of each particle in the form x=[x0,y0,z0;x1,y1,z1;...;xn-1,yn-1,zn-1],
the number of dimensions dim and particles np, it calculates the potential energy of the configuration*/
//variables for potential calculation
int i,j,k;
double *x2;
double r2inv; // 1/r^2
double foo,bar;
double dr[dim];
*potential=0; // set potential energy to zero
//main part of POT
for(i=0;i<np-1;i++){
x2=x1+dim;
for(j=i+1;j<np;j++){
//calculate relative distances between particles i & j
//apply periodic BCs and then calculate squared distance
//and the potential energy between them.
for(k=0;k<dim;k++) dr[k] = x2[k]-x1[k];
MIC(dr,L,dim); //periodic boundary conditions
r2inv=1/distSQ(dr);
//calculate potential energy
foo = sig*sig*r2inv;
bar = foo*foo*foo;
*potential+=bar*(bar-1);
}//for j
x1+=dim;
}//for i
*potential*=4*e; //scale and give energy units
}//POT
void POT_ONE(int particle,double pos[],double* potential){
*potential=0;
int i,k;
double dr[dim];
double r2inv,foo,bar;
double par_pos[dim];
int index=particle*dim;
par_pos[0]=pos[index];
par_pos[1]=pos[index+1];
par_pos[2]=pos[index+2];
for(i=0;i<np;i++){
if(i!=particle){
for(k=0;k<dim;k++) dr[k]=pos[k]-par_pos[k];
MIC(dr,L,dim);
r2inv=1/distSQ(dr);
foo=sig*sig*r2inv;
bar=foo*foo*foo;
*potential+=bar*(bar-1);
}
pos+=dim;
}
*potential*=4*e; //scale and give energy units
}//POT_ONE
int main(){
int D=np*dim;
double* pos=malloc(D*sizeof(double));
double potential=0; //calculated with POT
double U=0; ////calculated with POT_ONE
double tempU=0;
pos[0]=0;pos[1]=0;pos[2]=0;
pos[3]=4;pos[4]=0;pos[5]=0;
pos[6]=0;pos[7]=4;pos[8]=0;
pos[9]=0;pos[10]=0;pos[11]=4;
pos[12]=1;pos[13]=1;pos[14]=1;
POT(pos,&potential);
printf("POT: %f\n", potential);
int i,j;
for(i=0;i<np;i++){
POT_ONE(i,pos,&tempU);
U+=tempU;
}
U=U/2;
printf("POT_ONE: %f\n\n", U);
return 0;
}
Your error is in POT, where you forgot to update x2 at the end of the inner loop.
for (i = 0; i < np - 1; i++) {
double *x2 = x1 + dim;
for (j = i + 1; j < np; j++) {
// ... calculate stuff ..
x2 += dim;
}
x1 += dim;
}
An easier and arguably more readable variant is to forgo pointer arithmetic altogether and use boring old indices:
for (k = 0; k < dim; k++) {
dr[k] = x[j * dim + k] - x[i * dim + k];
}
Further observations:
Please make your variables local to the scope where they are used. A large list of uninitialized variables at the top of the function makes it very hard to track variables, even in a short function like yours.
Please consider returning single values from functions instead of passing in pointers. In my opinion, that makes functions like the square of the distance more readable.
The structure of your code is hard to see, because everything is run togeher very tightly, even the comments.
I run a loop a million times. Within the loop I call a C function to do some math (generating random variables from various distributions, to be exact). As part of that function, I declare a couple of double variables to hold parts of the transformation. An example:
void getRandNorm(double *randnorm, double mean, double var, int n)
{
// Declare variables
double u1;
double u2;
int arrptr = 0;
double sigma = sqrt(var); // the standard deviation
while (arrptr < n) {
// Generate two uniform random variables
u1 = rand() / (double)RAND_MAX;
u2 = rand() / (double)RAND_MAX;
// Box-Muller transform
randnorm[arrptr] = sqrt(-2*log(u1))*cos(2*pi*u2)*sigma+mean;
arrptr++;
if (arrptr < n) { // for an odd n, we cannot add off the end
randnorm[arrptr] = sqrt(-2*log(u2))*cos(2*pi*u1)*sigma+mean;
arrptr++;
}
}
}
And the calling loop:
iter = 1000000 // or something
for (i = 0; i < iter; i++) {
// lots of if statements
getRandNorm(sample1, truemean1, truevar1, n);
// some more analysis
}
I am working on speeding up the runtime. It occurs to me that I don't know what is happening with all these double variables that I am declaring. I assume a new 8 byte chunk of memory is allocated for the double for each of the one million loops. What happens to all those memory locations? They are declared within a C function; do they survive that function? Are they still locked up until the script exits?
The context for this question is wrapping this C program into a python function. If I'm going to execute this function multiple times in parallel from python, I want to be sure that I'm being as thrifty with memory usage as possible.
If you're talking about something like this:
for(int i=0;i<100000;i++){
double d = 5;
// some other stuff here
}
d is only allocated once by the compiler. It's mostly equivalent to declaring it above the for loop, except that the scope doesn't extend as far.
However, if you are doing something like this:
for(int i=0;i<1000000;i++){
double *d = malloc(sizeof(double));
free(d);
}
Then yes, you will allocate a double 1 million times, but it will likely re-use the memory for subsequent allocations. Finally, if you don't free the memory in my second example, you'll leak 16-32MB of memory.
The short answer is: NO, it should not matter if you declare these double variables inside the loop in C. By double variable, I assume you mean variables of type double.
The long answer is: Please post your code so people can tell you if you do something wrong and how to fix it to improve correctness and/or performance (a vast subject).
The final answer is: with the code provided, it makes no difference whether you declare u1 and u2 inside the body of the loop or outside. A good compiler will likely generate the same code.
You can improve the code a tiny bit by testing the odd case just once:
void getRandNorm(double *randnorm, double mean, double var, int n, double pi) {
// Declare variables
double u1, u2;
double sigma = sqrt(var); // the standard deviation
int arrptr, odd;
odd = n & 1; // check if n is odd
n -= odd; // make n even
for (arrptr = 0; arrptr < n; arrptr += 2) {
// Generate two uniform random variables
u1 = rand() / (double)RAND_MAX;
u2 = rand() / (double)RAND_MAX;
// Box-Muller transform
randnorm[arrptr + 0] = sqrt(-2*log(u1)) * cos(2*pi*u2) * sigma + mean;
randnorm[arrptr + 1] = sqrt(-2*log(u2)) * cos(2*pi*u1) * sigma + mean;
}
if (odd) {
u1 = rand() / (double)RAND_MAX;
u2 = rand() / (double)RAND_MAX;
randnorm[arrptr++] = sqrt(-2*log(u1)) * cos(2*pi*u2) * sigma + mean;
}
}
Note: arrptr + 0 is here for symmetry, the compiler will not generate any code for this addition.
regarding your question: If I run a loop a million times, do I have to worry about declaring doubles in each iteration?
The variables are being declared on the stack. So they 'disappear' when the function exits. The next execution of the function 're-creates' the variables, so (in reality) there is only a single instance of the variables and even then, only while the function is being executed.
So it does not matter how many times you call the function.
I am trying to make from f_rec (recursive function) to f_iter (iterative function) but I can't.
(My logic was to create a loop to calculate the results of f_rec(n-1).
int f_rec(int n)
{
if(n>=3)
return f_rec(n-1)+2*f_rec(n-2)+f_rec(n-3);
else
return 1;
}
int f_iter(int n)
{
}
I also think that my time complexity for the f_rec is 3^n , please correct me if I'm wrong.
Thank you
There are two options:
1) Use the discrete math lessons and derive the formula. The complexity (well if #Sasha mentioned it) will be O(1) for both memory and algorithm. No loops, no recursion, just the formula.
At first you need to find the characteristic polynomial and calculate its roots. Let's asssume that our roots are r1, r2, r3, r4. Then the n'th element is F(n) = A * r1^n + B * r2^n + C * r3^n + D * r4^n, where A, B, C, D are some unknown coefficients. You can find these coefficients using your initial conditions (F(n) = 1 for n <= 3).
I can explain it on russian if you need.
2) Use additional variables to store intermediate values. Just like #6052 have answered (he has answered really fast :) ).
You can always calculate the newest value from the last three. Just start calculating from the beginning and always save the last three:
int f_iter (int n) {
int last3[3] = {1,1,1}; // The three initial values. Use std::array if C++
for (int i = 3; i <= n; ++i) {
int new_value = last3[0] + 2 * last3[1] + last3[2];
last3[0] = last3[1];
last3[1] = last3[2];
last3[2] = new_value;
}
return last3[2];
}
This solution need O(1) memory and O(n) runtime. There might exist a formula that calculates this in O(1) (there most likely is), but I guess for the sake of demonstrating the iteration technique, this is the way to go.
Your solution has exponential runtime: Every additional level spawns three evaluations, so you end up with O(3^n) operations and stack-memory.
The following is the idea
int first=1,second=1,third=1; /* if n<=3 then the respective is the answer */
for(i=4;i<=n;i++)
{
int next=first+2*second+third;
first=second;
second=third;
third=next;
}
cout<<"The answer is "<<next<<endl;
Memory is O(1) and time is O(n).
EDIT
Your recursive function is indeed exponential in time , to keep it linear you can make use
of an array F[n], and use memoization. First initialize F[] as -1.
int f_rec(int n)
{
if(n>=3)
{
if(F[n]!=-1)return F[n];
F[n]=f_rec(n-1)+2*f_rec(n-2)+f_rec(n-3);
return F[n];
}
else
return 1;
}
Just keep three variables and roll them over
start with a, b and c all equal to 1
at each step new_a is a + 2*b + c
roll over: new_c is b, new_b is a
repeat the required number of steps
A bit of an overkill, but this can be further optimized by letting the what the variables represent change in an unfolded loop, combined with (link) Duff's device to enter the loop:
int f_iter(int n){
int a=1, b=1, c=1;
if(n < 3)
return(1);
switch(n%3){
for( ; n > 2; n -= 3){
case 2:
b = c + 2*a + b;
case 1:
a = b + 2*c + a;
case 0:
c = a + 2*b + c;
}
}
return c;
}
I'm trying to optimize some of my code in C, which is a lot bigger than the snippet below. Coming from Python, I wonder whether you can simply multiply an entire array by a number like I do below.
Evidently, it does not work the way I do it below. Is there any other way that achieves the same thing, or do I have to step through the entire array as in the for loop?
void main()
{
int i;
float data[] = {1.,2.,3.,4.,5.};
//this fails
data *= 5.0;
//this works
for(i = 0; i < 5; i++) data[i] *= 5.0;
}
There is no short-cut you have to step through each element of the array.
Note however that in your example, you may achieve a speedup by using int rather than float for both your data and multiplier.
If you want to, you can do what you want through BLAS, Basic Linear Algebra Subprograms, which is optimised. This is not in the C standard, it is a package which you have to install yourself.
Sample code to achieve what you want:
#include <stdio.h>
#include <stdlib.h>
#include <cblas.h>
int main () {
int limit =10;
float *a = calloc( limit, sizeof(float));
for ( int i = 0; i < limit ; i++){
a[i] = i;
}
cblas_sscal( limit , 0.5f, a, 1);
for ( int i = 0; i < limit ; i++){
printf("%3f, " , a[i]);
}
printf("\n");
}
The names of the functions is not obvious, but reading the guidelines you might start to guess what BLAS functions does. sscal() can be split into s for single precision and scal for scale, which means that this function works on floats. The same function for double precision is called dscal().
If you need to scale a vector with a constant and adding it to another, BLAS got a function for that too:
saxpy()
s a x p y
float a*x + y
y[i] += a*x
As you might guess there is a daxpy() too which works on doubles.
I'm afraid that, in C, you will have to use for(i = 0; i < 5; i++) data[i] *= 5.0;.
Python allows for so many more "shortcuts"; however, in C, you have to access each element and then manipulate those values.
Using the for-loop would be the shortest way to accomplish what you're trying to do to the array.
EDIT: If you have a large amount of data, there are more efficient (in terms of running time) ways to multiply 5 to each value. Check out loop tiling, for example.
data *= 5.0;
Here data is address of array which is constant.
if you want to multiply the first value in that array then use * operator as below.
*data *= 5.0;
Am trying to generate the value of a 10 order polynomial with 11 coefficients. Am also trying to generate its derivative. i have written a three functions shown below.
this code generates the value of the polynomial.a1 upto a10 are the coefficients.
float polynm( float a0,float a1,float a2,float a3,float a4,float a5,float a6,float a7,float a8,float a9,float a10,float x)
{
float poly = a0 + a1*x + a2*pow(x,2)+a3*pow(x,3)+a4*pow(x,4)+a5*pow(x,5)+a6*pow(x,6)+a7*pow(x,7)+a8*pow(x,8)+a9*pow(x,9)+a10*pow(x,10);
return poly;
}
this code generates the value of the derivative of the polynomial it calls a function deri
float polynm_der(float a0,float a1,float a2,float a3,float a4,float a5,float a6,float a7,float a8,float a9,float a10,float x)
{ float der = a1 + a2*deri(x,2)+a3*deri(x,3)+a4*deri(x,4)+a5*deri(x,5)+a6*deri(x,6)+a7*deri(x,7)+a8*deri(x,8)+a9*deri(x,9)+a10*deri(x,10);
return der;
}
deri is below
float deri(float x,int n)
{
float term_der = n*pow(x,n-1);
return term_der;
}
the code for the polynomial is inefficient.if i wanted to generate an 100 order polynomial it would become impossible. is there a way i can generate the polynomial and its derivative maybe recursively to avoid the unwieldy code.
One solution is to accept an array of coefficients and its length:
float poly(int x, float coefficients[], int order)
{
int idx;
float total;
for (idx = 0; idx < order; idx++)
total += coefficients[idx] * pow(x, idx);
return total;
}
The recursive solution would be beautiful, but this isn't Lisp. Anyway, a similar approach can be used for derivatives. Just keep in mind the fact that in C, array parameters to functions turn into pointers, so you can't use cool things like sizeof to get their lengths.
Edit: In response to the comment, you can enforce your requirements when the coefficients array is constructed. Alternatively, if you're not in charge of that code, you can stick it in the function (hackishly) like so:
if (coefficients[0] == 0 || coefficients[1] == 0 || coefficients[order-1] == 0)
assert(0);
You could rewrite the functions to take the x value, an array of coefficients, and then the length.