I am currently making a program that approximates the Schroedinger equation, and for my initial conditions, my professor said to begin with a gaussian. The formula I'm using for that is this (apologies, I don't know how to do equations in markdown):
p(x) = ( 1/sqrt(2 * PI) ) * e^( -1/2 * (x-u)^2 / o )
I am starting with u=0 and o=1 for simplicities sake, and so the way I use it in my program is like this:
double gaussian(double x) {
return (1/sqrt(2*M_PI)) * exp((-.5) * pow(x, 2));
}
void initial_conditions(int m, complex *values[], double dx) {
for (size_t i = 0; i < m; i++)
{
values[i]->real = gaussian(i * dx);
}
}
Compiled by: gcc project1.c -lm -o project1
But that produces a segfault every time I have run it. As far as I can tell, it should work, but I am somewhat of a novice to C. I have determined it is specifically that equation that is producing the error by using printf statements to narrow the place of error down, and it always gets to that specific whole formula and return statement and then dies.
Any advice or help would be appreciated.
complex *values[] is weird and unnatural. I can't see the invocation, but I have managed to convince myself that this really should be complex values[].
A complex is far too simple a thing to want to allocate each one individually on the heap; almost always an array of complex would be allocated in a single call to malloc() (or possibly even a stack allocated array by caller).
Proceeding on from the names, I can project with decent confidence the calling code didn't allocate each individual complex in values but just allocated values, and thus the crash is the -> dereference at values[i]->real and values[i] is uninitialized. It's like you want (carrying forward the single array) values[i].real = ... ; values[i].imag = 0;
Related
At school someday several years ago I had to do a swap function that swaps two integers, I wanted to do this using bitwise operations without using a third variable, so I came up with this:
void swap( int * a, int * b ) {
*a = *a ^ *b;
*b = *a ^ *b;
*a = *a ^ *b;
}
I thought it was good but when my function was tested by the school's correction program it found an error (of course when I asked they didn't want to tell me), and still today I don't know what didn't work, so I wonder in which case this method wouldn't work.
I wanted to do this using bitwise operations without using a third variable
Do you mind if I ask why? Was there a practical reason for this limitation, or was it just an intellectual puzzle?
when my function was tested by the school's correction program it found an error
I can't be sure what the correction program was complaining about, but one class of inputs this sort of solution is known to fail on is exemplified by
int x = 5;
swap(&x, &x);
printf("%d\n", x);
This prints 0, not 5.
You might say, "Why would anyone swap something with itself?"
They probably wouldn't, as I've shown it, but perhaps you can imagine that, in a mediocrely-written sort algorithm, it might end up doing the equivalent of
if(a[i] < a[j]) {
/* they are in order */
} else {
swap(&a[i], &a[j]);
}
Now, if it ever happens that i and j are the same, the swap function will wrongly zero out a[i].
See also What is the difference between two different swapping function?
I'm trying to learn how to optimize code (I'm also learning C), and in one of my books there's a problem for optimizing Horner's method for evaluation polynomials. I'm a little lost on how to approach the problem. I'm not great at recognizing what needs optimizing.
Any advice on how to make this function run faster would be appreciated.
Thanks
double polyh(double a[], double x, int degree) {
long int i;
double result = a[degree];
for (i = degree-1; i >= 0; i--)
result = a[i] + x*result;
return result;
}
You really need to profile your code to test whether proposed optimizations really help. For example, it may be the case that declaring i as long int rather than int slows the function on your machine, but on the other hand it may make no difference on your machine but might make a difference on others, etc. Anyway, there's no reason to declare i a long int when degree is an int, so changing it probably won't hurt. (But still profile!)
Horner's rule is supposedly optimal in terms of the number of multiplies and adds required to evaluate a polynomial, so I don't see much you can do with it. One thing that might help (profile!) is changing the test i>=0 to i!=0. Of course, then the loop doesn't run enough times, so you'll have to add a line below the loop to take care of the final case.
Alternatively you could use a do { ... } while (--i) construct. (Or is it do { ... } while (i--)? You figure it out.)
You might not even need i, but using degree instead will likely not save an observable amount of time and will make the code harder to debug, so it's not worth it.
Another thing that might help (I doubt it, but profile!) is breaking up the arithmetic expression inside the loop and playing around with order, like
for (...) {
result *= x;
result += a[i];
}
which may reduce the need for temporary variables/registers. Try it out.
Some suggestion:
You may use int instead of long int for looping index.
Almost certainly the problem is inviting you to conjecture on the values of a. If that vector is mostly zeros, then you'll go faster (by doing fewer double multiplications, which will be the clear bottleneck on most machines) by computing only the values of a[i] * x^i for a[i] != 0. In turn the x^i values can be computed by careful repeated squaring, preserving intermediate terms so that you never compute the same partial power more than once. See the Wikipedia article if you've never implemented repeated squaring.
As the title says (and suggests), I'm new to C and I'm trying to return an arbitrary sized array of structs from a function. I chose to use malloc, as someone on the internet, whose cleverer than me, pointed out that unless I allocate to the heap, the array will be destroyed when points_on_circle finishes executing, and a useless pointer will be returned.
The code I'm presenting used to work, but now I'm calling the function more and more in my code, I'm getting a runtime error ./main: free(): invalid next size (normal): 0x0a00e380. I'm guessing this is down to my hacked-together implementation of arrays/pointers.
I'm not calling free as of yet, as many of the arrays I'm building will need to persist throughout the life of the program (I will be adding free() calls to the remainder!).
xy* points_on_circle(int amount, float radius)
{
xy* array = malloc(sizeof(xy) * amount);
float space = (PI * 2) / amount;
while (amount-- >= 0) {
float theta = space * amount;
array[amount].x = sin(theta) * radius;
array[amount].y = cos(theta) * radius;
}
return array;
}
My ground-breaking xy struct is defined as follows:
typedef struct { float x; float y; } xy;
And an example of how I'm calling the function is as follows:
xy * outer_points = points_on_circle(360, 5.0);
for(;i<360;i++) {
//outer_points[i].x
//outer_points[i].y
}
A pointer in the right direction would be appreciated.
Allocating memory in one function and freeing it in another is fraught with peril.
I would allocate the memory and pass it (the memory buffer) to the function with a parameter indicating how many structures are allowed to be written to the buffer.
I've seen APIs where there are two functions, one to get the memory required and then another to actually get the data after the memory has been allocated.
[Edit] Found an example:
http://msdn.microsoft.com/en-us/library/ms647005%28VS.85%29.aspx
I would say that this program design is fundamentally flawed. First of all, logically a function which is doing calculations has nothing to do with memory allocation, those are two different things. Second, unless the function that allocates memory and the one that frees it belong to the same program module, the design is bad and you will likely get memory leaks. Instead, leave allocation to the caller.
The code also contains various dangerous practice. Avoid using -- and ++ operators as part of complex expressions, it is a very common cause for bugs. Your code looks as if it has a fatal bug and is writing out of bounds on the array, just because you are mixing -- with other operators. There is never any reason to do so in the C language, so don't do it.
Another dangerous practice is the reliance on C's implicit type conversions from ints to float (balancing, aka "the usual arithmetic conversions"). What is "PI" in this code? Is it an int, float or double? The outcome of the code will vary depending on this.
Here is what I propose instead (not tested):
void get_points_on_circle (xy* buffer, size_t items, float radius)
{
float space = (PI * 2.0f) / items;
float theta;
signed int i;
for(i=items-1; i>=0; i--)
{
theta = space * i;
buffer[i].x = sin(theta) * radius;
buffer[i].y = cos(theta) * radius;
}
}
EDIT: You are returning the array correctly, but ...
Consider you're making an array with 1 element
xy *outer_points = points_on_circle(1, 5.0);
What happens inside the function?
Let's check ...
xy* array = malloc(sizeof(xy) * amount);
allocate space for 1 element. OK!
while (amount-- >= 0) {
1 is greater or equal to 0 so the loop executes (and amount gets decreased)
after setting array[0] you return to the top of the loop
while (amount-- >= 0) {
0 is greater or equal to 0 so the loop executes (and amount gets decreased)
You're now trying to set array[-1], which is invalid because the index refers to an area outside of the array.
I think this:
while (amount-- >= 0 ) {
should be:
while ( --amount >= 0 ) {
Consider the case where amount is zero initially.
You're doing the right thing as far as I'm concerned, provided of course your callers are free'ing the results.
Some people prefer to have the memory allocation and freeing responsibility in the same place (for symmetry), i.e. outside your function. In this case you would pass a pre-allocated xy* as a parameter and return the size of the buffer if the pointer was null:
int requiredSpace = points_on_circle(10, 10, NULL);
xy* myArray = (xy*)malloc(requiredSpace);
points_on_circle(10, 10, myArray);
free(myArray);
You are iterating over one element to much.
You should use the goes down to zero operator instead:
while ( amount --> 0) {
...
}
Allocating memory in one function and freeing it in the other is like giving a gun to a four year old. You shoudn't do that.
While your decision to count down amount and use while instead of using a temp value saved a few memory cycles, it is more conceptually confusing. Especially in a case where the maths are taking all the time here, you only save fractions.
But that is not the reason why you should waste minor amounts of time. This question is the reason: you've wasted hours! This applies to even the most experienced and smartest programmers. The mistakes are just more complicated and beyond the scope of a stackoverflow answer! In other words, the Peter Principle applies to coding too.
Don't make the mistake as you gain experience that you can get away with taking these kinds of risks to save a cycle or two. That is why McConnell in Code Complete lists Humility as a positive programmer attribute.
Here's the solution you probably thought of to start with:
xy* points_on_circle(int amount, float radius)
{
xy* array = malloc(sizeof(xy) * amount);
float space = (PI * 2) / amount;
int index;
for (index=0;index<amount;index++) {
float theta = space * index;
array[index].x = sin(theta) * radius;
array[index].y = cos(theta) * radius;
}
return array;
}
If you need speed, a tiny thing you can do is put theta outside the loop set to 0 and add 'space' each time since + is bound to be cheaper than * in floating point.
speed it up 10x or more?
If you need serious speed, this tip from answers.com will give you an improvement of 10x if you do it right:
By Pythagoream's theorem, x2 + y2 is
radius2. It is then simple to solve
for x or y, given the other along with
radius. You also do not need to
compute for the whole circle - you can
compute for one quadrant, and generate
the other three quadrants by symmetry.
You generation loop would, for
example, simply iterate from origin to
radius as x by delta x, generating y,
and reflecting that in the other three
quadrants. You can also compute for
one half of a quadrant, and use both
symmetry and reflection to generate
the other seven half quadrants.
When I was 12 I thought I was hot stuff drawing a circle using sin and cos in graphics 8 on my atari 800. My cousin Marty (when as of late worked form Microsoft robotics) erased my program and implemented the above solution, using only addition in the loop if I remember right, and draw the same circle in 4 seconds instead of a minute! Had I not been baptized I would have bowed down in worship. Too bad I don't have the code handy but I'd bet a little googling would bring it up. Anybody?
gprof is not working properly on my system (MinGW) so I'd like to know which one of the following snippets is more efficient, on average.
I'm aware that internally C compilers convert everything into pointers arithmetic, but nevertheless I'd like to know if any of the following snippets has any significant advantage over the others.
The array has been allocated dynamically in contiguous memory as 1d array and may be re-allocated at run time (its for a simple board game, in which the player is allowed to re-define the board's size, as often as he wants to).
Please note that i & j must get calculated and passed into the function set_cell() in every loop iteration (gridType is a simple struct with a few ints and a pointer to another cell struct).
Thanks in advance!
Allocate memory
grid = calloc( (nrows * ncols), sizeof(gridType) );
Snippet #1 (parse sequentially as 1D)
gridType *gp = grid;
register int i=0 ,j=0; // we need to pass those in set_cell()
if ( !grid )
return;
for (gp=grid; gp < grid+(nrows*ncols); gp++)
{
set_cell( gp, i, j, !G_OPENED, !G_FOUND, value, NULL );
if (j == ncols-1) { // last col of current row has been reached
j=0;
i++;
}
else // last col of current row has NOT been reached
j++;
}
Snippet #2 (parse as 2D array, using pointers only)
gridType *gp1, *gp2;
if ( !grid )
return;
for (gp1=grid; gp1 < grid+nrows; gp1+=ncols)
for (gp2=gp1; gp2 < gp1+ncols; gp2++)
set_cell( gp2, (gp1-grid), (gp2-gp1), !G_OPENED, !G_FOUND, value, NULL );
Snippet #3 (parse as 2D, using counters only)
register int i,j; // we need to pass those in set_cell()
for (i=0; i<nrows; i++)
for (j=0; j<ncols; j++)
set_cell( &grid[i * ncols + j], i, j, !G_OPENED, !G_FOUND, value, NULL);
Free memory
free( grid );
EDIT:
I fixed #2 form gp1++) to gp1+=ncols), in the 1st loop, after Paul's correction (thx!)
For anything like this, the answer is going to depend on the compiler and the machine you're running it on. You could try each of your code snippets, and calculating how long each one takes.
However, this is a prime example of premature optimization. The best thing to do is to pick the snippet which looks the clearest and most maintainable. You'll get much more benefit from doing that in the long run than from any savings you'd make from choosing the one that's fastest on your machine (which might not be fastest on someone else's anyway!)
Well, snippet 2 doesn't exactly work. You need different incrementing behavior; the outer loop should read for (gp1 = grid; gp1 < grid + (nrows * ncols); gp1 += ncols).
Of the other two, any compiler that's paying attention will almost certainly convert snippet 3 into something equivalent to snippet 1. But really, there's no way to know without profiling them.
Also, remember the words of Knuth: "Premature optimization is the ROOT OF ALL EVIL. I have seen more damage done in the name of 'optimization' than for all other causes combined, including sheer, wrongheaded stupidity." People who write compilers are smarter than you (unless you're secretly Knuth or Hofstadter), so let the compiler do its job and you can get on with yours. Trying to write "clever" optimized code will usually just confuse the compiler, preventing it from writing even better, more optimized code.
This is the way I'd write it. IMHO it's shorter, clearer and simpler than any of your ways.
int i, j;
gridType *gp = grid;
for (i = 0; i < nrows; i++)
for (j = 0; j < ncols; j++)
set_cell( gp++, i, j, !G_OPENED, !G_FOUND, value, NULL );
gprof not working isn't a real
excuse. You can still set up a
benchmark and measure execution
time.
You might not be able to measure any
difference on modern CPUs until
nrows*ncols is getting very
large or the reallocation happens
very often, so you might optimize the wrong part of your code.
This certainly is micro-optimization as the most runtime will most probably be spent in set_cell and everything else could be optimized to the same or very similar code by the compiler.
You don't know until you measure it.
Any decent compiler may produce the same code, even if it doesn't the effects of caching, pilelining, predictive branching and other clever stuff means that simply guessing the number of instructions isn't enough
I might've gone crazy here, but I keep recompiling the exact same code, and get different answers. I'm not using any random values at all. I am strictly staying to floats and 1D arrays (I want to port this to CUDA eventually).
Is it possible on the compiler side that my same code is being redone in a way that makes it not work at all?
I run the .exe by just clicking on it and it runs fine, but when I click "compile and run" (Dev C++ 4.9.9.2) none of my images come out right. ...although sometimes they do.
...any insight on how I fix this? If I can provide any more help please tell me.
Much Appreciated.
Edit:
Here's the block of code that if I comment it out, everything runs sort of right. (Its completely deterministic if I comment this block out)
-this is a electromagnetic simulator, if that helps at all:
//***********************************************************************
// Update HZ in PML regions (hzx,hzy)
//***********************************************************************
boundaryIndex = 0;
for (regionIndex = 1; regionIndex < NUMBEROFREGIONS; regionIndex++) {
xStart = regionData[regionIndex].xStart;
xStop = regionData[regionIndex].xStop ;
yStart = regionData[regionIndex].yStart;
yStop = regionData[regionIndex].yStop ;
for (i = xStart; i < xStop; i++) {
for (j = yStart; j < yStop; j++) {
hzx = hz[i*xSize+j] - hzy[boundaryIndex]; // extract hzx
hzx = dahz[i*xSize+j] * hzx + dbhz[i*xSize+j] * ( ey[i*(xSize+1)+j] - ey[(i+1)*(xSize+1)+j] ); // dahz,dbhz holds dahzx,dbhzx
hzy[boundaryIndex] = dahzy[boundaryIndex] * hzy[boundaryIndex] + dbhzy[boundaryIndex] * ( ex[i*ySize+j+1] - ex[i*ySize+j] );
hz[i*xSize+j] = hzx + hzy[boundaryIndex]; // update hz
boundaryIndex++;
} //jForLoop /
} //iForLoop /
} //
where, NUMBEROFREGIONS is constant (8), Xsize is defined at compile time (128 here).
Well some code examples would help! But this is a classic symptom of un-initialized variables.
You are not setting some important variables (indexes to 0, switches to True etc.) so your program picks up whichever values are hanging around in memory each time you run.
As these are effectively random values you get different results each time.
Is there an indexing error with your simulated two-dimensional array? Is ey supposed to be xSize or xSize+1 wide?
dahz[i*xSize+j] * hzx +
dbhz[i*xSize+j] * ( ey[i*(xSize+1)+j] -
ey[(i+1)*(xSize+1)+j] );
Your index treats 2D array ey as being xSize+1 wide. The code for array ex treats it as being ySize wide.
dbhzy[boundaryIndex] * ( ex[i*ySize+j+1] - ex[i*ySize+j] );
You are potentially invoking undefined behaviour. There are a number of things that are undefined by the C language standard. Some of these cases can be caught by the compiler and you may be issued a diagnostic, others are harder for the compiler to catch. Here is just a few things that have undefined behaviour:
Trying to use the value of an uninitialised variable:
int i;
printf("%d\n", i); // could be anything!
An object is modified more than once between sequence points:
int i = 4;
i = (i += ++i); // Woah, Nelly!
Reading or writing past the end of an allocated memory block:
int *ints = malloc(100 * sizeof (int));
ints[200] = 0; // Oops...
Using printf et. al but providing the wrong format specifiers:
int i = 4;
printf("%llu\n", i);
Converting a value to a signed integer type but the type cannot represent the value (some say this is implementation defined and the C language specification seems ambiguous):
signed short i;
i = 100.0 * 100.0 * 100.0 * 100.0; // probably won't fit
Edit:
Answered before OP provided code.
Are you compiling it in debug mode or release mode? Each one of these have different way how they initialize the heap and memory.
As everybody said without some code of what is wrong we can't help you a lot.
My best gest from what you just explained is that your creating pointers on non allocated memory.
something like this
APointer *aFunction(){
YourData yd = something;//local variable creation
return yd;
}
main(){
APointer *p = aFunction();
}
Here p is a pointer to something that was a local varaible in aFunction and got destroyed as soon as it left the function, this will sometime by PURE LUCK still point to the right data that hasn't been written over, but this memory space will eventual be changed and your pointer will be reading something different completly random.