gsl openmp failed integration - c

This is my first post on here, so go easy on me!
I have a very strange problem. I've written a c code that converts particle data to grid data (the data comes from a cosmological simulation). In order to do this conversion, I am using the gsl Monte Carlo vegas integrator. When I run it in serial, it runs just fine and gives me the correct answer (albeit slowly). As an attempt at speeding it up, I tried openmp. The problem is that when I run it in parallel, the integration times out (I set a MAX_ITER variable in the integration loop to avoid an infinite loop due to lack of convergence). The random number generator is set and initialized before the parallel block of code. I checked and double checked, and all of the data about the particle that it fails at in parallel (x, y, and z position and integration bounds being passed to the integrator) are the same in both serial and parallel. I also tried increasing my MAX_ITER variable from 100 to 1000, but that did nothing; it just took longer to fail.
My question then, is if anyone has any idea why the code would run in serial, but time out in parallel when using the exact same particles?
Also, in case you want it, the numbers for the offending particle are: x = 0.630278, y = 24.952896, z = 3.256376, h = 3 (this is the smoothing length of the particle, which serves to "smear" out the particle's mass, as the goal of the simulation is to use particles to sample a fluid. This is the sph method), x integration bounds (lower, upper) = {0, 630278}, y bounds = {21.952896, 27.952896}, and z bounds = {0.256376, 6.256375}
The idea behind the conversion is that the particle's mass is contained within a "smoothing sphere" of radius h and centered at the particle itself. This mass is not distributed uniformly, but is done so according to the sph kernel (this is the function I'm integrating). Thus, depending on how the sphere is placed within its "home cell", only a part of the sphere may actually be inside this cell. The goal then is to get the appropriate bounds of integration and pass them to the integrator. The integrator (the code is below) has a check, whereby if the point given to it by the Monte Carlo integrator lies outside the sphere, it returns 0 (this is because getting the exact limits of integration for every possible case is a huge pain).
The code for my loop is here:
// Loop over every particle
#pragma omp parallel shared(M_P, m_header, NumPartTot, num_grid_elements, cxbounds,
cybounds, czbounds, master_cell) private(index, x, y, z, i, j, k, h, tid, cell,
corners)
{
tid = omp_get_thread_num();
// Set up cell struct. Allocate memory!
cell = set_up_cell();
#pragma omp for
for(index = 1; index <= NumPartTot; index++)
{
printf("\n\n\n************************************\n");
printf("Running particle: %d on thread: %d\n", index, tid);
printf("x = %f y = %f z = %f\n", M_P[index].Pos[0], M_P[index].Pos[1], M_P[index].Pos[2]);
printf("**************************************\n\n\n");
fflush(stdout);
// Set up convenience variables
x = M_P[index].Pos[0];
y = M_P[index].Pos[1];
z = M_P[index].Pos[2];
// Figure out which cell the particle is in
i = (int)((x / m_header.BoxSize) * num_grid_elements);
j = (int)((y / m_header.BoxSize) * num_grid_elements);
k = (int)((z / m_header.BoxSize) * num_grid_elements);
corners = get_corners(i, j, k);
// Check to see what type of particle we're dealing with
if(M_P[index].Type == 0)
{
h = M_P[index].hsml;
convert_gas(i, j, k, x, y, z, h, index, cell, corners);
}
else
{
update_cell_non_gas_properties(index, i, j, k, cell);
}
}
// Copy each thread's version of cell to cell_master
#ifdef _OPENMP
copy_to_master_cell(cell);
free_cell(cell);
#endif
} /*-- End of parallel region --*/
The problem occurs in the function convert_gas. The problematic section is here (in the home cell block):
// Case 6: Left face
if(((x + h) < cxbounds[i][j][k].hi) && ((x - h) < cxbounds[i][j][k].lo) &&
((y + h) < cybounds[i][j][k].hi) && ((y - h) >= cybounds[i][j][k].lo) &&
((z + h) < czbounds[i][j][k].hi) && ((z - h) >= czbounds[i][j][k].lo))
{
printf("Using case 6\n");
fflush(stdout);
// Home cell
ixbounds.lo = cxbounds[i][j][k].lo;
ixbounds.hi = x + h;
iybounds.lo = y - h;
iybounds.hi = y + h;
izbounds.lo = z - h;
izbounds.hi = z + h;
kernel = integrate(ixbounds, iybounds, izbounds, h, index, i, j, k);
update_cell_gas_properties(kernel, i, j, k, index, cell);
// Left cell
ixbounds.lo = x - h;
ixbounds.hi = cxbounds[i][j][k].lo;
iybounds.lo = y - h; // Not actual bounds. See note above.
iybounds.hi = y + h;
izbounds.lo = z - h;
izbounds.hi = z + h;
kernel = integrate(ixbounds, iybounds, izbounds, h, index, i - 1, j, k);
update_cell_gas_properties(kernel, i - 1, j, k, index, cell);
return;
}
The data that I'm currently using is test data, so I know exactly where the particles should be and what integration bounds they should have. When using gdb, I find that all of these numbers are correct. The integration loop in the function integrate is here (TOLERANCE is 0.2, WARM_CALLS is 10000, and N_CALLS is 100000):
gsl_monte_vegas_init(monte_state);
// Warm up
gsl_monte_vegas_integrate(&monte_function, lower_bounds, upper_bounds, 3,
WARM_CALLS, random_generator, monte_state, &result, &error);
// Actual integration
do
{
gsl_monte_vegas_integrate(&monte_function, lower_bounds, upper_bounds, 3,
N_CALLS, random_generator, monte_state, &result, &error);
iter++;
} while(fabs(gsl_monte_vegas_chisq(monte_state) - 1.0) > TOLERANCE && iter < MAX_ITER);
if(iter >= MAX_ITER)
{
fprintf(stdout, "ERROR!!! Max iterations %d exceeded!!!\n"
"See M_P[%d].id : %d (%f %f %f)\n"
"lower bnds : (%f %f %f) upper bnds : (%f %f %f)\n"
"trying to integrate in cell %d %d %d\n\n", MAX_ITER, pind, M_P[pind].id,
M_P[pind].Pos[0], M_P[pind].Pos[1], M_P[pind].Pos[2],
ixbounds.lo, iybounds.lo, izbounds.lo, ixbounds.hi, iybounds.hi, izbounds.hi, i, j, k);
fflush(stdout);
exit(1);
}
Again, this exact code (but without openmp, I pass that compile time option as an option in the makefile) with the exact same numbers runs in serial, but not in parallel. I'm sure it's something stupid that I've done and simply cannot see at the moment (at least, I hope!) Anyways, thanks for the help in advance!

Related

Complex numbers (complex.h) and apparent lag of precision

I decided to play around a bit with complex.h, and ran into what I consider a very curious problem.
int mandelbrot(long double complex c, int lim)
{
long double complex z = c;
for(int i = 0; i < lim; ++i, z = cpowl(z,2)+c)
{
if(creall(z)*creall(z)+cimagl(z)*cimagl(z) > 4.0)
return 0;
}
return 1;
}
int mandelbrot2(long double cr, long double ci, int lim)
{
long double zr = cr;
long double zi = ci;
for(int i = 0; i < lim; ++i, zr = zr*zr-zi*zi+cr, zi = 2*zr*zi+ci)
{
if(zr*zr+zi*zi > 4.0)
return 0;
}
return 1;
}
These functions do not behave the same. If we input -2.0+0.0i and a limit higher than 17, the latter will return 1, which is correct for any limit, while the former will return 0, at least on my system. GCC 9.1.0, Ryzen 2700x.
I cannot for the life of me figure out how this can happen. I mean while I may not entirely understand how complex.h works behind the scenes, for this particular example it makes no sense that the results should deviate like this.
While writing I notices the cpowl(z,2)+c, and tried to change it to z*z+c, which helped, however after a quick test, I found that the behavior still differ. Ex. -1.3+0.1*I, lim=18.
I'm curious to know if this is specific to my system and what the cause might be, though I'm perfectly aware that the most like scenario is me having made a mistake, but alas, I can't find it.
--- edit---
Finally, the complete code, including alterations and fixes. The two functions now seem to yield the same result.
#include <stdio.h>
#include <complex.h>
int mandelbrot(long double complex c, int lim)
{
long double complex z = c;
for(int i = 0; i < lim; ++i, z = z*z+c)
{
if(creall(z)*creall(z)+cimagl(z)*cimagl(z) > 4.0)
return 0;
}
return 1;
}
int mandelbrot2(long double cr, long double ci, int lim)
{
long double zr = cr;
long double zi = ci;
long double tmp;
for(int i = 0; i < lim; ++i)
{
if(zr*zr+zi*zi > 4.0) return 0;
tmp = zi;
zi = 2*zr*zi+ci;
zr = zr*zr-tmp*tmp+cr;
}
return 1;
}
int main()
{
long double complex c = -2.0+0.0*I;
printf("%i\n",mandelbrot(c,100));
printf("%i\n",mandelbrot2(-2.0,0.0,100));
return 0;
}
cpowl() still messes things up, but I suppose if I wanted to, I could just create my own implementation.
The second function is the one that's incorrect, not the first.
In the expression in the third clause of the for:
zr = zr*zr-zi*zi+cr, zi = 2*zr*zi+ci
The calculation of zi is using the new value of zr, not the current one. You'll need to save the results of these two calculations in temp variables, then assign these back to zr and zi:
int mandelbrot2(long double cr, long double ci, int lim)
{
long double zr = cr;
long double zi = ci;
for(int i = 0; i < lim; ++i)
{
printf("i=%d, z=%Lf%+Lfi\n", i, zr, zi);
if(zr*zr+zi*zi > 4.0)
return 0;
long double new_zr = zr*zr-zi*zi+cr;
long double new_zi = 2*zr*zi+ci;
zr = new_zr;
zi = new_zi;
}
return 1;
}
Also, using cpowl for simple squaring will result in inaccuracies that can be avoided by simplying using z*z in this case.
Difference for Input −2 + 0 i
cpowl is inaccurate. Exponentiation is a complicated function to implement, and a variety of errors likely arise in its computation. On macOS 10.14.6, z in the mandelbrot routine takes on these values in successive iterations:
z = -2 + 0 i.
z = 2 + 4.33681e-19 i.
z = 2 + 1.73472e-18 i.
z = 2 + 6.93889e-18 i.
z = 2 + 2.77556e-17 i.
z = 2 + 1.11022e-16 i.
z = 2 + 4.44089e-16 i.
z = 2 + 1.77636e-15 i.
z = 2 + 7.10543e-15 i.
z = 2 + 2.84217e-14 i.
z = 2 + 1.13687e-13 i.
z = 2 + 4.54747e-13 i.
z = 2 + 1.81899e-12 i.
z = 2 + 7.27596e-12 i.
z = 2 + 2.91038e-11 i.
z = 2 + 1.16415e-10 i.
z = 2 + 4.65661e-10 i.
Thus, once the initial error is made, producing 2 + 4.33681•10−19 i, z continues to grow (correctly, as a result of mathematics, not just floating-point errors) until it is large enough to pass the test comparing the square of its absolute value to 4. (The test does not immediately capture the excess because the square of the imaginary part is so small it is lost in rounding when added to the square of the real part.)
In contrast, if we replace z = cpowl(z,2)+c with z = z*z + c, z remains 2 (that is, 2 + 0i). In general, the operations in z*z experience some rounding errors too, but not as badly as with cpowl.
Difference for Input −1.3 + 0.1 i
For this input, the difference is caused by the incorrect calculation in the update step of the for loop:
++i, zr = zr*zr-zi*zi+cr, zi = 2*zr*zi+ci
That uses the new value of zr when calculating zi. It can be fixed by inserting long double t; and changing the update step to
++i, t = zr*zr - zi*zi + cr, zi = 2*zr*zi + ci, zr = t

Initial value problem for a system of ODEs solver C program

So I wanted to implement the path of the Moon around the Earth with a C program.
My problem is that you know the Moon's velocity and position at Apogee and Perigee.
So I started to solve it from Apogee, but I cannot figure out how I could add the second velocity and position as "initial value" for it. I tried it with an if but I don't see any difference between the results. Any help is appreciated!
Here is my code:
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <string.h>
typedef void (*ode)(double* p, double t, double* k, double* dk);
void euler(ode f, double *p, double t, double* k, double h, int n, int N)
{
double kn[N];
double dk[N];
double Rp = - 3.633 * pow(10,8); // x position at Perigee
for(int i = 0; i < n; i++)
{
f(p, 0, k, dk);
for (int j = 0; j < N; j++)
{
if (k[0] == Rp) // this is the "if" I mentioned in my comment
// x coordinate at Perigee
{
k[1] = 0; // y coordinate at Perigee
k[2] = 0; // x velocity component at Perigee
k[3] = 1076; // y velocity component at Perigee
}
kn[j] = k[j] + h * dk[j];
printf("%f ", kn[j]);
k[j] = kn[j];
}
printf("\n");
}
}
void gravity_equation(double* p, double t, double* k, double* dk)
{
// Earth is at the (0, 0)
double G = p[0]; // Gravitational constant
double m = p[1]; // Earth mass
double x = k[0]; // x coordinate at Apogee
double y = k[1]; // y coordinate at Apogee
double Vx = k[2]; // x velocity component at Apogee
double Vy = k[3]; // y velocity component at Apogee
dk[0] = Vx;
dk[1] = Vy;
dk[2] = (- G * m * x) / pow(sqrt((x * x)+(y * y)),3);
dk[3] = (- G * m * y) / pow(sqrt((x * x)+(y * y)),3);
}
void run_gravity_equation()
{
int N = 4; // how many equations there are
double initial_values[N];
initial_values[0] = 4.055*pow(10,8); // x position at Apogee
initial_values[1] = 0; // y position at Apogee
initial_values[2] = 0; // x velocity component at Apogee
initial_values[3] = (-1) * 964; //y velocity component at Perigee
int p = 2; // how many parameters there are
double parameters[p];
parameters[0] = 6.67384 * pow(10, -11); // Gravitational constant
parameters[1] = 5.9736 * pow(10, 24); // Earth mass
double h = 3600; // step size
int n = 3000; // the number of steps
euler(&gravity_equation, parameters, 0, initial_values, h, n, N);
}
int main()
{
run_gravity_equation();
return 0;
}
Your interface is
euler(odefun, params, t0, y0, h, n, N)
where
N = dimension of state space
n = number of steps to perform
h = step size
t0, y0 = initial time and value
The intended function of this procedure seems to be that the updated values are returned inside the array y0. There is no reason to insert some hack to force the state to have some initial conditions. The initial condition is passed as argument. As you are doing in void run_gravity_equation(). The integration routine should remain agnostic of the details of the physical model.
It is extremely improbable that you will hit the same value in k[0] == Rp a second time. What you can do is to check for sign changes in Vx, that is, k[1] to find points or segments of extremal x coordinate.
Trying to interpret your description closer, what you want to do is to solve a boundary value problem where x(0)=4.055e8, x'(0)=0, y'(0)=-964 and x(T)=-3.633e8, x'(T)=0. This has the advanced tasks to solve a boundary value problem with single or multiple shooting and additionally, that the upper boundary is variable.
You might want to to use the Kepler laws to get further insights into the parameters of this problem so that you can solve it just with a forward integration. The Kepler ellipse of the first Kepler law has the formula (scaled for Apogee at phi=0, Perigee at phi=pi)
r = R/(1-E*cos(phi))
so that
R/(1-E)=4.055e8 and R/(1+E)=3.633e8,
which gives
R=3.633*(1+E)=4.055*(1-E)
==> E = (4.055-3.633)/(4.055+3.633) = 0.054891,
R = 3.633e8*(1+0.05489) = 3.8324e8
Further, the angular velocity is given by the second Kepler law
phi'*r^2 = const. = sqrt(R*G*m)
which gives tangential velocities at Apogee (r=R/(1-E))
y'(0)=phi'*r = sqrt(R*G*m)*(1-E)/R = 963.9438
and Perigee (r=R/(1+E))
-y'(T)=phi'*r = sqrt(R*G*m)*(1+E)/R = 1075.9130
which indeed reproduces the constants you used in your code.
The area of the Kepler ellipse is pi/4 times the product of smallest and largest diameter. The smallest diameter can be found at cos(phi)=E, the largest is the sum of apogee and perigee radius, so that the area is
pi*R/sqrt(1-E^2)*(R/(1+E)+R/(1-E))/2= pi*R^2/(1-E^2)^1.5
At the same time it is the integral over 0.5*phi*r^2 over the full period 2*T, thus equal to
sqrt(R*G*m)*T
which is the third Kepler law. This allows to compute the half-period as
T = pi/sqrt(G*m)*(R/(1-E^2))^1.5 = 1185821
With h = 3600 the half point should be reached between n=329 and n=330 (n=329.395). Integration with scipy.integrate.odeint vs. Euler steps gives the following table for h=3600:
n [ x[n], y[n] ] for odeint/lsode for Euler
328 [ -4.05469444e+08, 4.83941626e+06] [ -4.28090166e+08, 3.81898023e+07]
329 [ -4.05497554e+08, 1.36933874e+06] [ -4.28507841e+08, 3.48454695e+07]
330 [ -4.05494242e+08, -2.10084488e+06] [ -4.28897657e+08, 3.14986514e+07]
The same for h=36, n=32939..32940
n [ x[n], y[n] ] for odeint/lsode for Euler
32938 [ -4.05499997e+08 5.06668940e+04] [ -4.05754415e+08 3.93845978e+05]
32939 [ -4.05500000e+08 1.59649309e+04] [ -4.05754462e+08 3.59155385e+05]
32940 [ -4.05500000e+08 -1.87370323e+04] [ -4.05754505e+08 3.24464789e+05]
32941 [ -4.05499996e+08 -5.34389954e+04] [ -4.05754545e+08 2.89774191e+05]
which is a little closer for the Euler method, but not much better.

3D Sobel Operator Algorith in C

I'm currently struggling to make a 3D Sobel edge detector in C (which I am quite new to). It's not exactly working as expected (highlighting non-edges within a solid 3D object) and I was hoping someone might see where I've gone wrong. (and sorry for the poor spacing in this post)
First of all, im is the input image which has been copied into tm with a 1 pixel border on each side.
I loop through the image:
for (z = im.zlo; z <= im.zhi; z++) {
for (y = im.ylo; y <= im.yhi; y++) {
for (x = im.xlo; x <= im.xhi; x++) {
I make an array which will house the change in the x, y, and z directions, and loop through a 3x3x3 cube:
int dxdydz[3] = {0, 0, 0};
for (a = -1; a < 2; a++) {
for (b = -1; b < 2; b++) {
for (c = -1; c < 2; c++) {
Now here's the meat, where it gets a bit tricky. I'm weighting my Sobel operator such that if you imagine one 2D surface of the kernel, it would be {{1,2,1},{2,4,2},{1,2,1}}. In other words, the weight of a kernel pixel is related to its 4-connected nearness to the center pixel.
To accomplish this, I define e as 3 - (|a| + |b| + |c|), so that it is either 0, 1, or 2. The kernel will be weighted by 3^e at each pixel.
The sign of the kernel pixel will just be determined by the sign of a, b, or c.
int e = 3 - (abs(a) + abs(b) + abs(c));
Now I loop through a, b, and c by packaging them into an array and looping from 0-1-2. When a for example is 0, we don't want to add any values to x, so we exclude that with an if statement (8 levels deep!).
int abc[3] = {a, b, c};
for (i = 0; i < 3; i++) {
if (abc[i] != 0) {
The value to add should just be the image value at that pixel multiplied by the kernel value at that pixel. abc[i] is just -1 or 1, and (int)pow(3, e) is the nearness-to-center weight.
dxdydz[i] += abc[i]*(int)pow(3, e)*tm.u[z+a][y+b][x+c];
}
}
}
}
}
Lastly take the sqrt of the sum of the squared changes in x, y, and z.
int mag2 = 0;
for (i = 0; i < 3; i++) {
mag2 += (int)pow(dxdydz[i], 2);
}
im.u[z][y][x] = (int)sqrt(mag2);
}
}
}
Of course I could just loop through the image and multiply 3x3x3 cubes by the 3D kernels:
int kx[3][3][3] = {{{-1,-2,-1},{0,0,0},{1,2,1}},
{{-2,-4,-2},{0,0,0},{2,4,2}},
{{-1,-2,-1},{0,0,0},{1,2,1}}};
int ky[3][3][3] = {{{-1,-2,-1},{-2,-4,-2},{-1,-2,-1}},
{{0,0,0},{0,0,0},{0,0,0}},
{{1,2,1},{2,4,2},{1,2,1}}};
int kz[3][3][3] = {{{-1,0,1},{-2,0,2},{-1,0,1}},
{{-2,0,2},{-4,0,4},{-2,0,2}},
{{-1,0,1},{-1,0,1},{-1,0,1}}};
But I think the loop approach is a lot sexier.

Improvement to my Mandelbrot set code

I have the following Mandelbrot set code in C. I am doing the calculation and creating a .ppm file for the final fractal image. The point is that my fractal image is upside down, meaning it is rotated by 90 degrees. You can check it by executing my code:
./mandel > test.ppm
On the other hand, I also want to change the colours. I want to achieve this fractal image:
My final issue is that my code doesn't check the running time of my code. I have the code for this part too, but when code execution finishes it doesn't print the running time. If someone can make the appropriate changes to my code and help me achieve this fractal image, and make elapsed time displayed I would be glad.
#include <math.h>
#include <stdlib.h>
#include <time.h>
#include <stdio.h>
void color(int red, int green, int blue)
{
fputc((char)red, stdout);
fputc((char)green, stdout);
fputc((char)blue, stdout);
}
int main(int argc, char *argv[])
{
int w = 600, h = 400, x, y;
//each iteration, it calculates: newz = oldz*oldz + p, where p is the current pixel, and oldz stars at the origin
double pr, pi; //real and imaginary part of the pixel p
double newRe, newIm, oldRe, oldIm; //real and imaginary parts of new and old z
double zoom = 1, moveX = -0.5, moveY = 0; //you can change these to zoom and change position
int maxIterations = 1000;//after how much iterations the function should stop
clock_t begin, end;
double time_spent;
printf("P6\n# CREATOR: E.T / mandel program\n");
printf("%d %d\n255\n",w,h);
begin = clock();
//loop through every pixel
for(x = 0; x < w; x++)
for(y = 0; y < h; y++)
{
//calculate the initial real and imaginary part of z, based on the pixel location and zoom and position values
pr = 1.5 * (x - w / 2) / (0.5 * zoom * w) + moveX;
pi = (y - h / 2) / (0.5 * zoom * h) + moveY;
newRe = newIm = oldRe = oldIm = 0; //these should start at 0,0
//"i" will represent the number of iterations
int i;
//start the iteration process
for(i = 0; i < maxIterations; i++)
{
//remember value of previous iteration
oldRe = newRe;
oldIm = newIm;
//the actual iteration, the real and imaginary part are calculated
newRe = oldRe * oldRe - oldIm * oldIm + pr;
newIm = 2 * oldRe * oldIm + pi;
//if the point is outside the circle with radius 2: stop
if((newRe * newRe + newIm * newIm) > 4) break;
}
color(i % 256, 255, 255 * (i < maxIterations));
}
end = clock();
time_spent = (double)(end - begin) / CLOCKS_PER_SEC;
printf("Elapsed time: %.2lf seconds.\n", time_spent);
return 0;
}
Part 1:
You need to swap the order of your loops to:
for(y = 0; y < h; y++)
for(x = 0; x < w; x++)
That will give you the correctly oriented fractal.
Part 2:
To get the time to print out, you should print it to stderr since you are printing the ppm output to stdout:
fprintf(stderr, "Elapsed time: %.2lf seconds.\n", time_spent);
Part 3:
To get a continuous smooth coloring, you need to use the Normalized Iteration Count method or something similar. Here is a replacement for your coloring section that gives you something similar to what you desire:
if(i == maxIterations)
color(0, 0, 0); // black
else
{
double z = sqrt(newRe * newRe + newIm * newIm);
int brightness = 256. * log2(1.75 + i - log2(log2(z))) / log2(double(maxIterations));
color(brightness, brightness, 255);
}
It isn't quite there because I kind of did a simple approximate implementation of the Normalized Iteration Count method.
It isn't a fully continuous coloring, but it is kind of close.

Progressive loop through pairs of increasing integers

Suppose one wanted to search for pairs of integers x and y a that satisfy some equation, such as (off the top of my head) 7 x^2 + x y - 3 y^2 = 5
(I know there are quite efficient methods for finding integer solutions to quadratics like that; but this is irrelevant for the purpose of the present question.)
The obvious approach is to use a simple double loop "for x = -max to max; for y = -max to max { blah}" But to allow the search to be stopped and resumed, a more convenient approach, picturing the possible integers of x and y as a square lattice of points in the plane, is to work round a "square spiral" outward from the origin, starting and stopping at (say) the top right corner.
So basically, I am asking for a simple and sound "pseudo-code" for the loops to start and stop this process at points (m, m) and (n, n) respectively.
For extra kudos, if the reader is inclined, I suggest also providing the loops if one of x can be assumed non-negative, or if both can be assumed non-negative. This is probably somewhat easier, especially the second.
I could whump this up myself without much difficulty, but am interested in seeing neat ideas of others.
This would make quite a good "constructive" interview challenge for those dreaded interviewers who like to torture candidates with white boards ;-)
def enumerateIntegerPairs(fromRadius, toRadius):
for radius in range(fromRadius, toRadius + 1):
if radius == 0: yield (0, 0)
for x in range(-radius, radius): yield (x, radius)
for y in range(-radius, radius): yield (radius, -y)
for x in range(-radius, radius): yield (-x, -radius)
for y in range(-radius, radius): yield (-radius, y)
Here is a straightforward implementation (also on ideone):
void turn(int *dr, int *dc) {
int tmp = *dc;
*dc = -*dr;
*dr = tmp;
}
int main(void) {
int N = 3;
int r = 0, c = 0;
int sz = 0;
int dr = 1, dc = 0, cnt = 0;
while (r != N+1 && c != N+1) {
printf("%d %d\n", r, c);
if (cnt == sz) {
turn(&dr, &dc);
cnt = 0;
if (dr == 0 && dc == -1) {
r++;
c++;
sz += 2;
}
}
cnt++;
r += dr;
c += dc;
}
return 0;
}
The key in the implementation is the turn function, that performs the right turn given a pair of {delta-Row, delta-Col}. The rest is straightforward arithmetic.

Resources