I am programming a game for fun and to get more familiar with C and GBA mode 3. Though, I have run into an issue.
I have these two blocks on the screen, one is the good guy, the other is the bad guy. When the good guy collides with the bad guy its supposed to remove a life. That is where the problem comes in.
I have this within a while loop that runs the game:
if (plyr_row < enemy_row + enemy_size && plyr_ row
+ plyr_size > enemy_row && plyr_col < enemy_col + enemy_size
&& plyr_size + plyr_col > enemy_col)
{
lives--;
}
The lives do go down, but a lot of lives are taken away while the player is making contact with the enemy. In other words, during contact, the lives drop really fast and I just want to remove one for each time they collide, how can I accomplish that?
You have to use a flag to remember, if a collision is currently happening or not. Something like:
int in_collision = 0; // global flag, initialized to 0 once at start
...
if (plyr_row < enemy_row + enemy_size &&
plyr_row + plyr_size > enemy_row &&
plyr_col < enemy_col + enemy_size &&
plyr_size + plyr_col > enemy_col) {
if (!in_collision) {
in_collision = 1;
lives--;
}
} else {
in_collision = 0;
}
Now, the running collision must stop before another life will be removed on the following collision.
The simplest solution is to maintain a flag IN_COLLISION. You want to remove a life when there is a collision and IN_COLLISION is false.
Then it's a matter of toggling it to true at the first collision detection and then to false when you are not colliding anymore.
The code below shows the rowtotal[0], which is the return value I'm getting from an infinite loop for every iteration. I'm trying to break the loop when all three returned values from the costcheck array are the same. This is my code:
do
{
.
.
.
/*do loop body*/
.
.
costcheck[counter3]=rowtotal[0];
if(costcheck[counter3-2]==costcheck[counter3] &&
costcheck[counter3-1]==costcheck[counter3] )
{
response=1;
}
counter3++;
printf("\t\t\t Number of iterations: %d \r", stop++);
}
while(response!=1);
Just get rid of all strange, superfluous variables. You only need to save the result of the previous iteration, together with a counter which you increase each time you find a match, rather than every time in the loop.
int counter=0;
const int COUNT_N = 3;
data_t prev=FORBIDDEN; // a value that rowdata[0] can never have
while(counter != COUNT_N)
{
...
if(prev == rowdata[0])
{
counter++;
}
else
{
counter=0;
}
prev = rowdata[0];
}
just to elaborate on Lundins Answer wich is the way to go in my opinion (would have posted as a comment, but lacking reputation...)
Only thing missing is the actual loop advancement counter (counter3 in your example):
int quitCounter=0; // Counter for quiting the loop on 3 consecutive identical values
int loopCounter=0; // Your normal rowdata index
const int QUIT_COUNT_N = 3;
#define FORBIDDEN 0xffffff // or some other value that rowdata[0] can never have
data_t prev=FORBIDDEN; // a value
do
{
...
/* do loop body, provide new value for rowtotal[0] on each iteration */
/* if you need to store the consecutive values returned in rowtotal[0] in costcheck array,
make sure that it will be big enough - if you only need to break on 3 identical values,
you can skip the entire costcheck array as Lundin proposes. */
...
costcheck[counter3]=rowtotal[0];
if(prev == costcheck[counter3])
{
quitCounter++;
}
else
{
quitCounter=0;
}
prev = costcheck[counter3];
counter3++;
} while(quitCounter!= QUIT_COUNT_N )
If you really want an infinite loop, a if(costcheck[counter-1] == costcheck[counter-2] && costcheck[counter-2] == costcheck[counter-3]) will lead to failure of program, if costcheck array has less than 3 elements. You have to be sure that it does have at least 3 elemets in this array.
BUT!!!! counter does not need to be more than 3 because as far as i get it, you want to check 3 most reciently read elements. Which means for comparison, you only need to remember 3 last values that were read.
The exapmple below stores up to 3 rowtotal[0] values, and checks if they are equal. If they are, progarm exits, if not, program gets new rowtotal[0] to the "end" of costcheck array, also the oldest value: here it's costcheck[0] is lost.
I can post the code to the example which i made, to show how the logic should work.
NOTE!!! I strongly think Lundin's and Morphine's solutions are by far better than mine.
do
{
.............
if(counter < 3)
{
costcheck[counter] = rowtotal[0];
counter++;
continue;
}
else
{
if(costcheck[counter-1] == costcheck[counter-2] && costcheck[counter-2] == costcheck[counter-3])
{
response=1;
}
else
{
costcheck[counter-3] = costcheck[counter-2];
costcheck[counter-2] = costcheck[counter-1];
costcheck[counter-1] = rowtotal[0];
}
}
}
while(response!=1);
}
This is a tough one, at least for my minimal c skills.
Basically, the user enters a list of prices into an array, and then the desired number of items he wants to purchase, and finally a maximum cost not to exceed.
I need to check how many combinations of the desired number of items are less than or equal to the cost given.
If the problem was a fixed number of items in the combination, say 3, it would be much easier with just three loops selecting each price and adding them to test.
Where I get stumped is the requirement that the user enter any number of items, up to the number of items in the array.
This is what I decided on at first, before realizing that the user could specify combinations of any number, not just three. It was created with help from a similar topic on here, but again it only works if the user specifies he wants 3 items per combination. Otherwise it doesn't work.
// test if any combinations of items can be made
for (one = 0; one < (count-2); one++) // count -2 to account for the two other variables
{
for (two = one + 1; two < (count-1); two++) // count -1 to account for the last variable
{
for (three = two + 1; three < count; three++)
{
total = itemCosts[one] + itemCosts[two] + itemCosts[three];
if (total <= funds)
{
// DEBUG printf("\nMatch found! %d + %d + %d, total: %d.", itemCosts[one], itemCosts[two], itemCosts[three], total);
combos++;
}
}
}
}
As far as I can tell there's no easy way to adapt this to be flexible based on the user's desired number of items per combination.
I would really appreciate any help given.
One trick to flattening nested iterations is to use recursion.
Make a function that takes an array of items that you have selected so far, and the number of items you've picked up to this point. The algorithm should go like this:
If you have picked the number of items equal to your target of N, compute the sum and check it against the limit
If you have not picked enough items, add one more item to your list, and make a recursive call.
To ensure that you do not pick the same item twice, pass the smallest index from which the function may pick. The declaration of the function may look like this:
int count_combinations(
int itemCosts[]
, size_t costCount
, int pickedItems[]
, size_t pickedCount
, size_t pickedTargetCount
, size_t minIndex
, int funds
) {
if (pickedCount == pickedTargetCount) {
// This is the base case. It has the code similar to
// the "if" statement from your code, but the number of items
// is not fixed.
int sum = 0;
for (size_t i = 0 ; i != pickedCount ; i++) {
sum += pickedItems[i];
}
// The following line will return 0 or 1,
// depending on the result of the comparison.
return sum <= funds;
} else {
// This is the recursive case. It is similar to one of your "for"
// loops, but instead of setting "one", "two", or "three"
// it sets pickedItems[0], pickedItems[1], etc.
int res = 0;
for (size_t i = minIndex ; i != costCount ; i++) {
pickedItems[pickedCount] = itemCosts[i];
res += count_combinations(
itemCosts
, costCount
, pickedItems
, pickedCount+1
, pickedTargetCount
, i+1
, funds
);
}
return res;
}
}
You call this function like this:
int itemCosts[C] = {...}; // The costs
int pickedItems[N]; // No need to initialize this array
int res = count_combinations(itemCosts, C, pickedItems, 0, N, 0, funds);
Demo.
This can be done by using a backtracking algorithm. This is equivalent to implementing a list of nested for loops. This can be better understood by trying to see the execution pattern of a sequence of nested for loops.
For example lets say you have, as you presented, a sequence of 3 fors and the code execution has reached the third level (the innermost). After this goes through all its iterations you return to the second level for where you go to the next iteration in which you jump again in third level for. Similarly, when the second level finishes all its iteration you jump back to the first level for which continues with the next iteration in which you jump in the second level and from there in the third.
So, in a given level you try go to the deeper one (if there is one) and if there are no more iterations you go back a level (back track).
Using the backtracking you represent the nested for by an array where each element is an index variable: array[0] is the index for for level 0, and so on.
Here is a sample implementation for your problem:
#define NUMBER_OF_OBJECTS 10
#define FORLOOP_DEPTH 4 // This is equivalent with the number of
// of nested fors and in the problem is
// the number of requested objects
#define FORLOOP_ARRAY_INIT -1 // This is a init value for each "forloop" variable
#define true 1
#define false 0
typedef int bool;
int main(void)
{
int object_prices[NUMBER_OF_OBJECTS];
int forLoopsArray[FORLOOP_DEPTH];
bool isLoopVariableValueUsed[NUMBER_OF_OBJECTS];
int forLoopLevel = 0;
for (int i = 0; i < FORLOOP_DEPTH; i++)
{
forLoopsArray[i] = FORLOOP_ARRAY_INIT;
}
for (int i = 0; i < NUMBER_OF_OBJECTS; i++)
{
isLoopVariableValueUsed[i] = false;
}
forLoopLevel = 0; // Start from level zero
while (forLoopLevel >= 0)
{
bool isOkVal = false;
if (forLoopsArray[forLoopLevel] != FORLOOP_ARRAY_INIT)
{
// We'll mark the loopvariable value from the last iterration unused
// since we'll use a new one (in this iterration)
isLoopVariableValueUsed[forLoopsArray[forLoopLevel]] = false;
}
/* All iterations (in all levels) start basically from zero
* Because of that here I check that the loop variable for this level
* is different than the previous ones or try the next value otherwise
*/
while ( isOkVal == false
&& forLoopsArray[forLoopLevel] < (NUMBER_OF_OBJECTS - 1))
{
forLoopsArray[forLoopLevel]++; // Try a new value
if (loopVariableValueUsed[forLoopsArray[forLoopLevel]] == false)
{
objectUsed[forLoopsArray[forLoopLevel]] = true;
isOkVal = true;
}
}
if (isOkVal == true) // Have we found in this level an different item?
{
if (forLoopLevel == FORLOOP_DEPTH - 1) // Is it the innermost?
{
/* Here is the innermost level where you can test
* if the sum of all selected items is smaller than
* the target
*/
}
else // Nope, go a level deeper
{
forLoopLevel++;
}
}
else // We've run out of values in this level, go back
{
forLoopsArray[forLoopLevel] = FORLOOP_ARRAY_INIT;
forLoopLevel--;
}
}
}
I saw the below algorithm works to check if a point is in a given polygon from this link:
int pnpoly(int nvert, float *vertx, float *verty, float testx, float testy)
{
int i, j, c = 0;
for (i = 0, j = nvert-1; i < nvert; j = i++) {
if ( ((verty[i]>testy) != (verty[j]>testy)) &&
(testx < (vertx[j]-vertx[i]) * (testy-verty[i]) / (verty[j]-verty[i]) + vertx[i]) )
c = !c;
}
return c;
}
I tried this algorithm and it actually works just perfect. But sadly I cannot understand it well after spending some time trying to get the idea of it.
So if someone is able to understand this algorithm, please explain it to me a little.
Thank you.
The algorithm is ray-casting to the right. Each iteration of the loop, the test point is checked against one of the polygon's edges. The first line of the if-test succeeds if the point's y-coord is within the edge's scope. The second line checks whether the test point is to the left of the line (I think - I haven't got any scrap paper to hand to check). If that is true the line drawn rightwards from the test point crosses that edge.
By repeatedly inverting the value of c, the algorithm counts how many times the rightward line crosses the polygon. If it crosses an odd number of times, then the point is inside; if an even number, the point is outside.
I would have concerns with a) the accuracy of floating-point arithmetic, and b) the effects of having a horizontal edge, or a test point with the same y-coord as a vertex, though.
Edit 1/30/2022: I wrote this answer 9 years ago when I was in college. People in the chat conversation are indicating it's not accurate. You should probably look elsewhere. 🤷♂️
Chowlett is correct in every way, shape, and form.
The algorithm assumes that if your point is on the line of the polygon, then that is outside - for some cases, this is false. Changing the two '>' operators to '>=' and changing '<' to '<=' will fix that.
bool PointInPolygon(Point point, Polygon polygon) {
vector<Point> points = polygon.getPoints();
int i, j, nvert = points.size();
bool c = false;
for(i = 0, j = nvert - 1; i < nvert; j = i++) {
if( ( (points[i].y >= point.y ) != (points[j].y >= point.y) ) &&
(point.x <= (points[j].x - points[i].x) * (point.y - points[i].y) / (points[j].y - points[i].y) + points[i].x)
)
c = !c;
}
return c;
}
I changed the original code to make it a little more readable (also this uses Eigen). The algorithm is identical.
// This uses the ray-casting algorithm to decide whether the point is inside
// the given polygon. See https://en.wikipedia.org/wiki/Point_in_polygon#Ray_casting_algorithm
bool pnpoly(const Eigen::MatrixX2d &poly, float x, float y)
{
// If we never cross any lines we're inside.
bool inside = false;
// Loop through all the edges.
for (int i = 0; i < poly.rows(); ++i)
{
// i is the index of the first vertex, j is the next one.
// The original code uses a too-clever trick for this.
int j = (i + 1) % poly.rows();
// The vertices of the edge we are checking.
double xp0 = poly(i, 0);
double yp0 = poly(i, 1);
double xp1 = poly(j, 0);
double yp1 = poly(j, 1);
// Check whether the edge intersects a line from (-inf,y) to (x,y).
// First check if the line crosses the horizontal line at y in either direction.
if ((yp0 <= y) && (yp1 > y) || (yp1 <= y) && (yp0 > y))
{
// If so, get the point where it crosses that line. This is a simple solution
// to a linear equation. Note that we can't get a division by zero here -
// if yp1 == yp0 then the above if will be false.
double cross = (xp1 - xp0) * (y - yp0) / (yp1 - yp0) + xp0;
// Finally check if it crosses to the left of our test point. You could equally
// do right and it should give the same result.
if (cross < x)
inside = !inside;
}
}
return inside;
}
To expand on the "too-clever trick". We want to iterate over all adjacent vertices, like this (imagine there are 4 vertices):
i
j
0
1
1
2
2
3
3
0
My code above does it the simple obvious way - j = (i + 1) % num_vertices. However this uses integer division which is much much slower than all other operations. So if this is performance critical (e.g. in an AAA game) you want to avoid it.
The original code changes the order of iteration a bit:
i
j
0
3
1
0
2
1
3
2
This is still totally valid since we're still iterating over every vertex pair and it doesn't really matter whether you go clockwise or anticlockwise, or where you start. However now it lets us avoid the integer division. In easy-to-understand form:
int i = 0;
int j = num_vertices - 1; // 3
while (i < num_vertices) { // 4
{body}
j = i;
++i;
}
Or in very terse C style:
for (int i = 0, j = num_vertices - 1; i < num_vertices; j = i++) {
{body}
}
This might be as detailed as it might get for explaining the ray-tracing algorithm in actual code. It might not be optimized but that must always come after a complete grasp of the system.
//method to check if a Coordinate is located in a polygon
public boolean checkIsInPolygon(ArrayList<Coordinate> poly){
//this method uses the ray tracing algorithm to determine if the point is in the polygon
int nPoints=poly.size();
int j=-999;
int i=-999;
boolean locatedInPolygon=false;
for(i=0;i<(nPoints);i++){
//repeat loop for all sets of points
if(i==(nPoints-1)){
//if i is the last vertex, let j be the first vertex
j= 0;
}else{
//for all-else, let j=(i+1)th vertex
j=i+1;
}
float vertY_i= (float)poly.get(i).getY();
float vertX_i= (float)poly.get(i).getX();
float vertY_j= (float)poly.get(j).getY();
float vertX_j= (float)poly.get(j).getX();
float testX = (float)this.getX();
float testY = (float)this.getY();
// following statement checks if testPoint.Y is below Y-coord of i-th vertex
boolean belowLowY=vertY_i>testY;
// following statement checks if testPoint.Y is below Y-coord of i+1-th vertex
boolean belowHighY=vertY_j>testY;
/* following statement is true if testPoint.Y satisfies either (only one is possible)
-->(i).Y < testPoint.Y < (i+1).Y OR
-->(i).Y > testPoint.Y > (i+1).Y
(Note)
Both of the conditions indicate that a point is located within the edges of the Y-th coordinate
of the (i)-th and the (i+1)- th vertices of the polygon. If neither of the above
conditions is satisfied, then it is assured that a semi-infinite horizontal line draw
to the right from the testpoint will NOT cross the line that connects vertices i and i+1
of the polygon
*/
boolean withinYsEdges= belowLowY != belowHighY;
if( withinYsEdges){
// this is the slope of the line that connects vertices i and i+1 of the polygon
float slopeOfLine = ( vertX_j-vertX_i )/ (vertY_j-vertY_i) ;
// this looks up the x-coord of a point lying on the above line, given its y-coord
float pointOnLine = ( slopeOfLine* (testY - vertY_i) )+vertX_i;
//checks to see if x-coord of testPoint is smaller than the point on the line with the same y-coord
boolean isLeftToLine= testX < pointOnLine;
if(isLeftToLine){
//this statement changes true to false (and vice-versa)
locatedInPolygon= !locatedInPolygon;
}//end if (isLeftToLine)
}//end if (withinYsEdges
}
return locatedInPolygon;
}
Just one word about optimization: It isn't true that the shortest (and/or the tersest) code is the fastest implemented. It is a much faster process to read and store an element from an array and use it (possibly) many times within the execution of the block of code than to access the array each time it is required. This is especially significant if the array is extremely large. In my opinion, by storing each term of an array in a well-named variable, it is also easier to assess its purpose and thus form a much more readable code. Just my two cents...
The algorithm is stripped down to the most necessary elements. After it was developed and tested all unnecessary stuff has been removed. As result you can't undertand it easily but it does the job and also in very good performance.
I took the liberty to translate it to ActionScript-3:
// not optimized yet (nvert could be left out)
public static function pnpoly(nvert: int, vertx: Array, verty: Array, x: Number, y: Number): Boolean
{
var i: int, j: int;
var c: Boolean = false;
for (i = 0, j = nvert - 1; i < nvert; j = i++)
{
if (((verty[i] > y) != (verty[j] > y)) && (x < (vertx[j] - vertx[i]) * (y - verty[i]) / (verty[j] - verty[i]) + vertx[i]))
c = !c;
}
return c;
}
This algorithm works in any closed polygon as long as the polygon's sides don't cross. Triangle, pentagon, square, even a very curvy piecewise-linear rubber band that doesn't cross itself.
1) Define your polygon as a directed group of vectors. By this it is meant that every side of the polygon is described by a vector that goes from vertex an to vertex an+1. The vectors are so directed so that the head of one touches the tail of the next until the last vector touches the tail of the first.
2) Select the point to test inside or outside of the polygon.
3) For each vector Vn along the perimeter of the polygon find vector Dn that starts on the test point and ends at the tail of Vn. Calculate the vector Cn defined as DnXVn/DN*VN (X indicates cross product; * indicates dot product). Call the magnitude of Cn by the name Mn.
4) Add all Mn and call this quantity K.
5) If K is zero, the point is outside the polygon.
6) If K is not zero, the point is inside the polygon.
Theoretically, a point lying ON the edge of the polygon will produce an undefined result.
The geometrical meaning of K is the total angle that the flea sitting on our test point "saw" the ant walking at the edge of the polygon walk to the left minus the angle walked to the right. In a closed circuit, the ant ends where it started.
Outside of the polygon, regardless of location, the answer is zero.
Inside of the polygon, regardless of location, the answer is "one time around the point".
This method check whether the ray from the point (testx, testy) to O (0,0) cut the sides of the polygon or not .
There's a well-known conclusion here: if a ray from 1 point and cut the sides of a polygon for a odd time, that point will belong to the polygon, otherwise that point will be outside the polygon.
To expand on #chowlette's answer where the second line checks if the point is to the left of the line,
No derivation is given but this is what I worked out:
First it helps to imagine 2 basic cases:
the point is left of the line . / or
the point is right of the line / .
If our point were to shoot a ray out horizontally where would it strike the line segment. Is our point to the left or right of it? Inside or out? We know its y coordinate because it's by definition the same as the point. What would the x coordinate be?
Take your traditional line formula y = mx + b. m is the rise over the run. Here, instead we are trying to find the x coordinate of the point on that line segment that has the same height (y) as our point.
So we solve for x: x = (y - b)/m. m is rise over run, so this becomes run over rise or (yj - yi)/(xj - xi) becomes (xj - xi)/(yj - yi). b is the offset from origin. If we assume yi as the base for our coordinate system, b becomes yi. Our point testy is our input, subtracting yi turns the whole formula into an offset from yi.
We now have (xj - xi)/(yj - yi) or 1/m times y or (testy - yi): (xj - xi)(testy - yi)/(yj - yi) but testx isn't based to yi so we add it back in order to compare the two ( or zero testx as well )
I think the basic idea is to calculate vectors from the point, one per edge of the polygon. If vector crosses one edge, then the point is within the polygon. By concave polygons if it crosses an odd number of edges it is inside as well (disclaimer: although not sure if it works for all concave polygons).
This is the algorithm I use, but I added a bit of preprocessing trickery to speed it up. My polygons have ~1000 edges and they don't change, but I need to look up whether the cursor is inside one on every mouse move.
I basically split the height of the bounding rectangle to equal length intervals and for each of these intervals I compile the list of edges that lie within/intersect with it.
When I need to look up a point, I can calculate - in O(1) time - which interval it is in and then I only need to test those edges that are in the interval's list.
I used 256 intervals and this reduced the number of edges I need to test to 2-10 instead of ~1000.
Here's a php implementation of this:
<?php
class Point2D {
public $x;
public $y;
function __construct($x, $y) {
$this->x = $x;
$this->y = $y;
}
function x() {
return $this->x;
}
function y() {
return $this->y;
}
}
class Point {
protected $vertices;
function __construct($vertices) {
$this->vertices = $vertices;
}
//Determines if the specified point is within the polygon.
function pointInPolygon($point) {
/* #var $point Point2D */
$poly_vertices = $this->vertices;
$num_of_vertices = count($poly_vertices);
$edge_error = 1.192092896e-07;
$r = false;
for ($i = 0, $j = $num_of_vertices - 1; $i < $num_of_vertices; $j = $i++) {
/* #var $current_vertex_i Point2D */
/* #var $current_vertex_j Point2D */
$current_vertex_i = $poly_vertices[$i];
$current_vertex_j = $poly_vertices[$j];
if (abs($current_vertex_i->y - $current_vertex_j->y) <= $edge_error && abs($current_vertex_j->y - $point->y) <= $edge_error && ($current_vertex_i->x >= $point->x) != ($current_vertex_j->x >= $point->x)) {
return true;
}
if ($current_vertex_i->y > $point->y != $current_vertex_j->y > $point->y) {
$c = ($current_vertex_j->x - $current_vertex_i->x) * ($point->y - $current_vertex_i->y) / ($current_vertex_j->y - $current_vertex_i->y) + $current_vertex_i->x;
if (abs($point->x - $c) <= $edge_error) {
return true;
}
if ($point->x < $c) {
$r = !$r;
}
}
}
return $r;
}
Test Run:
<?php
$vertices = array();
array_push($vertices, new Point2D(120, 40));
array_push($vertices, new Point2D(260, 40));
array_push($vertices, new Point2D(45, 170));
array_push($vertices, new Point2D(335, 170));
array_push($vertices, new Point2D(120, 300));
array_push($vertices, new Point2D(260, 300));
$Point = new Point($vertices);
$point_to_find = new Point2D(190, 170);
$isPointInPolygon = $Point->pointInPolygon($point_to_find);
echo $isPointInPolygon;
var_dump($isPointInPolygon);
I modified the code to check whether the point is in a polygon, including the point is on an edge.
bool point_in_polygon_check_edge(const vec<double, 2>& v, vec<double, 2> polygon[], int point_count, double edge_error = 1.192092896e-07f)
{
const static int x = 0;
const static int y = 1;
int i, j;
bool r = false;
for (i = 0, j = point_count - 1; i < point_count; j = i++)
{
const vec<double, 2>& pi = polygon[i);
const vec<double, 2>& pj = polygon[j];
if (fabs(pi[y] - pj[y]) <= edge_error && fabs(pj[y] - v[y]) <= edge_error && (pi[x] >= v[x]) != (pj[x] >= v[x]))
{
return true;
}
if ((pi[y] > v[y]) != (pj[y] > v[y]))
{
double c = (pj[x] - pi[x]) * (v[y] - pi[y]) / (pj[y] - pi[y]) + pi[x];
if (fabs(v[x] - c) <= edge_error)
{
return true;
}
if (v[x] < c)
{
r = !r;
}
}
}
return r;
}
The following snippet is code from water-nsq benchmark from SPLASH 2...
if (comp_last > NMOL1)
{
for (mol = StartMol[ProcID]; mol < NMOL; mol++)
{
pthread_mutex_lock(&gl->MolLock[mol % MAXLCKS]);
for ( dir = XDIR; dir <= ZDIR; dir++) {
temp_p = VAR[mol].F[DEST][dir];
temp_p[H1] += PFORCES[ProcID][mol][dir][H1];
temp_p[O] += PFORCES[ProcID][mol][dir][O];
temp_p[H2] += PFORCES[ProcID][mol][dir][H2];
}
pthread_mutex_unlock(&gl->MolLock[mol % MAXLCKS]);
}
comp = comp_last % NMOL;
for (mol = 0; ((mol <= comp) && (mol < StartMol[ProcID])); mol++)
{
pthread_mutex_lock(&gl->MolLock[mol % MAXLCKS]);
for ( dir = XDIR; dir <= ZDIR; dir++)
{
temp_p = VAR[mol].F[DEST][dir];
temp_p[H1] += PFORCES[ProcID][mol][dir][H1];
temp_p[O] += PFORCES[ProcID][mol][dir][O];
temp_p[H2] += PFORCES[ProcID][mol][dir][H2];
}
pthread_mutex_unlock(&gl->MolLock[mol % MAXLCKS]);
}
}
else
{
for (mol = StartMol[ProcID]; mol <= comp_last; mol++)
{
pthread_mutex_lock(&gl->MolLock[mol % MAXLCKS]);
for ( dir = XDIR; dir <= ZDIR; dir++)
{
temp_p = VAR[mol].F[DEST][dir];
temp_p[H1] += PFORCES[ProcID][mol][dir][H1];
temp_p[O] += PFORCES[ProcID][mol][dir][O];
temp_p[H2] += PFORCES[ProcID][mol][dir][H2];
}
pthread_mutex_unlock(&gl->MolLock[mol % MAXLCKS]);
}
}
pthread_barrier_wait(&(gl->start));
The problem is that it is not deterministic at the barrier in the end, that is, if you execute this code two times with same inputs, it gives different answers. In other words, if the lock order of mutexes is changed, the results are different.
And yes I have verified this by noting the memory pages. Also I can assure you that the change occurs in the VAR's (pointed by temp_p) memory.
I want to know why? Because apparently, all threads are putting their own values (PFORCES[ProcID]...) to the sum of temp_p and at the end, that is at the barrier, the results should be same, no matter the order in which threads acquired the locks.
[EDITED]
Also, please note that variables comp, dir and mol are all local variables of the thread and therefore not shared.
Second try.
I can't check it, but I assume that in temp_p[H1] += PFORCES[ProcID][mol][dir][H1]; you are adding doubles or floats.
For floating point types, the order of addition matters! Floating point addition is not associative!
A different thread order means a different addition order. So changes in the outcome are to be expected.
See http://en.wikipedia.org/wiki/Floating_point#Accuracy_problems for some explanation.
I notice that you do not show the declaration of the loop variables, like mol and dir.
Could it be that the are accidently shared between threads?
If so, all kind of race conditions between e.g. one thread's mol++ and other thread's [mol % MAXLCKS] will cause problems.
UPDATE: According to the comments below, this does not seem to be the case.