OpenCL device side enqueue kernel & local memory - c

I'm trying to use local memory inside a device-side enqueued kernel.
My assumption that any locally-declared array is visible across all work items in the work group.
This is proven to be true when I use local memory on kernels that are called from the host-side, but I'm running into problems when I use a similar setup on device-side enqueued kernels.
Is there something wrong with my assumption?
Edit:
My kernel is below:
My goal is to sort the FIFO pipe into 3 buffers. The problem is that my work items have a limited view scope, and I'm trying to write the buffers into another pipe.
int pivot;
int in_pipe[BIN_SIZE];
int lt_bin[BIN_SIZE];
int gt_bin[BIN_SIZE];
int e_bin[BIN_SIZE];
reserve_id_t down_id = work_group_reserve_read_pipe(down_pipe, local_size);
//while ( is_valid_reserve_id(down_id) == false){
// down_id = work_group_reserve_read_pipe(down_pipe, local_size);
//}
//in_bin[tid] = -5;
if( is_valid_reserve_id(down_id) == true){
int status = read_pipe(down_pipe, down_id, lid, &pipe_out);
work_group_commit_read_pipe(down_pipe, down_id);
pivot = pipe_out;
pivot = work_group_broadcast(pivot, 0);
work_group_barrier(CLK_GLOBAL_MEM_FENCE);
work_group_barrier(CLK_LOCAL_MEM_FENCE);
in_pipe[tid] = pipe_out;
//in_bin[lid] = in_pipe[tid];
int e_count = 0;
int gt_count = 0;
int lt_count = 0;
if(in_pipe[tid] == pivot){
e_count = 1;
}
else if(in_pipe[tid] < pivot){
lt_count = 1;
}
else if(in_pipe[tid] > pivot){
gt_count = 1;
}
int e_tot = work_group_reduce_add(e_count);
e_tot = work_group_broadcast(e_tot, 0);
int e_val = work_group_scan_exclusive_add(e_count);
int gt_tot = work_group_reduce_add(gt_count);
gt_tot = work_group_broadcast(gt_tot, 0);
int gt_val = work_group_scan_exclusive_add(gt_count);
int lt_tot = work_group_reduce_add(lt_count);
lt_tot = work_group_broadcast(lt_tot, 0);
int lt_val = work_group_scan_exclusive_add(lt_count);
//in_bin[tid] = lt_val;
work_group_barrier(CLK_GLOBAL_MEM_FENCE);
work_group_barrier(CLK_LOCAL_MEM_FENCE);
if(in_pipe[tid] == pivot){
e_temp[e_val] = in_pipe[tid];
//in_bin[e_val] = e_bin[e_val];
//e_bin[e_Val] = work_group_broadcast(e_bin[e_val], lid);
}
if(in_pipe[tid] < pivot){
lte_temp[lt_val] = in_pipe[tid];
//in_bin[lt_val] = lt_bin[lt_val];
}
if(in_pipe[tid] > pivot){
gt_bin[gt_val] = in_pipe[tid];
//in_bin[gt_val] = gt_bin[gt_val];
}

No, not wrong. Local variables are declared and used across whole work-groups device-side too. They won't be shared with the parent kernels, though.
What exactly are you doing?

The working solution to my question is:
Pipes cannot be created on the device side. What I tried to accomplish was to make a dynamic tree structure, involving branches. OpenCL pipes simply cannot do that, as pipes are still memory objects, created on the host-side. There is no current way in the specifications to create memory objects.
Pipes, however, can be used in a dynamically-recursive method, albeit the recursion cannot deviate, and must occur in a linear fashion. Please consult the sample code found in the AMD APP SDK sample code packs for more details. Specifically, please look at the Device Enqueue BFS example.

Related

Only one of multiple threads is writing

I have a task to make a StarCraft like program with multiple pthreads as workers.
So , I have multiple pthreads that run the following function:
void* scv(int num){
int minerals_carried = 0;
while(map_minerals_remaining>0){
minerals_carried = 0;
for(int i = 0; i<number_of_fields; i++){
if(fields[i].minerals != 0 && minerals_carried == 0){
if(pthread_mutex_trylock(&fields[i].mutex)==0){
sleep(1);
// mine returns int
minerals_carried = mine(&fields[i]);
printf("SCV%d is carrying %d minerals from field %d\n",num,minerals_carried,i);
if(pthread_mutex_unlock(&fields[i].mutex)!=0){
perror("pthread_mutex_unlock");
return NULL;
}
}else{
perror("pthread_mutex_trylock");
return NULL;
}
}
}
}
return NULL;
}
I create 5 pthreads and they all get created properly , but only the first one prints out like its supposed to do, and all the other ones dont seem to do anything. Any idea why that might be ?
EDIT :
I was asked to show how I initialized number_of_fields and fields and this is it:
I first declare them as global
typedef struct Mineral_Field_t{
pthread_mutex_t mutex;
int minerals;
}Mineral_Field;
Mineral_Field* fields;
int number_of_fields = 2;
And then I have the following piece of code at the start of the main function:
if(argv[1] != NULL){
number_of_fields = atoi(argv[1]);
}
fields = malloc(number_of_fields*sizeof(Mineral_Field));

Dynamically allocate and initialize new object with 30% probability

I'm writing a program that will simulate a randomized race between runners who are climbing up a mountain where dwarf orcs (dorcs) are coming down the mountain to attack the runners. It begins with two runners named harold and timmy at the bottom of the mountain. The runners make their way up the mountain in randomized moves where they may make progress forward up the mountain, or they may slide back down the mountain. Dorcs are randomly generated, and they inflict damage on a runner if they collide. The simulation ends when one of the runners reaches the top of the mountain, or when both runners are dead.
I'm struggling with a part where I have to implement the actual race loop. Once the race is initialized, the race loop will iterate until the race is over. This happens when either a winner has been declared, or when all runners are dead.
Every iteration of the race loop will do the following:
with 30% probability, dynamically allocate a new dorc as an EntityType structure, and initialize it as follows:
(a) a dorc’s avatar is always “d”
(b) each dorc begins the race at the top of the mountain, which is at row 2
(c) with equal probability, the dorc may be placed either in the same column as timmy, or in the same column as the harold, or in the column exactly half-way between the two
(d) add the new dorc to the race’s array of dorcs
(e) using the pthread_create() function, create a thread for the new dorc, and save the thread pointer in the dorc’s entity structure; the function that each dorc thread will execute is the void* goDorc(void*) function that you will implement in a later step; the parameter to the goDorc() function will be the EntityType pointer that corresponds to that dorc
I guess I'm confused with the logic of how to approach this. I decided to make a function called isOver() to indicate if the race is over, and then a separate function called addDorc() to initialize the Dorc elements and do all the requirements above.
In isOver(), I attempt to add a dorc object to the dorcs array by doing addDorc(race); with every iteration of the race loop/if the race hasn't ended or no one died. But I keep getting the error:
control.c:82:3: error: too few arguments to function ‘addDorc’
addDorc(race);
The problem is I don't think I can manually declare all the parameters in addDorc() because some elements like the "path" argument are based on probability. As mentioned above, with equal probability, the dorc may be placed either in the same column as timmy, or in the same column as the harold, or in the column exactly half-way between the two. The issue is I don't know how to factor this random value when calling addDorc() and would appreciate some help. I also don't know if I'm doing the "with 30% probability, dynamically allocate a new dorc as an EntityType structure" correctly and would be grateful for some input on that as well.
defs.h
typedef struct {
pthread_t thr;
char avatar[MAX_STR];
int currPos;
int path;
} EntityType;
typedef struct {
EntityType ent;
char name[MAX_STR];
int health;
int dead;
} RunnerType;
typedef struct {
int numRunners;
RunnerType *runners[MAX_RUNNERS];
int numDorcs;
EntityType *dorcs[MAX_DORCS];
char winner[MAX_STR];
int statusRow;
sem_t mutex;
} RaceInfoType;
void launch();
int addDorc(RaceInfoType*, char*, int, int);
int isOver(RaceInfoType*);
void initRunners(RaceInfoType*);
int addRunner(RaceInfoType*, char*, char*, int, int, int, int);
int randm(int);
void *goRunner(void*);
void *goDorc(void*);
RaceInfoType *race;
control.c
void launch(){
race = malloc(sizeof(RaceInfoType));
race->numRunners = 0;
initRunners(race);
if (sem_init(&race->mutex, 0, 1) < 0) {
printf("semaphore initialization error\n");
exit(1);
}
strcpy(race->winner, " ");
srand((unsigned)time(NULL));
int i;
for(i = 0; i < race->numRunners; ++i){
pthread_create(&(race->runners[i]->ent.thr), NULL, goRunner, " ");
}
race->numDorcs = 0;
}
int addDorc(RaceInfoType* race, char *avatar, int path, int currPos){
if(race->numDorcs == MAX_DORCS){
printf("Error: Maximum dorcs already reached. \n");
return 0;
}
race->dorcs[race->numDorcs] = malloc(sizeof(EntityType));
int timmysColumn = race->dorcs[race->numDorcs]->currPos;
int haroldsColumn = race->dorcs[race->numDorcs]->currPos;
int halfwayColumn = (timmysColumn+haroldsColumn)/2;
int r = rand()%100;
pthread_t dorc;
if(r <= 30){
strcpy(race->dorcs[race->numDorcs]->avatar, "d");
race->dorcs[race->numDorcs]->currPos = 2;
if(r <= 33){
race->dorcs[race->numDorcs]->path = timmysColumn;
}else if(r <= 66){
race->dorcs[race->numDorcs]->path = haroldsColumn;
}else{
race->dorcs[race->numDorcs]->path = halfwayColumn;
}
pthread_create(&dorc, NULL, goDorc, " ");
}
race->numRunners++;
}
int isOver(RaceInfoType* race){
int i;
for(i = 0; i < race->numRunners; ++i){
if((race->winner != " ") || (race->runners[race->numRunners]->dead = 1)){
return 1;
}
addDorc(race);
return 0;
}
}
void initRunners(RaceInfoType* r){
addRunner(r, "Timmy", "T", 10, 35, 50, 0);
addRunner(r, "Harold", "H", 14, 35, 50, 0);
}
int addRunner(RaceInfoType* race, char *name, char *avatar, int path, int currPos, int health, int dead){
if(race->numRunners == MAX_RUNNERS){
printf("Error: Maximum runners already reached. \n");
return 0;
}
race->runners[race->numRunners] = malloc(sizeof(RunnerType));
strcpy(race->runners[race->numRunners]->name, name);
strcpy(race->runners[race->numRunners]->ent.avatar, avatar);
race->runners[race->numRunners]->ent.path = path;
race->runners[race->numRunners]->ent.currPos = currPos;
race->runners[race->numRunners]->health = health;
race->runners[race->numRunners]->dead = dead;
race->numRunners++;
return 1;
}
Caveat: Because there's so much missing [unwritten] code, this isn't a complete solution.
But, I notice at least two bugs: the isOver bugs in my top comments. And, incrementing race->numRunners in addDorc.
isOver also has the return 0; misplaced [inside the loop]. That should go as the last statement in the function. If you had compiled with -Wall [which you should always do], that should have been flagged by the compiler (e.g. control reaches end of non-void function)
From that, only one "dorc" would get created (for the first eligible runner). That may be what you want, but [AFAICT] you want to try to create more dorcs (one more for each valid runner).
Also, the bug the compiler flagged is because you're calling addDorc(race); but addDorc takes more arguments.
It's very difficult to follow the code when you're doing (e.g.) race->dorcs[race->numDorcs]->whatever everywhere.
Better to do (e.g.):
EntityType *ent = &race->dorcs[race->numDorcs];
ent->whatever = ...;
Further, it's likely that your thread functions would like a pointer to their [respective] control structs (vs. just passing " ").
Anyway, I've refactored your code to incorporate these changes. I've only tried to fix the obvious/glaring bugs from simple code inspection, but I've not tried to recompile or address the correctness of your logic.
So, there's still more work to do, but the simplifications may help a bit.
void
launch(void)
{
race = malloc(sizeof(RaceInfoType));
race->numRunners = 0;
initRunners(race);
if (sem_init(&race->mutex,0,1) < 0) {
printf("semaphore initialization error\n");
exit(1);
}
strcpy(race->winner," ");
srand((unsigned)time(NULL));
int i;
for (i = 0; i < race->numRunners; ++i) {
RunnerType *run = &race->runners[i];
EntityType *ent = &run->ent;
pthread_create(&ent->thr,NULL,goRunner,ent);
}
race->numDorcs = 0;
}
int
addDorc(RaceInfoType* race,char *avatar,int path,int currPos)
{
if (race->numDorcs == MAX_DORCS) {
printf("Error: Maximum dorcs already reached. \n");
return 0;
}
EntityType *ent = malloc(sizeof(*ent));
race->dorcs[race->numDorcs] = ent;
int timmysColumn = ent->currPos;
int haroldsColumn = ent->currPos;
int halfwayColumn = (timmysColumn + haroldsColumn) / 2;
int r = rand()%100;
#if 0
pthread_t dorc;
#endif
if (r <= 30) {
strcpy(ent->avatar,"d");
ent->currPos = 2;
if (r <= 33) {
ent->path = timmysColumn;
} else if (r <= 66) {
ent->path = haroldsColumn;
} else {
ent->path = halfwayColumn;
}
pthread_create(&ent->thr,NULL,goDorc,ent);
}
#if 0
race->numRunners++;
#else
race->numDorcs += 1;
#endif
}
int
isOver(RaceInfoType* race)
{
int i;
for (i = 0; i < race->numRunners; ++i) {
#if 0
if ((race->winner != " ") ||
(race->runners[race->numRunners]->dead = 1))
return 1;
#else
RunnerType *run = &race->runners[i];
if ((race->winner != " ") || (run->dead == 1))
return 1;
#endif
addDorc(race);
#if 0
return 0;
#endif
}
#if 1
return 0;
#endif
}
void
initRunners(RaceInfoType* r)
{
addRunner(r,"Timmy","T",10,35,50,0);
addRunner(r,"Harold","H",14,35,50,0);
}
int
addRunner(RaceInfoType* race,char *name,char *avatar,int path,int currPos,
int health,int dead)
{
if (race->numRunners == MAX_RUNNERS) {
printf("Error: Maximum runners already reached. \n");
return 0;
}
RunnerType *run = malloc(sizeof(*run));
race->runners[race->numRunners] = run;
strcpy(run->name,name);
EntityType *ent = &run->ent;
strcpy(ent->avatar,avatar);
ent->path = path;
ent->currPos = currPos;
run->health = health;
run->dead = dead;
race->numRunners++;
return 1;
}
UPDATE:
I noticed in addDorc(), you put pthread_t dorc; in an if statement. I don't quite understand what my if statement is actually supposed to be checking though.
I forgot to mention/explain. I wrapped your/old code and my/new code with preprocessor conditionals (e.g.):
#if 0
// old code
#else
// new code
#endif
After the cpp stage, the compiler will only see the // new code stuff. Doing this was an instructional tool to show [where possible] what code you had vs what I replaced it with. This was done to show the changes vs. just rewriting completely.
If we never defined NEVERWAS with a #define NEVERWAS, then the above block would be equivalent to:
#ifdef NEVERWAS
// old code ...
#else
// new code
#endif
Would it still be under the if(r <= 30) part like I did in my original code?
Yes, hopefully now, it is more clear. #if is a cpp directive to include/exclude code (as if you had edited that way). But, a "real" if is an actual executable statement that is evaluated at runtime [as it was before], so no change needed.
My other concern is it doesn't look like dorc is used anywhere in the function because you write pthread_create(&ent->thr,NULL,goDorc,ent); which seems to use ent instead?
That is correct. It is not used/defined and the value goes to ent->thr. As you had it, the pthread_t value set by pthread_create would be lost [when dorc goes out of scope]. So, unless it's saved somewhere semi-permanent (e.g. in ent->thr), there would be no way to do a pthread_join call later.

Priority Queue synchronization with pthreads

I'm working on a college assignment where we are to implement parallelized A* search for a 15 puzzle. For this part, we are to use only one priority queue (I suppose to see that the contention by multiple threads would limit speedup). A problem I am facing is properly synchronizing popping the next "candidate" from the priority queue.
I tried the following:
while(1) {
// The board I'm trying to pop.
Board current_board;
pthread_mutex_lock(&priority_queue_lock);
// If the heap is empty, wait till another thread adds new candidates.
if (pq->heap_size == 0)
{
printf("Waiting...\n");
pthread_mutex_unlock(&priority_queue_lock);
continue;
}
current_board = top(pq);
pthread_mutex_unlock(&priority_queue_lock);
// Generate the new boards from the current one and add to the heap...
}
I've tried different variants of the same idea, but for some reason there are occasions where the threads get stuck on "Waiting". The code works fine serially (or with two threads), so that leads me to believe this is the offending part of the code. I can post the entire thing if necessary. I feel like it's an issue with my understanding of the mutex lock though. Thanks in advance for help.
Edit:
I've added the full code for the parallel thread below:
// h and p are global pointers initialized in main()
void* parallelThread(void* arg)
{
int thread_id = (int)(long long)(arg);
while(1)
{
Board current_board;
pthread_mutex_lock(&priority_queue_lock);
current_board = top(p);
pthread_mutex_unlock(&priority_queue_lock);
// Move blank up.
if (current_board.blank_x > 0)
{
int newpos = current_board.blank_x - 1;
Board new_board = current_board;
new_board.board[current_board.blank_x][current_board.blank_y] = new_board.board[newpos][current_board.blank_y];
new_board.board[newpos][current_board.blank_y] = BLANK;
new_board.blank_x = newpos;
new_board.goodness = get_goodness(new_board.board);
new_board.turncount++;
if (check_solved(new_board))
{
printf("Solved in %d turns",new_board.turncount);
exit(0);
}
if (!exists(h,new_board))
{
insert(h,new_board);
push(p,new_board);
}
}
// Move blank down.
if (current_board.blank_x < 3)
{
int newpos = current_board.blank_x + 1;
Board new_board = current_board;
new_board.board[current_board.blank_x][current_board.blank_y] = new_board.board[newpos][current_board.blank_y];
new_board.board[newpos][current_board.blank_y] = BLANK;
new_board.blank_x = newpos;
new_board.goodness = get_goodness(new_board.board);
new_board.turncount++;
if (check_solved(new_board))
{
printf("Solved in %d turns",new_board.turncount);
exit(0);
}
if (!exists(h,new_board))
{
insert(h,new_board);
push(p,new_board);
}
}
// Move blank right.
if (current_board.blank_y < 3)
{
int newpos = current_board.blank_y + 1;
Board new_board = current_board;
new_board.board[current_board.blank_x][current_board.blank_y] = new_board.board[current_board.blank_x][newpos];
new_board.board[current_board.blank_x][newpos] = BLANK;
new_board.blank_y = newpos;
new_board.goodness = get_goodness(new_board.board);
new_board.turncount++;
if (check_solved(new_board))
{
printf("Solved in %d turns",new_board.turncount);
exit(0);
}
if (!exists(h,new_board))
{
insert(h,new_board);
push(p,new_board);
}
}
// Move blank left.
if (current_board.blank_y > 0)
{
int newpos = current_board.blank_y - 1;
Board new_board = current_board;
new_board.board[current_board.blank_x][current_board.blank_y] = new_board.board[current_board.blank_x][newpos];
new_board.board[current_board.blank_x][newpos] = BLANK;
new_board.blank_y = newpos;
new_board.goodness = get_goodness(new_board.board);
new_board.turncount++;
if (check_solved(new_board))
{
printf("Solved in %d turns",new_board.turncount);
exit(0);
}
if (!exists(h,new_board))
{
insert(h,new_board);
push(p,new_board);
}
}
}
return NULL;
}
I tried the following:
I don't see anything wrong with the code that follows, assuming that top also removes the board from the queue. It's wasteful (if the queue is empty, it will spin locking and unlocking the mutex), but not wrong.
I've added the full code
This is useless without the code for exists, insert and push.
One general observation:
pthread_mutex_lock(&priority_queue_lock);
current_board = top(p);
pthread_mutex_unlock(&priority_queue_lock);
In the code above, your locking is "ouside" of the top function. But here:
if (!exists(h,new_board))
{
insert(h,new_board);
push(p,new_board);
}
you either do no locking at all (in which case that's a bug), or you do locking "inside" exists, insert and push.
You should not mix "inside" and "outside" locking. Pick one or the other and stick with it.
If you in fact do not lock the queue inside exists, insert, etc. then you have a data race and are thinking of mutexes incorrectly: they protect invariants, and you can't check whether the queue is empty in parallel with another thread executing "remove top element" -- these operations require serialization, and thus must both be done under a lock.

problems with old c code with new ncurses version (ldat struct)

I have a problem with some code using curses after upgrading to a new server and thus also new software like libs, headers and such.
The problem is the use of the ldat struct fields "firstchar", "lastchar" and "text" which in the the newer versions of curses.h is hidden in the curses.priv.h and therefore they are not resolved.
I could really use some pointers as to how I might be able to resolve these issues.
The code below indicates the use of the struct fields, but it just a part of the complete code as it several thousand lines...
If there is need for additional code I can add this.
I might also add that I have not made this program myself, I'm just responsible for making it work with our new server...
int
update_window(changed, dw, sw, win_shared)
bool *changed;
WINDOW *dw; /* Destination window */
window_t *sw; /* Source window */
bool win_shared;
{
int y, x;
int yind, nx, first, last;
chtype *pd, *ps; /* pd = pointer destination, ps = pointer source */
int nscrolls; /* Number of scrolls to make */
if(! sw->changed) {
*changed = FALSE;
return(0);
}
/****************************************
* Determine number of times window is
* scrolled since last update
****************************************/
nscrolls = sw->scrollcount; if(nscrolls >= sw->ny)
nscrolls = 0;
sw->scrollcount = 0L;
dw->_flags = _HASMOVED;
dw->_cury = sw->cury;
dw->_curx = sw->curx;
if(nscrolls > 0) {
/* Don't copy lines that is scolled away */
for(y = nscrolls; y < sw->ny; y++) {
yind = GETYIND(y - nscrolls, sw->toprow, sw->ny);
if(sw->lastch[yind] != _NOCHANGE) {
first = dw->_line[y].firstchar = sw->firstch[yind];
last = dw->_line[y].lastchar = sw->lastch[yind];
ps = &sw->screen[yind][first];
pd = (chtype *)&dw->_line[y].text[first];
nx = last - first + 1;
LOOPDN(x, nx)
d++ = *ps++;
if(! win_shared) {
sw->firstch[yind] = sw->nx;
sw->lastch[yind] = _NOCHANGE;
}
}
}
} else {
LOOPUP(y, sw->ny) {
yind = GETYIND(y, sw->toprow, sw->ny);
if(sw->lastch[yind] != _NOCHANGE) {
first = dw->_line[y].firstchar = sw->firstch[yind];
last = dw->_line[y].lastchar = sw->lastch[yind];
ps = &sw->screen[yind][first];
pd = (chtype *)&dw->_line[y].text[first];
nx = last - first + 1;
LOOPDN(x, nx)
*pd++ = *ps++;
if(! win_shared) {
sw->firstch[yind] = sw->nx;
sw->lastch[yind] = _NOCHANGE;
}
}
}
if(! win_shared)
sw->changed = FALSE;
}
*changed = TRUE;
return(nscrolls);
}
I appreciate all the help I can get!
The members of struct ldat were made private in June 2001. Reading the function and its mention of scrolls hints that it is writing a portion of some window used to imitate scrolling (by writing a set of lines to the real window), and attempting to bypass the ncurses logic which checks for changed lines.
For a function like that, the only solution is to determine what the developer was trying to do, and write a new function which does this — using the library functions provided.

Worker threads and controller thread synchronization

I'm having trouble with getting my worker threads and facilitator threads to synchronize properly. The problem I'm trying to solve is to find the largest prime number 10 files using up to 10 threads. 1 thread is single-threaded and anything greater than that is multi-threaded.
The problem lies where the worker signals the facilitator that it has found a new prime. The facilitator can ignore it if the number is insignificant, or signal to update all threads my_latest_lgprime if it is important. I keep getting stuck in my brain and in code.
The task must be completed using a facilitator and synchronization.
Here is what I have so far:
Worker:
void* worker(void* args){
w_pack* package = (w_pack*) args;
int i, num;
char text_num[30];
*(package->fac_prime) = 0;
for(i = 0; i<package->file_count; i++){
int count = 1000000; //integers per file
FILE* f = package->assigned_files[i];
while(count != 0){
fscanf(f, "%s", text_num);
num = atoi(text_num);
pthread_mutex_lock(&lock2);
while(update_ready != 0){
pthread_cond_wait(&waiter, &lock2);
package->my_latest_lgprime = largest_prime;//largest_prime is global
update_ready = 0;
}
pthread_mutex_unlock(&lock2);
if(num > (package->my_latest_lgprime+100)){
if(isPrime(num)==1){
*(package->fac_prime) = num;
package->my_latest_lgprime = num;
pthread_mutex_lock(&lock);
update_check = 1;
pthread_mutex_unlock(&lock);
pthread_cond_signal(&updater);
}
}
count--;
}
}
done++;
return (void*)package;
}`
Facilitator:
void* facilitator(void* args){
int i, temp_large;
f_pack* package = (f_pack*) args;
while(done != package->threads){
pthread_mutex_lock(&lock);
while(update_check == 0)
pthread_cond_wait(&updater, &lock);
temp_large = isLargest(package->threads_largest, package->threads);
if(temp_large > largest_prime){
pthread_mutex_lock(&lock2);
update_ready = 1;
largest_prime = temp_large;
pthread_mutex_unlock(&lock2);
pthread_cond_broadcast(&waiter);
printf("New large prime: %d\n", largest_prime);
}
update_check = 0;
pthread_mutex_unlock(&lock);
}
}
Here is the worker package
typedef struct worker_package{
int my_latest_lgprime;
int file_count;
int* fac_prime;
FILE* assigned_files[5];
} w_pack;
Is there an easier way to do this using semaphores?
I can't really spot a problem with certainty, but just by briefly reading the code it seems the done variable is shared across threads yet it is accessed and modified without synchronization.
In any case, I can suggest a couple of ideas to improve on your solution.
You assign the list of files to each thread at start up. This isn't the most efficient way, since processing each file may take more or less time. It seems to me a better approach would be to have a single list of files, and then each thread picks up the next file in the list.
Do you really need a facilitator task for this? It seems to me each thread can keep track of its own largest prime, and every time it finds a new maximum it can go check a global maximum and update it if necessary. You could also keep a single maximum (w/o a per-thread maximum) but that will require you to lock every time you need to compare.
Here is pseudo-code of how I would write the worker threads:
while (true) {
lock(file_list_mutex)
if file list is empty {
break // we are done!
}
file = get_next_file_in_list
unlock(file_list_mutex)
max = 0
foreach number in file {
if number is prime and number > max {
lock(max_number_mutex)
if (number > max_global_number) {
max_global_number = number
}
max = max_global_number
unlock(max_number_mutex)
}
}
}
Before you start the worker threads you need to initialize max_global_number = 0.
The above solution has the benefit that it doesn't abuse locks like in your case, so thread contention is minimized.

Resources