I'm programming in with MPI library in C at the moment and I have the following snippet of code that behaves very strange.
This is not a minimal reproducible example, but I think there is an obvious problem with the code snippet, even unrelated to mpi, which can be easily solved without reproduction. Do let me know if there is additional code that needed and I happily provide it!
void monitor_proposals(int people_per_gender) {
int satisfied_women = 0;
/* declarations independent of the one above (omitted) */
while (satisfied_women < people_per_gender) {
MPI_Recv(&buf, sizeof(buf), MPI_INT, MPI_ANY_SOURCE, MPI_ANY_TAG, MPI_COMM_WORLD, &status);
sender = status.MPI_SOURCE;
index = sender/2;
printf("here\n");
printf("ppg=%d, sw=%d\n", people_per_gender, satisfied_women);
fflush(stdout);
if (women_atleast_one_proposal[index] == 0) {
women_atleast_one_proposal[index] = sender+1; /* logical id */
satisfied_women += 1;
printf("Monitor: First proposal to woman (%d)\n", sender+1);
printf("ppg=%d, sw=%d\n", people_per_gender, satisfied_women);
}
if (satisfied_women == people_per_gender) {
MPI_Send(&DONE, sizeof(DONE), MPI_INT, sender, sender, MPI_COMM_WORLD);
printf("this\n");
} else {
MPI_Send(&NOT_DONE, sizeof(NOT_DONE), MPI_INT, sender, sender, MPI_COMM_WORLD);
printf("that\n");
}
}
printf("outside\n");
}
Output in terminal:
here
ppg=1, sw=16
Monitor: First proposal to woman (1)
ppg=1, sw=17
that
My expectation is of course that satisfied_women is initialized to 0, then incremented to 1 and therefore will break the loop once it iterates. I also flush the output stream to stdout which should show me if there is uncontrolled looping but there seems not to be.
Expected output:
here
ppg=1, sw=0
Monitor: First proposal to woman (1)
ppg=1, sw=1
this
outside
I'm using mpich: stable 4.0.1 via homebrew.
EDIT
I solved the increment problem and this is the code right now, I (changed count argument to 1 in several places so that part works now).
There are n men processes and n women processes. Men and women rate eachother. Men propose to women by sending and women wait for proposals. If a woman receives a better rated man than the currently accepted, then the previous man will have to propose to another woman.
There is a monitoring process that is called once every iteration in women's while-loop, and it feeds back to the sending woman if it should exit or not. When woman exit the while-loop they notified the man that it accepted most recently.
#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
typedef enum gender {
MAN,
WOMAN
} gender_t;
/* men use array of women and fill in each womens rating_t, and vice versa */
typedef struct rating {
gender_t gender;
int id;
int rating;
} rating_t;
/*******************************************************************************************
* HELPER FUNCTIONS ************************************************************************
*******************************************************************************************/
/* custom compare for qsort */
int compare(const void *r1, const void *r2) {
return ((rating_t*)r1)->rating > ((rating_t*)r2)->rating ? -1 : 1;
}
/* random shuffling of ratings */
void shuffle_ratings(rating_t *profiles, int size) {
int random_index, temp;
for (int max_index = size-1; max_index > 0; max_index--) {
random_index = rand() % (max_index+1);
/* swap values at indexes */
temp = profiles[max_index].rating;
profiles[max_index].rating = profiles[random_index].rating;
profiles[random_index].rating = temp;
}
}
/*******************************************************************************************
* PROCESSES *******************************************************************************
*******************************************************************************************/
/* keeps track of women who are with a man, eventually notifies them it's done */
void monitor_proposals(int people_per_gender) {
MPI_Status status;
const int DONE = 1;
const int NOT_DONE = 0;
int *women_atleast_one_proposal = (int*)calloc(people_per_gender, sizeof(int));
int satisfied_women = 0;
int sender, index;
int buf; /* not useful */
while (satisfied_women < people_per_gender) {
MPI_Recv(&buf, 1, MPI_INT, MPI_ANY_SOURCE, MPI_ANY_TAG, MPI_COMM_WORLD, &status);
sender = status.MPI_SOURCE;
index = sender/2;
fflush(stdout);
if (women_atleast_one_proposal[index] == 0) {
women_atleast_one_proposal[index] = sender+1; /* logical id */
satisfied_women++;
printf("Monitor: First proposal to woman (%d)\n", sender+1);
printf("ppg=%d, sw=%d\n", people_per_gender, satisfied_women);
}
if (satisfied_women == people_per_gender) {
MPI_Send(&DONE, 1, MPI_INT, sender, sender, MPI_COMM_WORLD);
printf("this\n");
} else {
MPI_Send(&NOT_DONE, 1, MPI_INT, sender, sender, MPI_COMM_WORLD);
printf("that\n");
}
}
}
/* function for men, highest rating is proposed to first */
void propose(int id, rating_t *my_ratings) {
MPI_Status rec_status;
int proposals = 0;
int accepted = 0;
int propose_dest, propose_rating;
while (!accepted) {
propose_dest = my_ratings[proposals].id - 1;
propose_rating = my_ratings[proposals].rating;
printf("Man (%d): Proposed to woman (%d) who's rated %d\n", id, propose_dest+1, propose_rating);
fflush(stdout);
MPI_Send(&propose_rating, 1, MPI_INT, propose_dest, propose_dest, MPI_COMM_WORLD);
proposals++;
MPI_Recv(&accepted, 1, MPI_INT, MPI_ANY_SOURCE, MPI_ANY_TAG, MPI_COMM_WORLD, &rec_status);
}
printf("man %d accepted\n", id);
}
/* function for women, accepts first proposal but can replace */
void receive_proposals(int id, rating_t *my_ratings, int monitor_rank) {
MPI_Status status;
const int ACCEPT = 1;
const int REJECT = 0;
int DONT_CARE = 0;
int monitor_response;
int from_man;
int received_man_rank = -1;
int received_man_rating = -1;
int best_man_rank = -1;
int best_man_rating = -1;
while (1) {
MPI_Recv(&from_man, 1, MPI_INT, MPI_ANY_SOURCE, MPI_ANY_TAG, MPI_COMM_WORLD, &status);
received_man_rank = status.MPI_SOURCE;
received_man_rating = my_ratings[received_man_rank/2].rating;
if (best_man_rank == -1) { /* first proposal received */
best_man_rank = received_man_rank;
best_man_rating = received_man_rating;
printf("Woman (%d): Accepted man (%d) #%d#\n", id, best_man_rank+1, best_man_rating);
} else if (received_man_rating > best_man_rating) { /* proposal is better rated than current accepted, notify replaced */
MPI_Send(&REJECT, 1, MPI_INT, best_man_rank, best_man_rank, MPI_COMM_WORLD);
printf("Woman (%d): Replaced man (%d) #%d# for man (%d) #%d#\n", id, best_man_rank+1, \
best_man_rating, received_man_rank+1, received_man_rating);
best_man_rank = received_man_rank;
best_man_rating = received_man_rating;
} else { /* notify denied man */
MPI_Send(&REJECT, 1, MPI_INT, received_man_rank, received_man_rank, MPI_COMM_WORLD);
printf("Woman (%d): Rejected proposing man (%d) #%d# due to best man (%d) #%d#\n", id, received_man_rank+1, \
received_man_rating, best_man_rank+1, best_man_rating);
}
MPI_Send(&DONT_CARE, 1, MPI_INT, monitor_rank, monitor_rank, MPI_COMM_WORLD);
MPI_Recv(&monitor_response, 1, MPI_INT, monitor_rank, MPI_ANY_TAG, MPI_COMM_WORLD, &status);
if (monitor_response) {
printf("woman here\n");
break;
}
}
/* send ok to accepted man */
MPI_Send(&ACCEPT, 1, MPI_INT, best_man_rank, best_man_rank, MPI_COMM_WORLD);
printf("Woman (%d) + Man (%d) MARRIED\n", id, best_man_rank+1);
}
int main(int argc, char *argv[]) {
int pool_size, people_per_gender;
int rank, id, monitor_rank;
rating_t *my_ratings;
gender_t gender;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &pool_size);
if (pool_size % 2 != 1) {
if (rank == 0)
printf("Requirement: men == women and 1 extra process!\n");
MPI_Finalize();
exit(1);
}
people_per_gender = pool_size / 2; /* number of men/women */
id = rank + 1; /* logical id */
monitor_rank = pool_size - 1; /* collecting of proposals */
if (rank != monitor_rank) {
gender = (id % 2 == 0 ? MAN : WOMAN); /* odd id - woman, even id - man */
my_ratings = (rating_t*)malloc(people_per_gender * sizeof(rating_t)); /* rate half of pool, i.e. other gender */
/* create "profiles" of other gender */
for (int i = 0; i < people_per_gender; i++) {
my_ratings[i].gender = (gender == MAN ? WOMAN : MAN);
my_ratings[i].id = ( gender == MAN ? (2*i+1) : (2*i+2) );
my_ratings[i].rating = i+1;
}
/* randomize ratings of other gender */
srand(time(NULL) + id);
shuffle_ratings(my_ratings, people_per_gender);
qsort(my_ratings, people_per_gender, sizeof(rating_t), compare);
if (gender == WOMAN) printf("W(%d) ratings: ", id);
else if (gender == MAN) printf("M(%d) ratings: ", id);
for (int i = 0; i < people_per_gender; i++)
printf("| {id:%d, %d} | ", my_ratings[i].id, my_ratings[i].rating);
printf("\n");
}
MPI_Barrier(MPI_COMM_WORLD);
if (rank == monitor_rank) printf("\n");
fflush(stdout);
MPI_Barrier(MPI_COMM_WORLD);
/* call function based on process type */
if (rank == monitor_rank) {
monitor_proposals(people_per_gender);
} else {
if (gender == WOMAN)
receive_proposals(id, my_ratings, monitor_rank);
else if (gender == MAN)
propose(id, my_ratings);
}
MPI_Barrier(MPI_COMM_WORLD);
printf("ID (%d): Done\n", id);
MPI_Finalize();
return 0;
}
You're not giving us enough code. How is the buffer defined in:
MPI_Recv(&buf, sizeof(buf), MPI_INT,
int buf then this is almost correct because sizeof will give 4.
int buf[20] then sizeof will give the sizes in bytes, not in ints.
buf = (int*)malloc(whatever) then sizeof will give 8 bytes for the pointer.
In other words, it's definitely wrong, but precisely how we can not tell.
Related
So, somehow MPI_Probe receives the same message even though it is only sent once.
I execute the program with only 2 process of which process 1 is sending two messages, one two retrieve a task and another one to send the result. I send the messages with different tags to differentiate those two.
So what's supposed to happen is following:
Process 0 waits for a task request
Process 1 sends a task request indicated by tag=0
Process 0 sends the task
Process 1 does the task and sends the results back to process 0 indicated by tag=1.
-- This is the part where the first problem occurs: Process 0 still receives tag=0
shown by the printf
Process 0 receives the task enters the else-if-block where tag==1 -- It does not.
Process 1 breaks the while loop - DEBUGGING PURPOSE
Procces 0 is supposed to be blocked by MPI_Probe but is not, instead it continues to
run and always shows that the received Tag is still 0
The code is still messy and inefficient. I just want a minimal working program to build upon and to optimize. But still any tip is appreciated!
The code:
if(rank == 0) {
struct stack* idle_stack = init_stack(env_size-1);
struct sudoku_stack* sudoku_stack_ptr = init_sudoku_stack(256);
push_sudoku(sudoku_stack_ptr, sudoku);
int it=0;
while(1) {
printf("ITERATION %d\n", it);
int idle_stack_size = stack_size(idle_stack);
int _sudoku_stack_empty = sudoku_stack_empty(sudoku_stack_ptr);
MPI_Probe(MPI_ANY_SOURCE, MPI_ANY_TAG, MPI_COMM_WORLD, &status);
printf("TAG: %d\n", status.MPI_TAG);
// So this part is supposed to be entered once
if(status.MPI_TAG == 0) {
if(!sudoku_stack_empty(sudoku_stack_ptr)) {
printf("SENDING TASK\n");
int *next_sudoku = pop_sudoku(sudoku_stack_ptr);
MPI_Send(next_sudoku, v_size, MPI_INT, status.MPI_SOURCE, 0, MPI_COMM_WORLD);
} else {
// But since the Tag stays 0, it is called multiple times until
// a stack overflow occurs
printf("PUSHING TO IDLE STACK\n");
push(idle_stack, status.MPI_SOURCE);
}
} else if(status.MPI_TAG == 1) {
// This part should actually be entered by the second received message
printf("RECEIVING SOLUTION\n");
int count;
MPI_Get_count(&status, MPI_INT, &count);
int* recv_sudokus = (int*)malloc(count * sizeof(int));
MPI_Recv(recv_sudokus, count, MPI_INT, status.MPI_SOURCE, MPI_ANY_TAG, MPI_COMM_WORLD, &status);
for(int i = 0; i < count; i+=v_size) {
printf("%d ", recv_sudokus[i]);
if((i+1) % m_size == 0){
printf("\n");
}
}
// DEBUG - EXIT PROGRAM
teardown(sudoku_stack_ptr, idle_stack);
break;
push_sudoku(sudoku_stack_ptr, recv_sudokus);
} else if(status.MPI_TAG == 2) {
//int* solved_sudoku = (int*)malloc(v_size * sizeof(int));
//MPI_Recv(solved_sudoku, v_size, MPI_INT, status.MPI_SOURCE, MPI_ANY_TAG, MPI_COMM_WORLD, &status);
//TODO
}
it++;
}
} else {
int* sudoku = (int*)malloc(sizeof(int)*v_size);
int* possible_sudokus = (int*)malloc(sizeof(int)*m_size*v_size);
while(1) {
// Send task request
printf("REQUESTING TASK\n");
int i = 0;
MPI_Send(&i, 1, MPI_INT, 0, 0, MPI_COMM_WORLD);
// Wait for and receive task
printf("RECEIVING TASK\n");
MPI_Recv(sudoku, v_size, MPI_INT, 0, MPI_ANY_TAG, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
printf("CALCULATING\n");
int index = 0;
for(int i = 1; i <= m_size; i++) {
int is_safe_res = is_safe_first_empty_cell(sudoku, i);
if(is_safe_res) {
int* sudoku_cp = (int*)malloc(sizeof(int)*v_size);
memcpy(sudoku_cp, sudoku, sizeof(int)*v_size);
insert_to_first_empty_cell(sudoku_cp, i);
memcpy(&possible_sudokus[index], sudoku_cp, sizeof(int)*v_size);
index+=v_size;
free(sudoku_cp);
}
}
printf("SENDING\n");
MPI_Send(possible_sudokus, index*v_size, MPI_INT, 0, 1, MPI_COMM_WORLD);
break;
}
}
I am using an example code from an MPI book [will give the name shortly].
What it does is the following:
a) It creates two communicators world = MPI_COMM_WORLD containing all the processes and worker which excludes the random number generator server (the last rank process).
b) So, the server generates random numbers and serves them to the workers on requests from the workers.
c) What the workers do is they count separately the number of samples falling inside and outside an unit circle inside an unit square.
d) After sufficient level of accuracy, the counts inside and outside are Allreduced to compute the value of PI as their ratio.
**The code compiles well. However, when running with the following command (actually with any value of n) **
>mpiexec -n 2 apple.exe 0.0001
I get the following errors:
Fatal error in MPI_Allreduce: Invalid communicator, error stack:
MPI_Allreduce(855): MPI_Allreduce(sbuf=000000000022EDCC, rbuf=000000000022EDDC,
count=1, MPI_INT, MPI_SUM, MPI_COMM_NULL) failed
MPI_Allreduce(780): Null communicator
pi = 0.00000000000000000000
job aborted:
rank: node: exit code[: error message]
0: PC: 1: process 0 exited without calling finalize
1: PC: 123
Edit: ((( Removed: But when I am removing any one of the two MPI_Allreduce() functions, it is running without any runtime errors, albeit with wrong answer.))
Code:
#include <mpi.h>
#include <mpe.h>
#include <stdlib.h>
#define CHUNKSIZE 1000
/* message tags */
#define REQUEST 1
#define REPLY 2
int main(int argc, char *argv[])
{
int iter;
int in, out, i, iters, max, ix, iy, ranks [1], done, temp;
double x, y, Pi, error, epsilon;
int numprocs, myid, server, totalin, totalout, workerid;
int rands[CHUNKSIZE], request;
MPI_Comm world, workers;
MPI_Group world_group, worker_group;
MPI_Status status;
MPI_Init(&argc,&argv);
world = MPI_COMM_WORLD;
MPI_Comm_size(world,&numprocs);
MPI_Comm_rank(world,&myid);
server = numprocs-1; /* last proc is server */
if(myid==0) sscanf(argv[1], "%lf", &epsilon);
MPI_Bcast(&epsilon, 1, MPI_DOUBLE, 0, MPI_COMM_WORLD);
MPI_Comm_group(world, &world_group);
ranks[0] = server;
MPI_Group_excl(world_group, 1, ranks, &worker_group);
MPI_Comm_create(world, worker_group, &workers);
MPI_Group_free(&worker_group);
if(myid==server) /* I am the rand server */
{
srand(time(NULL));
do
{
MPI_Recv(&request, 1, MPI_INT, MPI_ANY_SOURCE, REQUEST, world, &status);
if(request)
{
for(i=0; i<CHUNKSIZE;)
{
rands[i] = rand();
if(rands[i]<=INT_MAX) ++i;
}
MPI_Send(rands, CHUNKSIZE, MPI_INT,status.MPI_SOURCE, REPLY, world);
}
}
while(request>0);
}
else /* I am a worker process */
{
request = 1;
done = in = out = 0;
max = INT_MAX; /* max int, for normalization */
MPI_Send(&request, 1, MPI_INT, server, REQUEST, world);
MPI_Comm_rank(workers, &workerid);
iter = 0;
while(!done)
{
++iter;
request = 1;
MPI_Recv(rands, CHUNKSIZE, MPI_INT, server, REPLY, world, &status);
for(i=0; i<CHUNKSIZE;)
{
x = (((double) rands[i++])/max)*2-1;
y = (((double) rands[i++])/max)*2-1;
if(x*x+y*y<1.0) ++in;
else ++out;
}
/* ** see error here ** */
MPI_Allreduce(&in, &totalin, 1, MPI_INT, MPI_SUM, workers);
MPI_Allreduce(&out, &totalout, 1, MPI_INT, MPI_SUM, workers);
/* only one of the above two MPI_Allreduce() functions working */
Pi = (4.0*totalin)/(totalin+totalout);
error = fabs( Pi-3.141592653589793238462643);
done = (error<epsilon||(totalin+totalout)>1000000);
request = (done)?0:1;
if(myid==0)
{
printf("\rpi = %23.20f", Pi);
MPI_Send(&request, 1, MPI_INT, server, REQUEST, world);
}
else
{
if(request)
MPI_Send(&request, 1, MPI_INT, server, REQUEST, world);
}
MPI_Comm_free(&workers);
}
}
if(myid==0)
{
printf("\npoints: %d\nin: %d, out: %d, <ret> to exit\n", totalin+totalout, totalin, totalout);
getchar();
}
MPI_Finalize();
}
What is the error here? Am I missing something? Any help or pointer will be highly appreciated.
You are freeing the workers communicator before you are done using it. Move the MPI_Comm_free(&workers) call after the while(!done) { ... } loop.
First, I precise that I am french and my english is not really good.
I am working on MPI application and I have some problems and I hope that somebody can help me.
As reported in the title of my post, I try to use a thread to listen when I have to kill my application and then call the MPI_Finalize function.
However, my application does not finish correcty.
More precisely, I obtain the following message:
[XPS-2720:27441] * Process received signal *
[XPS-2720:27441] Signal: Segmentation fault (11)
[XPS-2720:27441] Signal code: Address not mapped (1)
[XPS-2720:27441] Failing at address: 0x7f14077a3b6d
[XPS-2720:27440] * Process received signal *
[XPS-2720:27440] Signal: Segmentation fault (11)
[XPS-2720:27440] Signal code: Address not mapped (1)
[XPS-2720:27440] Failing at address: 0x7fb11d07bb6d
mpirun noticed that process rank 1 with PID 27440 on node lagniez-XPS-2720 exited on signal 11 (Segmentation fault).
My slave code is:
#include "mpi.h"
#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
#include <unistd.h>
#include <sys/types.h>
#include <pthread.h>
#include <cassert>
#define send_data_tag 1664
#define send_kill_tag 666
void *finilizeMPICom(void *intercomm)
{
printf("the finilizeMPICom was called\n");
MPI_Comm parentcomm = * ((MPI_Comm *) intercomm);
MPI_Status status;
int res;
// sleep(10);
MPI_Recv(&res, 1, MPI_INT, 0, send_kill_tag, parentcomm, &status);
int rank;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
printf("we receive something %d -- %d\n", rank, res);
MPI_Finalize();
exit(0);
}// finilizeMPICom
int main( int argc, char *argv[])
{
int numtasks, rank, len, rc;
char hostname[MPI_MAX_PROCESSOR_NAME];
int provided, claimed;
rc = MPI_Init_thread(0, 0, MPI_THREAD_MULTIPLE, &provided);
MPI_Query_thread( &claimed );
if (rc != MPI_SUCCESS || provided != 3)
{
printf ("Error starting MPI program. Terminating.\n");
MPI_Abort(MPI_COMM_WORLD, rc);
}
MPI_Comm_rank(MPI_COMM_WORLD,&rank);
MPI_Comm parentcomm;
MPI_Comm_get_parent(&parentcomm);
/* create a second thread to listen when we have to kill the program */
pthread_t properlyKill;
if(pthread_create(&properlyKill, NULL, finilizeMPICom, (void *) &parentcomm))
{
fprintf(stderr, "Error creating thread\n");
return 0;
}
assert(parentcomm != MPI_COMM_NULL);
MPI_Status status;
int root_process, ierr, num_rows_to_receive;
int mode;
MPI_Recv( &mode, 1, MPI_INT, 0, send_data_tag, parentcomm, &status);
printf("c The solver works in the mode %d\n", mode);
printf("I sent a message %d\n", rank);
// if(rank != 1) sleep(100);
int res = 1;
MPI_Send(&res, 1, MPI_INT, 0, send_data_tag, parentcomm);
printf("we want to listen for somethiing %d\n", rank);
int rescc = 1;
MPI_Recv(&rescc, 1, MPI_INT, 0, send_data_tag, parentcomm, &status);
printf("I received the message %d %d\n", rescc, rank);
if(rescc == 1000)
{
printf("~~~~~~~~>>> I print the solution %d\n", rank);
int res3 = 1001;
MPI_Send(&res3, 1, MPI_INT, 0, send_data_tag, parentcomm);
}
else printf("I do not understand %d\n", rank);
printf("I wait the thread to kill the programm %d\n", rank);
pthread_join(properlyKill, (void**)&(res));
return 0;
}
For the master I have:
int main(int argc, char **argv)
{
Parser *p = new Parser("slave.xml");
MPI_Init(&argc, &argv);
if(p->method == "concurrent")
{
ConcurrentManager cc(p->instance, p->solvers);
cc.run();
}
else
{
cerr << "c The only available methods are: concurrent, eps (Embarrassingly Parallel Search) or tree" << endl;
exit(1);
}
delete(p);
MPI_Finalize();
exit(0);
}// main
/**
Create a concurrent manager (means init the data structures to run
the solvers).
#param[in] _instance, the benchmark path
#param[in] _solvers, the set of solvers that will be ran
*/
ConcurrentManager::ConcurrentManager(string _instance, vector<Solver> &_solvers) :
instance(_instance), solvers(_solvers)
{
cout << "c\nc Concurrent manager called" << endl;
nbSolvers = _solvers.size();
np = new int[nbSolvers];
cmds = new char*[nbSolvers];
arrayOfArgs = new char **[nbSolvers];
infos = new MPI_Info[nbSolvers];
for(int i = 0 ; i<nbSolvers ; i++)
{
np[i] = solvers[i].npernode;
cmds[i] = new char[(solvers[i].executablePath).size() + 1];
strcpy(cmds[i], (solvers[i].executablePath).c_str());
arrayOfArgs[i] = new char *[(solvers[i].options).size() + 1];
for(unsigned int j = 0 ; j<(solvers[i].options).size() ; j++)
{
arrayOfArgs[i][j] = new char[(solvers[i].options[j]).size() + 1];
strcpy(arrayOfArgs[i][j], (solvers[i].options[j]).c_str());
}
arrayOfArgs[i][(solvers[i].options).size()] = NULL;
MPI_Info_create(&infos[i]);
char hostname[solvers[i].hostname.size()];
strcpy(hostname, solvers[i].hostname.c_str());
MPI_Info_set(infos[i], "host", hostname);
}
sizeComm = 0;
}// constructor
/**
Wait that at least one process finish and return the code
SOLUTION_FOUND.
#param[in] intercomm, the communicator
*/
void ConcurrentManager::waitForSolution(MPI_Comm &intercomm)
{
MPI_Status arrayStatus[sizeComm], status;
MPI_Request request[sizeComm];
int val[sizeComm], flag;
for(int i = 0 ; i<sizeComm ; i++) MPI_Irecv(&val[i], 1, MPI_INT, i, TAG_MSG, intercomm, &request[i]);
bool solutionFound = false;
while(!solutionFound)
{
for(int i = 0 ; i<sizeComm ; i++)
{
MPI_Test(&request[i], &flag, &arrayStatus[i]);
if(flag)
{
printf("---------------------> %d reveived %d\n", i , val[i]);
if(val[i] == SOLUTION_FOUND)
{
int msg = PRINT_SOLUTION;
MPI_Send(&msg, 1, MPI_INT, i, TAG_MSG, intercomm); // ask to print the solution
int msgJobFinished;
MPI_Recv(&msgJobFinished, 1, MPI_INT, i, TAG_MSG, intercomm, &status); // wait the answer
assert(msgJobFinished == JOB_FINISHED);
cout << "I am going to kill everybody" << endl;
int msgKill[sizeComm];
for(int j = 0 ; j<sizeComm ; j++)
{
msgKill[i] = STOP_AT_ONCE;
MPI_Send(&msgKill[i], 1, MPI_INT, j, TAG_KILL, intercomm);
}
solutionFound = true;
break;
} else
{
printf("restart the communication for %d\n", i);
MPI_Irecv(&val[i], 1, MPI_INT, i, TAG_MSG, intercomm, &request[i]);
}
}
}
}
}// waitForSolution
/**
Run the solver.
*/
void ConcurrentManager::run()
{
MPI_Comm intercomm;
int errcodes[solvers.size()];
MPI_Comm_spawn_multiple(nbSolvers, cmds, arrayOfArgs, np, infos, 0, MPI_COMM_WORLD, &intercomm, errcodes);
MPI_Comm_remote_size(intercomm, &sizeComm);
cout << "c Solvers are now running: " << sizeComm << endl;
int msg = CONCU_MODE;
for(int i = 0 ; i<sizeComm ; i++) MPI_Send(&msg, 1, MPI_INT, i, TAG_MSG, intercomm); // init the working mode
waitForSolution(intercomm);
}// run
I know that I put a lot of code :(
But, I do not know where is the problem.
Please, help me :)
Best regards.
The MPI documentation for how MPI interacts with threads demands that the call to MPI_Finalize() be performed by the main thread -- that is, the same one that initialized MPI. In your case, that happens also to be your process's initial thread.
In order to satisfy MPI's requirements, you could reorganize your application so that the initial thread is the one that waits for a kill signal and then shuts down MPI. The other work it currently does would then need to be moved to a different thread.
I am working on a project of converting a Point to Point Communication to a Collective Communication.
Essentially, what I would like to do is use MPI_Scatterv instead of MPI_Send and MPI_Recv. What I am having trouble determining is the correct arguments for Scatterv.
Here is the function that I am working in:
void read_block_vector (
char *s, /* IN - File name */
void **v, /* OUT - Subvector */
MPI_Datatype dtype, /* IN - Element type */
int *n, /* OUT - Vector length */
MPI_Comm comm) /* IN - Communicator */
{
int datum_size; /* Bytes per element */
int i;
FILE *infileptr; /* Input file pointer */
int local_els; /* Elements on this proc */
MPI_Status status; /* Result of receive */
int id; /* Process rank */
int p; /* Number of processes */
int x; /* Result of read */
datum_size = get_size (dtype);
MPI_Comm_size(comm, &p);
MPI_Comm_rank(comm, &id);
/* Process p-1 opens file, determines number of vector
elements, and broadcasts this value to the other
processes. */
if (id == (p-1)) {
infileptr = fopen (s, "r");
if (infileptr == NULL) *n = 0;
else fread (n, sizeof(int), 1, infileptr);
}
MPI_Bcast (n, 1, MPI_INT, p-1, comm);
if (! *n) {
if (!id) {
printf ("Input file '%s' cannot be opened\n", s);
fflush (stdout);
}
}
/* Block mapping of vector elements to processes */
local_els = BLOCK_SIZE(id,p,*n);
/* Dynamically allocate vector. */
*v = my_malloc (id, local_els * datum_size);
if (id == (p-1)) {
for (i = 0; i < p-1; i++) {
x = fread (*v, datum_size, BLOCK_SIZE(i,p,*n),
infileptr);
MPI_Send (*v, BLOCK_SIZE(i,p,*n), dtype, i, DATA_MSG,
comm);
}
x = fread (*v, datum_size, BLOCK_SIZE(id,p,*n),
infileptr);
fclose (infileptr);
} else {
MPI_Recv (*v, BLOCK_SIZE(id,p,*n), dtype, p-1, DATA_MSG,
comm, &status);
}
// My Attempt at making this collective communication:
if(id == (p-1))
{
x = fread(*v,datum_size,*n,infileptr);
for(i = 0; i < p; i++)
{
size[i] = BLOCK_SIZE(i,p,*n);
}
//x = fread(*v,datum_size,BLOCK_SIZE(id, p, *n),infileptr);
fclose(infileptr);
}
MPI_Scatterv(v,send_count,send_disp, dtype, storage, size[id], dtype, p-1, comm);
}
Any help would be appreciated.
Thank you
It's easier for people to answer your question if you post a small, self-contained, reproducible example.
For the Scatterv, you need to provide the list of counts to send to each process, which appears to be your size[] array, and the displacements within the data to send out. The mechanics of Scatter vs Scatterv are described in some detail in this answer. Trying to infer what all your variables and un-supplied functions/macros do, the example below scatters a file out to the processes.
But also note that if you're doing this, it's not much harder to actually use MPI-IO to coordinate the file access directly, avoiding the need to have one process read all of the data in the first place. Code for that is also supplied.
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
int main(int argc, char **argv) {
int id, p;
int *block_size;
int datasize = 0;
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &p);
MPI_Comm_rank(MPI_COMM_WORLD, &id);
block_size = malloc(p * sizeof(int));
for (int i=0; i<p; i++) {
block_size[i] = i + 1;
datasize += block_size[i];
}
/* create file for reading */
if (id == p-1) {
char *data = malloc(datasize * sizeof(char));
for (int i=0; i<datasize; i++)
data[i] = 'a' + i;
FILE *f = fopen("data.dat","wb");
fwrite(data, sizeof(char), datasize, f);
fclose(f);
printf("Initial data: ");
for (int i=0; i<datasize; i++)
printf("%c", data[i]);
printf("\n");
free(data);
}
if (id == 0) printf("---Using MPI-Scatterv---\n");
/* using scatterv */
int local_els = block_size[id];
char *v = malloc ((local_els + 1) * sizeof(char));
char *all;
int *counts, *disps;
counts = malloc(p * sizeof(int));
disps = malloc(p * sizeof(int));
/* counts.. */
for(int i = 0; i < p; i++)
counts[i] = block_size[i];
/* and displacements (where the data starts within the send buffer) */
disps[0] = 0;
for(int i = 1; i < p; i++)
disps[i] = disps[i-1] + counts[i-1];
if(id == (p-1))
{
all = malloc(datasize*sizeof(char));
FILE *f = fopen("data.dat","rb");
int x = fread(all,sizeof(char),datasize,f);
fclose(f);
}
MPI_Scatterv(all, counts, disps, MPI_CHAR, v, local_els, MPI_CHAR, p-1, MPI_COMM_WORLD);
if (id == (p-1)) {
free(all);
}
v[local_els] = '\0';
printf("[%d]: %s\n", id, v);
/* using MPI I/O */
fflush(stdout);
MPI_Barrier(MPI_COMM_WORLD); /* only for syncing output to screen */
if (id == 0) printf("---Using MPI-IO---\n");
for (int i=0; i<local_els; i++)
v[i] = 'X';
/* create the file layout - the subarrays within the 1d array of data */
MPI_Datatype myview;
MPI_Type_create_subarray(1, &datasize, &local_els, &(disps[id]),
MPI_ORDER_C, MPI_CHAR, &myview);
MPI_Type_commit(&myview);
MPI_File mpif;
MPI_Status status;
MPI_File_open(MPI_COMM_WORLD, "data.dat", MPI_MODE_RDONLY, MPI_INFO_NULL, &mpif);
MPI_File_set_view(mpif, (MPI_Offset)0, MPI_CHAR, myview, "native", MPI_INFO_NULL);
MPI_File_read_all(mpif, v, local_els, MPI_CHAR, &status);
MPI_File_close(&mpif);
MPI_Type_free(&myview);
v[local_els] = '\0';
printf("[%d]: %s\n", id, v);
free(v);
free(counts);
free(disps);
MPI_Finalize();
return 0;
}
Running this gives (output re-ordered for clarity)
$ mpirun -np 6 ./foo
Initial data: abcdefghijklmnopqrstu
---Using MPI-Scatterv---
[0]: a
[1]: bc
[2]: def
[3]: ghij
[4]: klmno
[5]: pqrstu
---Using MPI-IO---
[0]: a
[1]: bc
[2]: def
[3]: ghij
[4]: klmno
[5]: pqrstu
I've been having a bug in my code for some time and could not figure out yet how to solve it.
What I'm trying to achieve is easy enough: every worker-node (i.e. node with rank!=0) gets a row (represented by 1-dimensional arry) in a square-structure that involves some computation. Once the computation is done, this row gets sent back to the master.
For testing purposes, there is no computation involved. All that's happening is:
master sends row number to worker, worker uses the row number to calculate the according values
worker sends the array with the result values back
Now, my issue is this:
all works as expected up to a certain size for the number of elements in a row (size = 1006) and number of workers > 1
if the elements in a row exceed 1006, workers fail to shutdown and the program does not terminate
this only occurs if I try to send the array back to the master. If I simply send back an INT, then everything is OK (see commented out line in doMasterTasks() and doWorkerTasks())
Based on the last bullet point, I assume that there must be some race-condition which only surfaces when the array to be sent back to the master reaches a certain size.
Do you have any idea what the issue could be?
Compile the following code with: mpicc -O2 -std=c99 -o simple
Run the executable like so: mpirun -np 3 simple <size> (e.g. 1006 or 1007)
Here's the code:
#include "mpi.h"
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#define MASTER_RANK 0
#define TAG_RESULT 1
#define TAG_ROW 2
#define TAG_FINISHOFF 3
int mpi_call_result, my_rank, dimension, np;
// forward declarations
void doInitWork(int argc, char **argv);
void doMasterTasks(int argc, char **argv);
void doWorkerTasks(void);
void finalize();
void quit(const char *msg, int mpi_call_result);
void shutdownWorkers() {
printf("All work has been done, shutting down clients now.\n");
for (int i = 0; i < np; i++) {
MPI_Send(0, 0, MPI_INT, i, TAG_FINISHOFF, MPI_COMM_WORLD);
}
}
void doMasterTasks(int argc, char **argv) {
printf("Starting to distribute work...\n");
int size = dimension;
int * dataBuffer = (int *) malloc(sizeof(int) * size);
int currentRow = 0;
int receivedRow = -1;
int rowsLeft = dimension;
MPI_Status status;
for (int i = 1; i < np; i++) {
MPI_Send(¤tRow, 1, MPI_INT, i, TAG_ROW, MPI_COMM_WORLD);
rowsLeft--;
currentRow++;
}
for (;;) {
// MPI_Recv(dataBuffer, size, MPI_INT, MPI_ANY_SOURCE, TAG_RESULT, MPI_COMM_WORLD, &status);
MPI_Recv(&receivedRow, 1, MPI_INT, MPI_ANY_SOURCE, MPI_ANY_TAG, MPI_COMM_WORLD, &status);
if (rowsLeft == 0)
break;
if (currentRow > 1004)
printf("Sending row %d to worker %d\n", currentRow, status.MPI_SOURCE);
MPI_Send(¤tRow, 1, MPI_INT, status.MPI_SOURCE, TAG_ROW, MPI_COMM_WORLD);
rowsLeft--;
currentRow++;
}
shutdownWorkers();
free(dataBuffer);
}
void doWorkerTasks() {
printf("Worker %d started\n", my_rank);
// send the processed row back as the first element in the colours array.
int size = dimension;
int * data = (int *) malloc(sizeof(int) * size);
memset(data, 0, sizeof(size));
int processingRow = -1;
MPI_Status status;
for (;;) {
MPI_Recv(&processingRow, 1, MPI_INT, 0, MPI_ANY_TAG, MPI_COMM_WORLD, &status);
if (status.MPI_TAG == TAG_FINISHOFF) {
printf("Finish-OFF tag received!\n");
break;
} else {
// MPI_Send(data, size, MPI_INT, 0, TAG_RESULT, MPI_COMM_WORLD);
MPI_Send(&processingRow, 1, MPI_INT, 0, TAG_RESULT, MPI_COMM_WORLD);
}
}
printf("Slave %d finished work\n", my_rank);
free(data);
}
int main(int argc, char **argv) {
if (argc == 2) {
sscanf(argv[1], "%d", &dimension);
} else {
dimension = 1000;
}
doInitWork(argc, argv);
if (my_rank == MASTER_RANK) {
doMasterTasks(argc, argv);
} else {
doWorkerTasks();
}
finalize();
}
void quit(const char *msg, int mpi_call_result) {
printf("\n%s\n", msg);
MPI_Abort(MPI_COMM_WORLD, mpi_call_result);
exit(mpi_call_result);
}
void finalize() {
mpi_call_result = MPI_Finalize();
if (mpi_call_result != 0) {
quit("Finalizing the MPI system failed, aborting now...", mpi_call_result);
}
}
void doInitWork(int argc, char **argv) {
mpi_call_result = MPI_Init(&argc, &argv);
if (mpi_call_result != 0) {
quit("Error while initializing the system. Aborting now...\n", mpi_call_result);
}
MPI_Comm_size(MPI_COMM_WORLD, &np);
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
}
Any help is greatly appreciated!
Best,
Chris
If you take a look at your doWorkerTasks, you see that they send exactly as many data messages as they receive; (and they receive one more to shut them down).
But your master code:
for (int i = 1; i < np; i++) {
MPI_Send(¤tRow, 1, MPI_INT, i, TAG_ROW, MPI_COMM_WORLD);
rowsLeft--;
currentRow++;
}
for (;;) {
MPI_Recv(dataBuffer, size, MPI_INT, MPI_ANY_SOURCE, TAG_RESULT, MPI_COMM_WORLD, &status);
if (rowsLeft == 0)
break;
MPI_Send(¤tRow, 1, MPI_INT, status.MPI_SOURCE, TAG_ROW, MPI_COMM_WORLD);
rowsLeft--;
currentRow++;
}
sends np-2 more data messages than it receives. In particular, it only keeps receiving data until it has no more to send, even though there should be np-2 more data messages outstanding. Changing the code to the following:
int rowsLeftToSend= dimension;
int rowsLeftToReceive = dimension;
for (int i = 1; i < np; i++) {
MPI_Send(¤tRow, 1, MPI_INT, i, TAG_ROW, MPI_COMM_WORLD);
rowsLeftToSend--;
currentRow++;
}
while (rowsLeftToReceive > 0) {
MPI_Recv(dataBuffer, size, MPI_INT, MPI_ANY_SOURCE, TAG_RESULT, MPI_COMM_WORLD, &status);
rowsLeftToReceive--;
if (rowsLeftToSend> 0) {
if (currentRow > 1004)
printf("Sending row %d to worker %d\n", currentRow, status.MPI_SOURCE);
MPI_Send(¤tRow, 1, MPI_INT, status.MPI_SOURCE, TAG_ROW, MPI_COMM_WORLD);
rowsLeftToSend--;
currentRow++;
}
}
Now works.
Why the code doesn't deadlock (note this is deadlock, not a race condition; this is a more common parallel error in distributed computing) for smaller message sizes is a subtle detail of how most MPI implementations work. Generally, MPI implementations just "shove" small messages down the pipe whether or not the receiver is ready for them, but larger messages (since they take more storage resources on the receiving end) need some handshaking between the sender and the receiver. (If you want to find out more, search for eager vs rendezvous protocols).
So for the small message case (less than 1006 ints in this case, and 1 int definitely works, too) the worker nodes did their send whether or not the master was receiving them. If the master had called MPI_Recv(), the messages would have been there already and it would have returned immediately. But it didn't, so there were pending messages on the master side; but it didn't matter. The master sent out its kill messages, and everyone exited.
But for larger messages, the remaining send()s have to have the receiver particpating to clear, and since the receiver never does, the remaining workers hang.
Note that even for the small message case where there was no deadlock, the code didn't work properly - there was missing computed data.
Update: There was a similar problem in your shutdownWorkers:
void shutdownWorkers() {
printf("All work has been done, shutting down clients now.\n");
for (int i = 0; i < np; i++) {
MPI_Send(0, 0, MPI_INT, i, TAG_FINISHOFF, MPI_COMM_WORLD);
}
}
Here you are sending to all processes, including rank 0, the one doing the sending. In principle, that MPI_Send should deadlock, as it is a blocking send and there isn't a matching receive already posted. You could post a non-blocking receive before to avoid this, but that's unnecessary -- rank 0 doesn't need to let itself know to end. So just change the loop to
for (int i = 1; i < np; i++)
tl;dr - your code deadlocked because the master wasn't receiving enough messages from the workers; it happened to work for small message sizes because of an implementation detail common to most MPI libraries.