In my Matrix Addition code, I am transmitting the lower bound to the other processes with ISend and Tag 1, but when I compile the code all other slave processes claim to have the same lower bound. I don't understand why?
The Output:
I am process 1 and I received 1120 as lower bound
I am process 1 and my lower bound is 1120 and my upper bound is 1682
I am process 2 and I received 1120 as lower bound
I am process 2 and my lower bound is 1120 and my upper bound is 1682
Process 0 here: I am sending lower bound 0 to process 1
Process 0 here: I am sending lower bound 560 to process 2
Process 0 here: I am sending lower bound 1120 to process 3
Timings : 13.300698 Sec
I am process 3 and I received 1120 as lower bound
I am process 3 and my lower bound is 1120 and my upper bound is 1682
The code:
#define N_ROWS 1682
#define N_COLS 823
#define MASTER_TO_SLAVE_TAG 1 //tag for messages sent from master to slaves
#define SLAVE_TO_MASTER_TAG 4 //tag for messages sent from slaves to master
void readMatrix();
int rank, nproc, proc;
double matrix_A[N_ROWS][N_COLS];
double matrix_B[N_ROWS][N_COLS];
double matrix_C[N_ROWS][N_COLS];
int low_bound; //low bound of the number of rows of [A] allocated to a slave
int upper_bound; //upper bound of the number of rows of [A] allocated to a slave
int portion; //portion of the number of rows of [A] allocated to a slave
MPI_Status status; // store status of a MPI_Recv
MPI_Request request; //capture request of a MPI_Isend
int main (int argc, char *argv[]) {
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &nproc);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
double StartTime = MPI_Wtime();
// -------------------> Process 0 initalizes matrices and sends work portions to other processes
if (rank==0) {
readMatrix();
for (proc = 1; proc < nproc; proc++) {//for each slave other than the master
portion = (N_ROWS / (nproc - 1)); // calculate portion without master
low_bound = (proc - 1) * portion;
if (((proc + 1) == nproc) && ((N_ROWS % (nproc - 1)) != 0)) {//if rows of [A] cannot be equally divided among slaves
upper_bound = N_ROWS; //last slave gets all the remaining rows
} else {
upper_bound = low_bound + portion; //rows of [A] are equally divisable among slaves
}
//send the low bound first without blocking, to the intended slave
printf("Process 0 here: I am sending lower bound %i to process %i \n",low_bound,proc);
MPI_Isend(&low_bound, 1, MPI_INT, proc, MASTER_TO_SLAVE_TAG, MPI_COMM_WORLD, &request);
//next send the upper bound without blocking, to the intended slave
MPI_Isend(&upper_bound, 1, MPI_INT, proc, MASTER_TO_SLAVE_TAG + 1, MPI_COMM_WORLD, &request);
//finally send the allocated row portion of [A] without blocking, to the intended slave
MPI_Isend(&matrix_A[low_bound][0], (upper_bound - low_bound) * N_COLS, MPI_DOUBLE, proc, MASTER_TO_SLAVE_TAG + 2, MPI_COMM_WORLD, &request);
}
}
//broadcast [B] to all the slaves
MPI_Bcast(&matrix_B, N_ROWS*N_COLS, MPI_DOUBLE, 0, MPI_COMM_WORLD);
// -------------------> Other processes do their work
if (rank != 0) {
//receive low bound from the master
MPI_Recv(&low_bound, 1, MPI_INT, 0, MASTER_TO_SLAVE_TAG, MPI_COMM_WORLD, &status);
printf("I am process %i and I received %i as lower bound \n",rank,low_bound);
//next receive upper bound from the master
MPI_Recv(&upper_bound, 1, MPI_INT, 0, MASTER_TO_SLAVE_TAG + 1, MPI_COMM_WORLD, &status);
//finally receive row portion of [A] to be processed from the master
MPI_Recv(&matrix_A[low_bound][0], (upper_bound - low_bound) * N_COLS, MPI_DOUBLE, 0, MASTER_TO_SLAVE_TAG + 2, MPI_COMM_WORLD, &status);
printf("I am process %i and my lower bound is %i and my upper bound is %i \n",rank,low_bound,upper_bound);
//do your work
for (int i = low_bound; i < upper_bound; i++) {
for (int j = 0; j < N_COLS; j++) {
matrix_C[i][j] = (matrix_A[i][j] + matrix_B[i][j]);
}
}
//send back the low bound first without blocking, to the master
MPI_Isend(&low_bound, 1, MPI_INT, 0, SLAVE_TO_MASTER_TAG, MPI_COMM_WORLD, &request);
//send the upper bound next without blocking, to the master
MPI_Isend(&upper_bound, 1, MPI_INT, 0, SLAVE_TO_MASTER_TAG + 1, MPI_COMM_WORLD, &request);
//finally send the processed portion of data without blocking, to the master
MPI_Isend(&matrix_C[low_bound][0], (upper_bound - low_bound) * N_COLS, MPI_DOUBLE, 0, SLAVE_TO_MASTER_TAG + 2, MPI_COMM_WORLD, &request);
}
// -------------------> Process 0 gathers the work
...
MPI_Isend() begins a non-blocking send. Hence, modifiying the buffer that is sent without checking that the message was actually sent result in wrong values being sent.
This is what happens in the piece of code you provided, in the loop on process for (proc = 1; proc < nproc; proc++)
proc=1 : low_bound is computed.
proc=1 : low_bound is sent (non-blocking) to process 1.
proc=2 : low_bound is modified. The message is corrupted.
Different solutions exist:
Use blocking send MPI_Send().
Check that the message are completed by creating an array of 3 requests MPI_Request requests[3]; MPI_Status statuses[3];, use non-blocking send and use MPI_Waitall() to check the completion of the requests.
MPI_Isend(&low_bound, 1, MPI_INT, proc, MASTER_TO_SLAVE_TAG, MPI_COMM_WORLD, &requests[0]);
MPI_Isend(..., &requests[1]);
MPI_Isend(..., &requests[2]);
MPI_Waitall(3, requests, statuses);
Take a look at MPI_Scatter() and MPI_Scatterv() !
The "usual" way to do this is to MPI_Bcast() the size of the matrix. Then each process computes the size of its part of the matrix. Process 0 computes the sendcounts and displs needed by MPI_Scatterv().
Related
I'm rather new to MPI, so I'm not sure why this code isn't functioning as expected. The idea is to pass an integer to a random node and decrement it until is reaches 0. When I try to run it, it passes the integer twice and stalls. Can someone please point me in the right direction? Thank you!
if (rank == 0)
{
potato = rand() % 100 + size; // generate a random number between the number of processors and 100
sendTo = rand() % (size - 1) + 1; // generate a number (not 0) to represent the process to send the potato to
MPI_Send(&potato, 1, MPI_INT, sendTo, 0, MPI_COMM_WORLD); // send the potato
}
else // any process other than 0
{
MPI_Recv(&potato, 1, MPI_INT, MPI_ANY_SOURCE, MPI_ANY_TAG, MPI_COMM_WORLD, MPI_STATUS_IGNORE); //receive potato
if (potato == -1) // check for termination int
return;
--potato; // decrement potato
if (potato != 0)
{
do
{
sendTo = rand() % (size - 1) + 1; // send to a process 1 through size - 1
} while (sendTo == rank || sendTo == 0); // make sure it won't send the potato to itself or 0
printf("Node %d has the potato, passing to node %d.\n", rank, sendTo);
MPI_Send(&potato, 1, MPI_INT, sendTo, 0, MPI_COMM_WORLD);
}
else // potato == 0
{
printf("Node %d is it, game over.\n", rank);
potato = -1;
for (int i = 1; i < size; ++rank) // send termination message
MPI_Send(&potato, 1, MPI_INT, i, 0, MPI_COMM_WORLD);
}
}
Output:
Potato: 44
Node 3 has the potato, Passing to node 2.
Node 2 has the potato, Passing to node 3.
Your code is missing some loop. In your example, for node 3 to received the patato a second time, MPI_Recv must be called once more.
if (rank == 0)
{
potato = rand() % 100 + size; // generate a random number between the number of processors and 100
sendTo = rand() % (size - 1) + 1; // generate a number (not 0) to represent the process to send the potato to
MPI_Send(&potato, 1, MPI_INT, sendTo, 0, MPI_COMM_WORLD); // send the potato
}
else // any process other than 0
{
/* Here is the loop beginning, while patato is not -1, continue reading*/
while (1)
{
MPI_Recv(&potato, 1, MPI_INT, MPI_ANY_SOURCE, MPI_ANY_TAG, MPI_COMM_WORLD, MPI_STATUS_IGNORE); //receive potato
if (potato == -1) // check for termination int
return;
--potato; // decrement potato
if (potato != 0)
{
do
{
sendTo = rand() % (size - 1) + 1; // send to a process 1 through size - 1
} while (sendTo == rank || sendTo == 0); // make sure it won't send the potato to itself or 0
printf("Node %d has the potato, passing to node %d.\n", rank, sendTo);
MPI_Send(&potato, 1, MPI_INT, sendTo, 0, MPI_COMM_WORLD);
}
else // potato == 0
{
printf("Node %d is it, game over.\n", rank);
potato = -1;
for (int i = 1; i < size; ++rank) // send termination message
MPI_Send(&potato, 1, MPI_INT, i, 0, MPI_COMM_WORLD);
}
}
}
#include "mpi.h"
#include <stdio.h>
int main(int argc,char *argv[]){
int numtasks, rank, rc, count, tag=1, i =0;
MPI_Status Stat;
MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD, &numtasks);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
if (rank == 0) //for process 0 we print received messages
{
for(i=0; i< 9; i ++){
printf("value of i is: %d\n",i );
rc = MPI_Recv(&inmsg, 1, MPI_CHAR, MPI_ANY_SOURCE, tag, MPI_COMM_WORLD, &Stat);
printf("Task %d: Received %d char(s) from task %d with tag %d \n", rank, count, Stat.MPI_SOURCE, Stat.MPI_TAG);
}
}
else //for the other 9 processes
{
if(rank % 2 == 0){ //if rank is an even number
rc = MPI_Send(&outmsg, 1, MPI_CHAR, 0, tag, MPI_COMM_WORLD); //send message to process with rank 0
}
}
MPI_Finalize();
}
//
This program is run with 10 processes. Process with rank 0 receives messages and prints them out if the source process has an even numbered rank. Processes with rank other than 0 send process with rank 0 a message containing a character 'x'
Now, in regards to rank 0, it has a for loop that basically loops 9 times. In the loop it prints out the value of the iterating variable i, and the received character and source process.
However, when I run my program it does not terminate.
The output looks like this:
Task 0: Received 0 char(s) from task 2 with tag 1
value of i is: 1
Task 0: Received 0 char(s) from task 6 with tag 1
value of i is: 2
Task 0: Received 0 char(s) from task 4 with tag 1
value of i is: 3
Task 0: Received 0 char(s) from task 8 with tag 1
value of i is: 4
How do I get it to print the other values of i such as 5,6,7,8,9?
You're using a master-slave architecture for parallel processing, your process 0 is the master and is waiting for the input of 9 other process, but in your code only process with even id number will fire an output, namely process 2, 4, 6, 8.
you didn't put behavior for process 1,3,5,7 and 9 so the master is still waiting for them hence, the program waiting for parallel process to finish:
you need to complete your source code here
if(rank % 2 == 0){ //if rank is an even number
rc = MPI_Send(&outmsg, 1, MPI_CHAR, 0, tag, MPI_COMM_WORLD); //send message to process with rank 0
}else{
//logic for process 1,3,5,7,9
}
This program written using C Lagrange and MPI. I am new to MPI and want to use all processors to do some calculations, including process 0. To learn this concept, I have written the following simple program. But this program hangs at the bottom after receiving input from the process 0 and won't send results back to the process 0.
#include <mpi.h>
#include <stdio.h>
int main(int argc, char** argv) {
MPI_Init(&argc, &argv);
int world_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
int world_size;
MPI_Comm_size(MPI_COMM_WORLD, &world_size);
int number;
int result;
if (world_rank == 0)
{
number = -2;
int i;
for(i = 0; i < 4; i++)
{
MPI_Send(&number, 1, MPI_INT, i, 0, MPI_COMM_WORLD);
}
for(i = 0; i < 4; i++)
{ /*Error: can't get result send by other processos bellow*/
MPI_Recv(&number, 1, MPI_INT, i, 99, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
printf("Process 0 received number %d from i:%d\n", number, i);
}
}
/*I want to do this without using an else statement here, so that I can use process 0 to do some calculations as well*/
MPI_Recv(&number, 1, MPI_INT, 0, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
printf("*Process %d received number %d from process 0\n",world_rank, number);
result = world_rank + 1;
MPI_Send(&result, 1, MPI_INT, 0, 99, MPI_COMM_WORLD); /* problem happens here when trying to send result back to process 0*/
MPI_Finalize();
}
Runing and getting results:
:$ mpicc test.c -o test
:$ mpirun -np 4 test
*Process 1 received number -2 from process 0
*Process 2 received number -2 from process 0
*Process 3 received number -2 from process 0
/* hangs here and will not continue */
If you can, please show me with an example or edit the above code if possible.
I don't really get what would be wrong with using 2 if statements, surrounding the working domain. But anyway, here is an example of what could be done.
I modified your code to use collective communications as they make much more sense than the series of send/receive you used. Since the initial communications are with a uniform value, I use a MPI_Bcast() which does the same in one single call.
Conversely, since the result values are all different, a call to MPI_Gather() is perfectly appropriate.
I also introduce a call to sleep() just to simulate that the processes are working for a while, prior to sending back their results.
The code now looks like this:
#include <mpi.h>
#include <stdlib.h> // for malloc and free
#include <stdio.h> // for printf
#include <unistd.h> // for sleep
int main( int argc, char *argv[] ) {
MPI_Init( &argc, &argv );
int world_rank;
MPI_Comm_rank( MPI_COMM_WORLD, &world_rank );
int world_size;
MPI_Comm_size( MPI_COMM_WORLD, &world_size );
// sending the same number to all processes via broadcast from process 0
int number = world_rank == 0 ? -2 : 0;
MPI_Bcast( &number, 1, MPI_INT, 0, MPI_COMM_WORLD );
printf( "Process %d received %d from process 0\n", world_rank, number );
// Do something usefull here
sleep( 1 );
int my_result = world_rank + 1;
// Now collecting individual results on process 0
int *results = world_rank == 0 ? malloc( world_size * sizeof( int ) ) : NULL;
MPI_Gather( &my_result, 1, MPI_INT, results, 1, MPI_INT, 0, MPI_COMM_WORLD );
// Process 0 prints what it collected
if ( world_rank == 0 ) {
for ( int i = 0; i < world_size; i++ ) {
printf( "Process 0 received result %d from process %d\n", results[i], i );
}
free( results );
}
MPI_Finalize();
return 0;
}
After compiling it as follows:
$ mpicc -std=c99 simple_mpi.c -o simple_mpi
It runs and gives this:
$ mpiexec -n 4 ./simple_mpi
Process 0 received -2 from process 0
Process 1 received -2 from process 0
Process 3 received -2 from process 0
Process 2 received -2 from process 0
Process 0 received result 1 from process 0
Process 0 received result 2 from process 1
Process 0 received result 3 from process 2
Process 0 received result 4 from process 3
Actually, processes 1-3 are indeed sending the result back to processor 0. However, processor 0 is stuck in the first iteration of this loop:
for(i=0; i<4; i++)
{
MPI_Recv(&number, 1, MPI_INT, i, 99, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
printf("Process 0 received number %d from i:%d\n", number, i);
}
In the first MPI_Recv call, processor 0 will block waiting to receive a message from itself with tag 99, a message that 0 did not send yet.
Generally, it is a bad idea for a processor to send/receive messages to itself, especially using blocking calls. 0 already have the value in memory. It does not need to send it to itself.
However, a workaround is to start the receive loop from i=1
for(i=1; i<4; i++)
{
MPI_Recv(&number, 1, MPI_INT, i, 99, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
printf("Process 0 received number %d from i:%d\n", number, i);
}
Running the code now will give you:
Process 1 received number -2 from process 0
Process 2 received number -2 from process 0
Process 3 received number -2 from process 0
Process 0 received number 2 from i:1
Process 0 received number 3 from i:2
Process 0 received number 4 from i:3
Process 0 received number -2 from process 0
Note that using MPI_Bcast and MPI_Gather as mentioned by Gilles is a much more efficient and standard way for data distribution/collection.
Im trying to compute a NxN matrix multiplication using the OpenMPI and C. Everything runs as expected, except for the MPI_Bcast(). As far as I understand, the MASTER must broadcast matrix_2 to the rest of the WORKER processes. At the same time, when WORKERS reach the MPI_Bcast() they should wait there until the selected process (in this case the MASTER) does the broadcast.
The error I'm getting is a Segmentation fault and Address not mapped, so it surely has something to do with the dynamic allocation of the matrices. What I do is send parts of matrix_1 to each process, and each one of them then does partial multiplications and additions with the previously broadcast matrix_2.
I know that the error must be on the MPI_Bcast() because when I comment it the program finishes correctly (but obviously without computing the product). There must be something I'm not being aware of. I leave both the code and the error message I got. Thanks in advanced.
CODE
#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
/* MACROS */
#define MASTER_TO_SLAVE_TAG 1
#define SLAVE_TO_MASTER_TAG 4
#define MASTER 0
#define WORKER 1
int *matrix_1;
int *matrix_2;
int *result;
double start_time;
double end_time;
int procID;
int numProc;
int size, numRows, from, to;
int i,j,k;
MPI_Status status;
MPI_Request request;
void addressMatrixMemory(int);
int main(int argc, char *argv[]){
size = atoi(argv[1]);
MPI_Init (&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &procID);
MPI_Comm_size(MPI_COMM_WORLD, &numProc);
addressMatrixMemory(size);
/* MASTER starts. */
if(procID == MASTER){
start_time = MPI_Wtime();
for(i = 1; i < numProc; i++){
numRows = size/(numProc - 1);
from = (i - 1) * numRows;
if(((i + 1) == numProc) && ((size % (numProc - 1))) != 0){
to = size;
} else {
to = from + numRows;
}
MPI_Isend(&from, 1, MPI_INT, i, MASTER_TO_SLAVE_TAG, MPI_COMM_WORLD, &request);
MPI_Isend(&to, 1, MPI_INT, i, MASTER_TO_SLAVE_TAG + 1, MPI_COMM_WORLD, &request);
MPI_Isend(matrix_1, (to - from) * size, MPI_INT, i, MASTER_TO_SLAVE_TAG + 2, MPI_COMM_WORLD, &request);
}
}
MPI_Bcast(&matrix_2, size * size, MPI_INT, MASTER, MPI_COMM_WORLD);
/* WORKERS task */
if(procID >= WORKER){
int row, col;
int *matrix = malloc(sizeof(matrix_1[0])*size*size);
MPI_Recv(&from, 1, MPI_INT, MASTER, MASTER_TO_SLAVE_TAG, MPI_COMM_WORLD, &status);
MPI_Recv(&to, 1, MPI_INT, MASTER, MASTER_TO_SLAVE_TAG + 1, MPI_COMM_WORLD, &status);
MPI_Recv(matrix, (to - from) * size, MPI_INT, MASTER, MASTER_TO_SLAVE_TAG + 2, MPI_COMM_WORLD, &status);
for(row = from; row < to; row++){
for(col = 0; col < size; col++){
result[row * size + col] = 0;
for(k = 0; k < size; k++);
result[row * size + col] += matrix[row * size + k] * matrix_2[k * size + col];
}
}
MPI_Isend(&from, 1, MPI_INT, MASTER, SLAVE_TO_MASTER_TAG, MPI_COMM_WORLD, &request);
MPI_Isend(&to, 1, MPI_INT, MASTER, SLAVE_TO_MASTER_TAG + 1, MPI_COMM_WORLD, &request);
MPI_Isend(&result[from], (to - from) * size, MPI_INT, MASTER, SLAVE_TO_MASTER_TAG + 2, MPI_COMM_WORLD, &request);
}
/* MASTER gathers WORKERS job. */
if(procID == MASTER){
for(i = 1; i < numProc; i++){
MPI_Recv(&from, 1, MPI_INT, i, SLAVE_TO_MASTER_TAG, MPI_COMM_WORLD, &status);
MPI_Recv(&to, 1, MPI_INT, i, SLAVE_TO_MASTER_TAG + 1, MPI_COMM_WORLD, &status);
MPI_Recv(&result[from], (to - from) * size, MPI_INT, i, SLAVE_TO_MASTER_TAG + 2, MPI_COMM_WORLD, &status);
}
end_time = MPI_Wtime();
printf("\nRunning Time = %f\n\n", end_time - start_time);
}
MPI_Finalize();
free(matrix_1);
free(matrix_2);
free(result);
return EXIT_SUCCESS;
}
void addressMatrixMemory(int n){
matrix_1 = malloc(sizeof(matrix_1[0])*n*n);
matrix_2 = malloc(sizeof(matrix_2[0])*n*n);
result = malloc(sizeof(result[0])*n*n);
/* Matrix init with values between 1 y 100. */
srand(time(NULL));
int r = rand() % 100 + 1;
int i;
for(i = 0; i < n*n; i++){
matrix_1[i] = r;
r = rand() % 100 + 1;
matrix_2[i] = r;
r = rand() % 100 + 1;
}
}
ERROR MESSAGE
[tuliansPC:28270] *** Process received signal ***
[tuliansPC:28270] Signal: Segmentation fault (11)
[tuliansPC:28270] Signal code: Address not mapped (1)
[tuliansPC:28270] Failing at address: 0x603680
[tuliansPC:28270] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x10340) [0x7f0a98ce0340]
[tuliansPC:28270] [ 1] /lib/x86_64-linux-gnu/libc.so.6(+0x97ffe) [0x7f0a9899fffe]
[tuliansPC:28270] [ 2] /usr/lib/libmpi.so.1(opal_convertor_pack+0x129) [0x7f0a98fef779]
[tuliansPC:28270] [ 3] /usr/lib/openmpi/lib/openmpi/mca_btl_sm.so(mca_btl_sm_prepare_src+0x1fd) [0x7f0a923c385d]
[tuliansPC:28270] [ 4] /usr/lib/openmpi/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_send_request_start_rndv+0x1dc) [0x7f0a93245c9c]
[tuliansPC:28270] [ 5] /usr/lib/openmpi/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_isend+0x8ec) [0x7f0a9323856c]
[tuliansPC:28270] [ 6] /usr/lib/openmpi/lib/openmpi/mca_coll_tuned.so(ompi_coll_tuned_bcast_intra_generic+0x3fc) [0x7f0a914f49fc]
[tuliansPC:28270] [ 7] /usr/lib/openmpi/lib/openmpi/mca_coll_tuned.so(ompi_coll_tuned_bcast_intra_pipeline+0xbc) [0x7f0a914f4d5c]
[tuliansPC:28270] [ 8] /usr/lib/openmpi/lib/openmpi/mca_coll_tuned.so(ompi_coll_tuned_bcast_intra_dec_fixed+0x134) [0x7f0a914ec7a4]
[tuliansPC:28270] [ 9] /usr/lib/openmpi/lib/openmpi/mca_coll_sync.so(mca_coll_sync_bcast+0x64) [0x7f0a917096a4]
[tuliansPC:28270] [10] /usr/lib/libmpi.so.1(MPI_Bcast+0x13d) [0x7f0a98f5678d]
[tuliansPC:28270] [11] ej5Exec() [0x400e8c]
[tuliansPC:28270] [12] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7f0a98929ec5]
[tuliansPC:28270] [13] ej5Exec() [0x400ac9]
[tuliansPC:28270] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 28270 on node tuliansPC exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
Let's start with the first problem that jumps out. You're using non-blocking communication incorrectly. MPI_Isend is a non-blocking send function which means that when you call MPI_Isend, all you are really doing is telling MPI about a message that you'd like to send at some point in the future. It may get sent right then, it may not. In order to guarantee that the data is actually sent, you need to complete the call with something like MPI_Wait.Usually when people use non-blocking calls (MPI_Isend), they don't mix them with blocking calls (MPI_Recv). If you use all non-blocking calls, then you can have all of them complete with a single function, MPI_Waitall.
Try fixing these issues first and see if that solves your problem. Just because you commented out the collective, doesn't mean that the other issues weren't there. MPI programs can be notoriously difficult to debug because of weird behavior like this.
I wrote a program on MPI where it would go around each processor in a sort of ring fashion x amount of times (for example if I wanted it to go twice around the "ring" of four processors it would go to 0, 1, 2, 3, 0 ,1....3).
Everything compiled fine but when I ran the program on my Ubuntu VM it would never output anything. It wouldn't even run the first output. Can anyone explain what's going on?
This is my code:
#include <stdio.h>
#include <mpi.h>
int main(int argc, char **argv){
int rank, size, tag, next, from, num;
tag = 201;
MPI_Status status;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
next = (rank + 1)/ size;
from = (rank - 1)/size;
if (rank == 0){
printf("How many times around the ring? :: ");
scanf ("%d", &num);
MPI_Send(&num, 1, MPI_INT, 1, tag, MPI_COMM_WORLD);
}
do{
MPI_Recv(&num, 1, MPI_INT, from, tag, MPI_COMM_WORLD, &status);
printf("Process %d received %d from process %d\n", rank, num, status.MPI_SOURCE);
if (rank == 0){
num--;
printf("Process 0 has decremented the number\n");
}
printf("Process %d sending %d to process %d\n", rank, num ,next);
MPI_Send(&num, 1, MPI_INT, next, tag, MPI_COMM_WORLD);
}while (num > 0);
printf("Process %d has exited", rank);
if (rank == 0){
MPI_Recv(&num, 1, MPI_INT, size - 1, tag, MPI_COMM_WORLD, &status);
printf("Process 0 has received the last round, exiting");
}
MPI_Finalize();
return 0;
}
There's a problem with your neighbour assignment. If we insert the following line after the next/from calculation
printf("Rank %d: from = %d, next = %d\n", rank, from, next);
we get:
$ mpirun -np 4 ./ring
Rank 0: from = 0, next = 0
Rank 1: from = 0, next = 0
Rank 2: from = 0, next = 0
Rank 3: from = 0, next = 1
You want something more like
next = (rank + 1) % size;
from = (rank - 1 + size) % size;
which gives
$ mpirun -np 4 ./ring
Rank 0: from = 3, next = 1
Rank 1: from = 0, next = 2
Rank 2: from = 1, next = 3
Rank 3: from = 2, next = 0
and after that your code seems to work.
Whether your code is good or not, your first printf should be output.
If you have no messages printed at all, even the printf in the "if(rank==)" block, then it could be a problem with your VM. Are you sure you have any network interface activated on that VM ?
If the answer is yes, it might be useful to check its compatibility with MPI by checking the OpenMPI FAQ over tcp questions. Sections 7 (How do I tell Open MPI which TCP networks to use?) and 13 (Does Open MPI support virtual IP interfaces?) seems both interesting for any possible problems with running MPI in a Virtual Machine.