MPI get the number of non-finalized processes - c

I have the following code
int main(void) {
...
int numtasks, rank;
MPI_Comm_size(MPI_COMM_WORLD, &numtasks);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
int k = 2;
if (rank == 3) {
do {
...
} while (number_of_not_finalized_processes >= numtasks - k);
}
...
MPI_Finalize();
}
and want to loop process with rank = 3 until k processes are finalized with MPI_Finalize(). How can I find out, how many processes have been finalized and define number_of_not_finalized_processes ? Or maybe I have any better alternative way to find out how many processes are still alive or passed the certain part of code?

You can solve this with one-sided communication. Create a window where every process before it finalizes:
Acquires a lock on the window on rank 3
Uses MPI_Accumulate to increase the counter
This is a very asynchronous solution. If your processes do something synchronized, say a loop where they dynamically decide to exit the loop, you could for instance at the synchonization point create a subcommunicator of only the active processes.

Related

When do I free an MPI sub-group?

I'm creating MPI groups in a loop that perform a task, but when I want to free the group, the computation aborts. When should I free the group?
The error I get is:
[KLArch:13617] *** An error occurred in MPI_Comm_free
[KLArch:13617] *** reported by process [1712324609,2]
[KLArch:13617] *** on communicator MPI_COMM_WORLD
[KLArch:13617] *** MPI_ERR_COMM: invalid communicator
[KLArch:13617] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[KLArch:13617] *** and potentially your MPI job)
[KLArch:13611] 2 more processes have sent help message help-mpi-errors.txt / mpi_errors_are_fatal
[KLArch:13611] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
What I intend to do is to do a calculation with an increasing number of processes in parallel as a benchmark for MPI, i.e. doing the whole calculation with only 1 processe, take the time, run the same calculation with 2 processes, take the time, run the same calculation with 4 processes, take the time... and compare how the problem scales with the number of processes.
MWE:
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
int main(int argc, char *argv[])
{
int j = 0;
int size = 0;
int rank = 0;
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
// Get the group of processes in MPI_COMM_WORLD
MPI_Group world_group;
MPI_Comm_group(MPI_COMM_WORLD, &world_group);
// Construct a group containing all of the ranks smaller than i in MPI_COMM_WORLD
for (i = 1; i <= size; i++)
{
int group_ranks[i];
for (int j = 0; j < i; j++)
{
group_ranks[j] = j;
}
// Construct a group with all the ranks smaller than i
MPI_Group sub_group;
MPI_Group_incl(world_group, i, group_ranks, &sub_group);
// Create a communicator based on the group
MPI_Comm sub_comm;
MPI_Comm_create(MPI_COMM_WORLD, sub_group, &sub_comm);
int sub_rank = -1;
int sub_size = -1;
// If this rank isn't in the new communicator, it will be
// MPI_COMM_NULL. Using MPI_COMM_NULL for MPI_Comm_rank or
// MPI_Comm_size is erroneous
if (MPI_COMM_NULL != sub_comm)
{
MPI_Comm_rank(sub_comm, &sub_rank);
MPI_Comm_size(sub_comm, &sub_size);
}
// Do some work
printf("WORLD RANK/SIZE: %d/%d \t Group RANK/SIZE: %d/%d\n",
rank, size, sub_rank, sub_size);
// Free the communicator and group
MPI_Barrier(MPI_COMM_WORLD);
//MPI_Comm_free(&sub_comm);
//MPI_Group_free(&sub_group);
j = 0;
}
MPI_Group_free(&world_group);
MPI_Finalize();
return 0;
}
The code throws the error, if I uncomment MPI_Comm_free(&sub_comm); and MPI_Group_free(&sub_group); at the end of the loop.
Let me collect some remarks about this code.
that barrier is not needed.
if you test this on a multi-node system you have to be aware that your processes are not spread evenly: 13 processes on 3 6-core nodes would give you 6+6+1 which is unbalanced. You would want 5+4+4 or so. Your mpiexec would do this correctly; achieving this in your code is a little harder. Just be aware of this since you are doing benchmarking.
It's a little tricky getting this code right. When you make a subgroup, all processes have the same value for the group, including the ones that are not in the group. For instance they do not get MPI_GROUP_NULL. Then you have to call MPI_Comm_create collectively on the large communicator; processes that are not in the group get MPI_COMM_NULL as result. They do not participate in the actions on the subcommunicator. Also, and this was your problem: they do not free the subcommunicator, but they do free the subgroup.
(That last point was also pointed out by #GillesGouaillardet)

Recursive Function using MPI library

I am using the recursive function to find determinant of a 5x5 matrix. Although this sounds a trivial problem and can be solved using OpenMP if the dimensions are huge. I am trying to use MPI to solve this however I cant understand how to deal with recursion which is accumulating the results.
So my question is, How do I use MPI for this?
PS: The matrix is a Hilbert matrix so the answer would be 0
I have written the below code but I think it simply does the same part n times rather than dividing the problem and then accumulating result.
#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#define ROWS 5
double getDeterminant(double matrix[5][5], int iDim);
double matrix[ROWS][ROWS] = {
{1.0, -0.5, -0.33, -0.25,-0.2},
{-0.5, 0.33, -0.25, -0.2,-0.167},
{-0.33, -0.25, 0.2, -0.167,-0.1428},
{-0.25,-0.2, -0.167,0.1428,-0.125},
{-0.2, -0.167,-0.1428,-0.125,0.111},
};
int rank, size, tag = 0;
int main(int argc, char** argv)
{
//Status of messages for each individual rows
MPI_Status status[ROWS];
//Message ID or Rank
MPI_Request req[ROWS];
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
double result;
result = getDeterminant(matrix, ROWS);
printf("The determinant is %lf\n", result);
//Set barrier to wait for all processess
MPI_Barrier(MPI_COMM_WORLD);
MPI_Finalize();
return 0;
}
double getDeterminant(double matrix[ROWS][ROWS], int iDim)
{
int iCols, iMinorRow, iMinorCol, iTemp, iSign;
double c[5];
double tempMat[5][5];
double dDet;
dDet = 0;
if (iDim == 2)
{
dDet = (matrix[0][0] * matrix[1][1]) - (matrix[0][1] * matrix[1][0]);
return dDet;
}
else
{
for (iCols = 0; iCols < iDim; iCols++)
{
int temp_row = 0, temp_col = 0;
for (iMinorRow = 0; iMinorRow < iDim; iMinorRow++)
{
for (iMinorCol = 0; iMinorCol < iDim; iMinorCol++)
{
if (iMinorRow != 0 && iMinorCol != iCols)
{
tempMat[temp_row][temp_col] = matrix[iMinorRow][iMinorCol];
temp_col++;
if (temp_col >= iDim - 1)
{
temp_row++;
temp_col = 0;
}
}
}
}
//Hanlding the alternate signs while calculating diterminant
for (iTemp = 0, iSign = 1; iTemp < iCols; iTemp++)
{
iSign = (-1) * iSign;
}
//Evaluating what has been calculated if the resulting matrix is 2x2
c[iCols] = iSign * getDeterminant(tempMat, iDim - 1);
}
for (iCols = 0, dDet = 0.0; iCols < iDim; iCols++)
{
dDet = dDet + (matrix[0][iCols] * c[iCols]);
}
return dDet;
}
}
Expected result should be a very small value close to 0. I am getting the same result but not using MPI
The provided program will be executed in n processes. mpirun launches the n processes and they all executes the provided code. It is the expected behaviour. Unlike openmp, MPI is not a shared memory programming model rather a distributed memory programming model. It uses message passing to communicate with other processes.There are no global variables in MPI. All the data in your program will be local to your process.. If you need to share a data between process, you have to use MPI_send or MPI_Bcast or etc. explicitly to send it. You can use collective operations like MPI_Bcastto send it to all processes or point to point operations like MPI_send to send to specific processes.
For your application to do the expected behaviour, you have to tailor make it for MPI (unlike in openmp where you can use pragmas). All processes has an identifier or rank. Typically, rank 0 (lets call it your main process) should pass the data to all processes using MPI_Send (or any other methods) and the remaining processes should receive it using MPI_Recv (use MPI_Recv for MPI_Send). After receiving the local data from main process, the local processes should perform some computation on it and then send the results back to the main process. Main process will agreggate the result. This is a very basic scenario using MPI. You can use MPI IO etc..
MPI does not do anything by itself for synchronization or data sharing. It just launches instance of the application n times and provides required routines. It is the application developer who is in charge of communication (data structures etc), synchronization(using MPI_Barrier), etc. among processes.
Following is a simple send receive program using MPI. When you run the below code, say with n as 2, two copies of this program will be launched. In the program, using MPI_Comm_rank(), each process will get its id. We can use this ID for further computations/controlling the flow of code. In the code below, the process with rank 0 will send the variable number using MPI_Send and the process with rank 1 will receive this value using MPI_Recv. We can see that if and else if to differentiate between processes and change the control flow to send and receive the data. This is a very basic MPI program that share the data between processes.
// Find out rank, size
int world_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
int world_size;
MPI_Comm_size(MPI_COMM_WORLD, &world_size);
int number;
if (world_rank == 0) {
number = -1;
MPI_Send(&number, 1, MPI_INT, 1, 0, MPI_COMM_WORLD);
} else if (world_rank == 1) {
MPI_Recv(&number, 1, MPI_INT, 0, 0, MPI_COMM_WORLD,
MPI_STATUS_IGNORE);
printf("Process 1 received number %d from process 0\n",
number);
}
Here is a tutorial on MPI.

MPI Deadlock with collective functions

I'm writing a simple program in C with MPI library.
The intent of this program is the following:
I have a group of processes that perform an iterative loop, at the end of this loop all processes in the communicator must call two collective functions(MPI_Allreduce and MPI_Bcast). The first one sends the id of the processes that have generated the minimum value of the num.val variable, and the second one broadcasts from the source num_min.idx_v to all processes in the communicator MPI_COMM_WORLD.
The problem is that I don't know if the i-th process will be finalized before calling the collective functions. All processes have a probability of 1/10 to terminate. This simulates the behaviour of the real program that I'm implementing. And when the first process terminates, the others cause deadlock.
This is the code:
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
typedef struct double_int{
double val;
int idx_v;
}double_int;
int main(int argc, char **argv)
{
int n = 10;
int max_it = 4000;
int proc_id, n_proc;double *x = (double *)malloc(n*sizeof(double));
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &n_proc);
MPI_Comm_rank(MPI_COMM_WORLD, &proc_id);
srand(proc_id);
double_int num_min;
double_int num;
int k;
for(k = 0; k < max_it; k++){
num.idx_v = proc_id;
num.val = rand()/(double)RAND_MAX;
if((rand() % 10) == 0){
printf("iter %d: proc %d terminato\n", k, proc_id);
MPI_Finalize();
exit(EXIT_SUCCESS);
}
MPI_Allreduce(&num, &num_min, 1, MPI_DOUBLE_INT, MPI_MINLOC, MPI_COMM_WORLD);
MPI_Bcast(x, n, MPI_DOUBLE, num_min.idx_v, MPI_COMM_WORLD);
}
MPI_Finalize();
exit(EXIT_SUCCESS);
}
Perhaps I should create a new group and new communicator before calling MPI_Finalize function in the if statement? How should I solve this?
If you have control over a process before it terminates you should send a non-blocking flag to a rank that cannot terminate early (lets call it the root rank). Then instead of having a blocking all_reduce, you could have sends from all ranks to the root rank with their value.
The root rank could post non-blocking receives for a possible flag, and the value. All ranks would have to have sent one or the other. Once all ranks are accounted for you can do the reduce on the root rank, remove exited ranks from communication and broadcast it.
If your ranks exit without notice, I am not sure what options you have.

the error of MPI initialization

I have a question about parallelism. I have a portion of code which I have applied the concept of parallelizme and this part of code must be repeat N times in a loop, but I can not initialize the MPI in the loop because shows " MPI_Init(89): Cannot call MPI_INIT or MPI_INIT_THREAD more than once" and if I boot before the loop each process it will handle all the loop and it is not that the goal.
for (int i = 0; i <N; i ++)
{
the parallel area
}
I want that for every i in the loop, the K processes execute the parallel area.
The canonical way to distribute calculations in your case is to run the loop in all MPI processes and make each rank except rank 0 skip the serial parts:
// Obtain the rank
int rank;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
for (i = 0; i < N; i++)
{
if (rank == 0)
{
// Some serial calculations
}
//
// Parallel part
//
if (rank == 0)
{
// More serial calculations within the loop
}
}
If you need to communicate data produced in rank 0 in the serial calculations, you could either use point-to-point operations or collectives like MPI_Bcast or MPI_Scatter at the beginning of the parallel part. You could then bring the distributed data back to rank 0 for further processing in the second serial part with MPI_Gather or MPI_Reduce at the end of the parallel block.
Another canonical approach is to use the master-worker pattern where one process (the master) distributes work items to a set of worker processes that are simply spinning in a loop of receive work -> process work -> return results.
As to the multiple initialisation of MPI, one could check if it has already been done:
int done_already;
MPI_Initialized(&done_already);
if (!done_already)
MPI_Init(NULL, NULL);
Note that once you finalise MPI by calling MPI_Finalize, it cannot be reinitialised for the duration of the current execution of the program.

Changing value of a variable with MPI

#include<stdio.h>
#include<mpi.h>
int a=1;
int *p=&a;
int main(int argc, char **argv)
{
MPI_Init(&argc,&argv);
int rank,size;
MPI_Comm_rank(MPI_COMM_WORLD,&rank);
MPI_Comm_size(MPI_COMM_WORLD,&size);
//printf("Address val: %u \n",p);
*p=*p+1;
MPI_Barrier(MPI_COMM_WORLD);
MPI_Finalize();
printf("Value of a : %d\n",*p);
return 0;
}
Here, I am trying to execute the program with 3 processes where each tries to increment the value of a by 1, so the value at the end of execution of all processes should be 4. Then why does the value printed as 2 only at the printf statement after MPI_Finalize(). And isnt it that the parallel execution stops at MPI_Finalize() and there should be only one process running after it. Then why do I get the print statement 3 times, one for each process, during execution?
It is a common misunderstanding to think that mpi_init starts up the requested number of processes (or whatever mechanism is used to implement MPI) and that mpi_finalize stops them. It's better to think of mpi_init starting the MPI system on top of a set of operating-system processes. The MPI standard is silent on what MPI actually runs on top of and how the underlying mechanism(s) is/are started. In practice a call to mpiexec (or mpirun) is likely to fire up a requested number of processes, all of which are alive when the program starts. It is also likely that the processes will continue to live after the call to mpi_finalize until the program finishes.
This means that prior to the call to mpi_init, and after the call to mpi_finalize it is likely that there is a number of o/s processes running, each of them executing the same program. This explains why you get the printf statement executed once for each of your processes.
As to why the value of a is set to 2 rather than to 4, well, essentially you are running n copies of the same program (where n is the number of processes) each of which adds 1 to its own version of a. A variable in the memory of one process has no relationship to a variable of the same name in the memory of another process. So each process sets a to 2.
To get any data from one process to another the processes need to engage in message-passing.
EDIT, in response to OP's comment
Just as a variable in the memory of one process has no relationship to a variable of the same name in the memory of another process, a pointer (which is a kind of variable) has no relationship to a pointer of the same name in the memory of another process. Do not be fooled, if the ''same'' pointer has the ''same'' address in multiple processes, those addresses are in different address spaces and are not the same, the pointers don't point to the same place.
An analogy: 1 High Street, Toytown is not the same address as 1 High Street, Legotown; there is a coincidence in names across address spaces.
To get any data (pointer or otherwise) from one process to another the processes need to engage in message-passing. You seem to be clinging to a notion that MPI processes share memory in some way. They don't, let go of that notion.
Since MPI is only giving you the option to communicate between separate processes, you have to do message passing. For your purpose there is something like MPI_Allreduce, which can sum data over the separate processes. Note that this adds the values, so in your case you want to sum the increment, and add the sum later to p:
int inc = 1;
MPI_Allreduce(MPI_IN_PLACE, &inc, 1, MPI_INT, MPI_SUM, MPI_COMM_WORLD);
*p += inc;
In your implementation there is no communication between the spawned threads. Each process has his own int a variable which it increments and prints to the screen. Making the variable global doesn't make it shared between processes and all the pointer gimmicks show me that you don't know what you are doing. I would suggest learning a little more C and Operating Systems before you move on.
Anyway, you have to make the processes communicate. Here's how an example might look like:
#include<stdio.h>
#include<mpi.h>
// this program will count the number of spawned processes in a *very* bad way
int main(int argc, char **argv)
{
int partial = 1;
int sum;
int my_id = 0;
// let's just assume the process with id 0 is root
int root_process = 0;
// spawn processes, etc.
MPI_Init(&argc,&argv);
// every process learns his id
MPI_Comm_rank(MPI_COMM_WORLD, &my_id);
// all processes add their 'partial' to the 'sum'
MPI_Reduce(&partial, &sum, 1, MPI_INT, MPI_SUM, root_process, MPI_COMM_WORLD);
// de-init MPI
MPI_Finalize();
// the root process communicates the summation result
if (my_id == root_process)
{
printf("Sum total : %d\n", sum);
}
return 0;
}

Resources