The ref of MPI_Init, states:
This routine must be called by one thread only. That thread is called the main thread and must be the thread that calls MPI_Finalize.
How to do this? I mean every example I have seen looks like this and in my code, I tried:
MPI_Comm_rank(MPI_COMM_WORLD, &mpirank);
bool mpiroot = (mpirank == 0);
if(mpiroot)
MPI_Init(&argc, &argv);
but I got:
Attempting to use an MPI routine before initializing MPICH
However, notice that this will work fine, if I leave it as in the example, I just had to re-check, because of my code's failure here.
I am thinking that because we call mpiexec -n 4 ./test, 4 processes will be spawned, thus all of them will call MPI_Init. I just printed stuff at the very first line of main() and they will be printed as many times as the number of processes.
MPI_Init must be the first MPI function called by your MPI program. It must be called by each process. Note that a process is not the same as a thread! If you go on to spawn threads from a process, those threads must not call MPI_Init again.
So your program should be something like this:
int main(int argc, char **argv)
{
MPI_Init(&argc, &argv);
int mpirank;
MPI_Comm_rank(MPI_COMM_WORLD, &mpirank);
// No more calls to MPI_Init in here
...
MPI_Finalize();
}
Related
I'm messing around with openMPI, and I have a wierd bug.
It seems, that even after MPI_Finalize(), each of the threads keeps running.
I have followed a guide for a simple Hello World program, and it looks like this:
#include <mpi.h>;
int main(int argc, char** argv) {
// Initialize the MPI environment
MPI_Init(NULL, NULL);
// Get the number of processes
int world_size;
MPI_Comm_size(MPI_COMM_WORLD, &world_size);
// Get the rank of the process
int world_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
// Get the name of the processor
char processor_name[MPI_MAX_PROCESSOR_NAME];
int name_len;
MPI_Get_processor_name(processor_name, &name_len);
// Print off a hello world message
printf("Hello world from processor %s, rank %d"
" out of %d processors\n",
processor_name, world_rank, world_size);
// Finalize the MPI environment.
MPI_Finalize();
printf("This is after finalize");
}
Notice the last printf()... This should only be printed once, since the parallel part is finalized, right?!
However, the output from this program if i for example run it with 6 processors is:
mpirun -np 6 ./hello_world
Hello world from processor ubuntu, rank 2 out of 6 processors
Hello world from processor ubuntu, rank 1 out of 6 processors
Hello world from processor ubuntu, rank 3 out of 6 processors
Hello world from processor ubuntu, rank 0 out of 6 processors
Hello world from processor ubuntu, rank 4 out of 6 processors
Hello world from processor ubuntu, rank 5 out of 6 processors
This is after finalize...
This is after finalize...
This is after finalize...
This is after finalize...
This is after finalize...
This is after finalize...
Am I misunderstanding how MPI works? Should each thread/process not be stopped by the finalize?
This is just undefined behavior.
The number of processes running after this routine is called is
undefined; it is best not to perform much more than a return rc after
calling MPI_Finalize.
http://www.mpich.org/static/docs/v3.1/www3/MPI_Finalize.html
The MPI standard only requires that rank 0 return from MPI_FINALIZE. I won't copy the entire text here because it's rather lengthy, but you can find it in the version 3.0 of the standard (the latest for a few more days) in Chapter 8, section 8.7 (Startup) on page 359 - 361. Here's the most relevant parts:
Although it is not required that all processes return from MPI_FINALIZE, it is required that at least process 0 in MPI_COMM_WORLD return, so that users can know that the MPI portion of the computation is over. In addition, in a POSIX environment, users may desire to supply an exit code for each process that returns from MPI_FINALIZE.
There's even an example that's trying to do exactly what you said:
Example 8.10 The following illustrates the use of requiring that at least one process return and that it be known that process 0 is one of the processes that return. One wants code like the following to work no matter how many processes return.
...
MPI_Comm_rank(MPI_COMM_WORLD, &myrank);
...
MPI_Finalize();
if (myrank == 0) {
resultfile = fopen("outfile","w");
dump_results(resultfile);
fclose(resultfile);
} exit(0);
The MPI standard doesn't say anything else about the behavior of an application after calling MPI_FINALIZE. All this function is required to do is clean up internal MPI state, complete communication operations, etc. While it's certainly possible (and allowed) for MPI to kill the other ranks of the application after a call to MPI_FINALIZE, in practice, that is almost never the way that it is done. There's probably a counter example, but I'm not aware of it.
When I started MPI, I had same problem with MPI_Init and MPI_Finalize methods. I thought between these functions work parallel and outside work serial. Finally I saw this answer and I figured its functionality out.
J Teller's answer:
https://stackoverflow.com/a/2290951/893863
int main(int argc, char *argv[]) {
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
MPI_Comm_rank(MPI_COMM_WORLD,&myid);
if (myid == 0) { // Do the serial part on a single MPI thread
printf("Performing serial computation on cpu %d\n", myid);
PreParallelWork();
}
ParallelWork(); // Every MPI thread will run the parallel work
if (myid == 0) { // Do the final serial part on a single MPI thread
printf("Performing the final serial computation on cpu %d\n", myid);
PostParallelWork();
}
MPI_Finalize();
return 0;
}
Currently I'm trying to create a master-slave program with a "listener" loop where the master waits for a message from slaves to make a decision from there. But, despite using non-blocking MPI routines, I am experiencing an error. Do I need to use some blocking routine?
#include "mpi.h"
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
int main(int argc, char** argv)
{
// Variable Declarations
int rank, size;
MPI_Request *requestList,requestNull;
MPI_Status status;
// Start MPI
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
if( rank == 0 )
{
// Process Zero
int dataOut=13, pr;
float dataIn = -1;
requestList =(MPI_Request*)malloc((size-1)*sizeof(MPI_Request));
while(1){
dataIn = -1;
// We do NOT need to wait for the MPI_ Isend(s), it is the job of the receiver processes.
for(pr=1;pr<size;pr++)
{
MPI_Irecv(&dataIn,1,MPI_FLOAT,pr,1,MPI_COMM_WORLD,&(requestList[pr-1]));
}
if((dataIn > 1.5)){
printf("From the process: %f\n", dataIn);
break;
}
}
}
else
{
// Receiver Process
float message;
int index;
//MPI_Request request;
MPI_Status status;
while(1){
message = random()/(double)1147483648;
// Send the message back to the process zero
MPI_Isend(&message,1,MPI_FLOAT,0,1,MPI_COMM_WORLD, &requestNull);
if(message > 1.5)
break;
}
}
MPI_Finalize();
return 0;
}
The problem seems to be that you're never waiting for your MPI calls to complete.
The way non-blocking calls work in MPI is that when you issue a non-blocking call (like MPI_IRECV), the last input parameter is an MPI_REQUEST object. When the initialization call (MPI_IRECV) is done, that request object contains the information for the non-blocking call. However, that call isn't done yet and you have no guarantee that the call is done until you use a completion call (MPI_WAIT/MPI_TEST/and friends) on the request.
In your case, you probably are using the non-blocking calls unnecessarily since you rely on the information received during the MPI_IRECV call in the next line. You should probably just substitute your non-blocking call for a blocking MPI_RECV call and use MPI_ANY_SOURCE so you don't have to post a separate receive call for each rank in the communicator. Alternatively, you can use MPI_WAITANY to complete your non-blocking calls, but then you'll need to worry about cleaning up all of the extra operations when you finish.
#include<stdio.h>
#include<mpi.h>
int a=1;
int *p=&a;
int main(int argc, char **argv)
{
MPI_Init(&argc,&argv);
int rank,size;
MPI_Comm_rank(MPI_COMM_WORLD,&rank);
MPI_Comm_size(MPI_COMM_WORLD,&size);
//printf("Address val: %u \n",p);
*p=*p+1;
MPI_Barrier(MPI_COMM_WORLD);
MPI_Finalize();
printf("Value of a : %d\n",*p);
return 0;
}
Here, I am trying to execute the program with 3 processes where each tries to increment the value of a by 1, so the value at the end of execution of all processes should be 4. Then why does the value printed as 2 only at the printf statement after MPI_Finalize(). And isnt it that the parallel execution stops at MPI_Finalize() and there should be only one process running after it. Then why do I get the print statement 3 times, one for each process, during execution?
It is a common misunderstanding to think that mpi_init starts up the requested number of processes (or whatever mechanism is used to implement MPI) and that mpi_finalize stops them. It's better to think of mpi_init starting the MPI system on top of a set of operating-system processes. The MPI standard is silent on what MPI actually runs on top of and how the underlying mechanism(s) is/are started. In practice a call to mpiexec (or mpirun) is likely to fire up a requested number of processes, all of which are alive when the program starts. It is also likely that the processes will continue to live after the call to mpi_finalize until the program finishes.
This means that prior to the call to mpi_init, and after the call to mpi_finalize it is likely that there is a number of o/s processes running, each of them executing the same program. This explains why you get the printf statement executed once for each of your processes.
As to why the value of a is set to 2 rather than to 4, well, essentially you are running n copies of the same program (where n is the number of processes) each of which adds 1 to its own version of a. A variable in the memory of one process has no relationship to a variable of the same name in the memory of another process. So each process sets a to 2.
To get any data from one process to another the processes need to engage in message-passing.
EDIT, in response to OP's comment
Just as a variable in the memory of one process has no relationship to a variable of the same name in the memory of another process, a pointer (which is a kind of variable) has no relationship to a pointer of the same name in the memory of another process. Do not be fooled, if the ''same'' pointer has the ''same'' address in multiple processes, those addresses are in different address spaces and are not the same, the pointers don't point to the same place.
An analogy: 1 High Street, Toytown is not the same address as 1 High Street, Legotown; there is a coincidence in names across address spaces.
To get any data (pointer or otherwise) from one process to another the processes need to engage in message-passing. You seem to be clinging to a notion that MPI processes share memory in some way. They don't, let go of that notion.
Since MPI is only giving you the option to communicate between separate processes, you have to do message passing. For your purpose there is something like MPI_Allreduce, which can sum data over the separate processes. Note that this adds the values, so in your case you want to sum the increment, and add the sum later to p:
int inc = 1;
MPI_Allreduce(MPI_IN_PLACE, &inc, 1, MPI_INT, MPI_SUM, MPI_COMM_WORLD);
*p += inc;
In your implementation there is no communication between the spawned threads. Each process has his own int a variable which it increments and prints to the screen. Making the variable global doesn't make it shared between processes and all the pointer gimmicks show me that you don't know what you are doing. I would suggest learning a little more C and Operating Systems before you move on.
Anyway, you have to make the processes communicate. Here's how an example might look like:
#include<stdio.h>
#include<mpi.h>
// this program will count the number of spawned processes in a *very* bad way
int main(int argc, char **argv)
{
int partial = 1;
int sum;
int my_id = 0;
// let's just assume the process with id 0 is root
int root_process = 0;
// spawn processes, etc.
MPI_Init(&argc,&argv);
// every process learns his id
MPI_Comm_rank(MPI_COMM_WORLD, &my_id);
// all processes add their 'partial' to the 'sum'
MPI_Reduce(&partial, &sum, 1, MPI_INT, MPI_SUM, root_process, MPI_COMM_WORLD);
// de-init MPI
MPI_Finalize();
// the root process communicates the summation result
if (my_id == root_process)
{
printf("Sum total : %d\n", sum);
}
return 0;
}
I'm studying a bit of MPI, and decided to do a test by making a program that calls objects, eg main.c -> main program, function.c -> any function
function.c that will only use the MPI. compiling I as follows:
gcc-c main.c
to create main.o, mpicc-c to create function.c function.o, of course I create the file function.h too.
I compile with mpicc-o program main.o function.o
Here is main.c
#include <stdio.h>
#include "function.h"
void main(int argc, char *argv[])
{
printf("Hello\n");
function();
printf("Bye\n");
}
just function has the MPI code, but when I'm running the program mpiexe -np 2 I get
Hello
Hello
----- function job here -----
Bye
Bye
But I wanted it to be
Hello
------ function job -----
Bye
What can I do?
Your whole program is run on both of the two processors you set with the -np 2. A common way to prevent duplicates of printouts, final results, etc., is have one thread do those things just by checking the thread id first. Like:
int id;
MPI_Comm_rank(MPI_COMM_WORLD, &id);
if (id == 0) {
printf("only process %d does this one\n", id);
}
printf("hello from process %d\n", id); // all processes do this one
When starting out in MPI I found it helpful to print out those id numbers along with whatever partial results or data each thread was dealing with. Helped me make more sense out of what was happening.
Basically mpirun -np 2 starts 2 identical processes and you have to use MPI_Comm_rank function to check process rank.
Here is a quick snippet:
int main(int argc, char **argv)
{
int myrank;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &myrank);
if (myrank == 0) {
printf("Hello\n");
function();
MPI_Barrier(MPI_COMM_WORLD);
printf("Done\n");
} else {
function();
MPI_Barrier(MPI_COMM_WORLD);
}
MPI_Finalize();
return 0;
}
I generally prefer this method for printing data.
It involves barriers. So you must be careful while using it.
if(1)
do
for(i = 0 to num_threads)
do
if(i==my_rank)
do
do_printf
done
******* barrier ********
end for
done
If the set of threads printing the value does not include all threads, just add the relevant threads to barrier.
Another method is for every thread to write its output in a dedicated file. This way :
you don't have to access a barrier
you do not lose printfs of any thread
you output is explicit. so there is no cluttering while debugging programs.
Code :
sprintf(my_op_file_str, "output%d", myThreadID);
close(1)
open(my_op_file_str)
Now use printf's anywhere you may like.
This is a pretty basic MPI question, but I can't wrap my head around it. I have a main function that calls another function that uses MPI. I want the main function to execute in serial, and the other function to execute in parallel. My code is like this:
int main (int argc, char *argv[])
{
//some serial code goes here
parallel_function(arg1, arg2);
//some more serial code goes here
}
void parallel_function(int arg1, int arg2)
{
//init MPI and do some stuff in parallel
MPI_Init(NULL, NULL);
MPI_Comm_size(MPI_COMM_WORLD, &p);
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
//now do some parallel stuff
//....
//finalize to end MPI??
MPI_Finalize();
}
My code runs fine and gets the expected output, but the issue is that the main function is also being run in separate processes and so the serial code executes more than once. I don't know how it's running multiple times, because I haven't even called MPI_Init yet (if I printf in main before I call parallel_function, I see multiple printf's)
How can I stop my program running in parallel after I'm done?
Thanks for any responses!
Have a look at this answer.
Short story: MPI_Init and MPI_Finalize do not mark the beginning and end of parallel processing. MPI processes run in parallel in their entirety.
#suszterpatt is correct to state that "MPI processes run in parallel in their entirety". When you run a parallel program using, for example, mpirun or mpiexec this starts the number of processes you requested (with the -n flag) and each process begins execution at the start of main. So in your example code
int main (int argc, char *argv[])
{
//some serial code goes here
parallel_function(arg1, arg2);
//some more serial code goes here
}
every process will execute the //some serial code goes here and //some more serial code goes here parts (and of course they will all call parallel_function). There isn't one master process which calls parallel_function and then spawns other processes once MPI_Init is called.
Generally it is best to avoid doing what you are doing: MPI_Init should be one of the first function calls in your program (ideally it should be the first). In particular, take note of the following (from here):
The MPI standard does not say what a program can do before an MPI_INIT or after an MPI_FINALIZE. In the MPICH implementation, you should do as little as possible. In particular, avoid anything that changes the external state of the program, such as opening files, reading standard input or writing to standard output.
Not respecting this can lead to some nasty bugs.
It is better practice to rewrite your code to something like the following:
int main (int argc, char *argv[])
{
// Initialise MPI
MPI_Init(NULL, NULL);
MPI_Comm_size(MPI_COMM_WORLD, &p);
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
// Serial part: executed only by process with rank 0
if (my_rank==0)
{
// Some serial code goes here
}
// Parallel part: executed by all processes.
// Serial part: executed only by process with rank 0
if (my_rank==0)
{
// Some more serial code goes here
}
// Finalize MPI
MPI_Finalize();
return 0;
}
Note: I am not a C programmer, so use the above code with care. Also, shouldn't main always return something, especially when defined as int main()?