#include<stdio.h>
#include<mpi.h>
int a=1;
int *p=&a;
int main(int argc, char **argv)
{
MPI_Init(&argc,&argv);
int rank,size;
MPI_Comm_rank(MPI_COMM_WORLD,&rank);
MPI_Comm_size(MPI_COMM_WORLD,&size);
//printf("Address val: %u \n",p);
*p=*p+1;
MPI_Barrier(MPI_COMM_WORLD);
MPI_Finalize();
printf("Value of a : %d\n",*p);
return 0;
}
Here, I am trying to execute the program with 3 processes where each tries to increment the value of a by 1, so the value at the end of execution of all processes should be 4. Then why does the value printed as 2 only at the printf statement after MPI_Finalize(). And isnt it that the parallel execution stops at MPI_Finalize() and there should be only one process running after it. Then why do I get the print statement 3 times, one for each process, during execution?
It is a common misunderstanding to think that mpi_init starts up the requested number of processes (or whatever mechanism is used to implement MPI) and that mpi_finalize stops them. It's better to think of mpi_init starting the MPI system on top of a set of operating-system processes. The MPI standard is silent on what MPI actually runs on top of and how the underlying mechanism(s) is/are started. In practice a call to mpiexec (or mpirun) is likely to fire up a requested number of processes, all of which are alive when the program starts. It is also likely that the processes will continue to live after the call to mpi_finalize until the program finishes.
This means that prior to the call to mpi_init, and after the call to mpi_finalize it is likely that there is a number of o/s processes running, each of them executing the same program. This explains why you get the printf statement executed once for each of your processes.
As to why the value of a is set to 2 rather than to 4, well, essentially you are running n copies of the same program (where n is the number of processes) each of which adds 1 to its own version of a. A variable in the memory of one process has no relationship to a variable of the same name in the memory of another process. So each process sets a to 2.
To get any data from one process to another the processes need to engage in message-passing.
EDIT, in response to OP's comment
Just as a variable in the memory of one process has no relationship to a variable of the same name in the memory of another process, a pointer (which is a kind of variable) has no relationship to a pointer of the same name in the memory of another process. Do not be fooled, if the ''same'' pointer has the ''same'' address in multiple processes, those addresses are in different address spaces and are not the same, the pointers don't point to the same place.
An analogy: 1 High Street, Toytown is not the same address as 1 High Street, Legotown; there is a coincidence in names across address spaces.
To get any data (pointer or otherwise) from one process to another the processes need to engage in message-passing. You seem to be clinging to a notion that MPI processes share memory in some way. They don't, let go of that notion.
Since MPI is only giving you the option to communicate between separate processes, you have to do message passing. For your purpose there is something like MPI_Allreduce, which can sum data over the separate processes. Note that this adds the values, so in your case you want to sum the increment, and add the sum later to p:
int inc = 1;
MPI_Allreduce(MPI_IN_PLACE, &inc, 1, MPI_INT, MPI_SUM, MPI_COMM_WORLD);
*p += inc;
In your implementation there is no communication between the spawned threads. Each process has his own int a variable which it increments and prints to the screen. Making the variable global doesn't make it shared between processes and all the pointer gimmicks show me that you don't know what you are doing. I would suggest learning a little more C and Operating Systems before you move on.
Anyway, you have to make the processes communicate. Here's how an example might look like:
#include<stdio.h>
#include<mpi.h>
// this program will count the number of spawned processes in a *very* bad way
int main(int argc, char **argv)
{
int partial = 1;
int sum;
int my_id = 0;
// let's just assume the process with id 0 is root
int root_process = 0;
// spawn processes, etc.
MPI_Init(&argc,&argv);
// every process learns his id
MPI_Comm_rank(MPI_COMM_WORLD, &my_id);
// all processes add their 'partial' to the 'sum'
MPI_Reduce(&partial, &sum, 1, MPI_INT, MPI_SUM, root_process, MPI_COMM_WORLD);
// de-init MPI
MPI_Finalize();
// the root process communicates the summation result
if (my_id == root_process)
{
printf("Sum total : %d\n", sum);
}
return 0;
}
Related
Say I have 2 processes, P1 and P2 and both P1 and P2 are printing an array of 1000 data points. As we know, we can't guarantee anything about the order of output, it may be P1 prints the data first followed by P2 or vice versa, or it can be that both outputs are getting mixed. Now say I want to output the values of P1 first followed by P2. Is there any way by which I can guarantee that?
I am attaching a Minimal Reproducible Example in which output gets mixed herewith
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
#include "mpi.h"
int main( int argc, char *argv[])
{
MPI_Init(&argc, &argv);
int myrank, size; //size will take care of number of processes
MPI_Comm_rank(MPI_COMM_WORLD, &myrank) ;
MPI_Comm_size(MPI_COMM_WORLD, &size);
if(myrank==0)
{
int a[1000];
for(int i=0;i<1000;i++)
{
a[i]=i+1;
}
for(int i=0;i<1000;i++)
{
printf(" %d",a[i]);
}
}
if(myrank==1)
{
int a[1000];
for(int i=0;i<1000;i++)
{
a[i]=i+1;
}
for(int i=0;i<1000;i++)
{
printf(" %d",a[i]);
}
}
MPI_Finalize();
return 0;
}
The only way I can think of to output the data sequentially is that sending the data from say P1 to P0 and then printing it all from P0. But then we will incur the extra computational cost of sending data from one process to another.
You have some additional options:
pass a token. Processes can block waiting for a message,print whatever, then send to the next rank
let something else deal with ordering. Each process prefixes it's rank to the output, then you can sort the output by rank.
let's say this was a file. Each rank could compute where it should write and then everybody can carry out a right to the file in the correct location in parallel (which is what mpi-io apps will do)
Now say I want to output the values of P1 first followed by P2. Is
there any way by which I can guarantee that?
This is not how MPI is meant to be used, actually parallelism in general IMO. The coordination of printing the output to the console among processes will greatly degrade the performance of the parallel version, which defeats one of the purposes of parallelism i.e., reducing the overall execution time.
Most of the times one is better off just making one process responsible for printing the output to the console (typically the master process i.e., process with rank = 0).
Citing #Gilles Gouaillardet:
The only safe option is to send all the data to a given rank, and then print the data from that rank.
You could try using MPI_Barrier to coordinate the processes in a way that would print the output has you want, however (citing #Hristo Iliev):
Using barriers like that only works for local launches when (and if)
the processes share the same controlling terminal. Otherwise, it is
entirely to the discretion of the I/O redirection mechanism of the MPI
implementation.
If it is for debugging purposes you can either use a good MPI-aware debugger that allows to look into the content of the data of each process. Alternatively, you can limiting the output to be printed at one process at the time per run so that you can check if all the processes have the data that they should have.
I am trying to learn MPI programing and have written the following program. It adds an entire row of array and outputs the sum. At rank 0 (or process 0), it will call all its slave ranks to do the calculation. I want to do this using only two other slave ranks/process. Whenever I try to invoke same rank twice as shown in the code bellow, my code would just hang in the middle and wouldn't execute. If I don't call the same rank twice, code would work correctly
#include "mpi.h"
#include <stdio.h>
#include <stdlib.h>
int main (int argc, char *argv[])
{
MPI_Init(&argc, &argv);
int world_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
int world_size;
MPI_Comm_size(MPI_COMM_WORLD, &world_size);
int tag2 = 1;
int arr[30] = {0};
MPI_Request request;
MPI_Status status;
printf ("\n--Current Rank: %d\n", world_rank);
int index;
int source = 0;
int dest;
if (world_rank == 0)
{
int i;
printf("* Rank 0 excecuting\n");
index = 0;
dest = 1;
for ( i = 0; i < 30; i++ )
{
arr[ i ] = i + 1;
}
MPI_Send(&arr[0], 30, MPI_INT, dest, tag2, MPI_COMM_WORLD);
index = 0;
dest = 2;
for ( i = 0; i < 30; i++ )
{
arr[ i ] = 0;
}
MPI_Send(&arr[0], 30, MPI_INT, dest, tag2, MPI_COMM_WORLD);
index = 0;
dest = 2; //Problem happens here when I try to call the same destination(or rank 2) twice
//If I change this dest value to 3 and run using: mpirun -np 4 test, this code will work correctly
for ( i = 0; i < 30; i++ )
{
arr[ i ] = 1;
}
MPI_Send(&arr[0], 30, MPI_INT, dest, tag2, MPI_COMM_WORLD);
}
else
{
int sum = 0;
int i;
MPI_Irecv(&arr[0], 30, MPI_INT, source, tag2, MPI_COMM_WORLD, &request);
MPI_Wait (&request, &status);
for(i = 0; i<30; i++)
{
sum = arr[i]+sum;
}
printf("\nSum is: %d at rank: %d\n", sum, world_rank);
}
MPI_Finalize();
}
Result when using: mpirun -np 3 test
--Current Rank: 2
Sum is: 0 at rank: 2
--Current Rank: 0
* Rank 0 excecuting
--Current Rank: 1
Sum is: 524800 at rank: 1
//Program hangs here and wouldn't show sum of 30
Please let me know how can I call the same rank twice. For example, if I only had two other slave process that I can call.
Please show with an example if possible
In MPI, each process executes the same code and, as you're doing, you differentiate the different processes primarily through checking rank in if/else statements. The master process with rank 0 is doing 3 sends: a send to process 1, then two sends to process 2. The slave processes each do only one receive, which means that rank 1 receives its first message and rank 2 receives its first message. When you call the third MPI_Send on process 0, there is not and will not be any slave waiting to receive the message after that point, as the slaves have finished executing their else block. The program gets blocked as the master waits to send the final message.
In order to fix that, you have to make sure the slave of rank 2 performs two receives, either by adding a loop for that process only or by repeating for that process only (so, using an if(world_rank == 2) check) the block of code
sum = 0; //resetting sum
MPI_Irecv(&arr[0], 1024, MPI_INT, source, tag2, MPI_COMM_WORLD, &request);
MPI_Wait (&request, &status);
for(i = 0; i<1024; i++)
{
sum = arr[i]+sum;
}
printf("\nSum is: %d at rank: %d\n", sum, world_rank);
TL;DR: Just a remark that master/slave approach isn't to be favoured and might format programmer's mind durably, leading to poor code when put in production
Although Clarissa is perfectly correct and her answer is very clear, I'd like to add a few general remarks, not about the code itself, but about parallel computing philosophy and good habits.
First a quick preamble: when one wants to parallelise one's code, it can be for two main reasons: making it faster and/or permitting it to handle larger problems by overcoming the limitations (like memory limitation) found on a single machine. But in all cases, performance matters and I will always assume that MPI (or generally speaking parallel) programmers are interested and concerned by performance. So the rest of my post will suppose that you are.
Now the main reason of this post: over the past few days, I've seen a few questions here on SO about MPI and parallelisation, obviously coming from people eager to learn MPI (or OpenMP for that matter). This is great! Parallel programming is great and there'll never be enough parallel programmers. So I'm happy (and I'm sure many SO members are too) to answer questions helping people to learn how to program in parallel. And in the context of learning how to program in parallel, you have to write some simple codes, doing simple things, in order to understand what the API does and how it works. These programs might look stupid from the distance and very ineffective, but that's fine, that's how leaning works. Everybody learned this way.
However, you have to keep in mind that these programs you write are only that: API learning exercises. They're not the real thing and they do not reflect the philosophy of what an actual parallel program is or should be. And what drives my answer here is that I've seen here and in other questions and answers put forward recurrently the status of "master" process and "slaves" ones. And that is wrong, fundamentally wrong! Let me explain why:
As Clarissa perfectly pinpointed, "in MPI, each process executes the same code". The idea is to find a way of making several processes to interact for working together in solving a (possibly larger) problem (hopefully faster). But amongst these processes, none gets any special status, they are all equal. They are given an id to be able to address them, but rank 0 is no better than rank 1 or rank 1025... By artificially deciding that process #0 is the "master" and the others are its "slaves", you break this symmetry and it has consequences:
Now that rank #0 is the master, it commands, right? That's what a master does. So it will be the one getting the information necessary to run the code, will distribute it's share to the workers, will instruct them to do the processing. Then it will wait for the processing to be concluded (possibly getting itself busy in-between but more likely just waiting or poking the workers, since that's what a master does), collect the results, do a bit of reassembling and output it. Job done! What's wrong with that?
Well, the following are wrong:
During the time the master gets the data, slaves are sitting idle. This is sequential and ineffective processing...
Then the distribution of the data and the work to do implies a lot of transfers. This takes time and since it is solely between process #0 and all the others, this might create a lot of congestions on the network in one single link.
While workers do their work, what should the master do? Working as well? If yes, then it might not be readily available to handle requests from the slaves when they come, delaying the whole parallel processing. Waiting for these requests? Then it wastes a lot of computing power by sitting idle... Ultimately, there is no good answer.
Then points 1 and 2 are repeated in reverse order, with the gathering of the results and outputting or results. That's a lot of data transfers and sequential processing, which will badly damage the global scalability, effectiveness and performance.
So I hope you now see why master/slaves approach is (usually, not always but very often) wrong. And the danger I see from the questions and answers I've read of the past days is that you might get you mind formatted in this approach as if it was the "normal" way of thinking in parallel. Well, it is not! Parallel programming is symmetry. It is handling the problem globally, in all places at the same time. You have to think parallel from the start and see your code as a global parallel entity, not just a brunch of processes that need to be instructed on what to do. Each process is it's own master, dealing with it's peers on an equal ground. Each process should (as much as possible) acquire its data by itself (making it a parallel processing); decide what to do based on the number of peers involved in the processing and its id; exchange information with its peers when necessary, should it be locally (point to point communications) or globally (collective communications); and issue its own share of the result (again leading to parallel processing)...
OK, that's a bit extreme a requirement for people just starting to learn parallel programming and I want by no mean to tell you that your learning exercises should be like this. But keep the goal in mind and don't forget that API learning examples are only API learning examples, not reduced models of actual codes. So keep on experimenting with MPI calls to understand what they do and how they work, but try to slowly tend towards symmetrical approach on your examples. That can only be beneficial for you in the long term.
Sorry for this lengthy and somewhat off topic answer, and good luck with your parallel programming.
I am trying to let one process handle all the printtf operations so the printing happens in the order i want. I am trying to store the data that process 1 generate and let process 0 print the data that both process 0 and process 1 generate but i only get what process 0 is generating.
Here is the relevant part of the code
char str3[33];
char str4[66];
MPI_Barrier(MPI_COMM_WORLD);
while (count<5){
MPI_Recv(&count,1,MPI_INT,(my_rank+1)%2,tag,MPI_COMM_WORLD,&status);
char str[30];
if(my_rank==0)
sprintf(str,"Process %d received the count\n",my_rank);
if(my_rank==1)
sprintf(str3,"Process %d received the count\n",my_rank);
count++;
char str2[66];
if (my_rank==0)
sprintf(str2,"Process %d incremented the count(%d) and sent it back to process %d\n",my_rank,count,(my_rank+1)%2);
if (my_rank==1)
sprintf(str4,"Process %d incremented the count(%d) and sent it back to process %d\n",my_rank,count,(my_rank+1)%2);
if(my_rank==0){
printf(str3);
printf (str4);
printf(str);
printf(str2);
memset(str3,'\0',sizeof(str3));
memset(str4,'\0',sizeof(str4));
memset(str,'\0',sizeof(str));
memset(str2,'\0',sizeof(str2));
}
MPI_Send(&count,1,MPI_INT,(my_rank+1)%2,tag,MPI_COMM_WORLD);
}
First, note that, as pointed out by #grancis, your code appears to be flawed owing to a deadlock, since both processes 0 and 1 block in MPI_recv before sending. May be you are sending data before entering the code snippet shown in the question, allowing the code to continue.
Anyway, the problem is that the buffers str3 and str4 are modified by process 1, so that when process 0 tries to print them it can not obviously print anything different from what these buffers originally contained (which is uninitialized memory in your code before the first iteration, and zero in the subsequent iterations since you used memset). Remember that in MPI processes DO NOT share memory.
If you do not want process 1 to print information, then process 1 must send its information to process 0 (through ordinary MPI_Send/MPI_Recv in MPI1 or one sided communication in MPI2 or MPI3), and only then process 0 can print that information.
#include <stdio.h>
main()
{
int i, n=1;
for(i=0;i<n;i++) {
fork();
printf("Hello!");
}
}
I am confused if I put n=1, it prints Hello 2 times.
If n=2, it prints Hello 8 times
If n=3, it prints Hello 24 times..
and so on..
There's is no single "formula" how it's done as different operating systems do it in different ways. But what fork() does is it makes a copy of the process. Rough steps that are usually involved in that:
Stop current process.
Make a new process, and initialize or copy related internal structures.
Copy process memory.
Copy resources, like open file descriptors, etc.
Copy CPU/FPU registers.
Resume both processes.
you dont only fork the 'main' prozess, you also fork the children!
first itteration:
m -> c1
//then
m -> c2 c1-> c1.1
m -> c3 c1-> c1.1 c2->c2.1 c1.1 -> c1.1.1
for i = ....
write it in down this way:
main
fork
child(1)
fork child(2)
fork child(1.1)
fork child(3)
fork child(1.2)
fork child(2.1)
and so on ...
fork() always makes a perfect copy of the current process -- the only difference is the process ID).
Indeed in most modern operating systems, immediately after fork() the two processes share exactly the same chunk of memory. The OS makes a new copy only when the new process writes to that memory (in a mode of operation called "copy-on-write") -- if neither process ever changes the memory, then they can carry on sharing that chunk of memory, saving overall system memory.
But all of that is hidden from you as a programmer. Conceptually you can think of it as a perfect copy.
That copy includes all the values of variables at the moment the fork() happened. So if the fork() happens inside a for() loop, with i==2, then the new process will also be mid-way through a for() loop, with i==2.
However, the processes do not share i. When one process changes the value of i after the fork(), the other process's i is not affected.
To understand your program's results, modify the program so that you know which process is printing each line, and which iteration it is on.
#include <stdio.h>
# include <sys/types.h>
main() {
int i , n=4;
for(i=0;i<n;i++); {
int pid = getpid();
if(fork() == 0) {
printf("New process forked from %d with i=%d and pid=%d", pid, i, getpid());
}
printf("Hello number %d from pid %d\n", i, getpid());
}
}
Since the timing will vary, you'll get output in different orders, but you should be able to make sense of where all the "Hellos" are coming from, and why there are as many as there are.
When you fork() you create a new child process that has different virtual memory but initially shows in the same physical memory with the father. At that point both processes can only read data from the memory. If they want to write or change anything in that common memory, then they use different physical memory and have write properties to that memory. Therefore when you fork() your child has initially the same i value as its father, but that value changes separately for every child.
Yes, there is a formula:
fork() makes exactly one near-identical copy of the process. It returns twice, however. Once in the original process (the parent, returning the child pid) and once in the child (returning 0).
Usually when we call MPI_Bcast, the root must be decided. But now I have an application that does not care who will broadcast the message. It means the node that will broadcast the message is random, but once the message is broadcasted, a global variable must be consistent. My understanding is that since MPI_Bcast is a collective function, all nodes need to call it but the order may be different. So who arrives at MPI_Bcast first, who will broadcast the message to others. I ran the following code with 3 nodes, I think if node 1 (rank==1) arrives at MPI_Bcast first, it will send the local_count value (1) to other nodes,and then all nodes update global_count with the same local_count, so one of my expected result is (the output order does not matter)
node 0, global count is 1
node 1, global count is 1
node 2, global count is 1
But the the actual result is always (the output order does not matter):
node 1, global count is 1
node 0, global count is 0
node 2, global count is 2
This result is exactly the same as the code without MPI_Bcast. So is there anything wrong with my understanding of MPI_Bcast or my code. Thanks.
#include <mpi.h>
#include <stdio.h>
int main(int argc, char *argv[])
{
int rank, size;
int local_count, global_count;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
global_count = 0;
local_count = rank;
MPI_Bcast(&local_count, 1, MPI_INT, rank, MPI_COMM_WORLD);
global_count += local_count;
printf("node %d, global count is: %d\n", rank, global_count);
MPI_Finalize();
}
The code is a simplified case. In my application, there are some computations before MPI_Bcast and I don't know who will finish the computation first. Whenever a node comes to MPI_Bcast point, it needs to broadcast its own computation result local variable and all nodes update the global variable. So all nodes need to broadcast a message but we don't know the order. How to implement this idea?
Normally what you have written will lead to deadlock. In a typical MPI_Bcast case the process with rank root sends its data to all other processes in the comunicator. You need to specify the same root rank in those receiving processes so they know whom to "listen" to. This is an oversimplified description since usually hierarchial broadcast is used in orderto reduce the total operation time but with three processes this hierarchial implementation reduces to the very simple linear one. In your case process with rank 0 will try to send the same message to both processes 1 and 2. In the same time process 1 would not receive that message but instead will try to send its one to processes 0 and 2. Process 2 will also be trying to send to 0 and 1. In the end every process will be sending messages that no other process will be willing to receive. This is an almost sure recipe for distaster.
Why your program doesn't hang? Because messages being sent are very small, only one MPI_INT element and also the number of processes is small, thus those sends are all being buffered internally by the MPI library in every process - they never reach their destinations but nevertheless the calls made internally by MPI_Bcast do not block and your code gets back execution control although the operation is still in progress. This is undefined behaviour since the MPI library is not required to buffer anything by the standard - some implementations might buffer, others might not.
If you are trying to compute the sum of all local_count variables in all processes, then just use MPI_Allreduce. Replace this:
global_count = 0;
local_count = rank;
MPI_Bcast(&local_count, 1, MPI_INT, rank, MPI_COMM_WORLD);
global_count += local_count;
with this:
local_count = rank;
MPI_Allreduce(&local_count, &global_count, 1, MPI_INT, MPI_SUM, MPI_COMM_WORLD);
The correct usage of MPI_BCast is that all processes call the function with the same root. Even if you don't care about who the broadcaster is, all the processes must call the function with the same rank to listen to the broadcaster. In your code, each process are calling MPI_BCast with their own rank and all different from each other.
I haven't look at the standard document, but it is likely that you trigger undefined behavior by calling MPI_BCast with different ranks.
from the openmpi docs
"MPI_Bcast broadcasts a message from the process with rank root to all processes of the group, itself included. It is called by all members of group using the same arguments for comm, root. On return, the contents of root’s communication buffer has been copied to all processes. "
calling it might from ranks != root might cause some problems
a safer way is to hard code your own broadcast function and call that
its basically a for loop and a mpi_send command and shouldn't be to difficult to implement your self