Efficient usage of MPI_Allreduce

Efficient usage of MPI_Allreduce - c

I have on every processor a list range with numbers. I want to determine the maximum number of every row of these lists range.
The first four lists range for every processor P0-P3. The red list contains the maximum values of each row which every processor after MPI_Allreduce gets.
Here is a working version of my code:
#include <stdlib.h>
#include <time.h>
#include <stdio.h>
#include <mpi.h>
//#define KEY_MAX 100
typedef struct{
int myrank;
int numprocs;
int *range;
} SubDomainKeyTree;
void compRange(SubDomainKeyTree *s, int myrank, int numprocs){
s->myrank = myrank;
s->numprocs = numprocs;
// Allocate memory for (numprocs+1) ranges
s->range = malloc((numprocs+1) * sizeof(int));
// Compute range values
for(int p=0; p<=numprocs; p++){
s->range[p] = rand()%100;
}
for(int p=0; p<s->numprocs; p++){
if(s->myrank == p){
for(int k=0; k<=s->numprocs; k++){
printf("Processor %d: %d random number is %d\n", p, k, s->range[k]);
}
printf("\n");
}
}
}
void compDynRange(SubD *s){
int rangeForAll[s->numprocs+1];
//////////////////////////////////
// This is not really efficient //
//////////////////////////////////
for(int r=0; r<=s->numprocs; r++){
MPI_Allreduce(&s->range[r], &rangeForAll[r], 1, MPI_INT, MPI_MAX, MPI_COMM_WORLD);
}
for(int p=0; p<s->numprocs; p++){
if(s->myrank == p){
for(int k=0; k<=s->numprocs; k++){
s->range[k] = rangeForAll[k];
printf("Processor %d: %d random number after MPI_Allreduce is %d\n", p, k, s->range[k]);
}
printf("\n");
}
}
}
int main(int argc, char **argv){
int nameLen;
char processorName[MPI_MAX_PROCESSOR_NAME];
int myrank; // Rank of processor
int numprocs; // Number of processes
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &myrank);
MPI_Comm_size(MPI_COMM_WORLD, &numprocs);
MPI_Get_processor_name(processorName,&nameLen);
MPI_Status status;
time_t t;
srand((unsigned)time(NULL)+myrank*numprocs+nameLen);
SubD s;
compRange(&s, myrank, numprocs);
compDynRange(&s);
MPI_Finalize();
return 0;
}
I use a for-loop which seems highly inefficient to me. Here I compute the maximum value of every row of all lists one after the other.
But can I use MPI_Allreduce without that for-loop?
I already tried that instead of the for-loop which does not work.
MPI_Allreduce(&s->range, &rangeForAll, s->numprocs+1, MPI_INT, MPI_MAX, MPI_COMM_WORLD);
Can someone give me a hint how I can do that?

As already hinted in comment, the error you had in you code was that instead of passing the arrays containing your send and receive buffers, you were passing some pointers to them. I imagine that this error was simply coming fro the change from a single element used initially (like &s->range[r]) which was perfectly correct, to the full array by just removing the indexed access (ie &s->range) which was wrong.
So as explained, using:
MPI_Allreduce(s->range, rangeForAll, s->numprocs+1, MPI_INT, MPI_MAX, MPI_COMM_WORLD)
just does the trick. However, since you want to get the results into the s->range arrays rather than the temporary rangeFarAll ones, you'd better off not defining the later at all, and use the MPI_IN_PLACE keyword as sending parameter and s->range as receiving one. The call becomes:
MPI_Allreduce(MPI_IN_PLACE, s->range, s->numprocs+1, MPI_INT, MPI_MAX, MPI_COMM_WORLD)
and s->range acts both as sending and receiving buffer. Therefore, the final results will all be in the s->range buffers after the call, sparing you the need of doing the copy explicitly.

Related

An efficient way to perform an all reduction in MPI of a value based on another variable?

As an example, lets say I have
int a = ...;
int b = ...;
int c;
where a is the result of some complex local calculation and b is some metric for the quality of a.
I'd like to send the best value of a to every process and store it in c where best is defined by having the largest value of b.
I guess I'm just wondering if there is a more efficient way of doing this than doing an allgather on a and b and then searching through the resulting arrays.
The actual code involves sending and comparing several hundred values on upto several hundred/thousand processes, so any efficiency gains would be welcome.

I guess I'm just wondering if there is a more efficient way of doing
this than doing an allgather on a and b and then searching through the
resulting arrays.
This can be achieved with only a single MPI_AllReduce.
I will present two approaches, a simpler one (suitable for your use case); and a more generic one, for more complex use-cases. The latter will also be useful to show case MPI functionality such as custom MPI Datatypes and custom MPI reduction operators.
Approach 1
To represent
int a = ...;
int b = ...;
you could use the following struct:
typedef struct MyStruct {
int b;
int a;
} S;
then you can use the MPI Datatype MPI_2INT and the MPI operator MAXLOC
The operator MPI_MINLOC is used to compute a global minimum and also
an index attached to the minimum value. **MPI_MAXLOC similarly computes
a global maximum and index. One application of these is to compute a
global minimum (maximum) and the rank of the process containing this
value.
In your case, instead of the rank we will be using the value of 'a'. Hence, the MPI_AllReduce call:
S local, global;
...
MPI_Allreduce(&local, &global, 1, MPI_2INT, MPI_MAXLOC, MPI_COMM_WORLD);
The complete code would look like the following:
#include <stdio.h>
#include <mpi.h>
typedef struct MyStruct {
int b;
int a;
} S;
int main(int argc,char *argv[]){
MPI_Init(NULL,NULL); // Initialize the MPI environment
int world_rank;
int world_size;
MPI_Comm_rank(MPI_COMM_WORLD,&world_rank);
MPI_Comm_size(MPI_COMM_WORLD,&world_size);
// Some fake data
S local, global;
local.a = world_rank;
local.b = world_size - world_rank;
MPI_Allreduce(&local, &global, 1, MPI_2INT, MPI_MAXLOC, MPI_COMM_WORLD);
if(world_rank == 0){
printf("%d %d\n", global.b, global.a);
}
MPI_Finalize();
return 0;
}
Second Approach
The MPI_MAXLOC only works for a certain number of predefined datatypes. Nonetheless, for the remaining cases you can use the following approach (based on this SO thread):
Create a struct that will contain the values a and b;
Create a customize MPI_Datatype representing the 1. struct to be sent across processes;
Use MPI_AllReduce:
int MPI_Allreduce(const void *sendbuf, void *recvbuf, int count,
MPI_Datatype datatype, MPI_Op op, MPI_Comm comm)
Combines values from all processes and distributes the result back to
all processes
Use the operation MAX;
I'd like to send the best value of 'a' to every process and store it in
'c' where best is defined by having the largest value of 'b'.
Then you have to tell MPI to only consider the element b of the struct. Hence, you need to create a custom MPI_Op max operation.
Coding the approach
So let us break step-by-step the aforementioned implementation:
First define the struct:
typedef struct MyStruct {
double a, b;
} S;
Second create the customize MPI_Datatype:
void defineStruct(MPI_Datatype *tstype) {
const int count = 2;
int blocklens[count];
MPI_Datatype types[count];
MPI_Aint disps[count];
for (int i=0; i < count; i++){
types[i] = MPI_DOUBLE;
blocklens[i] = 1;
}
disps[0] = offsetof(S,a);
disps[1] = offsetof(S,b);
MPI_Type_create_struct(count, blocklens, disps, types, tstype);
MPI_Type_commit(tstype);
}
Very Important
Note that since we are using a struct you have to be careful with the fact that (source)
the C standard allows arbitrary padding between the fields.
So reducing a struct with two doubles is NOT the same as reducing an array with two doubles.
In the main you have to do:
MPI_Datatype structtype;
defineStruct(&structtype);
Third create the custom max operation:
void max_struct(void *in, void *inout, int *len, MPI_Datatype *type){
S *invals = in;
S *inoutvals = inout;
for (int i=0; i < *len; i++)
inoutvals[i].b = (inoutvals[i].b > invals[i].b) ? inoutvals[i].b : invals[i].b;
}
in the main do:
MPI_Op maxstruct;
MPI_Op_create(max_struct, 1, &maxstruct);
Finally, call the MPI_AllReduce:
S local, global;
...
MPI_Allreduce(&local, &global, 1, structtype, maxstruct, MPI_COMM_WORLD);
The entire code put together:
#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
typedef struct MyStruct {
double a, b;
} S;
void max_struct(void *in, void *inout, int *len, MPI_Datatype *type){
S *invals = in;
S *inoutvals = inout;
for (int i=0; i<*len; i++)
inoutvals[i].b = (inoutvals[i].b > invals[i].b) ? inoutvals[i].b : invals[i].b;
}
void defineStruct(MPI_Datatype *tstype) {
const int count = 2;
int blocklens[count];
MPI_Datatype types[count];
MPI_Aint disps[count];
for (int i=0; i < count; i++) {
types[i] = MPI_DOUBLE;
blocklens[i] = 1;
}
disps[0] = offsetof(S,a);
disps[1] = offsetof(S,b);
MPI_Type_create_struct(count, blocklens, disps, types, tstype);
MPI_Type_commit(tstype);
}
int main(int argc,char *argv[]){
MPI_Init(NULL,NULL); // Initialize the MPI environment
int world_rank;
int world_size;
MPI_Comm_rank(MPI_COMM_WORLD,&world_rank);
MPI_Comm_size(MPI_COMM_WORLD,&world_size);
MPI_Datatype structtype;
MPI_Op maxstruct;
S local, global;
defineStruct(&structtype);
MPI_Op_create(max_struct, 1, &maxstruct);
// Just some random values
local.a = world_rank;
local.b = world_size - world_rank;
MPI_Allreduce(&local, &global, 1, structtype, maxstruct, MPI_COMM_WORLD);
if(world_rank == 0){
double c = global.a;
printf("%f %f\n", global.b, c);
}
MPI_Finalize();
return 0;
}

You can pair the value of b with the rank of the process to find the rank that contains the maximum value of b. The MPI_DOUBLE_INT type is very useful for this purpose. You can then broadcast a from this rank in order to have the value at each process.
#include <mpi.h>
#include <stdlib.h>
#include <stdio.h>
int main(int argc, char *argv[])
{
int my_rank;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
// Create random a and b on each rank.
srand(123 + my_rank);
double a = rand() / (double)RAND_MAX;
double b = rand() / (double)RAND_MAX;
struct
{
double value;
int rank;
} s_in, s_out;
s_in.value = b;
s_in.rank = my_rank;
printf("before: %d, %f, %f\n", my_rank, a, b);
// Find the maximum value of b and the corresponding rank.
MPI_Allreduce(&s_in, &s_out, 1, MPI_DOUBLE_INT, MPI_MAXLOC, MPI_COMM_WORLD);
b = s_out.value;
// Broadcast from the rank with the maximum value.
MPI_Bcast(&a, 1, MPI_DOUBLE, s_out.rank, MPI_COMM_WORLD);
printf("after: %d, %f, %f\n", my_rank, a, b);
MPI_Finalize();
}

MPI_Scatter is receiving wrong values

My goal is to take an array of 6 integers and split them among 3 processes. However, the numbers in receiveBuffer are not correct. I don't know why the three processes don't contain integers from the original array.
#include<stdio.h>
#include <math.h>
#include <stdlib.h>
#include <time.h>
#include <assert.h>
#include "mpi.h"
#define ARRAY_SIZE 6
// simple print array method
void printArray(int arr[], int size)
{
int i;
for (i=0; i < size; i++)
printf("%d ", arr[i]);
printf("\n");
}
int main (int argc, char *argv[])
{
srand(time(NULL));
int array[ARRAY_SIZE];
int rank, numNodes;
// fill array with random numbers and print
for(int i = 0; i < ARRAY_SIZE; i++)
array[i] = rand();
printArray(array, ARRAY_SIZE);
MPI_Init( &argc, &argv );
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size (MPI_COMM_WORLD,&numNodes);
int receiveBuffer[ARRAY_SIZE/numNodes];
if(rank == 0)
{
MPI_Scatter(array, ARRAY_SIZE/numNodes, MPI_INT, &receiveBuffer, 0, MPI_INT, rank, MPI_COMM_WORLD);
}
printf("ID: %d with %d items.\n", rank, ARRAY_SIZE/numNodes);
printArray(receiveBuffer, ARRAY_SIZE/numNodes);
MPI_Finalize();
return 0;
}
Additionally, why is the original array printing for each process? Doesn't parallelization begin after INIT?

There are two issues with your usage of MPI_Scatter()
MPI_Scatter() is a collective operation, and has hence to be invoked by all the ranks of the communicator (e.g. not only rank zero)
since you use MPI_INT for both send and receive datatype, the send and receive counts should be equal (e.g. use ARRAY_SIZE/numNodes instead of 0)
The MPI standard does not specify what happens before MPI_Init(), and it is very common that mpirun spawns all the tasks, so they all execute the code before MPI_Init(). That's why MPI_Init() is generally invoked at the very beginning of a MPI program.

Open MPI Waitall() Segmentation Fault

I'm new with MPI and I'm trying to develop a non-blocking programm (with Isend and Irecv). The functionality is very basic (it's educational):
There is one process (rank 0) who is the master and receives messages from the slaves (rank 1-P). The master only receives results.
The slaves generates an array of N random numbers between 0 and R and then they do some operations with those numbers (again, it's just for educational purpose, the operations don't make any sense)
This whole process (operations + send data) is done M times (this is just for comparing different implementations; blocking and non-blocking)
I get a Segmentation Fault in the Master process when I'm calling the MPI_waitall() funcion
#include <stdio.h>
#include <stdlib.h>
#include "mpi.h"
#include <math.h>
#include <time.h>
#define M 1000 //Number of times
#define N 2000 //Quantity of random numbers
#define R 1000 //Max value of random numbers
double SumaDeRaices (double*);
int main(int argc, char* argv[]) {
int yo; /* rank of process */
int p; /* number of processes */
int dest; /* rank of receiver */
/* Start up MPI */
MPI_Init(&argc, &argv);
/* Find out process rank */
MPI_Comm_rank(MPI_COMM_WORLD, &yo);
/* Find out number of processes */
MPI_Comm_size(MPI_COMM_WORLD, &p);
MPI_Request reqs[p-1];
MPI_Status stats[p-1];
if (yo == 0) {
int i,j;
double result;
clock_t inicio,fin;
inicio = clock();
for(i = 0; i<M; i++){ //M times
for(j = 1; j<p; j++){ //for every slave
MPI_Irecv(&result, sizeof(double), MPI_DOUBLE, j, i, MPI_COMM_WORLD, &reqs[j-1]);
}
MPI_Waitall(p-1,reqs,stats); //wait all slaves (SEG_FAULT)
}
fin = clock()-inicio;
printf("Tiempo total de ejecucion %f segundos \n", ((double)fin)/CLOCKS_PER_SEC);
}
else {
double* numAleatorios = (double*) malloc( sizeof(double) * ((double) N) ); //array with numbers
int i,j;
double resultado;
dest=0;
for(i=0; i<M; i++){ //again, M times
for(j=0; j<N; j++){
numAleatorios[j] = rand() % R ;
}
resultado = SumaDeRaices(numAleatorios);
MPI_Isend(&resultado,sizeof(double), MPI_DOUBLE, dest, i, MPI_COMM_WORLD,&reqs[p-1]); //send result to master
}
}
/* Shut down MPI */
MPI_Finalize();
exit(0);
} /* main */
double SumaDeRaices (double* valores){
int i;
double sumaTotal = 0.0;
//Raices cuadradas de los valores y suma de estos
for(i=0; i<N; i++){
sumaTotal = sqrt(valores[i]) + sumaTotal;
}
return sumaTotal;
}

There are several issues with your code. First and foremost in your Isend you pass &resultado several times without waiting until previous non-blocking operation finishes. You are not allowed to reuse the buffer you pass to Isend before you make sure the operation finishes.
Instead I recommend you using normal Send, because in contrast to synchronous send (SSend) normal blocking send returns as soon as you can reuse the buffer.
Second, there is no need to use message tags. I recommend you to just set tag to 0. In terms of performance it is simply faster.
Third, the result shouldn't be a simple variable, but an array of size at least (p-1)
Fourth, I do not recommend you to allocate arrays on stack, like MPI_Request and MPI_Status if the size is not a known small number. In this case the size of array is (p-1), so you better use malloc for this data structure.
Fifth, if you do not check status, use MPI_STATUSES_IGNORE.
Also instead of sizeof(double) you should specify number of items (1).
But of course the absolutely best version is just to use MPI_Gather.
Moreover, generally there is no reason not to run computations on the root node.
Here is slightly rewritten example:
#include <stdio.h>
#include <stdlib.h>
#include "mpi.h"
#include <math.h>
#include <time.h>
#define M 1000 //Number of times
#define N 2000 //Quantity of random numbers
#define R 1000 //Max value of random numbers
double SumaDeRaices (double* valores)
{
int i;
double sumaTotal = 0.0;
//Raices cuadradas de los valores y suma de estos
for(i=0; i<N; i++) {
sumaTotal = sqrt(valores[i]) + sumaTotal;
}
return sumaTotal;
}
int main(int argc, char* argv[]) {
int yo; /* rank of process */
int p; /* number of processes */
/* Start up MPI */
MPI_Init(&argc, &argv);
/* Find out process rank */
MPI_Comm_rank(MPI_COMM_WORLD, &yo);
/* Find out number of processes */
MPI_Comm_size(MPI_COMM_WORLD, &p);
double *result;
clock_t inicio, fin;
double *numAleatorios;
if (yo == 0) {
inicio = clock();
}
numAleatorios = (double*) malloc( sizeof(double) * ((double) N) ); //array with numbers
result = (double *) malloc(sizeof(double) * p);
for(int i = 0; i<M; i++){ //M times
for(int j=0; j<N; j++) {
numAleatorios[j] = rand() % R ;
}
double local_result = SumaDeRaices(numAleatorios);
MPI_Gather(&local_result, 1, MPI_DOUBLE, result, 1, MPI_DOUBLE, 0, MPI_COMM_WORLD); //send result to master
}
if (yo == 0) {
fin = clock()-inicio;
printf("Tiempo total de ejecucion %f segundos \n", ((double)fin)/CLOCKS_PER_SEC);
}
free(numAleatorios);
/* Shut down MPI */
MPI_Finalize();
} /* main */

MPI Segmentationi fault MPI_Scatter using C

I am new in this area and use OpenMPI and C. I try to find out why my code leads to an Segmentatioin fault. I red already a lot about MPI but I did not find any help. It took me already hours. So I decided to ask here for help.
I get the expected result of my code. But I also get every time an error message.
Is it right how I use MPI_Scatter in my case?
Here is my simple code:
#include <stdio.h>
#include "mpi.h"
#include <stdlib.h>
const int MASTER_RANK = 0;
#define DIM 3
int main(int argc, char* argv[])
{
int numProc, rank;
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &numProc);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
int n = 9;
double *m;
double *m_client;
m_client = (double *)malloc(3);
if(rank == MASTER_RANK)
{
m = (double* )malloc(n);
for(int i=0; i<n; i++)
{
m[i] = (double)i+1.0;
}
}
MPI_Scatter(m, 3, MPI_DOUBLE, m_client, 3, MPI_DOUBLE, MASTER_RANK, MPI_COMM_WORLD);
printf("Process %d:\n", rank);
for(int i=0; i < 3; i++)
{
printf(" (%lf", m_client[i]);
m_client[i] += 1000*rank;
printf(" -> %lf)", m_client[i]);
printf("\n");
}
printf( "\n" );
MPI_Gather(m_client, 3, MPI_DOUBLE, m, 3, MPI_DOUBLE, MASTER_RANK, MPI_COMM_WORLD);
if(rank == MASTER_RANK)
{
printf("Master: Received= \n");
for(int i=0; i<numProc; i++)
{
for(int j=0; j < 3; j++)
{
int idx = i*3 + j;
printf("%lf ", m[idx]);
}
printf("from Process %d\n", i);
}
}
free(m);
free(m_client);
MPI_Finalize();
exit(0);
}
I build my MPI file by using mpicc mpifile.c -o mpifile and run it with mpirun -np 3 ./mpifile. I use 3 processes.
The error I get is:
[Samuel-Z97-HD3:14361] *** Process received signal ***
[Samuel-Z97-HD3:14361] Signal: Segmentation fault (11)
[Samuel-Z97-HD3:14361] Signal code: (128)
[Samuel-Z97-HD3:14361] Failing at address: (nil)
I am using Ubuntu and vim / Geany.

Your code has two problems.
Both your malloc() calls are having wrong sizes. You should allocate number of bytes instead of number of doubles. E.g. instead of calling malloc(3), call malloc(3*sizeof(double)).
Another problem is that your variable m should be initialized to NULL. Alternatively, you could surround free(m) with if(rank == MASTER_RANK). As is, a non-master process calls free(m) where m is uninitialized and could contain arbitrary value.

Communication between two processors (Parallel Programming)

I want to write a code that:
P0 processor gets an array from keyboard, and sends that array to P1 processor.
P1 processor prints all of the values to screen. For example:
[P0]: Enter the size of array: 1
[P0]: Enter the elements of array: 3
[P1]: The array is: 3
[P0]: Enter the size of array: 3
[P0]: Enter the elements of array: 5 7 5
[P1]: The array is: 5 7 5
.
.
.
and here is my first work. Too many faults I think. But I'm new. Want to learn how to code.
#include <stdio.h>
#include <mpi.h>
#define n 100
int main(int argc, char *argv[]){
int my_rank, size_cm;
int value, i;
int dizi[n];
double senddata[n];
double recvdata[n];
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
MPI_Comm_size(MPI_COMM_WORLD, &size_cm);
value=0;
if(my_rank == 0){
printf("[%d]: Enter the size of array: ",&my_rank);
scanf("%d",value);
printf("[%d]: Enter the elements of array",&my_rank);
for(i=0; i<n; i++){
scanf("%d", &dizi[n]);
senddata[0] = dizi[];
MPI_Send(senddata, 1, MPI_INT, 1, 0, MPI_COMM_WORLD);
}
}
if(my_rank == 1){
MPI_Recv(recvdata, 1, MPI_INT, 0, 0, MPI_COMM_WORLD,
printf("[%d]: The array is: %d ",&my_rank, dizi[n]);
}
MPI_Finalize();
return 0;
}

To get a minimal example that compile, I added the missing argument to MPI_Recv():
MPI_Status status;
MPI_Recv(recvdata, 1, MPI_INT, 0, 0, MPI_COMM_WORLD,&status);
I also modifed senddata[0] = dizi[]; to senddata[0] = dizi[i];
As I tried to compile the code you provided, I got a warning :
format ‘%d’ expects argument of type ‘int’, but argument 2 has type ‘int *
The function scanf() needs the pointer to the data to modify it, so int a;scanf("%d",&a); is correct. But printf() just needs the data since it will not modify it : int a;printf("%d",a); is the right way to go.
If you want the array to be populated, use scanf("%d", &dizi[i]);, not scanf("%d", &dizi[n]);. n is the lenght of array dizi. Hence, the index n is out of the array, since indexes of arrays start at 0. This could trigger undefined behaviors (strange values, segmentation fault or even a correct result !).
Since MPI_Send() is called in the for(i=0; i<n; i++), process 0 tries to send n messages to process 1. But process 1 only receives one. Hence, process 0 will be locked at i=1, waiting for process 1 to receive the second message. This is a deadlock.
I assume you are trying to send an array from process 0 to process 1. The following code based on your should do the trick. The actual length of the array is n_arr:
#include <stdio.h>
#include <mpi.h>
#include <stdlib.h>
#define n 100
int main(int argc, char *argv[]){
int my_rank, size_cm;
int n_arr;
int i;
int dizi[n];
// double senddata[n];
// double recvdata[n];
MPI_Status status;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
MPI_Comm_size(MPI_COMM_WORLD, &size_cm);
if(my_rank == 0){
// fflush(stdout); because the standard output is buffered...
printf("[%d]: Enter the size of array: ",my_rank);fflush(stdout);
if(scanf("%d",&n_arr)!=1){fprintf(stderr,"input error\n");exit(1);}
if(n_arr>100){
fprintf(stderr,"The size of the array is too large\n");exit(1);
}
printf("[%d]: Enter the elements of array",my_rank);fflush(stdout);
for(i=0; i<n_arr; i++){
if(scanf("%d", &dizi[i])!=1){fprintf(stderr,"input error\n");exit(1);}
}
//sending the length of the array
MPI_Send(&n_arr, 1, MPI_INT, 1, 0, MPI_COMM_WORLD);
//senfing the array
MPI_Send(dizi, n_arr, MPI_INT, 1, 0, MPI_COMM_WORLD);
}
if(my_rank == 1){
// receiving the length of the array
MPI_Recv(&n_arr, 1, MPI_INT, 0, 0, MPI_COMM_WORLD,&status);
//receiving the array
MPI_Recv(dizi, n_arr, MPI_INT, 0, 0, MPI_COMM_WORLD,&status);
printf("[%d]: The array of size %d is: ",my_rank,n_arr);
for(i=0; i<n_arr; i++){
printf("%d ",dizi[i]);
}
printf("\n");
}
MPI_Finalize();
return 0;
}
It is compiled by running mpicc main.c -o main and ran by mpirun -np 2 main
I added some stuff to check if the input is correct (always a good thing) and to handle the case of n_arr being larger than n=100. The last one could be avoided by using malloc() to allocate memory for the array: this part is left to you !

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Efficient usage of MPI_Allreduce - c

Related

An efficient way to perform an all reduction in MPI of a value based on another variable?

MPI_Scatter is receiving wrong values

Open MPI Waitall() Segmentation Fault

MPI Segmentationi fault MPI_Scatter using C

Communication between two processors (Parallel Programming)

Categories

Resources