MPI Matrix Multiplication with Dynamic Allocation: Seg. Fault

MPI Matrix Multiplication with Dynamic Allocation: Seg. Fault - c

I'm making a matriz multiplication program in OpenMPI, and I got this error message:
[Mecha Liberta:12337] *** Process received signal ***
[Mecha Liberta:12337] Signal: Segmentation fault (11)
[Mecha Liberta:12337] Signal code: Address not mapped (1)
[Mecha Liberta:12337] Failing at address: 0xbfe4f000
--------------------------------------------------------------------------
mpirun noticed that process rank 1 with PID 12337 on node Mecha Liberta exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
That's how I define the matrices:
int **a, **b, **r;
a = (int **)calloc(l,sizeof(int));
b = (int **)calloc(l,sizeof(int));
r = (int **)calloc(l,sizeof(int));
for (i = 0; i < l; i++)
a[i] = (int *)calloc(c,sizeof(int));
for (i = 0; i < l; i++)
b[i] = (int *)calloc(c,sizeof(int));
for (i = 0; i < l; i++)
r[i] = (int *)calloc(c,sizeof(int));
And here's my Send/Recv (i'm pretty sure my problem should be here):
MPI_Send(&sent, 1, MPI_INT, dest, tag, MPI_COMM_WORLD);
MPI_Send(&lines, 1, MPI_INT, dest, tag, MPI_COMM_WORLD);
MPI_Send(&(a[sent][0]), lines*NCA, MPI_INT, dest, tag, MPI_COMM_WORLD);
MPI_Send(&b, NCA*NCB, MPI_INT, dest, tag, MPI_COMM_WORLD);
and:
MPI_Recv(&sent, 1, MPI_INT, 0, tag, MPI_COMM_WORLD, &status);
MPI_Recv(&lines, 1, MPI_INT, 0, tag, MPI_COMM_WORLD, &status);
MPI_Recv(&a, lines*NCA, MPI_INT, 0, tag, MPI_COMM_WORLD, &status);
MPI_Recv(&b, NCA*NCB, MPI_INT, 0, tag, MPI_COMM_WORLD, &status);
Can anyone see where is the problem?

This is a common problem with C and multidimensional arrays and MPI.
In this line, say:
MPI_Send(&b, NCA*NCB, MPI_INT, dest, tag, MPI_COMM_WORLD);
you're telling MPI to send NCAxNCB integers starting at b to dest,MPI_COMM_WORLD with tag tag. But, b isn't a pointer to NCAxNCB integers; it's a pointer to NCA pointers to NCB integers.
So what you want to do is to ensure your arrays are contiguous (probably better for performance anyway), using something like this:
int **alloc_2d_int(int rows, int cols) {
int *data = (int *)malloc(rows*cols*sizeof(int));
int **array= (int **)malloc(rows*sizeof(int*));
for (int i=0; i<rows; i++)
array[i] = &(data[cols*i]);
return array;
}
/* .... */
int **a, **b, **r;
a = alloc_2d_int(l, c);
b = alloc_2d_int(l, c);
r = alloc_2d_int(l, c);
and then
MPI_Send(&sent, 1, MPI_INT, dest, tag, MPI_COMM_WORLD);
MPI_Send(&lines, 1, MPI_INT, dest, tag, MPI_COMM_WORLD);
MPI_Send(&(a[sent][0]), lines*NCA, MPI_INT, dest, tag, MPI_COMM_WORLD);
MPI_Send(&(b[0][0]), NCA*NCB, MPI_INT, dest, tag, MPI_COMM_WORLD);
MPI_Recv(&sent, 1, MPI_INT, 0, tag, MPI_COMM_WORLD, &status);
MPI_Recv(&lines, 1, MPI_INT, 0, tag, MPI_COMM_WORLD, &status);
MPI_Recv(&(a[0][0]), lines*NCA, MPI_INT, 0, tag, MPI_COMM_WORLD, &status);
MPI_Recv(&(b[0][0]), NCA*NCB, MPI_INT, 0, tag, MPI_COMM_WORLD, &status);
should work more as expected.

Related

Error running with more than 16 tasks MPI_ERR_TRUNCATE: message truncated

I have this c code that calculates polynomial arrays, where I'm trying to run it from a cluster using MPI.
int main(int argc, char **argv)
{
int id;
int n;
int i, size, arraySize;
double *vet, valor, *vresp, resposta, tempo, a[GRAU + 1];
int hostsize;
char hostname[MPI_MAX_PROCESSOR_NAME];
MPI_Status status;
MPI_Init(&argc, &argv);
MPI_Get_processor_name(hostname, &hostsize);
MPI_Comm_rank(MPI_COMM_WORLD, &id);
MPI_Comm_size(MPI_COMM_WORLD, &n);
if (id == 0) // Master
{
MPI_Barrier(MPI_COMM_WORLD);
MPI_Bcast(&a, GRAU, MPI_DOUBLE, 0, MPI_COMM_WORLD);
for (size = TAM_INI; size <= TAM_MAX; size += TAM_INC)
{
tempo = -MPI_Wtime();
for (int dest = 1; dest < n; ++dest)
{
int ini = 0;
int fim = dest * size / (n - 1);
int tam = fim - ini;
MPI_Send(&ini, 1, MPI_INT, dest, 0, MPI_COMM_WORLD);
MPI_Send(&tam, 1, MPI_INT, dest, 0, MPI_COMM_WORLD);
MPI_Send(&x[ini], tam, MPI_DOUBLE, dest, 0, MPI_COMM_WORLD);
ini = fim;
fflush(stdout);
}
int total = 0;
for (int dest = 1; dest < n; ++dest)
{
int ini_escravo;
int tam_escravo;
MPI_Recv(&ini_escravo, 1, MPI_INT, MPI_ANY_SOURCE, 0, MPI_COMM_WORLD, &status);
MPI_Recv(&tam_escravo, 1, MPI_INT, MPI_ANY_SOURCE, 0, MPI_COMM_WORLD, &status);
MPI_Recv(&y[ini_escravo], tam_escravo, MPI_DOUBLE, MPI_ANY_SOURCE, 0, MPI_COMM_WORLD, &status);
}
tempo += MPI_Wtime();
}
}
else
{ // Slave
MPI_Barrier(MPI_COMM_WORLD);
MPI_Bcast(&a, GRAU, MPI_DOUBLE, 0, MPI_COMM_WORLD);
for (arraySize = TAM_INI; arraySize <= TAM_MAX; arraySize += TAM_INC)
{
int ini, tam;
MPI_Recv(&ini, 1, MPI_INT, 0, 0, MPI_COMM_WORLD, &status);
MPI_Recv(&tam, 1, MPI_INT, 0, 0, MPI_COMM_WORLD, &status);
MPI_Recv(&x[0], tam, MPI_DOUBLE, 0, 0, MPI_COMM_WORLD, &status);
for (i = 0; i < tam; ++i)
y[i] = polinomio(a, GRAU, x[i]);
MPI_Send(&ini, 1, MPI_INT, 0, 0, MPI_COMM_WORLD);
MPI_Send(&tam, 1, MPI_INT, 0, 0, MPI_COMM_WORLD);
MPI_Send(&y[0], tam, MPI_DOUBLE, 0, 0, MPI_COMM_WORLD);
fflush(stdout);
}
}
MPI_Finalize();
return 0;
}
The code works fine when I run using 16 tasks or less per node. If I try to run using 32 tasks (16 per node, with 2 nodes), I get the following message:
[06:272259] *** An error occurred in MPI_Recv [06:272259] *** reported
by process [2965045249,0] [06:272259] *** on communicator
MPI_COMM_WORLD [06:272259] *** MPI_ERR_TRUNCATE: message truncated
[06:272259] *** MPI_ERRORS_ARE_FATAL (processes in this communicator
will now abort, [06:272259] *** and potentially your MPI job)
[07][[45243,1],31][btl_tcp.c:559:mca_btl_tcp_recv_blocking] recv(20)
failed: Connection reset by peer (104)
Any idea about what I am missing here?

Sharing a dynamically allocated 2D array with MPI [duplicate]

This question already has answers here:
Sending and receiving 2D array over MPI
(3 answers)
Closed 2 years ago.
I am trying to share a dynamically allocated 2D array from a master thread to several other threads using MPI in c, from within a function.
A simplified representation of the relevant code is as follows:
//Initialize program, start up the desired number of threads.
//Master thread takes input from user, dynamically allocates and constructs 2d array.
//All threads call method analyze_inputs(**array), which takes the array as input (all threads other than master simply pass NULL as argument)
//The master thread shares the array, along with work division to all other threads:
{//Master thread
MPI_Send(&x, 1, MPI_INT, recievingThread, 0, MPI_COMM_WORLD);
MPI_Send(&y, 1, MPI_INT, recievingThread, 0, MPI_COMM_WORLD);
MPI_Send(&(array[0][0]), x*y, MPI_INT, recievingThread, 0, MPI_COMM_WORLD);
}
{//Subthreads
MPI_Recv(&x, 1, MPI_INT, 0, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
MPI_Recv(&y, 1, MPI_INT, 0, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
MPI_Recv(&(array[0][0]), x*y, MPI_INT, 0, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
}
This is a soulution i found on this site for sending dynamically allocated 2d arrays, but i get segmentation error for the array recieve.
How can i do this?
edit: Minimal reproducible example
#include <mpi.h>
#include <stdlib.h>
#include <stdio.h>
int analyze_inputs (int x, int y, int** array);
int main (int argc, char **argv)
{
int x = 10;
int y = 8;
int rank;
int **array = NULL;
MPI_Init (&argc, &argv);
MPI_Comm_rank (MPI_COMM_WORLD, &rank);
if (rank == 0)
{
array = malloc(x * sizeof(int*));
for (int i = 0; i < x; i++)
{
array[i] = malloc(y * sizeof(int));
}
for (int i = 0; i < x; i++)
{
for (int j = 0; j < y; j++)
{
array[i][j] = rand();
}
}
}
analyze_inputs(x,y,array);
MPI_Finalize ();
}
int analyze_inputs(int x,int y, int** array)
{
int rank, x_temp, y_temp, **array_temp;
MPI_Comm_rank (MPI_COMM_WORLD, &rank);
if (rank == 0)
{
MPI_Send(&x, 1, MPI_INT, 1, 0, MPI_COMM_WORLD);
MPI_Send(&y, 1, MPI_INT, 1, 0, MPI_COMM_WORLD);
MPI_Send(&(array[0][0]), x*y, MPI_INT, 1, 0, MPI_COMM_WORLD);
}
else
{
MPI_Recv(&x_temp, 1, MPI_INT, 0, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
MPI_Recv(&y_temp, 1, MPI_INT, 0, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
printf("Works to here.\n");
MPI_Recv(&(array_temp[0][0]), x_temp*y_temp, MPI_INT, 0, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
printf("Crashes before here.\n");
}
}

Each row of array is allocated separately in your code, so simple
MPI_Send(&(array[0][0]), x*y, MPI_INT, 1, 0, MPI_COMM_WORLD);
won't work in this case.
An simple solution is to allocate a single block of memory like this:
array = malloc(x * sizeof(int*));
array[0] = malloc(y * x * sizeof(int));
for (int i = 1; i < x; i++)
{
array[i] = array[0] + y * i;
}
And freeing this array will be
free(array[0]);
free(array);
Do not free array[1], array[2], ... in this case because they are already freed by free(array[0]);.

Vigenere on MPI error MPI_COMM_WORLD can not encrypt or decrypt

I'm newbie using open-mpi, i want to use MPI to solve viginere cipher problem, my problems are :
1. doesn't even line up
2. After i insert words and key that i want to encrypt, this error comes up
[mpi-VirtualBox:1646] *** An error occurred in MPI_Recv
[mpi-VirtualBox:1646] *** reported by process [2610495489,0]
[mpi-VirtualBox:1646] *** on communicator MPI_COMM_WORLD
[mpi-VirtualBox:1646] *** MPI_ERR_RANK: invalid rank
[mpi-VirtualBox:1646] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[mpi-VirtualBox:1646] *** and potentially your MPI job)
here is my code
#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>
#include<string.h>
#include <time.h>
void my_bcast(void* data, int count, MPI_Datatype datatype, int root,
MPI_Comm communicator) {
int world_rank;
MPI_Comm_rank(communicator, &world_rank);
int world_size;
MPI_Comm_size(communicator, &world_size);
if (world_rank == root) {
// If we are the root process, send our data to everyone
int i;
for (i = 0; i < world_size; i++) {
if (i != world_rank) {
MPI_Send(data, count, datatype, i, 0, communicator);
}
}
} else {
// If we are a receiver process, receive the data from the root
MPI_Recv(data, count, datatype, root, 0, communicator, MPI_STATUS_IGNORE);
}
}
int k,j,lenPesan, lenKunci;
char pesan[1000];
char kunci[1000];
char kunciBaru[1000];
char encryptedPesan[1000];
char decryptedPesan[1000];
int main(int argc, char** argv) {
MPI_Init(NULL, NULL);
int world_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
int world_size;
MPI_Comm_size(MPI_COMM_WORLD, &world_size);
if (world_rank == 0) {
int i;
for (i=0; i<world_size;i++){
printf("Program Vigenere Cipher\n");
printf("Encryption dan Decryption\n");
printf("Menggunakan implementasi Message Passing Computing\n");
printf("--------------------------------------------------\n");
printf("Masukkan Pesan (huruf besar tanpa spasi) = ");
scanf("%s",pesan);
printf("\nMasukkan Key (huruf kecil tanpa spasi) = ");
scanf("%s",kunci);
char kunciBaru[lenPesan],encryptedPesan[lenPesan],decryptedPesan[lenPesan];
lenPesan = strlen(pesan);
lenKunci = strlen(kunci);
for(k = 0, j = 0; k < lenPesan; ++k, ++j){
if(j == lenKunci)
j = 0;
kunciBaru[k] = kunci[j];
}
kunciBaru[k] = '\0';
MPI_Status status;
my_bcast(&pesan, 1, MPI_CHAR, 0, MPI_COMM_WORLD);
my_bcast(&kunci, 1, MPI_CHAR, 0, MPI_COMM_WORLD);
my_bcast(&lenPesan, 1, MPI_CHAR, 0, MPI_COMM_WORLD);
my_bcast(&lenKunci, 1, MPI_CHAR, 0, MPI_COMM_WORLD);
my_bcast(&kunciBaru, 1, MPI_CHAR, 0, MPI_COMM_WORLD);
MPI_Recv(&encryptedPesan, 1, MPI_CHAR, i, 0, MPI_COMM_WORLD, &status);
MPI_Recv(&decryptedPesan, 1, MPI_CHAR, i, 0, MPI_COMM_WORLD, &status);
}
printf("Original Message: %s", pesan);
printf("\nKey: %s", kunci);
printf("\nNew Generated Key: %s", kunciBaru);
printf("\nEncrypted Message: %s", encryptedPesan);
printf("\nDecrypted Message: %s\n", decryptedPesan);
return 0;
} else {
my_bcast(&pesan, 1, MPI_CHAR, 0, MPI_COMM_WORLD);
my_bcast(&kunci, 1, MPI_CHAR, 0, MPI_COMM_WORLD);
my_bcast(&lenPesan, 1, MPI_CHAR, 0, MPI_COMM_WORLD);
my_bcast(&lenKunci, 1, MPI_CHAR, 0, MPI_COMM_WORLD);
my_bcast(&kunciBaru, 1, MPI_CHAR, 0, MPI_COMM_WORLD);
for(k = 0; k < lenPesan; ++k)
encryptedPesan[k] = ((pesan[k] + kunciBaru[k]) % 26) + 'A';
encryptedPesan[k] = '\0';
MPI_Send(&encryptedPesan, 1, MPI_CHAR, 0, 0, MPI_COMM_WORLD);
for(k = 0; k < lenPesan; ++k)
decryptedPesan[k] = (((encryptedPesan[k] - kunciBaru[k]) + 26) % 26) + 'A';
decryptedPesan[k] = '\0';
MPI_Send(&decryptedPesan, 1, MPI_CHAR, 0, 0, MPI_COMM_WORLD);
}
MPI_Finalize();
}
So far i've tried changing rank of source process to root, the problem still exist, and change rank of source to i still have the same problem.
i know it's a mess please don't judge me. If someone can help i will appreciate your help a lot, sorry for my bad language. Thanks

Trouble Resolving Deadlock in MPI Program dealing with a Cartesian mesh

I am implementing the cannon's algorithm. I run it using 4 processors. I hit a dead lock when I enter the loop:
for (i=0; i<dims[0]; i++) {
Multiply(nlocal, a, b, c);
MPI_Sendrecv_replace(a, nlocal*nlocal, MPI_DOUBLE,leftrank, 1, rightrank, 1, comm_2d, &status);
MPI_Sendrecv_replace(b, nlocal*nlocal, MPI_DOUBLE,uprank, 1, downrank, 1, comm_2d, &status);
}
The entire code is here:
#include <math.h>
#include <mpi.h>
#include <stdarg.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
void Multiply(int n, double *a, double *b, double *c);
double* readMatrix(char* filename, int* size);
void writeMatrix(double* matrix, char* filename, int size);
int main(int argc, char* argv[])
{
MPI_Init(&argc, &argv);
double* a,*b,*c;
int i, t, n;
int nlocal;
int npes, dims[2], periods[2];
int myrank, my2drank, mycoords[2];
int uprank, downrank, leftrank, rightrank, coords[2];
int shiftsource, shiftdest;
MPI_Status status;
MPI_Comm comm_2d;
MPI_Comm_size(MPI_COMM_WORLD, &npes);
MPI_Comm_rank(MPI_COMM_WORLD, &myrank);
MPI_Barrier(MPI_COMM_WORLD);
t = -MPI_Wtime();
if (myrank == 0) {
int sizeA,sizeB;
printf("Reading %s\n", argv[1]);
a = readMatrix(argv[1], &sizeA);
b = readMatrix(argv[2], &sizeB);
printf("Reading %s\n", argv[2]);
c = calloc(sizeA*sizeB, sizeof(double));
n = sizeA;
MPI_Bcast(&n, 1, MPI_INT, 0, MPI_COMM_WORLD);
MPI_Bcast(a, n*n, MPI_DOUBLE, 0, MPI_COMM_WORLD);
MPI_Bcast(b, n*n, MPI_DOUBLE, 0, MPI_COMM_WORLD);
MPI_Bcast(c, n*n, MPI_DOUBLE, 0, MPI_COMM_WORLD);
if (sizeA != sizeB) {
printf("Matrix not sized n^2\n");
MPI_Abort(MPI_COMM_WORLD, 0);
}
}
else {
a = calloc(n*n, sizeof(double));
b = calloc(n*n, sizeof(double));
c = calloc(n*n, sizeof(double));
MPI_Bcast(&n, 1, MPI_INT, 0, MPI_COMM_WORLD);
MPI_Bcast(a, n*n, MPI_DOUBLE, 0, MPI_COMM_WORLD);
MPI_Bcast(b, n*n, MPI_DOUBLE, 0, MPI_COMM_WORLD);
MPI_Bcast(c, n*n, MPI_DOUBLE, 0, MPI_COMM_WORLD);
}
dims[0] = dims[1] = sqrt(npes);
periods[0] = periods[1] = 1;
MPI_Cart_create(MPI_COMM_WORLD, 2, dims, periods, 1, &comm_2d);
MPI_Comm_rank(comm_2d, &my2drank);
MPI_Cart_coords(comm_2d, my2drank, 2, mycoords);
MPI_Cart_shift(comm_2d, 0, -1, &rightrank, &leftrank);
MPI_Cart_shift(comm_2d, 1, -1, &downrank, &uprank);
nlocal = n/dims[0];
MPI_Cart_shift(comm_2d, 0, -mycoords[0], &shiftsource, &shiftdest);
MPI_Sendrecv_replace(a, nlocal*nlocal, MPI_DOUBLE, shiftdest,1, shiftsource, 1, comm_2d, &status);
MPI_Cart_shift(comm_2d, 1, -mycoords[1], &shiftsource, &shiftdest);
MPI_Sendrecv_replace(b, nlocal*nlocal, MPI_DOUBLE,shiftdest, 1, shiftsource, 1, comm_2d, &status);
printf("rank[%d] has entered loop\n", myrank);
for (i=0; i<dims[0]; i++) {
Multiply(nlocal, a, b, c);
MPI_Sendrecv_replace(a, nlocal*nlocal, MPI_DOUBLE,leftrank, 1, rightrank, 1, comm_2d, &status);
MPI_Sendrecv_replace(b, nlocal*nlocal, MPI_DOUBLE,uprank, 1, downrank, 1, comm_2d, &status);
}
printf("rank[%d] has left loop\n", myrank);
MPI_Cart_shift(comm_2d, 0, +mycoords[0], &shiftsource, &shiftdest);
MPI_Sendrecv_replace(a, nlocal*nlocal, MPI_DOUBLE,shiftdest, 1, shiftsource, 1, comm_2d, &status);
MPI_Cart_shift(comm_2d, 1, +mycoords[1], &shiftsource, &shiftdest);
MPI_Sendrecv_replace(b, nlocal*nlocal, MPI_DOUBLE,shiftdest, 1, shiftsource, 1, comm_2d, &status);
printf("rank[%d] has reached the barrier...\n", myrank);
MPI_Barrier(MPI_COMM_WORLD);
if (myrank == 0) {
t += MPI_Wtime();
writeMatrix(c, argv[3], n);
printf("%s %d second(s)\n", "Finshed in", t);
}
free(a); free(b); free(c);
MPI_Comm_free(&comm_2d);
MPI_Finalize();
}
double* readMatrix(char* filename, int* size) {
FILE* file_handle = fopen(filename, "r");
int row;
int col;
fread(&row, sizeof(int), 1, file_handle);
fread(&col, sizeof(int), 1, file_handle);
if (row == col) {
*size = row;
}
else {
*size = -1;
return NULL;
}
double* buffer = calloc(row*col, sizeof(double));
for(int i = 0; i < row; i++) {
for(int j = 0; j < col; j++) {
double x;
fread(&x, sizeof(double), 1, file_handle);
buffer[row * i + j] = x;
}
}
fclose(file_handle);
printf("Buffer has size %d\n", row*col);
return buffer;
}
void writeMatrix(double* matrix, char* filename, int size) {
FILE* file_handle = fopen(filename, "w");
fwrite(&size, sizeof(int), 1, file_handle);
fwrite(&size, sizeof(int), 1, file_handle);
for(int i = 0; i < size; i++) {
for(int j = 0; j < size; j++) {
double x = matrix[size * i + j];
fwrite(&x, sizeof(double), 1, file_handle);
}
}
fclose(file_handle);
}
void Multiply(int n, double *a, double *b, double *c)
{
int i, j, k;
for (i=0; i<n; i++)
for (j=0; j<n; j++)
for (k=0; k<n; k++)
c[i*n+j] += a[i*n+k]*b[k*n+j];
}
If it is too much code I can easily remove certain parts. I am just wondering what is causing the deadlock and how to resolve it. Thank you for your time, in advance.
Important information:
Rank 0 always hits the barrier. But since the other 3 are deadlocked rank 0 is stuck until all have hit barrier.
Output
Reading 10
Buffer has size 100
Buffer has size 100
Reading 10
rank[0] has entered loop
rank[0] has left loop
rank[0] has reached the barrier...
rank[1] has entered loop
rank[2] has entered loop
rank[3] has entered loop

There are two little issues to get something working :
In lines :
a = calloc(n*n, sizeof(double));
b = calloc(n*n, sizeof(double));
c = calloc(n*n, sizeof(double));
MPI_Bcast(&n, 1, MPI_INT, 0, MPI_COMM_WORLD);
n should be broadcasted before allocating a. Otherwise, n is not initialized and the output is undefined. It can trigger a segmentation fault.
In the function MPI_Cart_shift, the third argument is the displacement : negative for downward and positive for upward. I changed it to set the same displacement for everyone and it worked fine. Even if MPI_Sendrecv_replace() is used, the number of messages received by a process must match the number of messages sent to this process. It is likely not the case in your call to MPI_Sendrecv_replace() :
MPI_Cart_shift(comm_2d, 0, -mycoords[0], &shiftsource, &shiftdest);
MPI_Sendrecv_replace(a, nlocal*nlocal, MPI_DOUBLE, shiftdest,1, shiftsource, 1, comm_2d, &status);
In the "skew" example of open-mpi, it is slightly different :
C compute shift source and destination
CALL MPI_CART_SHIFT(comm, 0, coords(2), source,
dest, ierr)
C skew array
CALL MPI_SENDRECV_REPLACE(A, 1, MPI_REAL, dest, 0,
source, 0, comm, status,
ierr)
In this case, all processes in each line get the same displacement. Hence, each process should send a message and each process should receive one. Yet, the displacement depends on the line and the matrix is skewed.
Here is the resulting code. It is compiled by mpicc main.c -o main -lm -Wall and run by mpirun -np 4 main :
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include "mpi.h"
int main(int argc, char **argv)
{
MPI_Init(&argc, &argv);
double* a,*b,*c;
int i, t, n;
int nlocal;
int npes, dims[2], periods[2];
int myrank, my2drank, mycoords[2];
int uprank, downrank, leftrank, rightrank;
int shiftsource, shiftdest;
MPI_Status status;
MPI_Comm comm_2d;
MPI_Comm_size(MPI_COMM_WORLD, &npes);
MPI_Comm_rank(MPI_COMM_WORLD, &myrank);
MPI_Barrier(MPI_COMM_WORLD);
t = -MPI_Wtime();
if (myrank == 0) {
int sizeA,sizeB;
printf("Reading \n");
// a = readMatrix(argv[1], &sizeA);
sizeA=16;
a=malloc(sizeA*sizeA*sizeof(double));
// b = readMatrix(argv[2], &sizeB);
sizeB=16;
b=malloc(sizeB*sizeB*sizeof(double));
printf("Reading \n");
c = calloc(sizeA*sizeB, sizeof(double));
n = sizeA;
MPI_Bcast(&n, 1, MPI_INT, 0, MPI_COMM_WORLD);
MPI_Bcast(a, n*n, MPI_DOUBLE, 0, MPI_COMM_WORLD);
MPI_Bcast(b, n*n, MPI_DOUBLE, 0, MPI_COMM_WORLD);
MPI_Bcast(c, n*n, MPI_DOUBLE, 0, MPI_COMM_WORLD);
if (sizeA != sizeB) {
printf("Matrix not sized n^2\n");
MPI_Abort(MPI_COMM_WORLD, 0);
}
}
else {
MPI_Bcast(&n, 1, MPI_INT, 0, MPI_COMM_WORLD);//n should be broadcast before allocation
a = calloc(n*n, sizeof(double));
b = calloc(n*n, sizeof(double));
c = calloc(n*n, sizeof(double));
MPI_Bcast(a, n*n, MPI_DOUBLE, 0, MPI_COMM_WORLD);
MPI_Bcast(b, n*n, MPI_DOUBLE, 0, MPI_COMM_WORLD);
MPI_Bcast(c, n*n, MPI_DOUBLE, 0, MPI_COMM_WORLD);
}
dims[0] = dims[1] = sqrt(npes);
periods[0] = periods[1] = 1;
MPI_Cart_create(MPI_COMM_WORLD, 2, dims, periods, 1, &comm_2d);
MPI_Comm_rank(comm_2d, &my2drank);
MPI_Cart_coords(comm_2d, my2drank, 2, mycoords);
MPI_Cart_shift(comm_2d, 0, -1, &rightrank, &leftrank);
MPI_Cart_shift(comm_2d, 1, -1, &downrank, &uprank);
nlocal = n/dims[0];
MPI_Cart_shift(comm_2d, 0, -1, &shiftsource, &shiftdest);
// MPI_Cart_shift(comm_2d, 0, -mycoords[0], &shiftsource, &shiftdest);
MPI_Sendrecv_replace(a, nlocal*nlocal, MPI_DOUBLE, shiftdest,5, shiftsource, 5, comm_2d, &status);
// MPI_Cart_shift(comm_2d, 1, -mycoords[1], &shiftsource, &shiftdest);
MPI_Cart_shift(comm_2d, 1, -1, &shiftsource, &shiftdest);
MPI_Sendrecv_replace(b, nlocal*nlocal, MPI_DOUBLE,shiftdest, 6, shiftsource, 6, comm_2d, &status);
printf("rank[%d] has entered loop dim %d\n", myrank,dims[0]);fflush(stdout);
for (i=0; i<dims[0]; i++) {
// Multiply(nlocal, a, b, c);
MPI_Sendrecv_replace(a, nlocal*nlocal, MPI_DOUBLE,leftrank, 1, rightrank, 1, comm_2d, &status);
MPI_Sendrecv_replace(b, nlocal*nlocal, MPI_DOUBLE,uprank, 2, downrank, 2, comm_2d, &status);
}
printf("rank[%d] has left loop\n", myrank);fflush(stdout);
MPI_Barrier(MPI_COMM_WORLD);
// MPI_Cart_shift(comm_2d, 0, +mycoords[0], &shiftsource, &shiftdest);
MPI_Cart_shift(comm_2d, 0, 1, &shiftsource, &shiftdest);
MPI_Sendrecv_replace(a, nlocal*nlocal, MPI_DOUBLE,shiftdest, 3, shiftsource, 3, comm_2d, &status);
MPI_Cart_shift(comm_2d, 1, 1, &shiftsource, &shiftdest);
//MPI_Cart_shift(comm_2d, 1, +mycoords[1], &shiftsource, &shiftdest);
MPI_Sendrecv_replace(b, nlocal*nlocal, MPI_DOUBLE,shiftdest, 4, shiftsource, 4, comm_2d, &status);
printf("rank[%d] has reached the barrier...\n", myrank);fflush(stdout);
MPI_Barrier(MPI_COMM_WORLD);
if (myrank == 0) {
t += MPI_Wtime();
// writeMatrix(c, argv[3], n);
printf("Finshed in %d second(s)\n",t);
}
free(a); free(b); free(c);
MPI_Comm_free(&comm_2d);
MPI_Finalize();
return 0;
}

Undefined struct timeval declaration in C

I have a code below
#include <stdio.h>
#include "mpi.h"
#define NRA 512 /* number of rows in matrix A */
#define NCA 512 /* number of columns in matrix A */
#define NCB 512 /* number of columns in matrix B */
#define MASTER 0 /* taskid of first task */
#define FROM_MASTER 1 /* setting a message type */
#define FROM_WORKER 2 /* setting a message type */
MPI_Status status;
double a[NRA][NCA], /* matrix A to be multiplied */
b[NCA][NCB], /* matrix B to be multiplied */
c[NRA][NCB]; /* result matrix C */
main(int argc, char **argv)
{
int numtasks, /* number of tasks in partition */
taskid, /* a task identifier */
numworkers, /* number of worker tasks */
source, /* task id of message source */
dest, /* task id of message destination */
nbytes, /* number of bytes in message */
mtype, /* message type */
intsize, /* size of an integer in bytes */
dbsize, /* size of a double float in bytes */
rows, /* rows of matrix A sent to each worker */
averow, extra, offset, /* used to determine rows sent to each worker */
i, j, k, /* misc */
count;
struct timeval start, stop;
intsize = sizeof(int);
dbsize = sizeof(double);
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &taskid);
MPI_Comm_size(MPI_COMM_WORLD, &numtasks);
numworkers = numtasks-1;
//printf(" size of matrix A = %d by %d\n",NRA,NCA);
//printf(" size of matrix B = %d by %d\n",NRA,NCB);
/*---------------------------- master ----------------------------*/
if (taskid == MASTER) {
printf("Number of worker tasks = %d\n",numworkers);
for (i=0; i<NRA; i++)
for (j=0; j<NCA; j++)
a[i][j]= i+j;
for (i=0; i<NCA; i++)
for (j=0; j<NCB; j++)
b[i][j]= i*j;
gettimeofday(&start, 0);
/* send matrix data to the worker tasks */
averow = NRA/numworkers;
extra = NRA%numworkers;
offset = 0;
mtype = FROM_MASTER;
for (dest=1; dest<=numworkers; dest++) {
rows = (dest <= extra) ? averow+1 : averow;
//printf(" Sending %d rows to task %d\n",rows,dest);
MPI_Send(&offset, 1, MPI_INT, dest, mtype, MPI_COMM_WORLD);
MPI_Send(&rows, 1, MPI_INT, dest, mtype, MPI_COMM_WORLD);
count = rows*NCA;
MPI_Send(&a[offset][0], count, MPI_DOUBLE, dest, mtype, MPI_COMM_WORLD);
count = NCA*NCB;
MPI_Send(&b, count, MPI_DOUBLE, dest, mtype, MPI_COMM_WORLD);
offset = offset + rows;
}
/* wait for results from all worker tasks */
mtype = FROM_WORKER;
for (i=1; i<=numworkers; i++) {
source = i;
MPI_Recv(&offset, 1, MPI_INT, source, mtype, MPI_COMM_WORLD, &status);
MPI_Recv(&rows, 1, MPI_INT, source, mtype, MPI_COMM_WORLD, &status);
count = rows*NCB;
MPI_Recv(&c[offset][0], count, MPI_DOUBLE, source, mtype, MPI_COMM_WORLD,
&status);
}
#ifdef PRINT
printf("Here is the result matrix\n");
for (i=0; i<NRA; i++) {
printf("\n");
for (j=0; j<NCB; j++)
printf("%6.2f ", c[i][j]);
}
printf ("\n");
#endif
gettimeofday(&stop, 0);
fprintf(stdout,"Time = %.6f\n\n",
(stop.tv_sec+stop.tv_usec*1e-6)-(start.tv_sec+start.tv_usec*1e-6));
} /* end of master section */
/*---------------------------- worker (slave)----------------------------*/
if (taskid > MASTER) {
mtype = FROM_MASTER;
source = MASTER;
#ifdef PRINT
printf ("Master =%d, mtype=%d\n", source, mtype);
#endif
MPI_Recv(&offset, 1, MPI_INT, source, mtype, MPI_COMM_WORLD, &status);
#ifdef PRINT
printf ("offset =%d\n", offset);
#endif
MPI_Recv(&rows, 1, MPI_INT, source, mtype, MPI_COMM_WORLD, &status);
#ifdef PRINT
printf ("row =%d\n", rows);
#endif
count = rows*NCA;
MPI_Recv(&a, count, MPI_DOUBLE, source, mtype, MPI_COMM_WORLD, &status);
#ifdef PRINT
printf ("a[0][0] =%e\n", a[0][0]);
#endif
count = NCA*NCB;
MPI_Recv(&b, count, MPI_DOUBLE, source, mtype, MPI_COMM_WORLD, &status);
#ifdef PRINT
printf ("b=\n");
#endif
for (k=0; k<NCB; k++)
for (i=0; i<rows; i++) {
c[i][k] = 0.0;
for (j=0; j<NCA; j++)
c[i][k] = c[i][k] + a[i][j] * b[j][k];
}
//mtype = FROM_WORKER;
#ifdef PRINT
printf ("after computer\n");
#endif
//MPI_Send(&offset, 1, MPI_INT, MASTER, mtype, MPI_COMM_WORLD);
MPI_Send(&offset, 1, MPI_INT, MASTER, FROM_WORKER, MPI_COMM_WORLD);
//MPI_Send(&rows, 1, MPI_INT, MASTER, mtype, MPI_COMM_WORLD);
MPI_Send(&rows, 1, MPI_INT, MASTER, FROM_WORKER, MPI_COMM_WORLD);
//MPI_Send(&c, rows*NCB, MPI_DOUBLE, MASTER, mtype, MPI_COMM_WORLD);
MPI_Send(&c, rows*NCB, MPI_DOUBLE, MASTER, FROM_WORKER, MPI_COMM_WORLD);
#ifdef PRINT
printf ("after send\n");
#endif
} /* end of worker */
MPI_Finalize();
} /* end of main */
when i try to compile it, the warning was :
matriks.c(43): error C2079: 'start' uses undefined struct 'timeval'
matriks.c(43): error C2079: 'stop' uses undefined struct 'timeval'
matriks.c(65): warning C4013: 'gettimeofday' undefined; assuming extern returning int
matriks.c(111): error C2224: left of '.tv_sec' must have struct/union type
matriks.c(111): error C2224: left of '.tv_usec' must have struct/union type
matriks.c(111): error C2224: left of '.tv_sec' must have struct/union type
matriks.c(111): error C2224: left of '.tv_usec' must have struct/union type
Please help, I don't know where the error is. Thank you.

I think you'll probably find that timeval needs you to include sys/time.h under POSIX systems (it's not standard C). See the POSIX SUSv2 page for details.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

MPI Matrix Multiplication with Dynamic Allocation: Seg. Fault - c

Related

Error running with more than 16 tasks MPI_ERR_TRUNCATE: message truncated

Sharing a dynamically allocated 2D array with MPI [duplicate]

Vigenere on MPI error MPI_COMM_WORLD can not encrypt or decrypt

Trouble Resolving Deadlock in MPI Program dealing with a Cartesian mesh

Undefined struct timeval declaration in C

Categories

Resources