I'm having trouble trying to implement a non-blocking send and receive in my code below and am getting this error:
16 Reading <edge192x128.pgm>
17 Rank 2 [Sat Apr 28 11:24:58 2018] [c6-0c0s13n1] Fatal error in PMPI_Wait: Request pending due to failure, error stack:
18 PMPI_Wait(207): MPI_Wait(request=0x7ffffff95534, status=0x7fffffff74b0) failed
19 PMPI_Wait(158): Invalid MPI_Request
20 Rank 3 [Sat Apr 28 11:24:58 2018] [c6-0c0s13n1] Fatal error in PMPI_Wait: Request pending due to failure, error stack:
21 PMPI_Wait(207): MPI_Wait(request=0x7ffffff95534, status=0x7fffffff74b0) failed
22 PMPI_Wait(158): Invalid MPI_Request
23 _pmiu_daemon(SIGCHLD): [NID 01205] [c6-0c0s13n1] [Sat Apr 28 11:24:58 2018] PE RANK 2 exit signal Aborted
24 [NID 01205] 2018-04-28 11:24:58 Apid 30656034: initiated application termination
25 Application 30656034 exit codes: 134
26 Application 30656034 resources: utime ~0s, stime ~0s, Rss ~7452, inblocks ~7926, outblocks ~19640
My program attempts to perform the following (assuming 4 processes for this example):
Root process reads in an image file as a two-dimensional array PM x PN into masterbuf;
Root process uses MPI_Issend to transfer subsections of masterbuf (PM/2 x PN/2) to all 4 processes (which includes itself). I have used a strided datatype to split the original array into 4 sections.
All processes use MPI_Irecv to store PM/2 x PN/2 subsection in their own copy of buf.
MPI_Wait is called to prevent program continuing until distribution of data is complete (I understand I could have used MPI_Waitall here, which I intend to do after I've got this working).
I've been playing with the code for hours now and just can't fix this issue so any help would be appreciated. Code is below. I've removed some non-relevant blocks.
1 #include <stdio.h>
2 #include <stdlib.h>
3 #include <mpi.h>
4 #include <math.h>
5 #include "pgmio.h"
6
7 #define M 192
8 #define N 128
9
10 #define PX 2 // number of processes in X dimension
11 #define PY 2 // number of processes in Y dimension
12 #define MP M/PX
13 #define NP N/PY
14
15
16 #define FILEIN "edge192x128.pgm"
17 #define FILEOUT "ex7_0_192x128.pgm"
18
19 int main(int argc, char **argv)
20 {
21 double buf[MP][NP];
22 double old[MP + 2][NP + 2];
23 double new[MP + 2][NP + 2];
24 double edge[MP + 2][NP + 2];
25 double masterbuf[M][N];
26 double delta, delta_max, master_delta;
27
28 int rank, cart_rank, size, left, right, up, down, iter;
29 int dims[] = {2, 2};
30 int periods[] = {0, 0};
31 int reorder = 0;
32 int tag = 0;
33
34 MPI_Status status;
35 MPI_Comm comm = MPI_COMM_WORLD;
36 MPI_Comm cart_comm;
37
38 /* initialise MPI */
39 MPI_Init(&argc, &argv);
40 MPI_Comm_size(comm, &size);
41 MPI_Comm_rank(comm, &rank);
42 MPI_Request request[2 * size];
43 int coords[size][2];
44
45 /* initialise cartesian topology */
46 MPI_Cart_create(comm, 2, dims, periods, reorder, &cart_comm);
47 MPI_Comm_rank(cart_comm, &cart_rank);
48 MPI_Cart_shift(cart_comm, 1, 1, &left, &right);
49 MPI_Cart_shift(cart_comm, 0, 1, &up, &down);
50 printf("cart_rank: %d\n", cart_rank);
51
56
57 /* create block datatype for allocation of subsections of image to processes */
58 MPI_Datatype MPI_block;
59 MPI_Type_vector(M / PX, N / PY, N, MPI_DOUBLE, &MPI_block);
60 MPI_Type_commit(&MPI_block);
61
73
74 /* master process: read edges data file into masterbuff and distribute */
75 if (rank == 0)
76 {
77 printf("Reading <%s>\n", FILEIN);
78 pgmread(FILEIN, masterbuf, M, N);
79
80 printf("Distributing data to processes...\n");
81 for (int i = 0; i < size; i++)
82 {
83 /* send chunk to each process: i refers to cart_rank */
84 MPI_Cart_coords(cart_comm, i, 2, &coords[i][0]);
85 printf("coords = (%d, %d), rank = %d\n", coords[i][0], coords[i][1], \
86 cart_rank);
87 MPI_Issend(&masterbuf[coords[i][0] * MP][coords[i][1] * NP], MP * NP, \
88 MPI_block, i, tag, cart_comm, &request[i]);
89 }
90
91 MPI_Wait(&request[0], &status);
92 MPI_Wait(&request[1], &status);
93 MPI_Wait(&request[2], &status);
94 MPI_Wait(&request[3], &status);
95 }
96
97 /* all processes: receive data sent by master process */
98 MPI_Irecv(buf, MP * NP, MPI_block, cart_rank, tag, cart_comm, \
99 &request[cart_rank + size]);
100
101 /* Could change this to MPI_Waitall */
102 MPI_Wait(&request[5], &status);
103 MPI_Wait(&request[4], &status);
104 MPI_Wait(&request[7], &status);
105 MPI_Wait(&request[6], &status);
106
107 if (rank == 0)
108 {
109 printf("...complete.\n");
110 }
Your application deadlock when rank 0 send to itself and no receive was yet posted.
Also, there are 4 MPI_Wait() but a single MPI_Recv().
As a side note, you can MPI_Waitall() instead of calling several consecutive MPI_Wait().
Related
I am attempting to pixelate a P6 PPM format image via the following steps:
Read a PPM image in grids of 4x4
Find the average RGB colour value of each 4x4 grid
Write to a new file by setting each 4x4 grid of pixels in the new image to have the average RGB colour value.
The PPM file begins in the following format:
P6
# ignores comments in header
width
height
max colour value
My problem:
The output PPM image file (which I am opening using GIMP image editor, and can also be opened in any text editor like Notepad to view the raw data) consists of one flat colour block, when instead it should resemble a sort of mosaic.
Note: The 4x4 grid can be varied i.e. the higher the value of the grid dimensions, the more pixelated the output image becomes. My code has mostly been influenced from another Stack Overflow question where the user attempted a similar implementation in C#. The link to this question:
https://codereview.stackexchange.com/questions/140162/pixelate-image-with-average-cell-color
UPDATE: The output now seems to pixelate the first 1/5 of the image, but the rest of the output image remains one colour block. I think my issue lies in that I am treating the cells as the pixels in linear order.
My attempt:
#include <stdio.h>
#include <assert.h>
//Struct to store RGB values
typedef struct {
unsigned char r, g, b;
} pixel;
int main()
{
int y, x; //Loop iteration variables
int yy = 0; //Loop iteration variables
int xx = 0; //Loop iteration variables
char magic_number[10]; //Variable which reads P6 in the header
int w, h, m; //Image dimension variables
pixel currentPix; //Current pixel variable
int avR; //Red declaration
int avG; //Green declaration
int avB; //Blue declarataion
int total; //Loop iteration counter declaration
//Input file
FILE* f;
f = fopen("Dog2048x2048.ppm", "r"); //Read PPM file
if (f == NULL) //Error notifiaction if file cannot be found
{
fprintf(stderr, "ERROR: cannot open input file");
getchar();
exit(1);
}
//Scan the header of the PPM file to get the magic number (P6), width
//height and max colour value
fscanf(f, "%s %d %d %d", &magic_number, &w, &h, &m);
//initialize file for writing (open and header)
FILE* f_output;
f_output = fopen("some_file.ppm", "w");
//fprintf(f_output, "%s %d %d %d", magic_number, w, h, m);
fprintf(f_output, "P6\n%d %d\n255\n", w, h);
if (f_output == NULL) //Error notifiaction if file cannot be found
{
fprintf(stderr, "ERROR: cannot open output file");
getchar();
exit(1);
}
// Loop through the image in 4x4 cells.
for (int yy = 0; yy < h && yy < h; yy += 4)
{
for (int xx = 0; xx < w && xx < w; xx += 4)
{
avR = 0;
avG = 0;
avB = 0;
total = 0;
// Store each color from the 4x4 cell into cellColors.
for (int y = yy; y < yy + 4 && y < h; y++)
{
for (int x = xx; x < xx + 4 && x < w; x++)
{
//Reads input file stream
fread(¤tPix, 3, 1, f);
//Current pixels
avR += currentPix.r;
avG += currentPix.g;
avB += currentPix.b;
//Counts loop iterations for later use in colour averaging
total++;
}
}
//Average RGB values
avR /= total;
avG /= total;
avB /= total;
// Go BACK over the 4x4 cell and set each pixel to the average color.
for (int y = yy; y < yy + 4 && y < h; y++)
{
for (int x = xx; x < xx + 4 && x < w; x++)
{
//Print out to new file
fprintf(f_output, "%i %i %i\t", avR, avG, avB);
}
}
}
fprintf(f_output, "\n");
}
return 0;
}
Your main mistake is that you assume to be reading and writing 4×4 blocks of pixels while actually accessing pixel data linearly; your suspicion is correct.
Consider the following example. Let there be a 12×4 1-channel image:
01 02 03 04 05 06 07 08 09 10 11 12
13 14 15 16 17 18 19 20 21 22 23 24
25 26 27 28 29 30 31 32 33 34 35 36
37 38 39 40 41 42 43 44 45 46 47 48
Each of the pixels has the color that equals its position in a PPM file.
Now these are the pixels which you are expecting to read when pixelating the first 4×4 block:
01 02 03 04
13 14 15 16
25 26 27 28
37 38 39 40
And these are the pixels which are actually being read by sequentially executing fread():
01 02 03 04 05 06 07 08 09 10 11 12
13 14 15 16
So eventually you are treating the input image as if it looked like that:
01 02 03 04 05 06 07 08 09 10 11 12
13 14 15 16 01 02 03 04 17 18 19 20 33 34 35 36
17 18 19 20 21 22 23 24 --> 05 06 07 08 21 22 23 24 37 38 39 40
25 26 27 28 29 30 31 32 09 10 11 12 25 26 27 28 41 42 43 44
33 34 35 36 13 14 15 16 29 30 31 32 45 46 47 48
37 38 39 40 41 42 43 44 45 46 47 48
instead of:
01 02 03 04 05 06 07 08 09 10 11 12
13 14 15 16 17 18 19 20 21 22 23 24
25 26 27 28 29 30 31 32 33 34 35 36
37 38 39 40 41 42 43 44 45 46 47 48
One of the simpler methods of resolving that issue is to allocate an array into which the data is to be read. Once you have that array filled with your data, you will be able to access its elements in any order, instead of the strictly linear order fread() implies.
I want to run, on my local pc, this c-mpi program Receiver process have to print their portion of array in order of rank. This is my code:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include "mpi.h"
#define MAX_DIM 22
#define NPROC 4
#define TRUE 1
#define FALSE 0
void print(int*,int*,int*,int*,int*,int*);
int main(int argc, char* argv[]){
int i,rank,*x,*y,nproc;
int partition=MAX_DIM/NPROC;
int sendcount[NPROC],offset[NPROC];
int k=MAX_DIM%NPROC;
for(i=0;i<NPROC;i++){
sendcount[i]=partition;
if(i<k)
sendcount[i]++;
}
offset[0]=0;
for(i=1;i<NPROC;i++)
offset[i]=offset[i-1]+sendcount[i-1];
MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD,&nproc);
MPI_Comm_rank(MPI_COMM_WORLD,&rank);
if(rank==0){
srand(time(NULL));
x=(int*)malloc(sizeof(int)*MAX_DIM);
y=(int*)malloc(sizeof(int)*MAX_DIM);
for(i=0;i<MAX_DIM;i++){
x[i]=rand()%100+1;
y[i]=rand()%100+1;
}
printf("Sender[%d] => array <x>: \n",rank);
for(i=0;i<MAX_DIM;i++){
if(i%10==0)
printf("\n");
printf("%d ",x[i]);
}
printf("\nSender[%d] => array <y>: \n",rank);
for(i=0;i<MAX_DIM;i++){
if(i%10==0)
printf("\n");
printf("%d ",y[i]);
}
}
int* rvett1=(int*)malloc(sizeof(int)*sendcount[rank]);
int* rvett2=(int*)malloc(sizeof(int)*sendcount[rank]);
MPI_Scatterv(x,&sendcount[rank],&offset[rank],MPI_INT,rvett1,sendcount[rank],MPI_INT,0,MPI_COMM_WORLD);
MPI_Scatterv(y,&sendcount[rank],&offset[rank],MPI_INT,rvett2,sendcount[rank],MPI_INT,0,MPI_COMM_WORLD);
print(&rank,&nproc,rvett1,&sendcount[rank],rvett2,&sendcount[rank]);
if(rank==0){
free(x);
free(y);
}
free(rvett1);
free(rvett2);
MPI_Finalize();
printf("Exit program! \n");
return 0;
}
void print(int* rank,int* nproc,int* rvett1,int* dim1,int* rvett2,int* dim2){
int i,tag=0;
short int token=FALSE;
MPI_Status info;
if(*rank==0){
printf("\nReceiver[%d] => part of array <x>, dimension: %d \n",*rank,*dim1);
for(i=0;i<*dim1;i++){
if(i%10==0)
printf("\n");
printf("%d ",rvett1[i]);
}
printf("\nReceiver[%d] => part of array <y>, dimension: %d \n",*rank,*dim1);
for(i=0;i<*dim2;i++){
if(i%10==0)
printf("\n");
printf("%d ",rvett2[i]);
}
token=TRUE;
printf("\nStarter[%d] sends a print token \n",*rank);
MPI_Send(&token,1,MPI_SHORT_INT,1,tag+1,MPI_COMM_WORLD);
}
else{
for(i=1;i<*nproc;i++){
if(*rank==i){
MPI_Recv(&token,1,MPI_SHORT_INT,i-1,tag+i,MPI_COMM_WORLD,&info);
printf("Receiver[%d] => OK print \n ",i);
printf("\nReceiver[%d] => part of array <x>, dimension: %d \n",*rank,*dim1);
for(i=0;i<*dim1;i++){
if(i%10==0)
printf("\n");
printf("%d ",rvett1[i]);
}
printf("\nReceiver[%d] => part of array <y>, dimension: %d \n",*rank,*dim1);
for(i=0;i<*dim2;i++){
if(i%10==0)
printf("\n");
printf("%d ",rvett2[i]);
}
if(*rank<(*nproc)-1){
printf("Receiver[%d] sends next token \n",i);
MPI_Send(&token,1,MPI_SHORT_INT,i+1,tag+i+1,MPI_COMM_WORLD);
}
}
}
}
}
I use this 2 command to compile and execute the program:
mpicc -o sca_prod sca_prod.c
mpiexec -n 4 ./sca_prod
During the execution the program crash and it returns this error:
Sender[0] => array <x>:
92 37 80 73 68 24 42 72 88 26
47 25 24 98 47 92 72 100 34 20
76 97
Sender[0] => array <y>:
17 62 55 70 53 44 73 72 19 47
11 83 29 30 56 39 80 51 24 54
96 70
Receiver[0] => part of array <x>, dimension: 6
92 37 80 73 68 24
Receiver[0] => part of array <y>, dimension: 6
17 62 55 70 53 44
Starter[0] sends a print token
Receiver[1] => OK print
Receiver[1] => part of array <x>, dimension: 6
42 72 88 26 47 25
Receiver[1] => part of array <y>, dimension: 6
73 72 19 47 11 83 Receiver[6] sends next token
Fatal error in MPI_Send: Invalid rank, error stack:
MPI_Send(174): MPI_Send(buf=0x7ffe216fb636, count=1, MPI_SHORT_INT, dest=7, tag=7, MPI_COMM_WORLD) failed
MPI_Send(100): Invalid rank has value 7 but must be nonnegative and less than 4
I'm using MPICH3.2 with Hydra executor and my OS is Ubuntu 14.04; the machine has a quadcore i7 processor. Please, can you help me? Thank you so much!
You have two subtle issues:
MPI_SHORT_INT is the datatype in MPI for a struct { short, int }, for a short int use MPI_SHORT instead. This leads to overwriting memory.
You use i as loop variable in two nested loops.
In general your code could be structured more better, to avoid issues like the second one.
Declare variables as locally as possible, especially loop variables within the loop declaration (c99).
Use more descriptive variable names.
Hide things like printing an array in functions.
Format your code properly.
Also learn to use a (parallel) debugger.
Before you say, yes I've checked nearly all the other postings, none are working.
My program has been giving me a segmentation error for hours and hours and nothing is fixing it. I debugged it to the point where I found it's in the file pointer. From what I know, it's because of the way I'm either using the file pointer in the 'makeArray' function or from the file closing statement. I don't really understand how it's not working because I used my last program as reference for this and it runs perfectly fine but this one won't.
#include <stdio.h>
#include <stdlib.h>
#define ROWS 12
#define COLS 8
void makeArray(FILE*, int [][COLS]);
int getScore(int [][COLS], int, int);
int getMonthMax(int [][COLS], int);
int getYearMax(int [][COLS]);
float getMonthAvg(int [][COLS], int);
float getYearAvg(int [][COLS]);
int toursMissed(int [][COLS]);
void displayMenu();
int processRequest(int [][COLS], int);
void printArray(int [][COLS]);
int main(){
int scoresArray[ROWS][COLS];
int choice, constant = 0;
FILE* inputPtr;
inputPtr = fopen("scores.txt", "r");
makeArray(inputPtr, scoresArray);
fclose(inputPtr);
while(constant == 0){
displayMenu();
scanf("%d", &choice);
processRequest(scoresArray, choice);
}
return 0;
}
void makeArray(FILE* inputPtr, int scoresArray[][COLS]){
int i, j;
for(i = 0; i < ROWS; i++){
for(j = 0; j < COLS; j++){
fscanf(inputPtr, "%d", &scoresArray[i][j]);
}
}
return;
}
I've tried moving the file pointers to every different spot in the code and nothing. I don't necessarily want you to just give me the answer but I want an explanation of why it's happening in this specific code because every other post I've checked and their results don't match up to mine.
Also the input file is
26 35 25 92 0 6 47 68 26 72 67 33 84 28
22 36 53 66 23 86 36 75 14 62 43 11 42 5
14 58 0 23 30 87 80 81 13 35 94 45 1 53
14 55 46 19 13 0 25 28 66 86 69 0 81 15
55 60 26 70 22 36 15 67 62 16 71 7 29 92
84 37 2 30 7 5 4 50 0 67 2 53 69 87
8 23 74 58 86 0 78 88 85 12 1 52 999
I wonder if your university compiler is picky about the input file - can you remove all new lines from a copy of your input file and try running with the copied modified input file --- so it is just a stream of numbers --- see if this sorts it out...
........ in my experience of scanf and fscanf these functions can be a bit fragile if the input does not run exactly the way you say it will in the format part - here "%d" does not tell fscanf about new line characters....
I have the following code which I compile and run with:
mpicc -std=c99 region.c
mpirun -n 4 region
$mpirun -version
mpirun (Open MPI) 1.6.5
$mpicc --version
gcc (Ubuntu 4.8.2-19ubuntu1) 4.8.2
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
int rank,
size,
dims[2],
coords[2],
image_size[2] = {8,8},
local_image_size[2];
MPI_Datatype border_row_t,
border_col_t,
subarray_type,
recv_type;
unsigned char *image,
*region,
*local_region;
void create_types() {
int starts[2] = {0, 0};
MPI_Type_create_subarray(2, image_size, local_image_size, starts, MPI_ORDER_C, MPI_UNSIGNED_CHAR, &subarray_type);
MPI_Type_commit(&subarray_type);
MPI_Type_vector(local_image_size[0], local_image_size[1], image_size[1], MPI_UNSIGNED_CHAR, &recv_type);
MPI_Type_commit(&recv_type);
}
void distribute_image(){
if (0 == rank) {
MPI_Request request;
int num_hor_segments = image_size[0] / local_image_size[0];
int num_vert_segments = image_size[1] / local_image_size[1];
int dest_rank=0;
for (int vert=0; vert<num_vert_segments; vert++) {
for (int hor=0; hor<num_hor_segments; hor++) {
MPI_Isend((image+(local_image_size[0]*hor)+(local_image_size[1]*image_size[1]*vert)), 1, subarray_type, dest_rank, 0, MPI_COMM_WORLD, &request);
dest_rank++;
}
}
}
MPI_Status status;
MPI_Recv(local_region, local_image_size[0]*local_image_size[1], MPI_UNSIGNED_CHAR, 0, 0, MPI_COMM_WORLD, &status);
}
void gather_region(){
int counts[4]={1,1,1,1};
int disps[4]={0,4,32,36};
MPI_Gatherv(local_region,local_image_size[0]*local_image_size[1], MPI_UNSIGNED_CHAR, region,counts,disps,recv_type,0,MPI_COMM_WORLD);
if (0==rank) {
printf("Actually returned:\n");
for (int i=0; i<image_size[0]*image_size[1]; i++) {
printf("%d\t", *(region+i));
if ((i+1)%image_size[0]==0) printf("\n");
}
}
}
void init_mpi(int argc, char** argv){
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Dims_create(size, 2, dims);
}
void load_and_allocate_images(int argc, char** argv){
if(rank == 0){
image = (unsigned char*) malloc(sizeof(unsigned char*) * image_size[0] * image_size[1]);
for (unsigned char i=0; i<image_size[0]*image_size[1]; i++) {
image[i] = i;
printf("%d\t", *(image+i));
if((i+1)%image_size[0]==0) printf("\n");
}
printf("\n\n");
region = (unsigned char*)calloc(sizeof(unsigned char),image_size[0]*image_size[1]);
}
local_image_size[0] = image_size[0]/dims[0];
local_image_size[1] = image_size[1]/dims[1];
int lsize = local_image_size[0]*local_image_size[1];
int lsize_border = (local_image_size[0] + 2)*(local_image_size[1] + 2);
local_region = (unsigned char*)calloc(sizeof(unsigned char),lsize_border);
}
void cleanup() {
MPI_Type_free(&subarray_type);
MPI_Type_free(&recv_type);
}
int main(int argc, char** argv){
init_mpi(argc, argv);
load_and_allocate_images(argc, argv);
create_types();
distribute_image();
gather_region();
cleanup();
MPI_Finalize();
exit(0);
}
When I run gatherv with displacements of 0, 4, 32 and 36 I get the following
Distributed vector:
0 1 2 3 4 5 6 7
8 9 10 11 12 13 14 15
16 17 18 19 20 21 22 23
24 25 26 27 28 29 30 31
32 33 34 35 36 37 38 39
40 41 42 43 44 45 46 47
48 49 50 51 52 53 54 55
56 57 58 59 60 61 62 63
Actually returned:
0 1 2 3 0 0 0 0
8 9 10 11 0 0 0 0
16 17 18 19 0 0 0 0
24 25 26 27 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
If I change the displacements to 0, 1, 32 36 I get the following:
Distributed vector:
0 1 2 3 4 5 6 7
8 9 10 11 12 13 14 15
16 17 18 19 20 21 22 23
24 25 26 27 28 29 30 31
32 33 34 35 36 37 38 39
40 41 42 43 44 45 46 47
48 49 50 51 52 53 54 55
56 57 58 59 60 61 62 63
Actually returned:
0 1 2 3 0 0 0 0
8 9 10 11 0 0 0 0
16 17 18 19 0 0 0 0
24 25 26 27 4 5 6 7
0 0 0 0 12 13 14 15
0 0 0 0 20 21 22 23
0 0 0 0 28 29 30 31
0 0 0 0 0 0 0 0
Why does a displacement of 1 translate to 28 in the returned vector? This confuses me.
Displacements in MPI_GATHERV are specified in units the extent of the datatype. The datatype as created by MPI_Type_vector(local_image_size[0], local_image_size[1], image_size[1], MPI_UNSIGNED_CHAR, &recv_type); has an extent of {(local_image_size[0]-1) * image_size[1] + local_image_size[1]} * extent(MPI_UNISIGNED_CHAR). Given the following:
local_image_size[0] = 4
local_image_size[1] = 4
image_size[1] = 8
extent(MPI_UNSIGNED_CHAR) = 1 byte
this results in the extent of recv_type being (4-1) * 8 + 4 or 28 bytes. Therefore, displacement of 1 specifies a location 28 bytes past the beginning of the receive buffer.
It is possible to "resize" a type by forcing a different "visible" extent on it with MPI_Type_create_resized. The whole procedure of properly performing 2D decomposition is well described in this answer.
I want to partition matrix into blocks (not stripes) and then distribute this blocks using MPI_Scatter.
I came up with solution which works, but I think it is far from "best practice". I have 8x8 matrix, filled with numbers from 0 to 63. Then I divide it into 4 4x4 blocks, using MPI_Type_vector and distribute it via MPI_Send, but this require some extra computation since i have to compute offsets for each block in big matrix.
If I use scatter, first (top left) block is transfered OK, but other blocks are not (wrong offset for start of block).
So is it possible to transfer blocks of matrix using MPI_Scatter, or what is the best way to do desired decomposition?
This is my code:
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
#define SIZE 8
int main(void) {
MPI_Init(NULL, NULL);
int p, rank;
MPI_Comm_size(MPI_COMM_WORLD, &p);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
char i;
char a[SIZE*SIZE];
char b[(SIZE/2)*(SIZE/2)];
MPI_Datatype columntype;
MPI_Datatype columntype2;
MPI_Type_vector(4, 4, SIZE, MPI_CHAR, &columntype2);
MPI_Type_create_resized( columntype2, 0, sizeof(MPI_CHAR), &columntype );
MPI_Type_commit(&columntype);
if(rank == 0) {
for( i = 0; i < SIZE*SIZE; i++) {
a[i] = i;
}
for(int rec=0; rec < p; rec++) {
int offset = (rec%2)*4 + (rec/2)*32;
MPI_Send (a+offset, 1, columntype, rec, 0, MPI_COMM_WORLD);
}
}
MPI_Recv (b, 16, MPI_CHAR, 0, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
//MPI_Scatter(&a, 1, boki, &b, 16, MPI_CHAR , 0, MPI_COMM_WORLD);
printf("rank= %d b= \n%d %d %d %d\n%d %d %d %d\n%d %d %d %d\n%d %d %d %d\n", rank, b[0], b[1], b[2], b[3], b[4], b[5], b[6], b[7], b[8], b[9], b[10], b[11], b[12], b[13], b[14], b[15]);
MPI_Finalize();
return 0;
}
What you've got is pretty much "best practice"; it's just a bit confusing until you get used to it.
Two things, though:
First, be careful with this: sizeof(MPI_CHAR) is, I assume, 4 bytes, not 1. MPI_CHAR is an (integer) constant that describes (to the MPI library) a character. You probably want sizeof(char), or SIZE/2*sizeof(char), or anything else convenient. But the basic idea of doing a resize is right.
Second, I think you're stuck using MPI_Scatterv, though, because there's no easy way to make the offset between each block the same size. That is, the first element in the first block is at a[0], the second is at a[SIZE/2] (jump of size/2), the next is at a[SIZE*(SIZE/2)] (jump of (SIZE-1)*(SIZE/2)). So you need to be able to manually generate the offsets.
The following seems to work for me (I generalized it a little bit to make it clearer when "size" means "number of rows" vs "number of columns", etc):
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
#define COLS 12
#define ROWS 8
int main(int argc, char **argv) {
MPI_Init(&argc, &argv);
int p, rank;
MPI_Comm_size(MPI_COMM_WORLD, &p);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
char i;
char a[ROWS*COLS];
const int NPROWS=2; /* number of rows in _decomposition_ */
const int NPCOLS=3; /* number of cols in _decomposition_ */
const int BLOCKROWS = ROWS/NPROWS; /* number of rows in _block_ */
const int BLOCKCOLS = COLS/NPCOLS; /* number of cols in _block_ */
if (rank == 0) {
for (int ii=0; ii<ROWS*COLS; ii++) {
a[ii] = (char)ii;
}
}
if (p != NPROWS*NPCOLS) {
fprintf(stderr,"Error: number of PEs %d != %d x %d\n", p, NPROWS, NPCOLS);
MPI_Finalize();
exit(-1);
}
char b[BLOCKROWS*BLOCKCOLS];
for (int ii=0; ii<BLOCKROWS*BLOCKCOLS; ii++) b[ii] = 0;
MPI_Datatype blocktype;
MPI_Datatype blocktype2;
MPI_Type_vector(BLOCKROWS, BLOCKCOLS, COLS, MPI_CHAR, &blocktype2);
MPI_Type_create_resized( blocktype2, 0, sizeof(char), &blocktype);
MPI_Type_commit(&blocktype);
int disps[NPROWS*NPCOLS];
int counts[NPROWS*NPCOLS];
for (int ii=0; ii<NPROWS; ii++) {
for (int jj=0; jj<NPCOLS; jj++) {
disps[ii*NPCOLS+jj] = ii*COLS*BLOCKROWS+jj*BLOCKCOLS;
counts [ii*NPCOLS+jj] = 1;
}
}
MPI_Scatterv(a, counts, disps, blocktype, b, BLOCKROWS*BLOCKCOLS, MPI_CHAR, 0, MPI_COMM_WORLD);
/* each proc prints it's "b" out, in order */
for (int proc=0; proc<p; proc++) {
if (proc == rank) {
printf("Rank = %d\n", rank);
if (rank == 0) {
printf("Global matrix: \n");
for (int ii=0; ii<ROWS; ii++) {
for (int jj=0; jj<COLS; jj++) {
printf("%3d ",(int)a[ii*COLS+jj]);
}
printf("\n");
}
}
printf("Local Matrix:\n");
for (int ii=0; ii<BLOCKROWS; ii++) {
for (int jj=0; jj<BLOCKCOLS; jj++) {
printf("%3d ",(int)b[ii*BLOCKCOLS+jj]);
}
printf("\n");
}
printf("\n");
}
MPI_Barrier(MPI_COMM_WORLD);
}
MPI_Finalize();
return 0;
}
Running:
$ mpirun -np 6 ./matrix
Rank = 0
Global matrix:
0 1 2 3 4 5 6 7 8 9 10 11
12 13 14 15 16 17 18 19 20 21 22 23
24 25 26 27 28 29 30 31 32 33 34 35
36 37 38 39 40 41 42 43 44 45 46 47
48 49 50 51 52 53 54 55 56 57 58 59
60 61 62 63 64 65 66 67 68 69 70 71
72 73 74 75 76 77 78 79 80 81 82 83
84 85 86 87 88 89 90 91 92 93 94 95
Local Matrix:
0 1 2 3
12 13 14 15
24 25 26 27
36 37 38 39
Rank = 1
Local Matrix:
4 5 6 7
16 17 18 19
28 29 30 31
40 41 42 43
Rank = 2
Local Matrix:
8 9 10 11
20 21 22 23
32 33 34 35
44 45 46 47
Rank = 3
Local Matrix:
48 49 50 51
60 61 62 63
72 73 74 75
84 85 86 87
Rank = 4
Local Matrix:
52 53 54 55
64 65 66 67
76 77 78 79
88 89 90 91
Rank = 5
Local Matrix:
56 57 58 59
68 69 70 71
80 81 82 83
92 93 94 95