MPI in c. I don't receive what I send - c

I am new in MPI. I am trying to write some simple code. Here it is:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <math.h>
#include <mpi.h>
#include <unistd.h>
#define ONE 0
#define TWO 1
int main(int argc, char * argv[])
{
int dimension = 5;
float ** matrix;
float * mat1;
float * mat2;
int i,j,numNeighbor, processReceived,rank,size,retval;
int k = 0;
retval = MPI_Init(&argc, &argv);
MPI_Request sendRequest[2], recvRequest[2];
MPI_Status status[2];
MPI_Datatype row;
MPI_Type_vector(dimension, 1, dimension, MPI_FLOAT, &row);
MPI_Type_commit(&row);
if(retval != MPI_SUCCESS)
{
MPI_Abort(MPI_COMM_WORLD, retval);
return EXIT_FAILURE;
}
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
mat1 = malloc(dimension*sizeof(float));
mat2 = malloc(dimension*sizeof(float));
matrix = malloc(dimension*sizeof(float*));
for(i=0; i<dimension; i++)
{
matrix[i] = malloc(dimension*sizeof(float));
}
printf("MATRIX OF RANK %d\n", rank);
for(i=0; i<dimension; i++)
{
for(j=0; j<dimension; j++)
{
matrix[i][j] = (float)(rank+1)*(i*2+j);
printf("%2.1f ",matrix[i][j]);
}
printf("\n");
}
printf("\n");
MPI_Isend(&(matrix[0][0]), 1, row, 1-rank, rank, MPI_COMM_WORLD, sendRequest + ONE);
MPI_Isend(&(matrix[0][0]), dimension, MPI_FLOAT, 1-rank, rank, MPI_COMM_WORLD, sendRequest + TWO);
MPI_Irecv(mat1,dimension, MPI_FLOAT, 1-rank, 1-rank, MPI_COMM_WORLD, recvRequest + ONE);
MPI_Irecv(mat2,dimension, MPI_FLOAT, 1-rank, 1-rank, MPI_COMM_WORLD, recvRequest + TWO);
for(i=0; i<2; i++)
{
MPI_Waitany(2,recvRequest, &processReceived, status);
printf("Process Received : %d of rank : %d\n", processReceived,rank);
if(processReceived == ONE)
{
printf("%d ",rank);
for(j=0; j<dimension; j++) printf("# %6.1f ",mat1[j]);
printf("\n");
}
if(processReceived == TWO)
{
printf("%d ",rank);
for(j=0; j<dimension; j++) printf("# %6.1f ",mat2[j]);
printf("\n");
}
}
MPI_Waitall(2, sendRequest, status);
free(mat1);
free(mat2);
for(i=0;i<dimension;i++) free(matrix[i]);
free(matrix);
MPI_Type_free(&row);
MPI_Finalize();
return 0;
}
This is my output:
MATRIX OF RANK 1
0.0 2.0 4.0 6.0 8.0
4.0 6.0 8.0 10.0 12.0
8.0 10.0 12.0 14.0 16.0
12.0 14.0 16.0 18.0 20.0
16.0 18.0 20.0 22.0 24.0
MATRIX OF RANK 0
0.0 1.0 2.0 3.0 4.0
2.0 3.0 4.0 5.0 6.0
4.0 5.0 6.0 7.0 8.0
6.0 7.0 8.0 9.0 10.0
8.0 9.0 10.0 11.0 12.0
Process Received : 0 of rank : 0
0 # 0.0 # 0.0 # 12.0 # 14.0 # 16.0
Process Received : 1 of rank : 0
0 # 0.0 # 0.0 # 12.0 # 14.0 # 16.0
Process Received : 0 of rank : 1
1 # 0.0 # 0.0 # 6.0 # 7.0 # 8.0
Process Received : 1 of rank : 1
1 # 0.0 # 0.0 # 6.0 # 7.0 # 8.0
What I think I am doing with this program is that I send one row and one column of each process to the other process. Then I print whatever i received in each process. But as you can see this is not the output i thought i would have. What i am waiting for is something like:
example of output:
Process Received : 0 of rank : 0
0 # 0.0 # 2.0 # 4.0 # 6.0 # 8.0
Process Received : 1 of rank : 0
0 # 0.0 # 4.0 # 8.0 # 12.0 # 16.0
Can anyone explain me what i haven't got very well? This is the only function that i use so you can run it in your own machines. For this example you can use only 2 processes.
mpiexec -n 2 ./name_exe

The issue comes from the fact that the data storage in matrix isn't linear. As it is defined in the code at the moment, matrix is an array of pointers, all of which point to some independent memory segments.
However, your code assumes that &matrix[0][0] points to the beginning of a linearised row-major matrix.
Well, to solve the issue, you simply need to comply with this by allocating matrix the following way:
matrix = malloc(dimension*sizeof(float*)); //nothing new here
matrix[0] = malloc(dimension*dimension*sizeof(float));
for(i=1; i<dimension; i++) matrix[i]=matrix[i-1]+dimension;
Then you use matrix exactly as before, but for the freeing part, which becomes:
free(matrix[0]);
free(matrix);
With this, the code should work.

Related

How should I make this logic in C?

There is this linear system given by the following 2d array:
1.0 0.0 -1.0 -4.9 -5.9 -6.9 -7.9
0.0 1.0 2.0 4.4 5.4 6.4 7.4
0.0 0.0 0.0 5.7 5.7 -3.3 -3.3
0.0 0.0 0.0 2.9 2.9 2.9 2.9
0.0 0.0 0.0 7.0 -1.0 -3.0 -3.0
0.0 0.0 -20.0 -65.9 -89.9 -100.9 128.9
Whenever I get a 0 in my main diagonal (when row equals column), I want to change the order of the rows, so there's no zeroes on my main diagonal.
In this case, the row 2(counting from 0) should be traded with row 5 (also counting from 0) because with this, there is no 0s on the main diagonal.
I'm already doing that, but I'm "deleting" the first line and appending it on the end of the linear system. How should I make this logic to know where to exactly trade the rows?
The code is as follows:
void change_order(double linear[6][7], unsigned int qty) {
double aux[100];
// dynamically create an array of pointers of size `m`
double **matrix = (double **)malloc((qty + 1) * sizeof(double *));
// dynamically allocate memory of size `n` for each row
for (int r = 0; r < qty+ 1; r++) {
matrix[r] = (double *)malloc((qty + 1) * sizeof(double));
}
for (int i = 0; i < qty; i++) {
for (int j = 0; j < qty+ 1; j++) {
if (i == 0)
aux[j] = linear[i][j];
}
}
for (int i = 0; i < qty; i++) {
for (int j = 0; j < qty+ 1; j++) {
matrix[i][j] = linear[i][j];
}
}
remove_line(matrix, 0, qty);
for (int i = 0; i < qty; i++) {
for (int j = 0; j < qty+ 1; j++) {
linear[i][j] = matrix[i][j];
}
}
for (int i = 0; i < qty; i++) {
for (int j = 0; j < qty+ 1; j++) {
if (i == qty- 1) {
linear[i][j] = aux[j];
}
}
}
}
void remove_line(double ** linear, int row, unsigned int qty) {
qty--;
free(linear[row]);
while (row < qty) {
linear[row] = linear[row + 1];
row++;
}
}
int main() {
double matrix[][7] = {
{1.0, 0.0, -1.0, -4.9, -5.9, -6.9, -7.9},
{0.0, 1.0, 2.0, 4.4, 5.4, 6.4, 7.4},
{0.0 , 0.0, 0.0, 5.7, 5.7, -3.3, -3.3},
{0.0 , 0.0, 0.0, 2.9, 2.9, 2.9, 2.9},
{0.0 , 0.0, 0.0, 7.0, -1.0, -3.0, -3.0},
{0.0 , 0.0, -20.0, -65.9, -89.9, -100.9, 128.9}
};
change_order(matrix, 6);
}
Example input:
0 3 2 28
4 0 2 24
2 3 0 16
4 2 1 0
Can be exchanged for:
4 0 2 24
2 3 0 16
4 2 1 0
0 3 2 28
If I'm understanding your requirements correctly, would you please try the following:
#include <stdio.h>
#include <stdlib.h>
#define ROWS 6
#define COLS 7
/*
* search for a trade line to be swapped below the n'th row
*/
int search_trade(double matrix[][COLS], int qty, int n)
{
for (int i = n + 1; i < qty; i++) {
if (matrix[i][n] != 0.0) {
return i; // i'th row is a nice trade
}
}
return -1; // not found
}
/*
* swap m'th row and n'th row
*/
void swap(double matrix[][COLS], int qty, int m, int n)
{
int j;
double tmp;
for (j = 0; j < qty + 1; j++) {
tmp = matrix[m][j];
matrix[m][j] = matrix[n][j];
matrix[n][j] = tmp;
}
}
void change_order(double linear[][COLS], int qty) {
for (int i = 0; i < qty; i++) {
if (linear[i][i] == 0.0) { // found 0 in the diagonal
int k = search_trade(linear, qty, i); // search for the trade row
if (k < 0) { // no applicable trade
fprintf(stderr, "cannot find the row to swap. abort.\n");
exit(1);
} else {
swap(linear, qty, i, k); // swap i'th row and k'th row
}
}
}
}
/*
* print the elements of the matrix
*/
void matprint(double matrix[][COLS], int qty)
{
for (int i = 0; i < qty; i++) {
for (int j = 0; j < qty + 1; j++) {
printf("%.2f%s", matrix[i][j], j == qty ? "\n" : " ");
}
}
printf("\n");
}
int main() {
double matrix[][COLS] = {
{1.0, 0.0, -1.0, -4.9, -5.9, -6.9, -7.9},
{0.0, 1.0, 2.0, 4.4, 5.4, 6.4, 7.4},
{0.0 , 0.0, 0.0, 5.7, 5.7, -3.3, -3.3},
{0.0 , 0.0, 0.0, 2.9, 2.9, 2.9, 2.9},
{0.0 , 0.0, 0.0, 7.0, -1.0, -3.0, -3.0},
{0.0 , 0.0, -20.0, -65.9, -89.9, -100.9, 128.9}
};
matprint(matrix, ROWS);
change_order(matrix, ROWS);
matprint(matrix, ROWS);
}
Output:
0.00 1.00 2.00 4.40 5.40 6.40 7.40
0.00 0.00 0.00 5.70 5.70 -3.30 -3.30
0.00 0.00 0.00 2.90 2.90 2.90 2.90
0.00 0.00 0.00 7.00 -1.00 -3.00 -3.00
0.00 0.00 -20.00 -65.90 -89.90 -100.90 128.90
1.00 0.00 -1.00 -4.90 -5.90 -6.90 -7.90
0.00 1.00 2.00 4.40 5.40 6.40 7.40
0.00 0.00 -20.00 -65.90 -89.90 -100.90 128.90
0.00 0.00 0.00 2.90 2.90 2.90 2.90
0.00 0.00 0.00 7.00 -1.00 -3.00 -3.00
0.00 0.00 0.00 5.70 5.70 -3.30 -3.30
You'll see the 2nd row and the 5th row are swapped.
The main concept is:
Seek the diagonal elements for value 0.
If 0 is found, search for a trade row which has a non-zero value in the same column.
If no trade rows are found, the program prints an error message and aborts.
If a trade row is found, swap the rows.
[Edit]
Answering your comment, the code assumes the count of cols == count of rows + 1.
As your provided example has 4x4 matrix, let me add an extra column as:
double matrix[][COLS] = {
{0, 3, 2, 28, -1},
{4, 0, 2, 24, -1},
{2, 3, 0, 16, -1},
{4, 2, 1, 0, -1}
};
(Please note the value -1 is a dummy value and meaningless so far.)
And modify the #define lines as:
#define ROWS 4
#define COLS 5
Then the program will output:
0.00 3.00 2.00 28.00 -1.00
4.00 0.00 2.00 24.00 -1.00
2.00 3.00 0.00 16.00 -1.00
4.00 2.00 1.00 0.00 -1.00
4.00 0.00 2.00 24.00 -1.00
0.00 3.00 2.00 28.00 -1.00
4.00 2.00 1.00 0.00 -1.00
2.00 3.00 0.00 16.00 -1.00
which shows the rows are properly rearranged having no 0 values in the diagonal.
(BTW your expected result breaks having 0 in the diagonal in the last row.)
**You can make a main function like as follows:
//after passing the linear function
int i,j, temp; // declare i and j as global variables**
for(i=0,i<qty+1,i++)
{
for (j=0;j<qty+1;j++)
{
if(i==j & matrix[i][j]==0)
{
remove_line;
temp = i;
break;
}
}
} // here we are looking for a zero in the diagonal.
for (;i<qty+1;i++)
{
if(matrix[i][j]!=0)
{
matrix[temp][j] = linear[i][j]
}
}
// **here we are increasing the rows till we get a non zero element and then
interchanging the values.**

Create image kernel in C freeing memory error

I am trying to create a Gaussian filter kernel in C to do some image processing. I am using a 2d float array on the heap, but when I call free() on the rows I keep getting an free(): invalid pointer error. I have printed out memory locations and values of the filter to standard output and everything seems to be what I expect
//kernel->kernel = float **
//kernel->row_len = kernel->col_len = 5
float total_weight = 0.0;
//build the holding col
kernel->kernel = malloc(sizeof(float *) * kernel->col_len);
//get mem for each row and set the values
for (int j = 0; j < kernel->col_len; j++)
{
kernel->kernel[j] = malloc(sizeof(float) * kernel->row_len);
for (int i = 0; i < kernel->row_len; i++)
{
kernel->kernel[j][i] = ken_ComputeGuassianVal(i, j, sigma, size);
total_weight += kernel->kernel[j][i];
}
//print debugging info
printf("Create - %p\n", (kernel->kernel + j));
for (int i = 0; i < kernel->row_len; i++)
{
printf("%d, %d - %f \n", i, j, kernel->kernel[j][i]);
printf("%p\n", (*(kernel->kernel + j) + i));
}
printf("\n");
}
//Normalise the kernel otherwise brightness will be added to the image
for (int j = 0; j < kernel->col_len; j++)
{
for (int i = 0; i < kernel->row_len; i++)
{
kernel->kernel[j][i] /= total_weight;
}
}
for (int j = 0; j < kernel->col_len; j++)
{
printf("Attempting to free memory at location %p\n", (kernel->kernel + j));
free(kernel->kernel + j);
printf("\n");
}
free(kernel->kernel);
Here is the output I am getting to standard output
Create - 0x55aa2a80d4e0
0, 0 - 0.000000
0x55aa2a7ed8d0
1, 0 - 0.000000
0x55aa2a7ed8d4
2, 0 - 0.000001
0x55aa2a7ed8d8
3, 0 - 0.000000
0x55aa2a7ed8dc
4, 0 - 0.000000
0x55aa2a7ed8e0
Create - 0x55aa2a80d4e8
0, 1 - 0.000000
0x55aa2a84c6a0
1, 1 - 0.001083
0x55aa2a84c6a4
2, 1 - 0.034551
0x55aa2a84c6a8
3, 1 - 0.001083
0x55aa2a84c6ac
4, 1 - 0.000000
0x55aa2a84c6b0
Create - 0x55aa2a80d4f0
0, 2 - 0.000001
0x55aa2a7f96a0
1, 2 - 0.034551
0x55aa2a7f96a4
2, 2 - 1.102181
0x55aa2a7f96a8
3, 2 - 0.034551
0x55aa2a7f96ac
4, 2 - 0.000001
0x55aa2a7f96b0
Create - 0x55aa2a80d4f8
0, 3 - 0.000000
0x55aa2a80d510
1, 3 - 0.001083
0x55aa2a80d514
2, 3 - 0.034551
0x55aa2a80d518
3, 3 - 0.001083
0x55aa2a80d51c
4, 3 - 0.000000
0x55aa2a80d520
Create - 0x55aa2a80d500
0, 4 - 0.000000
0x55aa2a7eddf0
1, 4 - 0.000000
0x55aa2a7eddf4
2, 4 - 0.000001
0x55aa2a7eddf8
3, 4 - 0.000000
0x55aa2a7eddfc
4, 4 - 0.000000
0x55aa2a7ede00
Destroy - 0x55aa2a7ed8d0
Attempting to free memory at location 0x55aa2a80d4e0
Destroy - 0x55aa2a84c6a0
Attempting to free memory at location 0x55aa2a80d4e8
free(): invalid pointer
[1] 13936 abort ./dipcw
I have tried both array notation kernel->kernel[j] and (kernel->kernel + j). I am using elementary linux 5.0 and gcc version 7.3.0 (Ubuntu 7.3.0-27ubuntu1~18.04)
Edit: Changed the stop condition variable in the normalization loop to use the same variables as the other loops. Added free statement for the double pointer at the end
The rule is: there should be a corresponding free() for every malloc()
struct matrix {
unsigned nrow;
unsigned ncol;
float ** ptrs;
} data;
/* allocate */
unsigned row, col;
data.ptrs = malloc (data.nrow * sizeof *data.ptrs); // <<< [A]
for (row=0; row < data.nrow; row++) { // <<< [B]
data.ptrs[row] = malloc (data.ncol * sizeof *data.ptrs[0]); // <<< [C]
}
/* there should be a corresponding free for every malloc
, but in the "inside-out" order :
*/
unsigned row, col;
for (row=0; row < data.nrow; row++) { // <<< [B]
free( data.ptrs[row] ); // <<< [C]
}
free(data.ptrs); // <<< [A]
Note: for simplicity, I swapped rows/columns , and I used matrix.field instead of pointer->field.

Program to print and display identity matrix in C

i am having trouble writing a program that prints a matrix, and then I generate the identity matrix. Here is my ccode below and any help would be greatly appreciated.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int PrintMatrix(int dim, double matrix[dim][dim]);
int main()
int PrintMatrix(int dim, double matrix[dim][dim]) {
int aa, bb;
for (aa = 0; aa <= dim; aa++) {
for (bb = 0; bb <= dim; bb++) {
printf("%lf ", matrix[aa][bb]);
}
printf("\n");
}
}
double TestMatrix[7][7] = {
{1,0,0,0,0,0,0},
{0,1,0,0,0,0,0},
{0,0,1,0,0,0,0},
{0,0,0,1,0,0,0},
{0,0,0,0,1,0,0},
{0,0,0,0,0,1,0},
{0,0,0,0,0,0,1}
};
PrintMatrix(7, TestMatrix);
return 0;
Your code won't compile successfully.
After main there is no opening brace.
You are defining function inside main, which is an issue.
Check for parentheses in whole code.
Fixed the loop controls from <= to <.
Here is the modified code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int PrintMatrix(int dim, double matrix[dim][dim]);
int main()
{
double TestMatrix[7][7] = {
{1,0,0,0,0,0,0},
{0,1,0,0,0,0,0},
{0,0,1,0,0,0,0},
{0,0,0,1,0,0,0},
{0,0,0,0,1,0,0},
{0,0,0,0,0,1,0},
{0,0,0,0,0,0,1},
};
PrintMatrix(7, TestMatrix);
return 0;
}
int PrintMatrix(int dim, double matrix[dim][dim]) {
int aa, bb;
for (aa = 0; aa < dim; aa++) {
for (bb = 0; bb < dim; bb++) {
printf("%lf ", matrix[aa][bb]);
}
printf("\n");
}
}
The code in the question is an appalling non-compiling mess. One of the comments is:
It still isn't returning the identity for dim = 2 up to 7; any thoughts?
As BluePixy hinted, if you lie to your compiler about the size of the input matrix to the function, for example by passing a 7x7 matrix but telling that it has a 3x3 matrix, it gets its revenge by printing different information from what you wanted. Don't lie to the compiler!
If you want to print identity matrices of sizes 1..7 from a 7x7 matrix, tell the compiler (function) both the actual size of the matrix and the size you want printed. For an identity matrix, you don't actually need the original matrix — you could synthesize the data.
#include <stdio.h>
static void printIdentityMatrix(int size)
{
for (int i = 0; i < size; i++)
{
for (int j = 0; j < size; j++)
printf("%4.1f", (i == j) ? 1.0 : 0.0);
putchar('\n');
}
}
int main(void)
{
for (int i = 1; i < 8; i++)
printIdentityMatrix(i);
return 0;
}
For printing the top left square subset of an arbitrarily sized square matrix, you must pass both the size of the data to be printed and the actual size of the matrix.
#include <assert.h>
#include <stdio.h>
static void PrintMatrix(int size, int dim, double matrix[dim][dim])
{
assert(size <= dim);
for (int aa = 0; aa < size; aa++)
{
for (int bb = 0; bb < size; bb++)
printf("%lf ", matrix[aa][bb]);
putchar('\n');
}
}
int main(void)
{
double TestMatrix[7][7] =
{
{1,0,0,0,0,0,0},
{0,1,0,0,0,0,0},
{0,0,1,0,0,0,0},
{0,0,0,1,0,0,0},
{0,0,0,0,1,0,0},
{0,0,0,0,0,1,0},
{0,0,0,0,0,0,1},
};
for (int i = 1; i < 8; i++)
{
PrintMatrix(i, 7, TestMatrix);
putchar('\n');
}
return 0;
}
Printing an arbitrary rectangular submatrix of an arbitrarily sized rectangular matrix requires many more function parameters (7 if I am counting correctly:
void PrintSubMatrix(int x_off, int y_off, int x_len, int y_len, int x_size, int y_size,
double matrix[x_size][y_size]);
and that's before you specify the file stream to write on.
#include <assert.h>
#include <stdio.h>
static void PrintSubMatrix(int x_off, int y_off, int x_len, int y_len, int x_size, int y_size,
double matrix[x_size][y_size])
{
assert(x_off >= 0 && x_off < x_size && x_off + x_len <= x_size);
assert(y_off >= 0 && y_off < y_size && y_off + y_len <= y_size);
printf("SubMatrix size %dx%d at (%d,%d) in M[%d][%d]\n",
x_len, y_len, x_off, y_off, x_size, y_size);
for (int x = x_off; x < x_off + x_len; x++)
{
for (int y = y_off; y < y_off + y_len; y++)
printf("%4.1f ", matrix[x][y]);
putchar('\n');
}
putchar('\n');
}
int main(void)
{
double TestMatrix[7][9] =
{
{ 1, 2, 3, 4, 3, 2, 1, 2, 3 },
{ 2, 1, 9, 8, 4, 6, 0, 0, 1 },
{ 3, 0, 8, 7, 5, 5, 0, 0, 1 },
{ 4, 0, 5, 6, 6, 8, 4, 4, 4 },
{ 5, 0, 1, 4, 7, 9, 0, 0, 1 },
{ 6, 0, 1, 0, 8, 1, 0, 0, 1 },
{ 7, 0, 0, 0, 9, 0, 1, 0, 1 },
};
PrintSubMatrix(0, 0, 7, 9, 7, 9, TestMatrix);
for (int i = 1; i < 4; i++)
{
for (int j = 2; j < 4; j++)
PrintSubMatrix(i, j, 3 + j - i, i + j, 7, 9, TestMatrix);
}
return 0;
}
Sample run:
SubMatrix size 7x9 at (0,0) in M[7][9]
1.0 2.0 3.0 4.0 3.0 2.0 1.0 2.0 3.0
2.0 1.0 9.0 8.0 4.0 6.0 0.0 0.0 1.0
3.0 0.0 8.0 7.0 5.0 5.0 0.0 0.0 1.0
4.0 0.0 5.0 6.0 6.0 8.0 4.0 4.0 4.0
5.0 0.0 1.0 4.0 7.0 9.0 0.0 0.0 1.0
6.0 0.0 1.0 0.0 8.0 1.0 0.0 0.0 1.0
7.0 0.0 0.0 0.0 9.0 0.0 1.0 0.0 1.0
SubMatrix size 4x3 at (1,2) in M[7][9]
9.0 8.0 4.0
8.0 7.0 5.0
5.0 6.0 6.0
1.0 4.0 7.0
SubMatrix size 5x4 at (1,3) in M[7][9]
8.0 4.0 6.0 0.0
7.0 5.0 5.0 0.0
6.0 6.0 8.0 4.0
4.0 7.0 9.0 0.0
0.0 8.0 1.0 0.0
SubMatrix size 3x4 at (2,2) in M[7][9]
8.0 7.0 5.0 5.0
5.0 6.0 6.0 8.0
1.0 4.0 7.0 9.0
SubMatrix size 4x5 at (2,3) in M[7][9]
7.0 5.0 5.0 0.0 0.0
6.0 6.0 8.0 4.0 4.0
4.0 7.0 9.0 0.0 0.0
0.0 8.0 1.0 0.0 0.0
SubMatrix size 2x5 at (3,2) in M[7][9]
5.0 6.0 6.0 8.0 4.0
1.0 4.0 7.0 9.0 0.0
SubMatrix size 3x6 at (3,3) in M[7][9]
6.0 6.0 8.0 4.0 4.0 4.0
4.0 7.0 9.0 0.0 0.0 1.0
0.0 8.0 1.0 0.0 0.0 1.0
It would be better if the code was fixed not to print a blank at the end of each line; that's left as an exercise for the reader.

MPI C Tree Structured Global Sum

How do I pair processes using MPI in C? It's a tree structured approach. Process 0 should be adding from all of the other even processes, which they are paired with. I only need it to work for powers of 2.
Should I be using MPI_Reduce instead of MPI Send/Receive? If so, why?
My program doesn't seem to get past for loop inside the first if statement. Why?
#include <stdio.h>
#include <time.h>
#include <stdlib.h>
#include <mpi.h>
int main(void){
int sum, comm_sz, my_rank, i, next, value;
int divisor = 2;
int core_difference = 1;
MPI_Init(NULL, NULL);
MPI_Comm_size(MPI_COMM_WORLD, &comm_sz);
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
srandom((unsigned)time(NULL) + my_rank);
value = random() % 10;
//process should recieve and add
if (my_rank % divisor == 0){
printf("IF----");
printf("Process %d generates: %d\n", my_rank, value);
for (i = 0; i < comm_sz; i++)
{
MPI_Recv(&value, 1, MPI_INT, i, my_rank , MPI_COMM_WORLD, MPI_STATUS_IGNORE);
sum += value;
printf("Current Sum=: %d\n", sum);
}
printf("The NEW divisor is:%d\n", divisor);
divisor *= 2;
core_difference *= 2;
}
//sending the random value - no calculation
else if (my_rank % divisor == core_difference){
printf("ELSE----");
printf("Process %d generates: %d\n", my_rank, value);
MPI_Send(&value, 1, MPI_INT, 0, 0, MPI_COMM_WORLD);
}
else
if (my_rank==0)
printf("Sum=: %d\n", sum);
MPI_Finalize();
return 0;
}
The problem is that your processes are all receiving from themselves. If I add a print statement before each send and receive with the processes involved in the operation, here's the output:
$ mpiexec -n 8 ./a.out
IF----Process 0 generates: 5
ELSE----Process 1 generates: 1
ELSE----Process 3 generates: 1
IF----Process 4 generates: 9
ELSE----Process 5 generates: 7
IF----Process 6 generates: 2
ELSE----Process 7 generates: 0
0 RECV FROM 0
1 SEND TO 0
3 SEND TO 0
4 RECV FROM 0
5 SEND TO 0
6 RECV FROM 0
7 SEND TO 0
IF----Process 2 generates: 7
2 RECV FROM 0
1 SEND TO 0 DONE
3 SEND TO 0 DONE
5 SEND TO 0 DONE
7 SEND TO 0 DONE
Obviously, everyone is hanging while waiting for rank 0, including rank 0. If you want to send to yourself, you'll need to use either MPI_Sendrecv to do both the send and receive at the same time or use nonblocking sends and receives (MPI_Isend/MPI_Irecv).
As you said, another option would be to use collectives, but if you do that, you'll need to create new subcommunicators. Collectives require all processes in the communicator to participate. You can't pick just a subset.

Intel MKL SpareBlas mm CSR one-based indexing not working

I am testing the functions of Intel MKL in a C test-program and I found that I just can't make the spareblas: mkl_scsrmm function CSR one-based indexing work. I am using CSR with the val, columns, pntrb and pntre variation. The original examples where placed in:
"...mkl\examples\examples_core_c\spblasc\source\cspblas_scsr.c"
This is the first code for zero-based indexing:
Example #1
#include <stdio.h>
#include "mkl_types.h"
#include "mkl_spblas.h"
int main() {
#define M 2
#define NNZ 4
MKL_INT m = M, nnz = NNZ;
float values[NNZ] = {2.0,4.0,4.0,2.0};
MKL_INT columns[NNZ] = {1,2,1,2};
MKL_INT rowIndex[M+1] = {1,3,5};
#define N 2
MKL_INT n = N;
float b[M][N] = {2.0, 1.0, 5.0, 2.0};
float c[M][N] = {0.0, 0.0, 0.0, 0.0};
float alpha = 1.0, beta = 0.0;
char transa, uplo, nonunit;
char matdescra[6];
MKL_INT i, j, is;
transa = 'N';
matdescra[0] = 'S';
matdescra[1] = 'L';
matdescra[2] = 'N';
matdescra[3] = 'F';
mkl_scsrmm(&transa, &m, &n, &m, &alpha, matdescra, values, columns, rowIndex, &(rowIndex[1]), &(b[0][0]), &n, &beta, &(c[0][0]), &n);
printf(" \n");
printf(" OUTPUT DATA FOR MKL_SCSRMM\n");
for (i = 0; i < m; i++) {
for (j = 0; j < n; j++) {
printf("%7.1f", c[i][j]);
};
printf("\n");
};
return 0;
}
The results I get are this:
Zero-based indexing(the right one):
24.0 10.0
18.0 8.0
One-based indexing:
8.0 10.0
18.0 24.0
Though it seems like it only changes the diagonal elements position, with 3x3 matrix the solution its completly different from the right one. I suspected that it might be something with the input format of the matrix b. I think that there's lack of clarity on the description of array b for the function mkl_scsrmm placed in the MKL reference manual. Thus I changed b format, in this example and it worked (I placed elements position in this order: `{2.0, 5.0, 1.0, 2.0}) But I did the same for another 3x3 example I coded and it didn't work so I think it may be just a coincidence. I don't really know what to do with this problem, I would like to understand what happens here.
References:
CSR format
https://software.intel.com/en-us/node/471374
Spare Blas mkl_scsrmm function
https://software.intel.com/sites/products/documentation/doclib/iss/2013/mkl/mklman/hh_goto.htm#GUID-78C55D9B-86FF-4A9F-B5D5-D2F61B9314FC.htm
Spare Blas Interface Considerations
https://software.intel.com/sites/products/documentation/doclib/iss/2013/mkl/mklman/hh_goto.htm#GUID-34C8DB79-0139-46E0-8B53-99F3BEE7B2D4.htm
And here is another example, the 3x3 one:
Example #2
// matrix A
//
// 2 4 3
// 4 2 1
// 3 1 6
//
// matrix B
//
// 2 1 3
// 4 5 6
// 7 8 9
//
// ZERO-BASED INDEXING
//
// a = {2 4 3 4 2 1 3 1 6}
// columns= {0 1 2 0 1 2 0 1 2}
// idexRow = {0 3 6 9}
//
// b = {2 1 3 4 5 6 7 8 9} (row order array)
//
// We print the array in row-major order
//
// ONE-BASED INDEXING
//
// a = {2 4 3 4 2 1 3 1 6}
// columns={1 2 3 1 2 3 1 2 3}
// indexRow = {0 3 6 9}
//
// b = {2 4 7 1 5 8 3 6 9} (column order array)
//
// We print the array in column-major order (because the resoult is in column major order, ie transposed)
//
//
//
#include <stdio.h>
#include "mkl_types.h"
#include "mkl_spblas.h"
int main()
{
#define M 3
#define NNZ 9
#define N 3
MKL_INT m = M, nnz = NNZ, n=N;
float a[NNZ] = {2.0,4.0,3.0,4.0,2.0,1.0,3.0,1.0,6.0};
MKL_INT columns[NNZ] = {0,1,2,0,1,2,0,1,2};
MKL_INT rowIndex[M+1] = {0,3,6,9};
float b[M][N] = {2.0, 1.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0};
float c[M][N] = {0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0};
float alpha = 1.0, beta = 0.0;
MKL_INT i, j;
char transa;
char matdescra[6];
float a1[NNZ] = {2.0,4.0,3.0,4.0,2.0,1.0,3.0,1.0,6.0};
MKL_INT columns1[NNZ] = {1,2,3,1,2,3,1,2,3};
MKL_INT rowIndex1[M+1] = {1,4,7,10};
float b1[M][N] = {2.0, 4.0, 7.0, 1.0, 5.0, 8.0, 3.0, 6.0, 9.0};
float c1[M][N] = {0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0};
//********************************
//ZERO-BASED INDEXING
//********************************
transa = 'n';
matdescra[0] = 's';
matdescra[1] = 'l';
matdescra[2] = 'n';
matdescra[3] = 'c';
mkl_scsrmm(&transa, &m, &n, &m, &alpha, matdescra, a, columns, rowIndex, &(rowIndex[1]), &(b[0][0]), &n, &beta, &(c[0][0]), &n);
printf(" \n");
printf(" Right Solution: ZERO-BASED: C \n");
for (i = 0; i < m; i++)
{
for (j = 0; j < n; j++) {
printf("%7.1f", c[i][j]);
};
printf("\n");
};
printf(" \n");
printf(" ZERO-BASED: C' \n");
for (i = 0; i < m; i++)
{
for (j = 0; j < n; j++)
{
printf("%7.1f", c[j][i]);
};
printf("\n");
};
//********************************
//ONE-BASED INDEXING
//********************************
matdescra[3] = 'f';
mkl_scsrmm(&transa, &m, &n, &m, &alpha, matdescra, a1, columns1, rowIndex1, &(rowIndex1[1]), &(b1[0][0]), &n, &beta, &(c1[0][0]), &n);
printf(" \n");
printf(" ONE-BASED: C \n");
for (i = 0; i < m; i++)
{
for (j = 0; j < n; j++)
{
printf("%7.1f", c1[i][j]);
};
printf("\n");
};
printf(" \n");
printf(" ONE-BASED: C' \n");
for (i = 0; i < m; i++)
{
for (j = 0; j < n; j++)
{
printf("%7.1f", c1[j][i]);
};
printf("\n");
};
return 0;
}
I asked the same question at Intel's forum and I got some help there and get to the solution of the problem. The deal was that when calling the routine from C interface with zero-based indexing you can send the matrix stored in an array following row-major order (native C array storage), and when you call the routine with one-based indexing you have to store the matrix in column-major order. This changes the way the matrix B and C need to be stored, and the way the result will be stored. For the matrix A it only changes the indexing (from 0 to 1). From Intel's documentation you may think that the C interface accepts always the row-major ordering for both types of indexing.
Notice that it that in general, column-major ordering is not the same as storing the transposed matrix in row-major ordering (it is the same if matrices are square).

Resources