Sending 2D Int Array with MPI_Send and Recv - c

I am trying to send a 2D integer array of arbitrary length from slave processes to the master but I keep getting a segmentation fault. As MPI is quite difficult to debug, I'm not certain that the issue has to do with the send/recv but if it's not that then it will have to be with the way I am allocating the arrays themselves.
I followed a previous question on here in regards to ensuring that the memory allocated to the array is contiguous but that still didn't fix the segmentation fault.
Below are some sections of my code:
Create array:
int** create2DArray(int sizeX, int sizeY)
{
int* data = (int *) malloc(sizeX * sizeY * sizeof(int));
int** array= (int **) malloc(sizeX * sizeof(int*));
int i;
for (i=0; i<sizeX; i++)
{
array[i] = &(data[sizeY * i]);
}
return array;
}
Initialise arrays:
if(rank==0)
{
display = x11setup(&win, &gc, width, height);
pixels = create2DArray(X_RESN, Y_RESN);
}
else
{
xStart = xPixels * (rank - 1);
xFinish = xStart + xPixels;
pixels = create2DArray(xPixels, Y_RESN);
}
Send:
MPI_Send(&pixels[0][0], xPixels * Y_RESN, MPI_INT, 0, type, MPI_COMM_WORLD);
Recv:
for(i = 1; i < processes; i++)
{
int** pixelChunk = create2DArray(xPixels, Y_RESN);
MPI_Recv(&pixelChunk[0][0], xPixels * Y_RESN, MPI_INT, i, type, MPI_COMM_WORLD, &status);
int xStart = xPixels * (i - 1);
int xFinish = xStart + xPixels;
int k;
for(j = xStart; j < xFinish; j++)
{
for(k = 0; k < Y_RESN; k++)
{
pixels[j][k] = pixelChunk[j - (xPixels * i - 1)][k];
}
}
}

This line looks suspicious:
pixels[j][k] = pixelChunk[j - (xPixels * i - 1)][k];
For example, say we have np = 2, so we're left with a single chunk, then
i = 1;
xStart = 0;
j = 0;
xPixels = 600;
pixelChunk[0 - (600 * 1 - 1)[k] == pixelChunk[-599][k]
Doesn't look right, does it?
This?
pixels[j][k] = pixelChunk[j - xPixels * (i - 1)][k];
The send/recv code is allright probably.

Related

C - Function to allocate dynamically a 3 dimension array (using malloc)

I created this function to allocate dynamically a 3D array.
int ***create_3D_Array(int nb_block, int nb_lin, int nb_col) {
int i, j;
int ***A = (int***)malloc(nb_block * sizeof(int**));
for (i = 0; i <nb_col; i++) {
A[i] = (int**)malloc(nb_col * sizeof(int*));
for (j = 0; j < nb_lin; j++) {
A[i][j] = (int*)malloc(nb_lin * sizeof(int));
}
}
return A;
}
I then used it here
int ***all_blocks = NULL;
all_blocks = create_3D_Array(54, 5, 5);
However, it is not working correctly because when I want to give a value to my 6th block all_blocks[5], the program stops working.
Is there any error in my function ?
The dimensions are incorrect in your allocation loops. The outer loop should run to nb_block, the second malloc should allocate nb_lin * sizeof(int*) and the third malloc should allocate nb_col * sizeof(int).
Here is a corrected version:
int ***create_3D_Array(int nb_block, int nb_lin, int nb_col) {
int i, j;
int ***A = (int***)malloc(nb_block * sizeof(int**));
for (i = 0; i < nb_block; i++) {
A[i] = (int**)malloc(nb_lin * sizeof(int*));
for (j = 0; j < nb_lin; j++) {
A[i][j] = (int*)malloc(nb_col * sizeof(int));
}
}
return A;
}
Note that it might be simpler to use a direct 3D array:
int (*all_blocks)[5][5] = malloc(54 * sizeof(*all_blocks));

How to handle MPI sendcount of zero

What is the correct way to handle a sendcount = 0 when using MPI_Gatherv (or any other function that requires a sendcount) when setting up the displs argument?
I have data that needs to be received by all processors, but all processors might not have any data to send themselves. As an MWE, I tried (on just two processors):
#include <stdlib.h>
#include <stdio.h>
#include <mpi.h>
int main(void)
{
int ntasks;
int thistask;
int n = 0;
int i;
int totcounts = 0;
int *data;
int *rbuf;
int *rcnts;
int *displs;
int *master_data;
int *master_displs;
// Set up mpi
MPI_Init(NULL, NULL);
MPI_Comm_size(MPI_COMM_WORLD, &ntasks);
MPI_Comm_rank(MPI_COMM_WORLD, &thistask);
// Allocate memory for arrays needed by allgatherv
rbuf = calloc(ntasks, sizeof(int));
rcnts = calloc(ntasks, sizeof(int));
displs = calloc(ntasks, sizeof(int));
master_displs = calloc(ntasks, sizeof(int));
// Initialize the counts and displacement arrays
for(i = 0; i < ntasks; i++)
{
rcnts[i] = 1;
displs[i] = i;
}
// Allocate data on just one task, but not others
if(thistask == 1)
{
n = 3;
data = calloc(n, sizeof(int));
for(i = 0; i < n; i++)
{
data[i] = i;
}
}
// Get n so each other processor knows about what others are sending
MPI_Allgatherv(&n, 1, MPI_INT, rbuf, rcnts, displs, MPI_INT, MPI_COMM_WORLD);
// Now that we know how much data each processor is sending, we allocate the array
// to hold it all
for(i = 0; i < ntasks; i++)
{
totcounts += rbuf[i];
}
master_data = calloc(totcounts, sizeof(int));
// Get displs for master data
master_displs[0] = 0;
for(i = 1; i < ntasks; i++)
{
master_displs[i] = master_displs[i - 1] + rbuf[i - 1];
}
// Send each processor's data to all others
MPI_Allgatherv(&data, n, MPI_INT, master_data, rbuf, master_displs, MPI_INT, MPI_COMM_WORLD);
// Print it out to see if it worked
if(thistask == 0)
{
for(i = 0; i < totcounts; i++)
{
printf("master_data[%d] = %d\n", i, master_data[i]);
}
}
// Free
if(thistask == 1)
{
free(data);
}
free(rbuf);
free(rcnts);
free(displs);
free(master_displs);
free(master_data);
MPI_Finalize();
return 0;
}
The way that I've set up master_displs works when every processor has a non-zero n (that is, they have data to send). In this case, both entries will be zero. However, the results of this program are garbage. How would I set up the master_displs array to ensure that master_data holds the correct information (in this case, just master_data[i] = i, as received from task 1)?

MPI Subarray Sending Error

I firstly initialize a 4x4 matrix and then try to send the first 2x2 block to the slave process by using MPI in C. However the slave process only receives the first row of the block, the second row is filled with random numbers from computer ram. I couldn't find what is missing. The code of the program is below :
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
#define SIZE 4
int main(int argc, char** argv)
{
int rank, nproc;
const int root = 0;
const int tag = 3;
int** table;
int* datas;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &nproc);
datas = malloc(SIZE * SIZE * sizeof(int));
table = malloc(SIZE * sizeof(int*));
for (int i = 0; i < SIZE; i++)
table[i] = &(datas[i * SIZE]);
for (int i = 0; i < SIZE; i++)
for (int k = 0; k < SIZE; k++)
table[i][k] = 0;
table[0][1] = 1;
table[0][2] = 2;
table[1][0] = 3;
table[2][3] = 2;
table[3][1] = 3;
table[3][2] = 4;
if (rank == root){
MPI_Datatype newtype;
int sizes[2] = { 4, 4 }; // size of table
int subsizes[2] = { 2, 2 }; // size of sub-region
int starts[2] = { 0, 0 };
MPI_Type_create_subarray(2, sizes, subsizes, starts, MPI_ORDER_C, MPI_INT, &newtype);
MPI_Type_commit(&newtype);
MPI_Send(&(table[0][0]), 1, newtype, 1, tag, MPI_COMM_WORLD);
}
else{
int* local_datas = malloc(SIZE * SIZE * sizeof(int));
int** local = malloc(SIZE * sizeof(int*));
for (int i = 0; i < SIZE; i++)
local[i] = &(local_datas[i * SIZE]);
MPI_Recv(&(local[0][0]), 4, MPI_INT, root, tag, MPI_COMM_WORLD, MPI_STATUSES_IGNORE);
for (int i = 0; i < 2; i++){
for (int k = 0; k < 2; k++)
printf("%3d ", local[i][k]);
printf("\n");
}
}
MPI_Finalize();
return 0;
}
You have instructed the receive operation to put four integer values consecutively in memory and therefore the 2x2 block is converted to a 1x4 row upon receive (since local is 4x4). The second row of local contains random values since the memory is never initialised.
You should either make use of MPI_Type_create_subarray in both the sender and the receiver in order to place the received data in a 2x2 block or redefine local to be a 2x2 matrix instead of 4x4.

Dynamic multidimensional array on the heap

I want to create a function which can allocate a multidimensional array on the heap with only one call to malloc. (Pointer array) So a function call would look like this:
size_t dim[2] = {2, 4};
int **_2darray = alloc_array(sizeof(int), dim, 2);
// ^ should be the "same" as:
int __2darray[2][4];
What I have so far is the SIZE computation of the whole block needed to hold the array and the pointers:
void *alloc_array(size_t element_size, size_t dimensions[static 1], size_t ndims)
{
unsigned char *DATA = NULL;
size_t SIZE = 0;
size_t multiplicators[ndims];
// Calculate for each dimension the multiplier
// SIZE 3d array: (N1 * sizeof(T **) + (N1 * N2 + sizeof(T *) + (N1 * N2 * n3 + sizeof(T))
// ^- first mulitplier ^ second multiplier ^ third multiplier
for (size_t i = 0; i < ndims; ++i) {
multiplicators[i] = dimensions[i];
for (size_t j = 0; j < i; ++j) {
multiplicators[i] *= dimensions[j];
}
}
SIZE = 0;
for (size_t dimI = 0; dimI < ndims; ++dimI) {
size_t mulval = multiplicators[dimI];
// The elements are in the "last" dimension
if (dimI+1 == ndims) {
SIZE += element_size * mulval;
} else {
// All other elements are pointers to the specific element
SIZE += sizeof(void *) * mulval;
}
}
DATA = malloc(SIZE);
return DATA;
}
So by now the SIZE calculation works. But now I'm stuck with setting the pointers to the right element. I know it's easy with dealing with static dimensions but I want this to be done with dynamic dimensions.
#include <stdlib.h>
#include <stdio.h>
void fill_array_pointers (void** pointers, char* elements,
size_t element_size, size_t total_elements_size,
size_t dimensions[], size_t ndims)
{
if (ndims == 2)
{
size_t i;
for (i = 0; i < dimensions[0]; ++i)
{
pointers[i] = elements + i * element_size * dimensions[1];
}
}
else
{
size_t i;
size_t block_size = total_elements_size / dimensions[0];
for (i = 0; i < dimensions[0]; ++i)
{
pointers[i] = pointers + dimensions[0] + i * dimensions[1];
fill_array_pointers (pointers + dimensions[0]
+ i * dimensions[1],
elements + block_size * i,
element_size, block_size,
dimensions+1, ndims-1);
}
}
}
void* alloc_array (size_t element_size, size_t dimensions[],
size_t ndims)
{
size_t total_elements_size = element_size;
int i;
// total size of elements
for (i = 0; i < ndims; ++i)
total_elements_size *= dimensions[i];
// total size of pointers
size_t total_pointers_size = 0;
int mulval = 1;
for (i = 0; i < ndims-1; ++i)
{
total_pointers_size += dimensions[i] * sizeof(void*) * mulval;
mulval *= dimensions[i];
}
size_t total_size = total_pointers_size;
size_t oddball = total_pointers_size % element_size;
// really needs to be alignof but we don't have it
if (oddball) total_size += (element_size - oddball);
total_size += total_elements_size;
void* block = malloc (total_size);
void** pointers = block;
char* elements = (char*)block + total_size - total_elements_size;
fill_array_pointers (pointers, elements, element_size,
total_elements_size, dimensions, ndims);
return block;
}
Test drive:
int main ()
{
size_t dims[] = { 2, 3, 4 };
int*** arr = alloc_array(sizeof(int), dims, 3);
int i, j, k;
for (i = 0; i < dims[0]; ++i)
for (j = 0; j < dims[1]; ++j)
for (k = 0; k < dims[2]; ++k)
{
arr[i][j][k] = i*100+j*10+k;
}
for (i = 0; i < dims[0]*dims[1]*dims[2]; ++i)
{
printf ("%03d ", (&arr[0][0][0])[i]);
}
printf ("\n");
free (arr);
}
This will not work for multidimensional char arrays on systems where sizeof(char*) != sizeof(char**); such systems exist but are rare. Multidimensional char arrays are pointless anyway.
The test runs cleanly under valgrind.
This is more an intellectual exercise than anything else. If you need maximum performance, don't use arrays of pointers, use a flat array and ugly but efficient explicit index calculations. If you need clear and concise code, you are probably better off allocating each level separately.

Creating *** object (pointer to pointer to pointer) in C

I am trying to write a function that creates a contiguous block of memory and assigns it to a 3d array. The code works in that it allows me to use the memory, and, when I use data stored in objects created with this function, the results appear correct. However, when I try to free the memory I have allocated with this function, I immediately get a glibc error. Here is the function:
void *** matrix3d(int size, int rows, int cols, int depth) {
void ***result;
int col_size = depth * size;
int row_size = (sizeof(void *) + col_size) * cols;
int data_size = (rows * cols * depth + 1) * size;
int pointer_size = rows * sizeof(void **) + cols * sizeof(void *);
int i, j;
char *pdata, *pdata2;
if((result = (void ***) malloc(pointer_size + data_size)) == NULL)
nerror("ERROR: Memory error.\nNot enough memory available.\n", 1);
pdata = (char *) result + rows * sizeof(void **);
if((long) pdata % (col_size + sizeof(void *)))
pdata += col_size + sizeof(void *) - (long) pdata % (col_size + sizeof(void *));
for(i = 0; i < rows; i++) {
result[i] = pdata;
pdata2 = pdata + cols * sizeof(void *);
for(j = 0; j < cols; j++) {
result[i][j] = pdata2;
pdata2 += col_size;
}
pdata += row_size;
}
return result;
}
It is called in this manner:
double ***positions = (double ***) matrix3d(sizeof(double), numResidues, numChains, numTimesteps);
for(i = 0; i < numResidues; i++)
for(j = 0; j < numChains; j++)
for(k = 0; k < numTimesteps; k++)
positions[i][j][k] = 3.2;
free(positions);
What have I done wrong? Thank you for the help.
What have I done wrong?
Your code is hard to follow (you're playing with pdata a lot) but 99% you're writing past the allocated space and you're messing up the bookkeeping left by glibc.
I can use the data I've written just fine. The only issue is when I
try to use free.
That's because glibc only gets a chance to see you messed up when you call it.
Please excuse my dear Aunt Sally.
int data_size = (rows * cols * depth + 1) * size;
This should be:
int data_size = (rows * cols * (depth + 1)) * size;
Running the code under valgrind identified the error immediately.
What you are doing is one single allocation and then casting it to a tripple-pointer, meaning you have to deal a lot with offsets.
It would probably be better to a larger number of allocations:
char ***result = malloc(sizeof(char **) * rows);
for(i = 0; i < rows; i++) {
result[i] = malloc(sizeof(char *) * cols);
for(j = 0; j < cols; j++) {
result[i][j] = malloc(sizeof(char) * size);
/* Copy data to `result[i][j]` */
}
}
When freeing, you have to free all of the allocations:
for(i = 0; i < rows; i++) {
for(j = 0; j < cols; j++) {
free(result[i][j]);
}
free(result[i]);
}
free(result);
things like this are magnificant candidates to get things wrong
pdata = (char *) result + rows * sizeof(void **);
there is no reason at all to circumvent the address computation that the compiler does for you.

Resources