Assertion failure using MPI_Gather - c

I am trying to write MPI C code that repeatedly performs a calculation and saves its outcome into a single array for outputting less frequently. Example code below (the size of var, 200, is sufficient for the number of CPUs in use):
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
int main(int argc, char **argv){
float *phie, *phitemp, var[200];
int time=0, gatherphi=10, gatherfile = 200, j, iter=0, lephie, x;
int nelecrank = 2, size, rank, Tmax = 2000;
FILE *out;
MPI_Init(&argc, &argv) ;
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
lephie = gatherfile/gatherphi; // number of runs of calculation before output
// allocate memory
//printf("%d Before malloc.\n", rank);
if (rank==1) phie=(float *) malloc(nelecrank*size*gatherfile/gatherphi*sizeof(float));
phitemp=(float *) malloc(nelecrank*sizeof(float));
//printf("%d After malloc.\n", rank);
for(j=0;j<200;j++) var[j]=rank;
for(time=0;time<Tmax;time++){
if (!time%gatherphi) {// do calculation
for (j=0;j<nelecrank;j++) { // each processor does the calculation nelecrank times
phitemp[j]=0;
for (x = 0; x<=4; x++) {
phitemp[j]=phitemp[j]+var[rank+j*size];
}
}
} // end of j for loop
printf("iter: %d, %d Before MPI_Gather.\n", iter, rank);
MPI_Gather(phitemp, nelecrank, MPI_FLOAT, phie+iter*nelecrank*size*sizeof(float), nelecrank, MPI_FLOAT, 1, MPI_COMM_WORLD);
iter++;
} // end of gatherphi condition
if (time % gatherfile) { //output result of calculation
iter=0;
if (rank==1) {
out = fopen ("output.txt", "wt+");
if (out == NULL) {
printf("Could not open output file.\n");
exit(1);
}
else printf("Have opened output file.\n");
for (j=0;j<nelecrank*size*lephie;j++) {
fprintf(out,"%f ",*(phie+j*sizeof(float)));
}
fclose(out);
}
} // end of file output
if (rank==1) {
if (phie) free (phie);
}
if (phitemp) free (phitemp);
MPI_Finalize();
return 0;
}
It gives me repeated memory allocation problems until it finally exits. I am not experienced using memory allocation in MPI - can you help?
Many thanks,
Marta

Basically, phie is not big enough.
You malloced nelecrank*size*gatherfile/gatherphi*sizeof(float)=80*sizeof(float) memory for phie.
But, your MPI_Gather usage requires iter*nelecrank*size*sizeof(float)+nelecrank*size*sizeof(float) memory. iter takes a maximum value of nelecrank*Tmax-1, so phie needs to be (nelecrank*Tmax-1)*nelecrank*size*sizeof(float)+nelecrank*size*sizeof(float), which is around 8000*size*sizeof(float) large.
Maybe you want to reset iter=0 in your loop somewhere?

Related

Quick-sorting with multiple threads leaves last 1/6 unsorted

My goal is to create a program that takes a large list of unsorted integers (1-10 million) and divides it into 6 parts where a thread concurrently sorts it. After sorting I merge it into one sorted array so I can find the median and mode quicker.
The input file will be something like this:
# 1000000
314
267
213
934
where the number following the # identifies the number of integers in the list.
Currently I can sort perfect and quickly without threading however when I began threading I ran into an issue. For a 1,000,000 data set it only sorts the first 833,333 integers leaving the last 166,666 (1/6) unsorted.
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h>
#include <time.h>
#define BUF_SIZE 1024
int sum; /* this data will be shared by the thread(s) */
int * bigArr;
int size;
int findMedian(int array[], int size)
{
if (size % 2 != 0)
return array[size / 2];
return (array[(size - 1) / 2] + array[size / 2]) / 2;
}
/*compare function for quicksort*/
int _comp(const void* a, const void* b) {
return ( *(int*)a - *(int*)b);
}
/*This function is the problem method*/
/*indicate range of array to be processed with the index(params)*/
void *threadFct(int param)
{
int x= size/6;
if(param==0)x= size/6;
if(param>0&&param<5)x= (size/6)*param;
if(param==5)x= (size/6)*param+ (size%size/6);/*pass remainder into last thread*/
qsort((void*)bigArr, x, sizeof(bigArr[param]), _comp);
pthread_exit(0);
}
int main(int argc, char *argv[])
{
FILE *source;
int i =0;
char buffer[BUF_SIZE];
if(argc!=2){
printf("Error. please enter ./a followed by the file name");
return -1;}
source= fopen(argv[1], "r");
if (source == NULL) { /*reading error msg*/
printf("Error. File not found.");
return 1;
}
int count= 0;
while (!feof (source)) {
if (fgets(buffer, sizeof (buffer), source)) {
if(count==0){ /*Convert string to int using atoi*/
char str[1];
sprintf(str, "%c%c%c%c%c%c%c%c%c",buffer[2],buffer[3],buffer[4],buffer[5],buffer[6],buffer[7],buffer[8],buffer[9],buffer[10]);/*get string of first */
size= atoi(str); /* read the size of file--> FIRST LINE of file*/
printf("SIZE: %d\n",size);
bigArr= malloc(size*sizeof(int));
}
else{
//printf("[%d]= %s\n",count-1, buffer); /*reads in the rest of the file*/
bigArr[count-1]= atoi(buffer);
}
count++;
}
}
/*thread the unsorted array*/
pthread_t tid[6]; /* the thread identifier */
pthread_attr_t attr; /* set of thread attributes */
// qsort((void*)bigArr, size, sizeof(bigArr[0]), _comp); <---- sorts array without threading
for(i=0; i<6;i++){
pthread_create(&tid[i], NULL, &threadFct, i);
pthread_join(tid[i], NULL);
}
printf("Sorted array:\n");
for(i=0; i<size;i++){
printf("%i \n",bigArr[i]);
}
fclose(source);
}
So to clarify the problem function is in my threadFct().
To explain what the function is doing, the param(thread number) identifies which chunk of the array to quicksort. I divide the size into 6 parts and because the it is even, the remainder of the numbers go into the last chunk. So for example, 1,000,000 integers I would have the first 5/6 sort 166,666 each and the last 1/6 would sort the remainder (166670).
I am aware that
Multi-threading will not speed up much at all even for 10 million integers
This is not the most efficient way to find the median/mode
Thanks for reading this and any help is received with gratitude.
You're sorting the beginning of the array in every call to qsort. You're only changing the number of elements that each thread sorts, by setting x. You're also setting x to the same value in threads 0 and 1.
You need to calculate an offset into the array for each thread, which is just size/6 * param. The number of elements will be size/6 except for the last chunk, which uses a modulus to get the remainder.
As mentioned in the comments, the argument to the thread function should be a pointer, not int. You can hide an integer in the pointer, but you need to use explicit casts.
void *threadFct(void* param_ptr)
{
int param = (int)param_ptr;
int start = size/6 * param;
int length;
if (param < 5) {
length = size/6;
} else {
length = size - 5 * (size/6);
}
qsort((void*)(bigArr+start), length, sizeof(*bigArr), _comp);
pthread_exit(0);
}
and later
pthread_create(&tid[i], NULL, &threadFct, (void*)i);

C-MPI Send created typedef struct with array of chars

Hello I have a file containing that kind of data lines:
AsfAGHM5om 00000000000000000000000000000000 0000222200002222000022220000222200002222000000001111
I want to read this kind of data and send them over using C and MPI. So I've reached the following C code:
#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stddef.h> // used for offsetof
typedef struct tuple_str{
char *key;
char *index;
char *value;
} tuple;
int main(int argc, char** argv) {
// Initialize the MPI environment
MPI_Init(&argc, &argv);
// Initialize file pointer
FILE *fp = fopen("tuples","r");
// define original structure that stores file and temp used by each process
tuple A[10000],B[10000];
// mpi structure name
MPI_Datatype mpi_tuples_str;
// number of structure members
const int nitems = 3;
// array of structure member sizes
int blocklengths[3];
blocklengths[0] = sizeof(A->key);
blocklengths[1] = sizeof(A->index);
blocklengths[2] = sizeof(A->value);
// structure member types
MPI_Datatype types[3] = {MPI_CHAR,MPI_CHAR,MPI_CHAR};
// status
MPI_Status status;
// offset of structure members
MPI_Aint offsets[3];
offsets[0] = offsetof(tuple,key);
offsets[1] = offsetof(tuple,index);
offsets[2] = offsetof(tuple,value);
// create mpi struct
MPI_Type_create_struct(nitems,blocklengths, offsets, types, &mpi_tuples_str);
MPI_Type_commit(&mpi_tuples_str);
// Get the number of processes
int size;
MPI_Comm_size(MPI_COMM_WORLD, &size);
// Get the rank of the process
int my_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
int index = 0;
int i;
int local_A_size = (10000%size == 0) ? 10000/size : 0;
if ( my_rank == 0){
char text[10000];
char *p;
p=strtok(NULL," ");
// node0 reads file form hard drive and saves file to struct
while( fgets(text,10000,fp)!=NULL){
p = strtok (text," ");
char *temp[3];
temp[0]=p;
A[index].key=temp[0];
p = strtok (NULL, " ");
temp[1] = p;
A[index].index=temp[1];
p = strtok (NULL, " ");
temp[2] = p;
A[index].value=temp[2];
// printf("%s ",A[index].key);
// printf("%s ",A[index].index);
// printf("%s\n",A[index].value);
index++;
}
fclose(fp);
}
if ( local_A_size != 0){
if (my_rank == 0) {
printf("File saved to memory of process %d!\n",my_rank);
printf("Process %d sending struct data to others...\n",my_rank);
}
// send struct to all processes
MPI_Scatter(&A,local_A_size,mpi_tuples_str,B,local_A_size,mpi_tuples_str,0,MPI_COMM_WORLD);
// MPI_Bcast(&A,index,mpi_tuples_str,0,MPI_COMM_WORLD);
for(i=0;i<=local_A_size;i++)
printf("I'm process %d and my result is: %s\n",my_rank,B[i].key);
if (my_rank == 0) printf("Data sent from process %d to others...\n",my_rank);
}
else
{
if (my_rank == 0) printf("Number of processes must be an exact divisor of %d, %d in not %ds divisor\n",index,size,index);
}
// free memory used by mpi_tuples_str
MPI_Type_free(&mpi_tuples_str);
// Finalize
MPI_Finalize();
return 0;
}
So the problem here is, as far as I can understand_ the creation and allocation of memory of my struct at first and the packing and sending of it at second.
As you can see I've tried both MPI_Scatter & MPI_Bcast but none of them helped me.
The result is that, as it supposed to, process 0 that reads the file has data but all others not. Also I'm getting this weird message of
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= EXIT CODE: 11
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
I'll be so grateful if someone can enlighten me!!
Alright I 've changed my code to the following:
`#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stddef.h> // used for offsetof
typedef struct tuple_str{
char key[10];
char index[12];
char value[52];
} tuple;
int main(int argc, char** argv) {
// Initialize the MPI environment
MPI_Init(NULL, NULL);
// Initialize file pointer
FILE *fp = fopen("tuples_mini","r");
// define original structure that stores file and temp used by each process
tuple A[10000],B[10000];
// mpi structure name
MPI_Datatype mpi_tuples_str;
// number of structure members
const int nitems = 3;
// array of structure member sizes
int blocklengths[3];
blocklengths[0] = sizeof(10);
blocklengths[1] = sizeof(12);
blocklengths[2] = sizeof(52);
// structure member types
MPI_Datatype types[3] = {MPI_CHAR,MPI_CHAR,MPI_CHAR};
// status
MPI_Status status;
// offset of structure members
MPI_Aint offsets[3];
offsets[0] = offsetof(tuple,key);
offsets[1] = offsetof(tuple,index);
offsets[2] = offsetof(tuple,value);
// create mpi struct
MPI_Type_create_struct(nitems,blocklengths, offsets, types, &mpi_tuples_str);
MPI_Type_commit(&mpi_tuples_str);
// Get the number of processes
int size;
MPI_Comm_size(MPI_COMM_WORLD, &size);
// Get the rank of the process
int my_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
int index = 0;
int i;
int local_A_size = (10000%size == 0) ? 10000/size : 0;
char *tmp[10000],*b[10000];
if ( my_rank == 0){
char text[10000];
char *p;
p=strtok(NULL," ");
// node0 reads file form hard drive and saves file to struct
while( fgets(text,10000,fp)!=NULL){
p = strtok (text," ");
char *temp[3];
temp[0]=p;
strcpy(A[index].key,temp[0]);
p = strtok (NULL, " ");
temp[1] = p;
strcpy(A[index].index,temp[1]);
p = strtok (NULL, " ");
temp[2] = p;
strcpy(A[index].value,temp[2]);
printf("%s ",A[index].key);
printf("%s ",A[index].index);
printf("%s\n",A[index].value);
index++;
}
fclose(fp);
}
if ( local_A_size != 0){
if (my_rank == 0) {
printf("File saved to memory of process %d!\n",my_rank);
printf("Process %d sending struct data to others...\n",my_rank);
}
// send struct to all processes
MPI_Scatter(&A,index,mpi_tuples_str,B,index,mpi_tuples_str,0,MPI_COMM_WORLD);
// MPI_Bcast(&tmp,index,MPI_CHAR,0,MPI_COMM_WORLD);
for(i=0;i<=local_A_size;i++){
// MPI_Recv(&tmp,index,MPI_CHAR,0,10,MPI_COMM_WORLD,&status);
printf("I'm process %d and my result is: %s\n",my_rank,B[i].key);
}
if (my_rank == 0) printf("Data sent from process %d to others...\n",my_rank);
}
else
{
if (my_rank == 0) printf("Number of processes must be an exact divisor of %d, %d in not %ds divisor\n",index,size,index);
}
// free memory used by mpi_tuples_str
MPI_Type_free(&mpi_tuples_str);
// Finalize
MPI_Finalize();
return 0;
}`
but that lead me to new error:
============================================================================== =====
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= EXIT CODE: 6
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
============================================================================== =====
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Aborted (signal 6)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions
after last suggestions!
#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stddef.h> // used for offsetof
typedef struct tuple_str{
char key[10];
char index[12];
char value[52];
} tuple;
int main(int argc, char** argv) {
// Initialize the MPI environment
MPI_Init(NULL, NULL);
// Initialize file pointer
FILE *fp = fopen("tuples_mini","r");
// define original structure that stores file and temp used by each process
tuple A[10000],B[10000];
// mpi structure name
MPI_Datatype mpi_tuples_str;
// number of structure members
const int nitems = 3;
// array of structure member sizes
int blocklengths[3];
blocklengths[0] = 11;
blocklengths[1] = 33;
blocklengths[2] = 53;
// structure member types
MPI_Datatype types[3] = {MPI_CHAR,MPI_CHAR,MPI_CHAR};
// status
MPI_Status status;
// offset of structure members
MPI_Aint offsets[3];
offsets[0] = offsetof(tuple,key);
offsets[1] = offsetof(tuple,index);
offsets[2] = offsetof(tuple,value);
// create mpi struct
MPI_Type_create_struct(nitems,blocklengths, offsets, types, &mpi_tuples_str);
MPI_Type_commit(&mpi_tuples_str);
// Get the number of processes
int size;
MPI_Comm_size(MPI_COMM_WORLD, &size);
// Get the rank of the process
int my_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
int index = 0;
int i;
int local_A_size = (10000%size == 0) ? 10000/size : 0;
char *tmp[10000],*b[10000];
if ( my_rank == 0){
char text[10000];
char *p;
// p=strtok(NULL," ");
// node0 reads file form hard drive and saves file to struct
while( fgets(text,10000,fp) != NULL && fp != NULL){
p = strtok (text," ");
char *temp[3];
temp[0]=p;
strcpy(A[index].key,temp[0]);
p = strtok (NULL, " ");
temp[1] = p;
strcpy(A[index].index,temp[1]);
p = strtok (NULL, " ");
temp[2] = p;
strcpy(A[index].value,temp[2]);
printf("%s ",A[index].key);
printf("%s ",A[index].index);
printf("%s\n",A[index].value);
tmp[index] = temp[0];
// printf("%s\n",tmp[index]);
index++;
}
fclose(fp);
}
if ( local_A_size != 0){
if (my_rank == 0) {
printf("File saved to memory of process %d!\n",my_rank);
printf("Process %d sending struct data to others...\n",my_rank);
// MPI_Send(&A,index,mpi_tuples_str,0,10,MPI_COMM_WORLD);
}
// send struct to all processes
MPI_Scatter(&A,index,mpi_tuples_str,B,index,mpi_tuples_str,0,MPI_COMM_WORLD);
// MPI_Bcast(&tmp,index,MPI_CHAR,0,MPI_COMM_WORLD);
// MPI_Bcast(&A,index,mpi_tuples_str,0,MPI_COMM_WORLD);
for(i=0;i<=local_A_size;i++){
// MPI_Recv(&tmp,index,MPI_CHAR,0,10,MPI_COMM_WORLD,&status);
// MPI_Recv(&A,index,mpi_tuples_str,0,10,MPI_COMM_WORLD,&status);
printf("I'm process %d and my result is: %s\n",my_rank,B[i].key);
}
if (my_rank == 0) printf("Data sent from process %d to others...\n",my_rank);
}
else
{
if (my_rank == 0) printf("Number of processes must be an exact divisor of %d, %d in not %ds divisor\n",index,size,index);
}
// free memory used by mpi_tuples_str
MPI_Type_free(&mpi_tuples_str);
// Finalize
MPI_Finalize();
return 0;
}
I applied all the comments about problems in the posted code, this is the result:
#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stddef.h> // used for offsetof
typedef struct tuple_str
{
char key[11];
char index[33];
char value[53];
} tuple;
int main( void )
{
// Initialize the MPI environment
MPI_Init( NULL, NULL );
// Initialize file pointer
FILE *fp = NULL;
if( NULL == ( fp = fopen( "tuples_mini" ,"r" ) ) )
{
perror( "fopen for read of truples_mini failed" );
exit( EXIT_FAILURE );
}
// implied else, fopen successful
// define original structure that stores file and temp used by each process
tuple A[10000];
tuple B[10000];
// mpi structure name
MPI_Datatype mpi_tuples_str;
// number of structure members
const int nitems = 3;
// array of structure member sizes
int blocklengths[3];
blocklengths[0] = 11;
blocklengths[1] = 33;
blocklengths[2] = 53;
// structure member types
MPI_Datatype types[3] = { MPI_CHAR, MPI_CHAR, MPI_CHAR };
// status
//MPI_Status status;
// offset of structure members
MPI_Aint offsets[3];
offsets[0] = offsetof( tuple,key);
offsets[1] = offsetof( tuple,index);
offsets[2] = offsetof( tuple,value);
// create mpi struct
MPI_Type_create_struct( nitems, blocklengths, offsets, types, &mpi_tuples_str);
MPI_Type_commit( &mpi_tuples_str);
// Get the number of processes
int size;
MPI_Comm_size( MPI_COMM_WORLD, &size);
// Get the rank of the process
int my_rank;
MPI_Comm_rank( MPI_COMM_WORLD, &my_rank);
int index = 0;
int i;
int local_A_size = (10000%size == 0) ? 10000/size : 0;
//char *tmp[10000];
//char *b[10000];
if ( my_rank == 0)
{
char text[10000];
char *p;
//p=strtok(NULL," ");
// node0 reads file from hard drive and saves file to struct
while( fgets( text, sizeof text, fp ) )
{
p = strtok (text," ");
char *temp[3];
temp[0]=p;
strcpy( A[index].key,temp[0]);
p = strtok (NULL, " ");
temp[1] = p;
strcpy( A[index].index,temp[1]);
p = strtok (NULL, " ");
temp[2] = p;
strcpy( A[index].value,temp[2]);
printf( "%s ",A[index].key);
printf( "%s ",A[index].index);
printf( "%s\n",A[index].value);
index++;
}
fclose(fp);
}
if ( local_A_size != 0)
{
if (my_rank == 0)
{
printf( "File saved to memory of process %d!\n",my_rank);
printf( "Process %d sending struct data to others...\n",my_rank);
}
// send struct to all processes
MPI_Scatter( &A,index, mpi_tuples_str, B, index, mpi_tuples_str, 0, MPI_COMM_WORLD;
// MPI_Bcast( &tmp,index, MPI_CHAR, 0, MPI_COMM_WORLD);
for(i=0;i<=local_A_size;i++)
{
// MPI_Recv( &tmp, index, MPI_CHAR, 0, 10, MPI_COMM_WORLD, &status);
printf( "I'm process %d and my result is: %s\n", my_rank, B[i].key);
}
if (my_rank == 0)
printf("Data sent from process %d to others...\n", my_rank);
}
else
{
if (my_rank == 0)
printf( "Number of processes must be an exact divisor of %d, %d in not %ds divisor\n", index, size,index);
}
// free memory used by mpi_tuples_str
MPI_Type_free( &mpi_tuples_str);
// Finalize
MPI_Finalize();
return 0;
}
I setup the tuples_mini file to contain:
AsfAGHM5om 00000000000000000000000000000000 0000222200002222000022220000222200002222000000001111
when I ran the program on my ubuntu linux 14.04 with 4 core processor, this is the output:
AsfAGHM5om 00000000000000000000000000000000 0000222200002222000022220000222200002222000000001111
File saved to memory of process 0!
Process 0 sending struct data to others...
I'm process 0 and my result is: AsfAGHM5om
I'm process 0 and my result is:
then several dozens of this line:
I'm process 0 and my result is:
followed by these lines:
I'm process 0 and my result is:
I'm process 0 and my result is: AsfAGHM5om
Data sent from process 0 to others...
so there seems to be a logic problem in the code, but it does not seg fault
on the second version of the posted code
The most likely cause of the signal 6/sigabort is because the first call to strtok() has NULL as the first parameter rather than the address of a visible buffer.
IMO: that first call to strtok() should be completely removed from the program.

C MPI Array Search Help (MPI_Scatter)

I'm new to using MPI, and I'm having an issue understanding why my code isn't working correctly.
#include "mpi.h"
#include <stdio.h>
#include <string.h>
int main(int argc, char* argv[]) {
int list_size = 1000
int threads;
int th_nums;
int slice;
char* the_array[list_size];
char* temp_array[list_size];
char str_to_search[10];
FILE *in = fopen("inputfile", "r");
char parse[10];
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &threads);
MPI_Comm_size(MPI_COMM_WORLD, &th_nums);
if (threads == 0) { // if master task
fgets(parse, 10, in);
slice = atoi(parse); // How many slices to cut the array into
fgets(parse, 10, in);
sscanf(parse, "%s", search); // gives us the string we want to search
int i;
for (i = 0; i < list_size; i++) {
char temp[10];
fgets(parse, 10, in);
sscanf(parse, "%s", temp);
the_array[i] = strdup(temp);
}
int index = list_size/slice; //
MPI_Bcast(&str_to_search, 10, MPI_CHAR, 0, MPI_COMM_WORLD);
}
MPI_Bcast(&str_to_search, 10, MPI_CHAR, 0, MPI_COMM_WORLD);
MPI_Scatter(the_array, index, MPI_CHAR, temp_array, index, 0, MPI_COMM_WORLD);
// Search for string occurs here
MPI_Finalize();
return 0;
}
However, I'm finding that when I try to search, only the master task receives some of the slice, the rest is null. All other tasks don't receive anything. I've tried placing MPI_Scatter outside of the if(master task) statement, but I have no luck with this. Also, when the list_size increases, I find the program basically gets stuck at the MPI_Scatter line. What am I doing wrong, and what can I do to correct this?
You should go look up some tutorials on MPI collectives. They require all processes to participate collectively. So if any process calls MPI_Scatter, then all processes must call MPI_Scatter. I'd recommend looking at some sample code and playing with it until you understand what's going on. Then try coming back to your own code and seeing if you can figure out what's going on.
My favorite reference for anything pre-MPI-3 is DeinoMPI. I've never used the implementation, but the documentation is great since it has a complete example for each function in the MPI-2 Spec.

program spinning on pthread lock

After banging my head against a wall for a few hours during this exercise, I am stuck at that wall.
First off, this is a program designed to find and print all prime numbers between 1 and ceiling, where ceiling is some user input. The design is to implement POSIX threads.
In my program, it runs successfully until on one of the later iterations in the thread's method. When it gets to that later iteration, it steps to the line pthread_mutex_lock(lock); and spins, forcing me to kill it with Ctrl+z. The 2 input's I've been using are 1 for the number of threads and 10 for the ceiling. This flaw is reproducible as it happens every time I've tried it. note: although this code should be able to implement multiple threads, I'd like to get it working correctly with 1 child thread before adding more.
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
int* numbermarker = NULL;
int* buffer = NULL;
int* checked = NULL;
int pullposition = 0;
int placeposition = 0;
pthread_mutex_t* lock;
int ceiling;
/*This method places one of the primes in the buffer. It
offers a safe way to manage where the next value will be placed*/
void placevalue(int value){
buffer[placeposition] = value;
placeposition++;
}
void* threadmethod(){
int i;
int k;
int l;
while(1){
printf("pull %d number %d \n",pullposition, buffer[pullposition]);
pthread_mutex_lock(lock);
printf("FLAG\n");
l = buffer[pullposition];
pullposition++;
printf("pullTWO %d number %d \n",pullposition, buffer[pullposition-1]);
pthread_mutex_unlock(lock);
for(k=l+1;k<=ceiling;k++){
if(k%l){
if(k%2){
checked[k]=1;
placevalue(k);
}
}
else{
numbermarker[k-1] = 1;
}
}
int sum=0;
for(i=0; i<ceiling; i++){
if(numbermarker[i]){
checked[i] = numbermarker[i];
}
printf("checked|%d|%d|%d|%d|%d|%d|%d|%d|%d|%d|\n",
checked[0], checked[1], checked[2], checked[3], checked[4], checked[5], checked[6], checked[7], checked[8], checked[9]);
sum += checked[i];
printf("sum %d ceiling %d\n",sum,ceiling);
}
printf("number |%d|%d|%d|%d|%d|%d|%d|%d|%d|%d|\n",
numbermarker[0], numbermarker[1], numbermarker[2], numbermarker[3], numbermarker[4], numbermarker[5], numbermarker[6], numbermarker[7], numbermarker[8], numbermarker[9]);
if(sum == ceiling){
return NULL;
}
}
}
int main()
{
int numthreads;
int i;
printf("Enter number of threads: \n");
scanf("%d", &numthreads);
printf("Enter the highest value to check \n");
scanf("%d", &ceiling);
/* This will hold 1's and 0's.
1 = number has been checked or is
confirmed not to be a prime
0 = number is a possible prime
The idea behind these values is that the next
prime can always be identified by the 0 with
the lowest index
*/
numbermarker = (int*)malloc(sizeof(int)*(ceiling));
checked = (int*)malloc(sizeof(int)*(ceiling));
/*This will hold the primes as they are found*/
buffer = (int*)malloc(sizeof(int)*(ceiling));
/*allocate space for the lock*/
lock = (pthread_mutex_t *) malloc(sizeof(pthread_mutex_t));
pthread_mutex_init(lock,NULL);
for(i=0; i<ceiling; i++){
if(i<1){
numbermarker[i] = 1;
}
else{
numbermarker[i] = 0;
}
checked[i]=0;
buffer[i]=0;
printf("%d \n",numbermarker[i]);
}
checked[0]=1;
placevalue(2);
printf("checked|%d|%d|%d|%d|%d|%d|%d|%d|%d|%d|\n", checked[0], checked[1], checked[2], checked[3], checked[4], checked[5], checked[6], checked[7], checked[8], checked[9]);
pthread_t **tid = (pthread_t **) malloc(sizeof(pthread_t *) * numthreads);
for(i=0;i<numthreads;i++){
tid[i] = (pthread_t *) malloc(sizeof(pthread_t));
}
for(i=0;i<numthreads;i++){
if(pthread_create(tid[i],
NULL,
threadmethod,
NULL)){
printf("Could not create thread \n");
exit(-1);
}
}
for(i=0;i<numthreads;i++){
if(pthread_join(*tid[i], NULL)){
printf("Error Joining with thread \n");
exit(-1);
}
free(tid[i]);
}
free(tid);
for(i=0;i<ceiling;i++){
if(numbermarker[i] == 0){
printf("%d sdfsddd \n", numbermarker[i]);
printf("%d \n", i+1);
}
}
free(buffer);
free(numbermarker);
buffer=NULL;
numbermarker=NULL;
return(0);
}
I've tried your code and in
void placevalue(int value)
{
buffer[placeposition] = value;
placeposition++;
}
placeposition goes beyond the size of buffer. This results in undefined behaviour, a very plausible outcome of which is the trashing of the mutex (which is malloc()ed right after buffer).
On top of that, there's a race condition is placevalue(). However, if you're using a single worker thread, you are not (yet) running into it.

Reading multiple files using MPI-IO

I'm trying to read in multiple files using MPI-IO in C. I'm following this example : http://users.abo.fi/Mats.Aspnas/PP2010/examples/MPI/readfile1.c
However I'm reading in a matrix a doubles instead of a string of chars. Here is that implementation:
/*
Simple MPI-IO program that demonstrate parallel reading from a file.
Compile the program with 'mpicc -O2 readfile1.c -o readfile1'
*/
#include <stdlib.h>
#include <stdio.h>
#include "mpi.h"
#define FILENAME "filename.dat"
double** ArrayAllocation() {
int i;
double** array2D;
array2D= (double**) malloc(num_procs*sizeof(double*));
for(i = 0; i < num_procs; i++) {
twoDarray[i] = (double*) malloc(column_size*sizeof(double));
}
return array2D;
}
int main(int argc, char* argv[]) {
int i, np, myid;
int bufsize, nrchar;
double *buf; /* Buffer for reading */
double **matrix = ArrayAllocation();
MPI_Offset filesize;
MPI_File myfile; /* Shared file */
MPI_Status status; /* Status returned from read */
/* Initialize MPI */
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &myid);
MPI_Comm_size(MPI_COMM_WORLD, &np);
/* Open the files */
MPI_File_open (MPI_COMM_WORLD, FILENAME, MPI_MODE_RDONLY,
MPI_INFO_NULL, &myfile);
/* Get the size of the file */
MPI_File_get_size(myfile, &filesize);
/* Calculate how many elements that is */
filesize = filesize/sizeof(double);
/* Calculate how many elements each processor gets */
bufsize = filesize/np;
/* Allocate the buffer to read to, one extra for terminating null char */
buf = (double *) malloc((bufsize)*sizeof(double));
/* Set the file view */
MPI_File_set_view(myfile, myid*bufsize*sizeof(double), MPI_DOUBLE,
MPI_DOUBLE,"native", MPI_INFO_NULL);
/* Read from the file */
MPI_File_read(myfile, buf, bufsize, MPI_DOUBLE, &status);
/* Find out how many elemyidnts were read */
MPI_Get_count(&status, MPI_DOUBLE, &nrchar);
/* Set terminating null char in the string */
//buf[nrchar] = (double)0;
printf("Process %2d read %d characters: ", myid, nrchar);
int j;
for (j = 0; j <bufsize;j++){
matrix[myid][j] = buf[j];
}
/* Close the file */
MPI_File_close(&myfile);
if (myid==0) {
printf("Done\n");
}
MPI_Finalize();
exit(0);
}
However when I try to call MPI_File_open after I close the first file, I get an error. Do I need multiple communicators to perform this? Any tips will be appreciated.
The code in ArrayAllocation above does not quite match the logic of the main program. The matrix is allocated as an array of pointers to vectors of doubles before MPI is initialized, therefore it is impossible to set the number of rows to the number of MPI processes.
The column_size is also not known until the file size is determined.
It is a general convention in the C language to store matrices by rows. Violating this convention might confuse you or the reader of your code.
All in all in order to get this program working you need to declare
int num_procs, column_size;
as global variables before the definition of ArrayAllocation and move the call to this function down below the line where bufsize is calculated:
...
/* Calculate how many elements each processor gets */
bufsize = filesize/np;
num_procs = np;
column_size = bufsize;
double **matrix = ArrayAllocation();
...
With the above modifications this example should work on any MPI implementation that supports MPI-IO. I have tested it with OpenMPI 1.2.8.
In order to generate a test file you could use for instance the following code:
FILE* f = fopen(FILENAME,"w");
double x = 0;
for(i=0;i<100;i++){
fwrite(&x, 1,sizeof(double), f);
x +=0.1;
}
fclose(f);

Resources