File read issues - Parallel version of Game of Life - c

For my Parallel Computing class, I am working on a project that parallelizes the Game of Life using MPI. I am specifically implementing exercise 6.13 in "Parallel Programming in C with MPI and OpenMP" by Michael J. Quinn.
I am using the author's pre-written library function, "read_row_striped_matrix". The following is the code for the function:
/*
* Process p-1 opens a file and inputs a two-dimensional
* matrix, reading and distributing blocks of rows to the
* other processes.
*/
void read_row_striped_matrix (
char *s, /* IN - File name */
void ***subs, /* OUT - 2D submatrix indices */
void **storage, /* OUT - Submatrix stored here */
MPI_Datatype dtype, /* IN - Matrix element type */
int *m, /* OUT - Matrix rows */
int *n, /* OUT - Matrix cols */
MPI_Comm comm) /* IN - Communicator */
{
int datum_size; /* Size of matrix element */
int i;
int id; /* Process rank */
FILE *infileptr; /* Input file pointer */
int local_rows; /* Rows on this proc */
void **lptr; /* Pointer into 'subs' */
int p; /* Number of processes */
void *rptr; /* Pointer into 'storage' */
MPI_Status status; /* Result of receive */
int x; /* Result of read */
MPI_Comm_size (comm, &p);
MPI_Comm_rank (comm, &id);
datum_size = get_size (dtype);
/* Process p-1 opens file, reads size of matrix,
and broadcasts matrix dimensions to other procs */
if (id == (p-1)) {
infileptr = fopen (s, "r");
if (infileptr == NULL) *m = 0;
else {
fread (m, sizeof(int), 1, infileptr);
fread (n, sizeof(int), 1, infileptr);
}
}
MPI_Bcast (m, 1, MPI_INT, p-1, comm);
if (!(*m)) MPI_Abort (MPI_COMM_WORLD, OPEN_FILE_ERROR);
MPI_Bcast (n, 1, MPI_INT, p-1, comm);
local_rows = BLOCK_SIZE(id,p,*m);
/* Dynamically allocate matrix. Allow double subscripting
through 'a'. */
*storage = (void *) my_malloc (id,
local_rows * *n * datum_size);
*subs = (void **) my_malloc (id, local_rows * PTR_SIZE);
lptr = (void *) &(*subs[0]);
rptr = (void *) *storage;
for (i = 0; i < local_rows; i++) {
*(lptr++)= (void *) rptr;
rptr += *n * datum_size;
}
/* Process p-1 reads blocks of rows from file and
sends each block to the correct destination process.
The last block it keeps. */
if (id == (p-1)) {
for (i = 0; i < p-1; i++) {
x = fread (*storage, datum_size,
BLOCK_SIZE(i,p,*m) * *n, infileptr);
MPI_Send (*storage, BLOCK_SIZE(i,p,*m) * *n, dtype,
i, DATA_MSG, comm);
}
x = fread (*storage, datum_size, local_rows * *n,
infileptr);
fclose (infileptr);
} else
MPI_Recv (*storage, local_rows * *n, dtype, p-1,
DATA_MSG, comm, &status);
}
In the beginning of my code, I call "read_row_striped_matrix" like this:
#include <stdio.h>
#include <mpi.h>
#include "MyMPI.h"
typedef int dtype;
#define MPI_TYPE MPI_INT
int main(int argc, char *argv[]) {
dtype** matrix; /* Doubly-subscripted array */
dtype* storage; /* Local portion of array elements */
int proc_id; /* Process Rank */
int row_count; /* Number of rows in matrix */
int col_count; /* Number of columns in matrix */
int proc_count; /* Number of processes */
int i; /* Used with for loop */
MPI_Init (&argc, &argv);
MPI_Comm_rank (MPI_COMM_WORLD, &proc_id);
MPI_Comm_size (MPI_COMM_WORLD, &proc_count);
read_row_striped_matrix (argv[3], (void *) &matrix, (void *) &storage, MPI_TYPE,
&row_count, &col_count, MPI_COMM_WORLD);
....
The problem is, my implementation was getting stuck in an infinite loop. So I started debugging by testing to see if the data was being read from the text file correctly. My text file named "file_input.txt" contains the following input, where the first number (5) represents the number of rows, and the second number (also 5) represents the number of cols, and the rest of the data are the values in the matrix:
5 5 0 0 1 0 1 0 0 1 ...
I inserted the following printf statements in the section of library code where the length and height was being read from the text file:
if (id == (p-1))
printf("The name of the file is %s\n", s);
infileptr = fopen (s, "r");
if (infileptr == NULL) *m = 0;
else {
printf("The value of m is %d\n", *m);
size_t ret_val = fread (m, sizeof(int), 1, infileptr);
size_t next_ret_val = fread (n, sizeof(int), 1, infileptr);
printf("The total # of elements successfully read is: %d\n", ret_val);
printf("The total # of elements successfully read is: %d\n", next_ret_val);
printf("The value of m is %d\n", *m);
printf("The value of n is %d\n", *n);
}
}
For executing "project_3 5 5 file_input.txt", The output of the program is:
The name of the file is: file_input.txt
The value of m is 0
The total number of elements successfully read is: 1
The total number of elements successfully read is: 1
The value of m is: 540549176
The value of n is: 540090416
...
From what I observe, the name of the file was read in correctly, and the value of m (0) is correct before calling fread. fread is reading in the correct # of elements for both m and n, but the values are "540549176" and "540090416" instead of 5 and 5. When I try changing the numbers in the beginning of the text file to say, "3 and 4" for example, the value of m and n does not change.
Does anybody have any idea why the first two integers are not being read in from the text file correctly? Thanks in advance.

You have two options here:
this program is expecting binary input. so you need to produce binary input somehow. "5" is an ascii character with the hex value 0x35 (decimal 53). When you fread sizeof(int), you are actually going to pull in 2 characters.
you can edit the program to parse ascii text but this is kind of annoying. First you read in a line of the file, then you tokenize it, then you convert each token into integers. Are you coming from a perl/python background? This text conversion stuff is nearly automatic in scripting languages. nothing is automatic in C

Look you need to rip the library function off and rewrite it to read and interpret text files. At present it reads binary data, when it reads into m and n it reads sizeof(int) (probably 4 bytes) of hexadecimal data ie 00,00,00,05 is what's expected to be in your file but the first 4 bytes are something like 76,32,76,32
Rather than rewrite the library function it probably makes sense to write a file compiler that reads a text file as input and writes it as it's binary data

Related

Quick-sorting with multiple threads leaves last 1/6 unsorted

My goal is to create a program that takes a large list of unsorted integers (1-10 million) and divides it into 6 parts where a thread concurrently sorts it. After sorting I merge it into one sorted array so I can find the median and mode quicker.
The input file will be something like this:
# 1000000
314
267
213
934
where the number following the # identifies the number of integers in the list.
Currently I can sort perfect and quickly without threading however when I began threading I ran into an issue. For a 1,000,000 data set it only sorts the first 833,333 integers leaving the last 166,666 (1/6) unsorted.
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h>
#include <time.h>
#define BUF_SIZE 1024
int sum; /* this data will be shared by the thread(s) */
int * bigArr;
int size;
int findMedian(int array[], int size)
{
if (size % 2 != 0)
return array[size / 2];
return (array[(size - 1) / 2] + array[size / 2]) / 2;
}
/*compare function for quicksort*/
int _comp(const void* a, const void* b) {
return ( *(int*)a - *(int*)b);
}
/*This function is the problem method*/
/*indicate range of array to be processed with the index(params)*/
void *threadFct(int param)
{
int x= size/6;
if(param==0)x= size/6;
if(param>0&&param<5)x= (size/6)*param;
if(param==5)x= (size/6)*param+ (size%size/6);/*pass remainder into last thread*/
qsort((void*)bigArr, x, sizeof(bigArr[param]), _comp);
pthread_exit(0);
}
int main(int argc, char *argv[])
{
FILE *source;
int i =0;
char buffer[BUF_SIZE];
if(argc!=2){
printf("Error. please enter ./a followed by the file name");
return -1;}
source= fopen(argv[1], "r");
if (source == NULL) { /*reading error msg*/
printf("Error. File not found.");
return 1;
}
int count= 0;
while (!feof (source)) {
if (fgets(buffer, sizeof (buffer), source)) {
if(count==0){ /*Convert string to int using atoi*/
char str[1];
sprintf(str, "%c%c%c%c%c%c%c%c%c",buffer[2],buffer[3],buffer[4],buffer[5],buffer[6],buffer[7],buffer[8],buffer[9],buffer[10]);/*get string of first */
size= atoi(str); /* read the size of file--> FIRST LINE of file*/
printf("SIZE: %d\n",size);
bigArr= malloc(size*sizeof(int));
}
else{
//printf("[%d]= %s\n",count-1, buffer); /*reads in the rest of the file*/
bigArr[count-1]= atoi(buffer);
}
count++;
}
}
/*thread the unsorted array*/
pthread_t tid[6]; /* the thread identifier */
pthread_attr_t attr; /* set of thread attributes */
// qsort((void*)bigArr, size, sizeof(bigArr[0]), _comp); <---- sorts array without threading
for(i=0; i<6;i++){
pthread_create(&tid[i], NULL, &threadFct, i);
pthread_join(tid[i], NULL);
}
printf("Sorted array:\n");
for(i=0; i<size;i++){
printf("%i \n",bigArr[i]);
}
fclose(source);
}
So to clarify the problem function is in my threadFct().
To explain what the function is doing, the param(thread number) identifies which chunk of the array to quicksort. I divide the size into 6 parts and because the it is even, the remainder of the numbers go into the last chunk. So for example, 1,000,000 integers I would have the first 5/6 sort 166,666 each and the last 1/6 would sort the remainder (166670).
I am aware that
Multi-threading will not speed up much at all even for 10 million integers
This is not the most efficient way to find the median/mode
Thanks for reading this and any help is received with gratitude.
You're sorting the beginning of the array in every call to qsort. You're only changing the number of elements that each thread sorts, by setting x. You're also setting x to the same value in threads 0 and 1.
You need to calculate an offset into the array for each thread, which is just size/6 * param. The number of elements will be size/6 except for the last chunk, which uses a modulus to get the remainder.
As mentioned in the comments, the argument to the thread function should be a pointer, not int. You can hide an integer in the pointer, but you need to use explicit casts.
void *threadFct(void* param_ptr)
{
int param = (int)param_ptr;
int start = size/6 * param;
int length;
if (param < 5) {
length = size/6;
} else {
length = size - 5 * (size/6);
}
qsort((void*)(bigArr+start), length, sizeof(*bigArr), _comp);
pthread_exit(0);
}
and later
pthread_create(&tid[i], NULL, &threadFct, (void*)i);

Reading elements into a matrix from binary file to separate processors in MPI causes segmentation faults

My goal is to assign a processor to read a binary file into some storage for a matrix, then send it to the processor that matrix is assigned to for calculations. Then afterwards. This program would need to be able to accomodate a variable size of the matrix, and a variable number of processors.
This is all in C
The actual allocation for the matrix, in case im doing it wrong:
void alloc_matrix(
int nrows,
int ncols,
size_t element_size,
void **matrix_storage,
void ***matrix,
int *errvalue){
int i;
void *ptr_to_row_in_storage;
void **matrix_row_start;
const int MSG_TAG =1;
//1. Allocate array of full matrix
*matrix_storage = malloc(nrows *ncols * element_size);
//2. create 2d matrix based on nrows.
*matrix = malloc(nrows * sizeof(void*));
//3. put addresses in each row.
//get address of start of array of pointers to linear storage, which is address of (*matrix)[0]
matrix_row_start = (void*) &(*matrix[0]);
ptr_to_row_in_storage = (void*) *matrix_storage;
//for each matrix pointer, *matrix[i], i = 0... nrows-1,
//set it to start of the ith row in linear storage
for(i = 0; i<nrows; i++){
//matrix_row_start is address of *matrix[i] and ptr_to_row_in_storage is address of start of ith row.
//changes content of (*matrix)[i] to store the start of the ith row in linear storage.
*matrix_row_start = (void*) ptr_to_row_in_storage;
matrix_row_start++;
ptr_to_row_in_storage += ncols * element_size;
}
*errvalue = SUCCESS;
}
The code for reading the file into the matrices to be sent:
void read_and_distribute_matrix(
char *filename,
void ***matrix, //matrix to fill
void **matrix_storage, //linear storage for matrix
MPI_Datatype dtype,
int *nrows, // rows in matrix
int *ncols, // columns in matrix
MPI_Comm comm
){
...
if(p-1 == id){
int nrows_to_send;
int num_elements;
size_t nelements_read;
int i;
//for process i go up to p-2 and send chunk of bytes to process i from p-1.
for(int i=0; i< p-1; i++){
nrows_to_send = number_of_rows (i, *nrows, p);
num_elements = nrows_to_send * (*ncols);
nelements_read = fread(*matrix_storage, element_size,
num_elements, file);
//check number read
MPI_Send (*matrix_storage, num_elements, dtype,
i, MSG_TAG, comm);
}
nelements_read = fread(*matrix_storage, element_size,
nlocal_rows * (*ncols), file);
fclose(file);
}
else{
MPI_Recv( *matrix_storage, nlocal_rows * (*ncols),
dtype, p-1, MSG_TAG, comm, &status);
}
So the last processor would be in charge with reading the file, and sending the read data to the appropriate processor, using MPI_Send. Then the processor, using MPI_Recv, obtains the data and places it in its local matrix_storage. But when I try to print the matrices, I receive this message:
*** An error occurred in MPI_Recv
*** reported by process
*** on communicator MPI_COMM_WORLD
*** MPI_ERR_TAG: invalid tag
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
1 more process has sent help message help-mpi-errors.txt / mpi_errors_are_fatal
Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
As I am still a novice in MPI, and I have identified the error to come from MPI_Send, but not sure how to fix it. I have tried changing the method of allocation to disastrous results, or comparing the matrix differently. Is there an area I should look at?
Thanks in advance.

C: playing audio loops in linux

I have a buffer of int16_t with some audio PCM data in it. I need to play the buffer repetitively from a point a to a point b, so that you hear an infinite audio loop.
I found that the easiest way to play sound is by using libao, but I agree with any other method.
This is my code:
int play(int a, int b, char *buf);
int main()
{
int16_t *buf; /*my buffer*/
int a, b;
/* a and b are the indexes of the buffer;
* because libao wants a buffer of char,
* and buf points to of int16_t, I'll pass
* the value a and b multiplied with 2.
*/
[ยทยทยท]
play(2*a, 2*b, (char *) buf);
return 0;
}
int play(int a, int b, char *buf)
{
ao_device *device;
ao_sample_format format;
int default_driver;
/* -- Initialize -- */
fprintf(stderr, "libao example program\n");
ao_initialize();
/* -- Setup for default driver -- */
default_driver = ao_default_driver_id();
memset(&format, 0, sizeof(format));
format.bits = 16;
format.channels = 1;
format.rate = 44100;
format.byte_format = AO_FMT_LITTLE;
/* -- Open driver -- */
device = ao_open_live(default_driver, &format, NULL /* no options */);
if (device == NULL) {
fprintf(stderr, "Error opening device.\n");
exit(1);
}
/* -- Play the infinite loop -- */
for (;;){
ao_play(device, buf+a, b-a+1);
/*buf+a is the start of the loop, b-a+1 the number of byte to play--edited*/
}
/* -- Close and shutdown -- */
ao_close(device);
ao_shutdown();
return 0;
}
The problem is that I hear a period of silence between the end and the start of the loop. Because I'm using this code to testing other code, I absolutely need to know if it could be caused by an incorrect use of libao.
Yes, it absolutely could be caused by incorrect use of libao. Please remove the +1 from the ao_play() call, like so:
ao_play(device, buf+a, b-a);

Reading multiple files using MPI-IO

I'm trying to read in multiple files using MPI-IO in C. I'm following this example : http://users.abo.fi/Mats.Aspnas/PP2010/examples/MPI/readfile1.c
However I'm reading in a matrix a doubles instead of a string of chars. Here is that implementation:
/*
Simple MPI-IO program that demonstrate parallel reading from a file.
Compile the program with 'mpicc -O2 readfile1.c -o readfile1'
*/
#include <stdlib.h>
#include <stdio.h>
#include "mpi.h"
#define FILENAME "filename.dat"
double** ArrayAllocation() {
int i;
double** array2D;
array2D= (double**) malloc(num_procs*sizeof(double*));
for(i = 0; i < num_procs; i++) {
twoDarray[i] = (double*) malloc(column_size*sizeof(double));
}
return array2D;
}
int main(int argc, char* argv[]) {
int i, np, myid;
int bufsize, nrchar;
double *buf; /* Buffer for reading */
double **matrix = ArrayAllocation();
MPI_Offset filesize;
MPI_File myfile; /* Shared file */
MPI_Status status; /* Status returned from read */
/* Initialize MPI */
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &myid);
MPI_Comm_size(MPI_COMM_WORLD, &np);
/* Open the files */
MPI_File_open (MPI_COMM_WORLD, FILENAME, MPI_MODE_RDONLY,
MPI_INFO_NULL, &myfile);
/* Get the size of the file */
MPI_File_get_size(myfile, &filesize);
/* Calculate how many elements that is */
filesize = filesize/sizeof(double);
/* Calculate how many elements each processor gets */
bufsize = filesize/np;
/* Allocate the buffer to read to, one extra for terminating null char */
buf = (double *) malloc((bufsize)*sizeof(double));
/* Set the file view */
MPI_File_set_view(myfile, myid*bufsize*sizeof(double), MPI_DOUBLE,
MPI_DOUBLE,"native", MPI_INFO_NULL);
/* Read from the file */
MPI_File_read(myfile, buf, bufsize, MPI_DOUBLE, &status);
/* Find out how many elemyidnts were read */
MPI_Get_count(&status, MPI_DOUBLE, &nrchar);
/* Set terminating null char in the string */
//buf[nrchar] = (double)0;
printf("Process %2d read %d characters: ", myid, nrchar);
int j;
for (j = 0; j <bufsize;j++){
matrix[myid][j] = buf[j];
}
/* Close the file */
MPI_File_close(&myfile);
if (myid==0) {
printf("Done\n");
}
MPI_Finalize();
exit(0);
}
However when I try to call MPI_File_open after I close the first file, I get an error. Do I need multiple communicators to perform this? Any tips will be appreciated.
The code in ArrayAllocation above does not quite match the logic of the main program. The matrix is allocated as an array of pointers to vectors of doubles before MPI is initialized, therefore it is impossible to set the number of rows to the number of MPI processes.
The column_size is also not known until the file size is determined.
It is a general convention in the C language to store matrices by rows. Violating this convention might confuse you or the reader of your code.
All in all in order to get this program working you need to declare
int num_procs, column_size;
as global variables before the definition of ArrayAllocation and move the call to this function down below the line where bufsize is calculated:
...
/* Calculate how many elements each processor gets */
bufsize = filesize/np;
num_procs = np;
column_size = bufsize;
double **matrix = ArrayAllocation();
...
With the above modifications this example should work on any MPI implementation that supports MPI-IO. I have tested it with OpenMPI 1.2.8.
In order to generate a test file you could use for instance the following code:
FILE* f = fopen(FILENAME,"w");
double x = 0;
for(i=0;i<100;i++){
fwrite(&x, 1,sizeof(double), f);
x +=0.1;
}
fclose(f);

fwrite not writing entire buffer

I am currently making a small test program for simple file checking. The program writes two small matrices(A and B) to files, closes and reopens them, reads in the matrices from the files, multiplies them and writes the resulting matrix(C) to a new file. It then closes and reopens this file containing the answer and prints it out for me to check if the IO operation proceeded correctly.
My problem is that the result matrix reads differently than expected.
I consider myself a beginner in C and of file input/output operations and this is the code that is causing me trouble. I am using WinXP, Codeblocks and Mingw.
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#define bufferA(i,k) (bufferA[i*cols+k])
#define bufferB(k,j) (bufferB[k*cols+j])
#define bufferC(i,j) (bufferC[i*cols+j])
void printMatrix(int *nMatrixToPrint, int nNumberOfElements, int nDimension) {
// This function prints out the element of an Array. This array represents a matrix in memory.
int nIndex;
printf("\n");
for (nIndex = 0; nIndex < nNumberOfElements; nIndex++) {
if (nIndex % nDimension == 0)
printf("\n");
printf("%d,",nMatrixToPrint[nIndex]);
}
return;
}
int main(int argc, char *argv[]) {
int nElements = 16, nDim = 4;
int A[4][4] = {{1,2,3,1},{2,2,1,2},{4,2,3,1},{5,1,1,3}};
int B[4][4] = {{3,2,1,4},{2,2,3,3},{4,1,3,2},{2,2,5,1}};
// Create files of A and B, delete old ones if present
FILE *fpA = fopen("A.dat", "w+");
FILE *fpB = fopen("B.dat", "w+");
// Write data to them
fwrite((int*)A, sizeof(*A), nElements, fpA);
fwrite((int*)B, sizeof(*B), nElements, fpB);
// and close them
fclose(fpA);
fclose(fpB);
// Reopen files
fpA = fopen("A.dat", "r");
fpB = fopen("B.dat", "r");
// Allocate memory
int *bufferA = (int*)malloc(nElements * sizeof(*bufferA));
int *bufferB = (int*)malloc(nElements * sizeof(*bufferB));
int *bufferC = (int*)calloc(nElements, sizeof(*bufferC));
// Read files
fread(bufferA, sizeof(int), nElements, fpA);
fread(bufferB, sizeof(int), nElements, fpB);
printf("\nA");
printMatrix(bufferA, nElements, nDim);
printf("\n\nB");
printMatrix(bufferB, nElements, nDim);
// Matrix multiplication
// Calculate and write to C
int i,j,k = 0; // Loop indices
int n = nDim,l = nDim, m = nDim, cols = nDim;
// multiply
for (i = 0; i < n; i++) { // Columns
for (j = 0; j < m; j++) { // Rows
//C(i,j) = 0;
for (k = 0; k < l; k++) {
bufferC(i,j) += bufferA(i,k) * bufferB(k,j);
}
}
}
printf("\n\nC_buffer");
printMatrix(bufferC, nElements, nDim);
// Create C and write to it
FILE* Cfile = fopen("C.dat", "w");
fwrite(bufferC, sizeof(*bufferC), nElements, Cfile);
// Close files
fclose(fpA);
fclose(fpB);
fclose(Cfile);
// reopen C for reading
Cfile = fopen("C.dat", "r");
// Obtain file size
fseek(Cfile , 0 , SEEK_END);
long lSize = ftell(Cfile);
rewind(Cfile);
printf("\nC file length is: %ld", lSize);
// read data into bufferA
fread(bufferA, sizeof(int), lSize, Cfile);
fclose(Cfile);
printf("\n\nC_file");
printMatrix(bufferA, nElements, nDim);
// Free allocated memory and remove dangling pointers
free(bufferA); bufferA = NULL;
free(bufferB); bufferB = NULL;
free(bufferC); bufferC = NULL;
exit(0);
}
Which gives me the following output:
A
1,2,3,1,
2,2,1,2,
4,2,3,1,
5,1,1,3,
B
3,2,1,4,
2,2,3,3,
4,1,3,2,
2,2,5,1,
C_buffer
21,11,21,17,
18,13,21,18,
30,17,24,29,
27,19,26,28,
C file length is: 64
C_file
21,11,21,17,
18,13,21,18,
30,17,24,29,
27,19,1,3,
As you can see, the last two elements in C_file are wrong, instead the output shows the last two elements in A as I was writing the file contents into bufferA. A switch to bufferB would swap the last two characters with the last elements in B which is still erroneous. A filecopy into another project would yield the last two integers as whatever was in ram at that malloc address.
My question is as follows: Why does not fwrite write the proper data into the file. Why does it manage the first 14 elements but not the last two? And how does this differ from my previous correct uses of fwrite and fread when I wrote and retrieved the elements of A and B?
You are writing binary data, and therefore you have to open the file in binary mode, the default is text mode. This makes a difference on windows, but not on *nix, which explains why it works for the other people here.
for all your fopen calls, include the letter 'b' in the mode argument, e.g. replace "w+" with "w+b" , replace "r" with "rb" and so on.
Your program runs just fine on my Mac.
The results would look better if printMatrix() output a final newline. Perhaps the unterminated line is causing some sort of confusion on your system?

Resources