C: Read and write between parent and multiple child processes - c

Say I have a function called worker(int in, int out) that performs some task based on the given in file descriptor and takes the result and writes it to out.
It might look something like:
while (read(in, buffer, some_max_length) > 0){
// Do something with buffer
.
.
write(out, some_other_info, some_size);
}
Say I also have a main process that spawns a variable amount of processes that might look something like:
// Assume I have done error checking.
array_of_write_pipes[n][2];
array_of_read_pipes[n][2]; // Assume these have already been populated.
while(there is a word from stdin){
// Spawn n processes
for(int i = 0; i < n; i++){
n = fork();
if(n == 0){
write(array_of_read_pipes[i][1], some_data, some_max_length);
worker(some_other_data, array_of_read_pipes[i][0], array_of_write_pipes[i][1]);
exit(0);
}
}
// Block of code waiting for all children to terminate.
// Collect the results of each child process
for(i = 0; i < n; i++){
while (read(array_of_write_pipes[i][0], buffer, some_max_length) > 0){
// Do something with buffer
.
.
}
}
}
At the moment, it seems to hang.
My end goal is to have the main process send the same word to each child process. Each child process then performs the task via worker(). Then, the child process sends its results back to the main process for further processing.
At this moment, I'm not even sure if I'm even remotely going in the right direction.
I tried to keep this question as general as possible except for the parts dealing with piping. If I'm missing any information, please let me know. Just a disclaimer, this is a homework related problem and I do not want the full answer, just whether or not my thought process is correct. If not, what am I missing?
Any help is appreciated.

Related

Averaging the output of multiple processes in C

I've been searching around about the topic, but it's kinda confusing for me with how many different techniques there are and I'm not sure how to approach my problem.
I have a function that computes some value, but it's based on random numbers and I want to compute that value multiple times, let's say few dozen or hundred times and take the average of it, but since it takes quite a while I've wanted to use multiprocessing, with each process executing that function, saving the result and then I'd simply sum the results and divide by the amount of worker processes in the main process.
Quite simple in theory, but I have no idea how to do it - it seems that a simple way would be to just do something like
loop that creates pipes
if (fork())
loop that reads the outputs of pipes
else
code of function that computes the desired value
but that somehow seems wrong? I'm really not sure how to do it
EDIT:
To adress the comments, I've been thinking about something like this:
for (int i = 0; i < n_children; ++i) {
if (fork() == 0) { //child process
x += estimation();
}
}
for (int i = 0; i < n_children; ++i) //waiting for each process to end
wait(NULL);
x /= n_children;
but I know that it won't work properly, I don't know how to store/synchronize the results
As William Pursell mentioned in the comments, a single pipe is what you want. The parent will close the write end, and each forked child will close the read end. Each child writes its result to the pipe. The parent calls wait(2) on each child and, if the status indicates data was written to the pipe, reads the pipe and updates the average.
It could also be done with Posix anonymous shared memory. Allocate an array of results in shared memory. Each child will have a unique value of the loop variable i when its process is created. The child writes to array[i]. The parent waits for each child. When they have all completed, iterate over the array and compute.

Managing multi processes to do the same jobs in C

I write a code that some processes use an array, sort it and then print it. In fact What I would like to do is that each process should sort a line of integer numbers that main process gives to them and print them, then send them back to the main process. The algorithm works fine without process and forking. But when I add forking, some process cause printing or performing some instruction more than one time or more. Please let me know how to manage it.
Here is the code:
if (N<=NumberOfLines)
{
PortionOfProcess=NumberOfLines/N;
for (int i=0;i<N;i++)//making N process using fork
{
for (int j=0; j<PortionOfProcess; j++)//For using from function by a single process
{
int pointer=i*PortionOfProcess+j;//this points to the line for each process
printf("poniter: %d . the i is %d, the j is: %d and the portionprocess is : %d\n",pointer,i,j,PortionOfProcess);
fileopener(B_result,pointer);
mypid=fork();
if (mypid==0)//child
{
///////////do the sorting
for (int j=0 ; j<(y-1) ; j++)
{
for (int i=0 ; i<(y-1) ; i++)
{
if (B_result[i+1] < B_result[i])
{
t = B_result[i];
B_result[i] = B_result[i + 1];
B_result[i + 1] = t;
}
}
}
for (int j=0 ; j<y ; j++)
{
printf("SORTED %d \n",B_result[j]);
}
//////////////////end sorting
}
}
}
}
I am new in C programming. I write a code that some processes use an array, sort it and then print it. The algorithm works fine with out process and forking
Here is what fork() does: it creates an entire new copy of the process that, in most ways, is completely independent of the original. However, the original parent process does not wait for the children to finish. Nor has it any way of communicating with the children.
What you want to do is actually quite complex. The parent and child processes needs to create some sort of communications channel. This is most usually done by creating a pipe between them. The child will then write to the pipe like a normal file and the parent will read from the pipe. The logic will look something like this:
create pipe
fork
if parent close write end of the pipe
if child close read end of pipe
The children then do their stuff and exit normally. The parent, however, has a load of files to read and it doesn't know which order to read them in. In your case the children are fairly simple, so you could probably just read each one in the order you create it, but you may also want to look at select so that you read the results in the order they are ready.
Finally, you need to call wait or waitpid so that you get the return status of each child and you do not end up with zombie processes which is a possibility because with the parent blocking on input from various pipes, any mistake you make could lead to it waiting forever (or until killed).

Pipeline idea not working due to pointers (C)

I was working on a little program I had to make some time ago, and I wanted to short it up.
This little program was about one thread creating several child who would redirect their standard output/input one to another with pipes in order, except for the last child, who won't redirect it's standard output, like this.
Parent pipe child1 pipe child2 pipe last child
__ __ __
O-----|__|-----O-----|__|-----O-----|__|-----O -> Stdout
First time I face this code, I made a matrix with a dimension of [n_child][2], and made a pipe from every position of that matrix, so it was very easy to connect every pipe to every child when needed. But now I want to try it only with 2 pipes, and "playing" with inheritance.
Maybe I'm not explaining myself really well, so I think everything would be better understood with my code,so here we go.
piping_function(int number_of_child){
int i;
int olddesc[2];
int newdesc[2]; //Here I create the descriptors of the pipes I'll use
int *olddir;
int *newdir;
if(pipe(olddesc)<0 || pipe(newdesc) < 0){ //I create the pipes
//Error
}
olddir = olddesc;
newdir = newdesc; //And attach the directions to the array's direction (we will see later why)
for(i = 0; i < number_of_child; i++){
chaddr[i] = fork();
switch(chaddr[i]){
case -1:
//Error trace
case 0:
dup2(olddesc[0],0); //Here I redirect the pipe who connect the previous child's pipe to the standard input
if(i != number_of_child - 1)
dup2(newdesc[1],1); //And here, except from the last child, I redirect the standard output to the pipe who will connect to the standard input of the next child
close(olddesc[0]);
close(newdesc[1]); //I close the descriptors I don't need
//Several child operations with standard input-output (end up returning 0/1 after the pipeline is connected, so no child will create any child)
default:
if(i == 0)
dup2(olddesc[1], 1); //I want the standard output of the principal proccess only on the first pipe
olddir = newdir; //Here I would want the direction of the "old" pipe to be the direction of the "new" pipe, in order to achieve the pipeline
if(pipe(newdesc)<0)
//Error
break;
}//End of switch
}//End of for
close(olddesc[0]);close(olddesc[1]);close(newdesc[0]);close(newdesc[1]); //I don't need these descriptors anymore, as they must be redirected to the standard's input/output of the process they need.
}//End of function
Well, there is my code. I think I can see the mistake I'm doing, when I make olddir be newdir, and create the pipe, I'm doing olddir to be also that new pipe right? So here comes my question:
Is there any way to achieve that change? I mean, the thing I want is to equal olddir (who is the address of olddesc, right? so if I change that address olddesc's address will be also changed, right?) to newdir, in order to continue with the pipe I created before to redirect the standard output of the "next" child, but I also want newdir to be a NEW pipe.
I don't really know if I explained myself right, I'm not a native speaker and it's a bit difficult to explain these kind of ideas in other language. Feel free to correct any grammar mistake, I'd appreciate it, and to ask any question about the code, as maybe I'm not giving the point I wanted to.
Thanks.
Well, I finally made it, but changing the point of view and not working with any address overwrite, only playing with the iterations, in order always keep the record of one of the two pipes. I didn't really solve the address problem, but it works so, I'm satisfied.
piping_function(int number_of_child){
int i;
int olddesc[2];
int newdesc[2]; //Here I create the descriptors of the pipes I'll use
//Now we'll start from the last pipe, so we can connect the parent to the first one and not connect the last child's output.
for(i = number_of_child - 1; i >= 0; i--){
//Instead of changing addresses, I'll play with the "i"
if(i%2==0){
close(newdesc[0]); //We close the last descriptors
close(newdesc[1]);
pipe(newdesc); //And create a new pipe!
}
else{
close(olddesc[0]);
close(olddesc[1]);
pipe(oldesc);
}
chaddr[i] = fork();
switch(chaddr[i]){
case -1:
//Error trace
case 0:
if(i%2==0){
dup2(newdesc[0],0); //Here I redirect the pipe who connect the previous child's pipe to the standard input
if(i != number_of_child - 1)
dup2(olddesc[1],1); //And here, except from the last child, I redirect the standard output to the pipe who will connect to the standard input of the next child
close(newdesc[1]);
close(olddesc[0]); //I close the descriptors I don't need
}
else{
dup2(olddesc[0],0); //Here I redirect the pipe who connect the previous child's pipe to the standard input
if(i != number_of_child - 1)
dup2(newdesc[1],1); //And here, except from the last child, I redirect the standard output to the pipe who will connect to the standard input of the next child
close(olddesc[1]);
close(newdesc[0]); //I close the descriptors I don't need
}
//Several child operations with standard input-output (end up exiting 0/1 after the pipeline is connected, so no child will create any child)
default:
if(i == 0)
dup2(newdesc[1], 1); //I want the standard output of the principal proccess only on the first pipe
break;
}//End of switch
}//End of for
close(olddesc[0]);close(olddesc[1]);close(newdesc[0]);close(newdesc[1]); //I don't need these descriptors anymore, as they must be redirected to the standard's input/output of the process they need.
}//End of function
*Edit:
I don't think it's good to close always the descriptors, even without creating anything, I saw I left them this way on this piece of code, but in order to make it work correctly we should only close the descriptors when created so an if would be needed on every close.

understanding forking - simple

if I have a program like this:
int i = 0;
int status;
bool result = true;
for(i = 0; i < numfiles; i++) { // LOOP 1
if (fork() == 0) {/* Child */
if (substLines(s1, s2, filenames[i])) {
exit(0);
} else {
exit(2);
}
}
}
for(i = 0; i < numfiles; i++) { // LOOP 2
wait(&status);
....
}
return result;
}
I have the following question.
what happens if a child process exists, before the program even knows about the wait(). I guess my question is regarding how a program is 'read'. Again, for example. If I exit from the first child, whilst still going through LOOP 1, what happens (does it even know about LOOP 2 at this point)?
is this a concurrent program? the parent seems to be waiting on the children after is forked them all, so i would say yes?
The man page of wait says
If a child has already changed state, then these calls return immediately. Otherwise they block until either a child changes state or a signal handler interrupts the call
so question1 doesn't matter
and question2, the answer is no.
Concurrency means they are running at the same time. It needs mutil-core CPU or more than one computer such as distributed system.
your program is multi-process, it is just Parallelism, which means they are running one by one under the schedule of CPU, for more info: Scheduling_(computing)
Just an addition to #simon_xia's excellent answer.
A killed or exited process becomes a zombie until its parent calls wait for it. And yes, this is the official terminology. :-) In zombie state everything is cleaned up (memory pages, open files, env, etc), just the exit status or killing signal number are kept.

Fork() new process and write to files for child and parent processes

I'm new to fork(), parent and child processes and have some difficulty understanding the logic behind the code that I wrote, but did not perform what I expected. Here is what I have:
int main (int argc, char** argv)
{
FILE *fp_parent;
FILE *fp_child;
fp_parent = fopen ("parent.out","w");
fp_child = fopen ("child.out","w");
int test_pid;
printf ("GET HERE\n");
fprintf (fp_parent,"Begin\n"); // MY CONCERN
for (int i = 0; i < 1; i++) //for simplicity, just fork 1 process.
{ // but i want to fork more processes later
test_pid = fork();
if(test_pid < 0)
{
printf ("ERROR fork\n");
exit (0);
}
else if(test_pid == 0) // CHILD
{
fprintf(fp_child,"child\n");
break;
}
else //PARENT
{
fprintf(fp_parent,"parent\n");
}
}
fclose(fp_parent);
fclose(fp_child);
}
So the output of above code is:
to stdout: GET HERE
in parent.out:
Begin
parent
Begin
in child.out:
child
My main concern is that I don't quite understand why "Begin" get written to parent.out twice. If I remove the for loop completely, then only one "Begin" is written which is expected.
So I think that it's because of the fork() and definitely I miss or don't understand some logic behind it. Could you guys please help me explain?
My plan to be able to write something before the for loop in parent.out and write something during the for loop in parent.out. Child process will write to child.out.
In C, the input/output operations using FILE structure are buffered at the level of user process. In your case, the output you have written to fp_parent was not actually written onto the disk and was kept in a local buffer at the moment of fork. The fork creates a copy of the whole process including the buffer containing Begin and this is why it appears twice in your file. Try to put fflush(fp_parent); before the fork. This will flush the buffer and the dirty line will disappear from the file.

Resources