blocking pipe, c, linux - c

I have a project called "consumers and producers".
In it, my main program creates given numbers of consumer and producer
child processes, as specified via command-line arguments (./main number_of_producents number_of_consumers).
Every producer generates a random number of random printable characters, writes these both to a process-specific file and to a pipe designated by the main process. Every consumer reads all characters from the pipe specified to it by the main process, and writes them to its own process-specific file. After the main program finishes, the combined number of characters in the producers' files should be the same as the combined number of characters in the consumers' files.
I have a problem with how to inform consumers about the end of the products. I've tried to do this by having the producers each write a null character after their data, but it doesn't solve my problem, because then when I create more producers than consumers not all products will be read by consumers (each finishes its work when it reads a null character). When the number of consumers is bigger than the number of producers, on the other hand, some of the consumers never do read a null character, and therefore never finish.
I think that I need to a different mechanism to signal the end of each producer's data, but I don't have any other idea how to inform consumers about the end of pipe. I've read something about fcntl, but I don't know how to use it in my code.
Here is the code:
MAIN:
int n;
int i,w,x;
int pipeT[2];
if(pipe(potok) != 0) exit(EXIT_FAILURE);
printf("[MAIN] Creating %d producers!\n", atoi(argv[1]));
for(i = 0; i < atoi(argv[1]); i++) {
switch(fork()) {
case -1:
exit(3);
break;
case 0:
if((n = dup(pipeT[1])) == -1) exit(5);
close(pipeT[0]);
if((execl("./producer","producer", &n, NULL)) == -1) exit(4);
break;
default:
break;
}
}
printf("[MAIN] Creating %d consumers!\n", atoi(argv[2]));
for(i = 0; i < atoi(argv[2]); i++) {
switch(fork()) {
case -1:
exit(3);
break;
case 0: //Proces potomny
if((n = dup(pipeT[0])) == -1) exit(5);
close(pipeT[1]);
if((execl("./consumer","consumer", &n, NULL)) == -1) exit(4);
break;
default:
break;
}
}
close(pipeT[0]);
close(pipeT[1]);
for (i = 0; i < atoi(argv[1]) + atoi(argv[2]); i++) {
w = wait(&x);
if (w == -1) exit(6);
}
PRODUCER
int main(int argc, char *argv[]) {
srand(getpid());
int k,t,n;
n = *argv[1];
char sign;
char nameOfFile[20];
sprintf(nameOfFile, "P/p_%d", getpid());
FILE* f = fopen(nameOfFile, "w");
if(f == NULL) exit (1);
int i;
int numberOfSigns = rand()%100;
for(i = 0; i < numberOfSigns; i++) {
sign = rand()%94 + 32;
if (write(n, &sign, sizeof(char)) == -1) exit(EXIT_FAILURE);
fprintf(f, "%c", sign);
}
sign = 0;
write(n, &sign, sizeof(char));
fclose(f);
return 0;
}
CONSUMER:
int main(int argc, char *argv[]) {
char sign;
char nameOfFile[20];
int n = *argv[1];
sprintf(nameOfFile, "K/k_%d", getpid());
FILE* f = fopen(nameOfFile, "w");
if(f == NULL) exit (1);
do {
if ((read(n, &sign, sizeof(char))) == -1) exit(EXIT_FAILURE);
if(sign != 0) fprintf(f, "%c" , sign);
} while(sign != 0);
fclose(f);
return 0;
}
How can I signal the end of the data to consumers such that the consumers among them read all the data and all of them recognize the end?

You're making this harder than it needs to be.
The natural way for the consumers to recognize the end of the data is as end-of-file on the pipe. They will each see this in the form of read() returning 0, which will happen after all copies of the write end of the pipe are closed and no more data are available to be read from it. Since the consumers and producers all share the same pipe, this will also provide some natural balancing of the data among the consumers.
Note well, however, that I said all copies of the write end of the pipe must be closed. This includes the copy in the main program, as well as the copies in each producer and consumer. It turns out that you do close all these, but you are relying on termination of the producers for closure of one copy in each of them. That's a bit unclean, and it's certainly situation-specific. It would be better style for the producers to explicitly close these when they are finished with them.
Additionally, your code exhibits some additional problems and oddities, among them:
Your uses of dup() are pointless. Since you're passing the file descriptor numbers to the children anyway, you could just as well let them use the original pipe-end file descriptors. The usual use of dup() (or better: dup2()) in circumstances like yours is to make one or more specific file descriptors be copies of the ones you are duping. Those of the standard streams, for example. And note that if you do use the standard streams then they have well-known file descriptor numbers, so you don't need to tell the children which FDs to use.
The arguments to execl() and the elements of every process's argv are pointers to strings, so type char *, terminated by a NULL pointer of that type (which is not counted in argc). You are specifying some pointers of type int * in your execl() calls, and your child processes are assuming that they receive pointers of that type. It may be that this happens to work for you, but it is incorrect, in the sense that the behaviors of parent and child are both undefined as a result.

You should probably think harder about how you're trying to solve this. For example, what if one of the producers stops much earlier than the others, should that stop one of the consumers? If not, then the producers shouldn't send a code to the consumers, only the main program should care that they've terminated.
The way I would solve this, is that the consumers should be terminated by the main program, not by the producers. The easiest way to do that is by closing the sending end of the pipe. The reading end would then get an end-of-file, i.e., the size returned by read() would be 0.
In your code, this would probably mean
1) don't send 0 from the producers
2) make the main program wait for all the producers to terminate first, and close the pipe when they've all terminated (before waiting for the consumers)
3) make the consumers exit when read() returns 0
(Also note that there are other problems with your code. For example, you can't pass pointers to integer through execl, only strings.)

Related

How do I use 2 child processes one for executing command and the other one for reading output and passing it to the next?

So my program needs to pipe multiple processes and read the number of bytes each process output has.
The way I implemented it, in a for loop, we have two children:
Child 1: dups output and executes the process
Child 2: reads the output and writes it for the next input
Currently, child 1 executes the process and the child 2 reads its output, but it doesn't seem to write it in the right place because in the second loop iteration it prints the output to the screen and blocks.
for (int i = 0; i < processes; i++) {
int result = socketpair(PF_LOCAL, SOCK_STREAM, 0, apipe[i]);
if (result == -1) {
error_and_exit();
}
int pid;
int pid2;
pid = fork_or_die();
// child one points to STDOUT
if (pid == FORK_CHILD) {
if (dup2(apipe[i][1], STDOUT_FILENO) == -1)
error_and_exit();
if (close(apipe[i][1]) == -1)
error_and_exit();
if (close(apipe[i][0]) == -1)
error_and_exit();
if (execlp("/bin/sh", "sh", "-c", tabCommande[i], (char *)NULL) == -1)
error_and_exit();
}
pid2 = fork_or_die();
//CHILD 2 reads the output and writes if for the next command to use
if(pid2 == FORK_CHILD){
FILE *fp;
fp = fopen("count", "a");
close(apipe[i][1]);
int count=0;
char str[4096];
count = read(apipe[i][0], str, sizeof(str)+1);
close(apipe[i][0]);
write(STDIN_FILENO, str, count);
fprintf(fp, "%d : %d \n ", i, count);
fclose(fp);
}
}
Your second child does “write(STDIN_FILENO, …); that’s not a conventional way of using standard input.
If standard input is a terminal, then the device is usually opened for reading and writing and the three standard I/O channels are created using dup() or dup2(). Thus you can read from the outputs and write to the input — but only if the streams are connected to a login terminal (window). If the input is a pipe, you can't successfully write to it, nor can you read from the output if it is a pipe. (Similarly if the input is redirected from a file or the output is redirected to a file.) This terminal setup is done by the process that creates the terminal window. It is background information explaining why writing to standard input appears on the terminal.
Anyway, that's what you're doing. You are writing to the terminal via standard input. Your minimum necessary change is to replace STDIN_FILENO with STDOUT_FILENO.
You are also going to need a loop around the reading and writing code. In general, processes generate lots of output in small chunks. The close on the input pipe will be outside the loop, of course, not between the read() and write() operations. You should check that the write() operations write all the data to the output.
You should also have the second child exit after it closes the output file. In this code, I'd probably open the file after the counting loop (or what will become the counting loop), but that's mostly a stylistic change, keeping the scope of variables to a minimum.
You will probably eventually need to handle signals like SIGPIPE (or ignore it so that the output functions return errors when the pipe is closed early). However, that's a refinement for when you have the basic code working.
Bug: you have:
count = read(apipe[i][0], str, sizeof(str)+1);
This is a request to the o/s to give you a buffer overflow — you ask it to write more data into str than str can hold. Remove the +1!
Minor note: you don’t need to check the return value from execlp() or any of that family of functions. If the call succeeds, it doesn’t return; if it returns, it failed. Your code is correct to exit after the call to execlp(), though; that's good.
You said:
I replaced STDIN_FILENO to STDOUT_FILENO in the second child but it doesn't seem to solve the issue. The output is still shown in the terminal and there's a pipe blockage after.
That observation may well be correct, but it isn't something that can be resolved by studying this code alone. The change to write to an output stream is necessary — and in the absence of any alternative information, writing to STDOUT_FILENO is better than writing to STDIN_FILENO.
That is a necessary change, but it is probably not a sufficient change. There are other changes needed too.
Did you set up the inputs and outputs for the pair of children this code creates correctly? It is very hard to know from the code shown — but given that it is not working as you intended, it's a reasonable inference that you did not get all the plumbing correct. You need to draw a diagram of how the processes are intended to operate in the larger context. At a minimum, you need to know where the standard input for each process comes from, and where its standard input goes. Sometimes, you need to worry about standard error too — most likely though, in this case, you can quietly ignore it.
This is what I think your code could look like — though the comments in it describe numerous possible variants.
#include <sys/socket.h>
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
/* The code needs these declarations and definition to compile */
extern _Noreturn void error_and_exit(void);
extern pid_t fork_or_die(void);
extern void unknown_function(void);
static ssize_t copy_bytes(int fd1, int fd2);
#define FORK_CHILD 0
int processes;
int apipe[20][2];
char *tabCommande[21];
void unknown_function(void)
{
for (int i = 0; i < processes; i++)
{
int result = socketpair(PF_LOCAL, SOCK_STREAM, 0, apipe[i]);
if (result == -1)
error_and_exit();
int pid1 = fork_or_die();
// child one points to STDOUT
if (pid1 == FORK_CHILD)
{
if (dup2(apipe[i][1], STDOUT_FILENO) == -1)
error_and_exit();
if (close(apipe[i][1]) == -1)
error_and_exit();
if (close(apipe[i][0]) == -1)
error_and_exit();
execlp("/bin/sh", "sh", "-c", tabCommande[i], (char *)NULL);
error_and_exit();
}
//CHILD 2 reads the output and writes if for the next command to use
int pid2 = fork_or_die();
if (pid2 == FORK_CHILD)
{
close(apipe[i][1]);
ssize_t count = copy_bytes(apipe[i][0], STDOUT_FILENO);
FILE *fp = fopen("count", "a");
if (fp == NULL)
error_and_exit();
/*
** Using %zd for ssize_t is a reasonable guess at a format to
** print ssize_t - but it is a guess. Alternatively, change the
** type of count to long long and use %lld. There isn't a
** documented, official (fully standardized by POSIX) conversion
** specifier for ssize_t AFAIK.
*/
fprintf(fp, "%d : %zd\n ", i, count);
fclose(fp);
exit(EXIT_SUCCESS);
}
/*
** This is crucial - the parent has all the pipes open, and the
** child processes won't get EOF until the parent closes the
** write ends of the pipes, and they won't get EOF on the inputs
** until the parent closes the read ends of the pipe.
**
** It could be avoided if the first child creates the pipe or
** socketpair and then creates the second child as a grandchild
** of the main process. That also alters the process structure
** and reduces the number of processes that the original parent
** process has to wait for. If the first child creates the
** pipe, then the apipe array of arrays becomes unnecessary;
** you can have a simple int apipe[2]; array that's local to the
** two processes. However, you may need the array of arrays so
** that you can chain the outputs of one process (pair of
** processes) to the input of the next.
*/
close(apipe[i][0]);
close(apipe[i][1]);
}
}
static ssize_t copy_bytes(int fd1, int fd2)
{
ssize_t tbytes = 0;
ssize_t rbytes;
char buffer[4096];
while ((rbytes = read(fd1, buffer, sizeof(buffer))) > 0)
{
ssize_t wbytes = write(fd2, buffer, rbytes);
if (wbytes != rbytes)
{
/*
** There are many possible ways to deal with this. If
** wbytes is negative, then the write failed, presumably
** irrecoverably. The code could break the loop, reporting
** how many bytes were written successfully to the output.
** If wbytes is zero (pretty improbable), it isn't clear
** what happened. If wbytes is positive, then you could add
** the current value to tbytes and try to write the rest in
** a loop until everything has been written or an error
** occurs. You pays your money and takes your pick.
*/
error_and_exit();
}
tbytes += wbytes;
}
if (tbytes == 0 && rbytes < 0)
tbytes = rbytes;
return tbytes;
}
You could add #include <signal.h> and signal(SIGPIPE, SIG_IGN); to the code in the second child.

Why does read() block and wait forever in parent process despite the writing end of pipe being closed?

I'm writing a program with two processes that communicate through a pipe. The child process reads some parameters from the parent, executes a shell script with them and returns the results to the parent process line by line.
My code worked just fine until I wrote the while(read()) part at the end of the parent process. The child would execute the shell script, read its echo's from popen() and print them to standard output.
Now I tried to write the results to the pipe as well and read them in the while() loop at the parent's end but it blocks and neither will the child process print the result to standard output. Apparently it won't even reach the point after reading the data from the pipe sent by the parent.
If I comment out the while() at the parent process, the child will print the results and return, and the program will end smoothly.
Why does the while(read()) block even if I closed the writing end of the pipe in both parent and child processes?
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>
#include <errno.h>
#include <string.h>
#include <fcntl.h>
int read_from_file(char **directory, int *octal) {
FILE *file = fopen("input", "r");
if (file == NULL) {
perror("error opening file");
exit(1);
}
fscanf(file, "%s %d", *directory, octal);
}
int main(int argc, char *argv[]) {
char *directory = malloc(256);
int *octal = malloc(sizeof *octal);
pid_t pid;
int pfd[2];
char res[256];
if (pipe(pfd) < 0) {
perror("Error opening pipe");
return 1;
}
if ((pid = fork()) < 0)
perror("Error forking");
if (pid == 0) {
printf("client here\n");
if (read(pfd[0], directory, 256) < 0)
perror("error reading from pipe");
if (read(pfd[0], octal, sizeof(int)) < 0)
perror("error reading from pipe");
// This won't get printed:
printf("client just read from pipe\n");
// close(pfd[0]);
char command[256] = "./asd.sh ";
strcat(command, directory);
char octal_c[5];
sprintf(octal_c, " %d", *octal);
strcat(command, octal_c);
FILE *f = popen(command, "r");
while (fgets(res, 256, f) != NULL) {
printf("%s", res);
if (write(pfd[1], res, 256) < 0)
perror("Error writing res to pipe");
}
fclose(f);
close(pfd[1]);
close(pfd[0]);
fflush(stdout);
return 1;
}
read_from_file(&directory, octal);
if (write(pfd[1], directory, 256) < 0)
perror("Error writing dir to pipe");
if (write(pfd[1], octal, sizeof(int)) < 0)
perror("error writing octal to pipe");
int r;
close(pfd[1]);
while (r = read(pfd[0], res, 256)) {
if (r > 0) {
printf("%s", res);
}
}
close(pfd[0]);
while (wait(NULL) != -1 || errno != ECHILD);
}
Since the child demonstrably reaches ...
printf("client here\n");
... but seems not to reach ...
printf("client just read from pipe\n");
... we can suppose that it blocks indefinitely on one of the two read() calls between. With the right timing, that explains why the parent blocks on its own read() from the pipe. But how and why does that blocking occur?
There are at least three significant semantic errors in your program:
pipes do not work well for bidirectional communication. It is possible, for example, for a process to read back the bytes that it wrote itself and intended for a different process. If you want bidirectional communication then use two pipes. In your case, I think that would have avoided the apparent deadlock, though it would not, by itself, have made the program work correctly.
write and read do not necessarily transfer the full number of bytes requested, and short reads and writes are not considered erroneous. On success, these functions return the number of bytes transferred, and if you want to be sure to transfer a specific number of bytes then you need to run the read or write in a loop, using the return values to track progress through the buffer being transferred. Or use fread() and fwrite() instead.
Pipes convey undifferentiated streams of bytes. That is, they are not message oriented. It is not safe to assume that reads from a pipe will be paired with writes to the pipe, so that each read receives exactly the bytes written by one write. Yet your code depends on that to happen.
Here's a plausible failure scenario that could explain your observations:
The parent:
fork()s the child.
after some time performs two writes to the pipe, one from variable directory and the other from variable octal. At least the first of those is a short write.
closes its copy of the write end of the pipe.
blocks attempting to read from the pipe.
The child:
reads all the bytes written via its first read (into its copy of directory).
blocks on its second read(). It can do this despite the parent closing its copy of the write end, because the write end of the pipe is still open in the child.
You then have a deadlock. Both ends of the pipe are open in at least one process, the pipe is empty, and both processes are blocked trying to read bytes that can never arrive.
There are other possibilities that arrive at substantially the same place, too, some of them not relying on a short write.
The parent process was trying to read from the pipe before the child could have read from it and write the results to it. Using two different pipes for the two-way communication solved the problem.

Implementation of Sieve of Eratosthenes in C

Blow is an implementation of Sieve of Eratosthenes in C
#include "kernel/stat.h"
#include "kernel/types.h"
#include "user/user.h"
void cull(int p) {
int n;
while (read(0, &n, sizeof(n))) {
// n is not prime
if (n % p != 0) {
write(1, &n, sizeof(n));
}
}
}
void redirect(int k, int pd[]) {
close(k);
dup(pd[k]);
close(pd[0]);
close(pd[1]);
}
void right() {
int pd[2], p;
// read p
if (read(0, &p, sizeof(p))) {
printf("prime %d\n", p);
pipe(pd);
if (fork()) {
// parent
redirect(0, pd);
right();
} else {
redirect(1, pd);
cull(p);
}
}
}
int main(int argc, char *argv[]) {
int pd[2];
pipe(pd);
if (fork()) {
redirect(0, pd);
right();
} else {
redirect(1, pd);
for (int i = 2; i < 36; i++) {
write(1, &i, sizeof(i));
}
}
exit(0);
I am not quite following the logic here:
1). Why does redirect need to close pd1?
2). cull() is reading from the file descriptor 0, but in right() the child process will close 0. Why doesn't this cause any problem?
3). why it is necessary to use redirect here when we only want read from 0 and write to 1?
Update:
1). I found this implementation in the following blog post:
https://abcdlsj.github.io/post/mit-6.828-lab1xv6-and-unix-utilities/
2). I think the idea is borrowed from the inventor of this algorithm
3). I think the reason that the header is so is because it is implemented in a toy operating system for educational purpose.
The code is an adaptation of the paper Coroutine prime number sieve by M. Douglas McIlroy to the Xv6 operating system, a re-implementation of Version 6 Unix used for teaching. The technique is from 1968 as an experiment in co-routines via pipes. The paper explains the algorithm and rationale.
The program implements the sieve of Eratosthenes as a kind of production line, in which each worker culls multiples of one prime from a passing stream of integers, and new workers are recruited as primes are discovered.
When Unix came to be, my fascination with coroutines led me to badger its author, KenThompson, to allow writes in one process to go not only to devices but also to matching reads in another process.
...the coroutine sieve has been a standard demo for languages or systems that support interprocess communication. Implementations using Unix processes typically place the three coroutines—source, cull and sink—in distinct executable files.The fact that the whole program can be written as a single source file, in a language that supports neither concurrency nor IPC, is a tribute not only to Unix’s pipes, but also to its clean separation of program initiation into fork for duplicating address spaces and exec for initializing them.
I believe using stdin and stdout is an artifact of its origins in the early days of Unix when piping stdin and stdout between processes was first introduced. It makes a lot more sense in shell.
#!/bin/bash
source() {
seq 2 1000000
}
cull() {
while true
do
read n
(($n % $1 != 0)) && echo $n
done
}
sink() {
read p
echo $p
cull $p | sink &
}
source | sink
In C, as we'll see, it's simpler to skip the redirection and pass around pipes.
First, what's going on?
redirect is redirecting stdin and stdout to a pipe. 0 is stdin and 1 is stdout. This can be made more clear by using STDIN_FILENO and STDOUT_FILENO.
main makes a pipe.
main forks.
The child redirects stdout to the pipe.
The child streams numbers to the pipe via stdout.
The first number must be 2.
main redirects stdin to the pipe.
main calls right.
right reads the first prime, 2, from stdin which is a pipe to the number stream.
[number stream] ->(2) [right]
After the initial condition, a switcheroo happens inside right.
right makes a pipe.
right forks.
The child redirects its stdout to the pipe.
The child's stdin is still reading from the number stream.
The child calls cull.
cull reads from stdin (the number stream) and writes to stdout (right).
right redirects its stdin to the pipe, reading from cull.
right recurses.
[number stream] ->(n) [cull] ->(p) [right]
After the first call right is reading primes from cull and writing them to the real stdout. cull reads candidates from the number stream and writes primes to right.
When the number stream loop ends the process ends and closes its pipe to cull. Once all the numbers have been read from the pipe, cull to reads EOF ending its loop and its process, closing its pipe to right. right reads EOF and returns back to main which exits.
To explain redirect we need to understand redirection in C.
First, a simple one-way pipe.
int pd[2];
pipe(pd);
//parent
if (fork()) {
// Parent must close the input side else reading from pd[0] will
// continue to try to read from pd[1] even after the child closes
// their pipe.
close(pd[1]);
int p;
while( read(pd[0], &p, sizeof(p)) ) {
printf("p = %d\n", p);
}
fprintf(stderr, "parent done reading\n");
}
// child
else {
// Not strictly necessary, but the child will not be reading.
close(pd[0]);
for (int i = 2; i < 10; i++) {
write(pd[1], &i, sizeof(i));
}
// Tell the parent we're done writing to the pipe.
// The parent will get EOF on its next read. If the child
// does not close the pipe, the parent will hang waiting to read.
close(pd[1]);
fprintf(stderr, "child done writing\n");
// Pipes are closed automatically when a process exits, but
// let's simulate the child not immediately exiting to
// illustrate why it's important to explicitly close pipes.
sleep(1);
}
The parent and child share a pipe. The parent reads from one end, the child writes to the other. The child closes their write end when they're done so the parent doesn't hang trying to read. The parent closes their write end immediately so their pipe doesn't try to read from it.
Instead of passing the pipe around, redirect is redirecting the parent's half to stdin and the child's half to stdout. Let's do that in our simple example using dup2. dup2 duplicates a descriptor to another, first closing the target.
int pd[2];
pipe(pd);
if (fork()) {
// Redirect pd[0] to stdin.
dup2(pd[0], STDIN_FILENO);
// Parent still has to close its input side.
close(pd[1]);
int p;
while( read(STDIN_FILENO, &p, sizeof(p)) ) {
printf("p = %d\n", p);
}
fprintf(stderr, "parent done reading\n");
} else {
// Redirect pd[1] to stdout.
dup2(pd[1], STDOUT_FILENO);
// Close the original pd[1] so the parent doesn't try to read from it.
close(pd[1]);
for (int i = 2; i < 10; i++) {
write(STDOUT_FILENO, &i, sizeof(i));
}
// Tell the parent we're done writing.
close(STDOUT_FILENO);
fprintf(stderr, "child done writing\n");
sleep(1);
}
The final twist is dup. dup duplicates pd[k] to the lowest numbered descriptor currently not in use by the process. redirect(0, pd) closes descriptor 0 and then copies pd[0] to the lowest numbered descriptor: 0.
redirect(1, pd) closes descriptor 1 and then copies pd[1] to what it hopes is the lowest numbered descriptor: 1. If something else closed 0, redirect(1, pd) will copy pd[1] to descriptor 0 and the code will not work. This can be avoided by using dup2 which makes it explicit which file descriptor you're copying to.
// close(k) and dup(pd[k]) to k safely and explicitly.
dup2(pd[k], k);
redirect can be rewritten as:
void redirect(int k, int pd[]) {
dup2(pd[k], k);
close(pd[0]);
close(pd[1]);
}
Note that is all for a one-way pipe. cull uses bi-directional pipes, but the idea is the same.
By redirecting its pipe to stdin and stdout the program can use the pipe without having to pass the pipe around. This lets right read the first prime from the number generator and then let cull read the rest. It could also be done explicitly.
With some simpler examples in place, now we can answer the questions.
1). Why does redirect need to close pd[1]?
The parent must close the input side of its pipe, even after it's been duplicated or closed by the child, else the pipe will remain open and the parent will hang trying to read from it.
cull() is reading from the file descriptor 0, but in right() the child process will close 0. Why doesn't this cause any problem?
right closes its 0 and then copies pd[0] to 0. cull does not close 0, it closes pd[0]. cull reads from the original 0 which is the number generator.
Why it is necessary to use redirect here when we only want read from 0 and write to 1?
Because we need 0 and 1 to be different things at different times. We don't really want to read from 0 and write to 1. We want to read and write from pipes which happen to be attached to 0 and 1. The program is redirecting its pipes to 0 and 1 to demonstrate how Unix pipes and redirection works internally.
It took a dabbler like me some time to figure out how this program worked, it would have been a lot easier if I'd read the original paper and seen the shell script version first.
It can be rewritten to use explicit pipes. This avoids action at a distance, is easier to understand, and still demonstrates pipes and co-routines, but it no longer illustrates redirection.
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
void cull(int p, int read_d, int write_d) {
int n;
while (read(read_d, &n, sizeof(n))) {
if (n % p != 0) {
write(write_d, &n, sizeof(n));
}
}
}
void right(int read_d) {
int p;
if (read(read_d, &p, sizeof(p))) {
printf("prime %d\n", p);
int cull_pipe[2];
pipe(cull_pipe);
if (fork()) {
// right() reads from cull().
close(cull_pipe[1]);
right(cull_pipe[0]);
} else {
// cull() reads from the number generator and writes to right().
close(cull_pipe[0]);
cull(p, read_d, cull_pipe[1]);
close(cull_pipe[1]);
}
}
}
int main(int argc, char *argv[]) {
int pd[2];
pipe(pd);
if (fork()) {
// The first call to right() reads from the number generator.
close(pd[1]);
right(pd[0]);
} else {
close(pd[0]);
for (int i = 2; i < 6; i++) {
write(pd[1], &i, sizeof(i));
}
close(pd[1]);
}
exit(0);
}
Other notes:
The headers can be replaced with standard headers.
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>

Why are my child processes being waited on by parent without any work being done?

I'm sure it's something fairly simple, but for the life of me I cannot figure out why my child processes are doing no work, getting waited on, then the last one is pausing (like I'm not closing the pipes properly). At any rate, I'll post a bunch of code but this is what it's doing:
The program parses a txt document and takes all individual words, and sends them along a pipe round-robin style to a specified number of child processes. I have a 1 dimensional array that holds the pipe FD's with every even index being a read, and every odd index being a write pipe.
After the parsing is finished, the program closes the read/write pipes prior to forking children (to close the pipes with the parent). Then, within a for loop, the specified number of child processes are spawned and the write end of the corresponding pipes are closed off in the child, and the read ends are opened. fgets MUST be used to take the input from the pipes (I know, annoying, but it's a requirement).
After the child is done, it gets waited on by the parent process. There are some comment and debugger lines that I've tried to get to help me, and from them it seems like the child processes are being forked and entered correctly, the write pipe is closed, the read pipe is opened, but when I do the fgets() function, it immediately exits and gets waited on by the parent. Interestingly, not all children get waited on. If I want the number of children to be 3, 2 processes are waited on, the 3rd gets hung up. If I want 10 processes, 5 get waited on, and the 6th gets hung up.
So, I am pretty certain it has something to do with fgets() but I cannot figure out why! I have a hunch it may be something to do with newline characters being in the wrong spot when they're sent along the pipe (fgets reads up until newline, right?) but based on the code written and some additional debugging statement the input into the pipe from the parent process seems to be newline terminated properly.
At any rate, here's the code for both the parser then the bit with creating the children --
Parser:
char buf[PIPE_BUF];
int wordCount;
char buffer[PIPE_BUF];
char *temp;
char word[50];
FILE* inputFile = fopen(fileName, "r"); //OPENS file
//Parsing and distributing words round robin to pipes
while(fgets(buffer, (sizeof buffer), inputFile)){
//remove all non-alpha chars in buffer and converts to lowercase
int i;
for(i = 0; i < strlen(buffer); i++){
if(isalpha(buffer[i]) == 0){ //0 means it is not a letter
buffer[i] = ' ';
}
else{
buffer[i] = tolower(buffer[i]); //turn the current word to lower case
}
}
//parse words and sends them to the sort processes in a round-robin fashion
temp = strtok(buffer, " "); //splits along spaces
if(temp != NULL){
strcpy(word, temp);
strcat(word, "\n"); //puts newline at the end
}
int j = 0;
while(temp != NULL){
FILE *input = fdopen(pipefds[(j*2)+1], "w");
//close(pipefds[j*2]); //closing read pipes in parent
fputs(word, input); //puts into write pipe
printf("fputs done successfully with pipe %d with contents: %s\n", pipefds[(j*2)+1], word);
//close(pipefds[(j*2)+1]); //closing write pipe after write is done
temp = strtok(NULL, " ");
if(temp != NULL){
strcpy(word, temp);
strcat(word, "\n");
}
if(j == (numChildren - 1)){
j = 0;
}
else{
j++;
}
}
}
//need to close all parent writes, and parent reads (it's done with everything)
for(i = 0; i < numChildren; i++){
close(pipefds[i]);
}
Parent forking and getting piped data:
//Collection of children need to be created specified by numChildren
int count;
for(count = 0; count < numChildren; count++){
printf("Count: %d\n", count);
switch((p = fork())){
case -1:
perror("Could not create child");
exit(-1);
case 0:
printf("Entering child\n");
//child case, GET INPUT FROM PARENT TO SORT!!! SEND TO SUPPRESSOR (Uses both input and output)
//count[0] = read, count[1] = write
close(pipefds[(count*2)+1]); //closing write pipes in child
printf("write pipe closed in child\n");
FILE *output = fdopen(pipefds[count*2], "r"); //opening the read pipe from the parent write pipe
printf("read pipe opened in child\n");
fgets(buf, PIPE_BUF, output); //gets data from read pipe
printf("child read pipe contents read (fgets) with buf contents: %s\n", buf);
printf("child read pipe closed (%d)\n", getpid());
//execlp("sort", "sort", sortStuff,(char*)NULL);
close(pipefds[count*2]); //closing read pipe after reading is done
count = numChildren;
break;
default:
//parent case -- p holds pid of child
printf("I am the parent, PID: %d\n", getpid());
child = wait(&status);
printf("Waited on child %d\n", child);
break;
}
}
I apologize in advance for the code, I'm not the best C programmer, so things tend to get a little messy.
The major problem is with this code:
// need to close all parent writes,
// and parent reads (it's done with everything)
for(i = 0; i < numChildren; i++){
close(pipefds[i]);
You do this (it appears) before you create the child processes, and by doing so, you basically remove the pipes. They're gone. They no longer exist. There's nothing for the child process to read. My guess is that this line:
FILE *output = fdopen(pipefds[count*2], "r");
is failing (output is NULL) because the file descriptor has already been closed, and thus, is an invalid descriptor as far as the system is concerned.
Another issue is the order of your steps. Typically, you create a pipe, then create a child process and it's only after the child process is finished do you close out the pipe. I don't think I've ever seen an implementation that writes to a pipe, then creates the child processes to read from it, since there is one large problem with this: pipes are limited in size, and the parent process can block writing to a pipe (I suspect you have a small file you are testing against and thus, aren't hitting the size limit of the pipe).
The order of steps I would recommend is:
create the pipes
create the child processes
in the parent process
close the read end of each pipe (only the read end)
read the text file
write words to the pipes
when done reading text file, close the write end of each pipe
wait for each child
in the child process(es)
close the write end of the pipe its using
while there is input from the pipe
read the input
do whatever
close the read end of the pipe
_exit()
That way, you actually receive the benefit of multiprocessing and you won't have to worry about the parent indefinitely blocking when writing (because there's always a child process reading).
When you associate a FILE* with the descriptors using this function:
FILE *input = fdopen(pipefds[(j*2)+1], "w");
then anything you do with *input is going to be buffered in addition to whatever buffering goes on in the pipe. So it's possible that whatever you think you're writing to the pipe, is actually just sitting in the FILE*'s buffer and never actually reaching the pipe. If you used fclose(input) the buffer would be flushed through at the end, but I think you're closing the underlying file descriptor, not the FILE*, which means all the buffer management for FILE* doesn't know it should finish up.
A call to fflush(input) might help.
Separately, and deeper still, if you're writing all your data to the pipe before you even start reading it, you should be aware that there is a limit to what the pipe can buffer before it won't accept any more input (until something is read out of the other end.)
EDIT: Summary, I think you're data is stuck in the FILE* buffer and never even gets to the pipe. Not flushed. Also, you should probably be calling fclose(input) someplace, which will affect your need to close() the underlying file descriptor.

Problem with simple pipe communication in C

I have a problem with this exercise:
Write a program in C that creates a child and between father and child, there will be two-way communication using pipes. His father would read a file (whose name will be given the user) and will send letters to the child.
The child will count the number of words starting from with 'a'; if the number is, whether X of words, X, starting with 'a' is greater than 5, then the child will create his own child (grandchild). The grandchild of the establishment will send, by whatever means you deem possible, the grandfather * value of X number and exit. [By inference: if the number X is less than 6, then the child simply exits, and the 'would be grandparent' also exits.]
Note: Grandfather = initial process of the father, that father of his father's grandchild
Here is what I have done till now; please help me...
#include <stdio.h>
#include<string.h>
#include <stdlib.h>
#include <fcntl.h>
int main(int argc, char *argv[])
{
int fd1[2], fd2[2], pid, status,sum=0, i;
char gram[100], x;
char buff[100];
char rev[1000];
FILE *fp;
if (pipe(fd1) == -1) { /* Create a pipe */
perror("pipe");
exit(1);
}
pid = fork();
switch (pid)
{
case -1:
perror ("Fork error\n");
exit(99); // in case of error
case 0:
close(fd1[1]);//Close the writing side
rev[1000]= read(fd1[0], buff, 1000); /* Read from the pipe */
/*for (i=0; i< 1000 ; i++)
{
rev[i] = read(fd1[0], buff, sizeof(buff));
}*/
while( rev[i] != '\0')
{
if (buff[i] == 'a' )
{ sum++;
i++;
}
if (rev[i] == "")
{
if (rev[i+1]) //elenxei to epomeno tou kenou
{
sum++;
i++;
}
}
i++;
}
printf("%d", sum);
exit(0);
default:
printf("dwse arxeio\n");
scanf("%s", gram);
close(fd1[0]);//Close the reading side
fp = fopen (gram,"r");
getc(fp);
fclose(fp);
write(fd1[1], buff, sizeof(buff)+1);
close(fd1[1]);//Close the writing side
wait(&status); // waits till the child process ends
}
}
You probably want to look at popen.
It starts a child process and returns a FILE * you can use to read/write to the childs stdin/stdout.
Let's call the three processes GP (for grandparent), PP (for parent process), and GC (for grandchild). In outline, I think what you need to do is:
GP creates two pipes (4 file descriptors) designated RP (read pipe) and WP (write pipe). That decribes how GP will use them; PP and GC will write on RP and read on WP.
GP forks, creating PP.
GP will close the write end of RP and the read end of WP.
GP will open the file.
GP will write whatever subset of the file is appropriate to PP via WP.
GP will close WP when there is no more data to transfer.
GP should also close the file it opened.
GP will then read from RP, probably stashing the data until it gets EOF.
If it gets any information back, GP will echo that information to standard output.
GP can then terminate.
Meanwhile, step 2 above created PP, who has to do some work:
PP needs to close the read end of RP and the write end of WP.
PP sits in a loop, reading data from WP, counting whatever is relevant.
When PP gets an EOF on WP, it can decide what it needs to do.
PP can now close the read end of WP.
If its counter X is bigger than 5, then (for reasons that only make sense in homework) it will fork to create GC; it can then exit.
If its counter X does not reach the threshold, then as far as the specification goes, it can terminate immediately. For debugging, you'll probably have it print something to stdout about what it did and why.
Now you have GP and GC around; remember that GC is an almost exact copy of PP, and (in particular) GC knows the value X just as well as PP did. So, GC does some work, too:
GC formats X as a string (probably - you could do binary data transfer if you prefer, but that passes the formatting buck to GP, that's all).
GC writes the formatted X on the write end of RP.
GC closes the write end of RP.
GC exits.
GC's step 3 ensures that GP wakes up from its step 8.
Judged as an industrial design, there is no point to creating GC; PP could perfectly well do what GC does. As a homework exercise, it passes muster. The key insight is in the comments above:
The crux of the matter is how to communicate from grandchild to grandfather. The good news is, the grandchild inherits the pipes open in the child.
The other key steps are closing the unused ends of the pipes and closing the pipes when there's nothing more to write. Without those closes, processes are apt to get stuck in deadlock. If GP fails to close the write end of RP, for example, it will never get EOF when reading from RP, because there is a process that could still write to RP - and that process is GP!
rev[1000]= read(fd1[0], buff, 1000); /* Read from the pipe */
What are you trying to accomplish with the lvalue here?
First, rev is declared as having 1000 elements, so rev[1000] is a buffer overrun...
Second, I suggest you look at the "Return value" section of the manual page for read(). It returns the number of bytes received (which may be smaller than the third parameter you specified), or 0 on end-of-file, or negative on failure. It will fill in the contents of buff with the actual data. I am not sure from your code what you are expecting the system call to act like but it doesn't seem to me like you are using it correctly.
You want to be doing something like this:
int r;
r = read(fd1[0], buff, sizeof(buff));
if (r < 0) { /* TODO: Handle error */ }
else if (r == 0) { /* TODO: Handle EOF */ }
else { /* TODO: Handle the fact that buff now contains 'r' bytes */ }

Resources