Strange Output Buffering Behavior in C - c

I am trying out a C shell implementation from an open course, yet there is something intriguing about the behavior of the output buffering.
The code goes like this (note the line where I use pid = waitpid(-1, &r, WNOHANG)):
int
main(void)
{
static char buf[100];
int fd, r;
pid_t pid = 0;
// Read and run input commands.
while(getcmd(buf, sizeof(buf)) >= 0){
if(buf[0] == 'c' && buf[1] == 'd' && buf[2] == ' '){
buf[strlen(buf)-1] = 0; // chop \n
if(chdir(buf+3) < 0)
fprintf(stderr, "cannot cd %s\n", buf+3);
continue;
}
if((pid = fork1()) == 0)
runcmd(parsecmd(buf));
while ((pid = waitpid(-1, &r, WNOHANG)) >= 0) {
if (errno == ECHILD) {
break;
}
}
}
exit(0);
}
The runcmd function is like this (note that in pipe handling I create 2 child processes and wait for them to terminate):
void
runcmd(struct cmd *cmd)
{
int p[2], r;
struct execcmd *ecmd;
struct pipecmd *pcmd;
struct redircmd *rcmd;
if(cmd == 0)
exit(0);
switch(cmd->type){
case ' ':
ecmd = (struct execcmd*)cmd;
if(ecmd->argv[0] == 0) {
exit(0);
}
// Your code here ...
// fprintf(stderr, "starting to run cmd: %s\n", ecmd->argv[0]);
execvp(ecmd->argv[0], ecmd->argv);
fprintf(stderr, "exec error !\n");
exit(-1);
break;
case '>':
case '<':
rcmd = (struct redircmd*)cmd;
// fprintf(stderr, "starting to run <> cmd: %s\n", rcmd->file);
// Your code here ...
mode_t mode = S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH;
if (rcmd->type == '<') {
// input
close(0);
if (open(rcmd->file, O_RDONLY, mode) != 0) {
fprintf(stderr, "Opening file error !\n");
exit(-1);
}
} else {
// output
close(1);
if (open(rcmd->file, O_WRONLY|O_CREAT|O_TRUNC, mode) != 1) {
fprintf(stderr, "Opening file error !\n");
exit(-1);
}
}
runcmd(rcmd->cmd);
break;
case '|':
pcmd = (struct pipecmd*)cmd;
// fprintf(stderr, "starting to run pcmd\n");
// Your code here ...
pipe(p);
if (fork1() == 0) {
// child for read, right side command
close(0);
if (dup(p[0]) != 0) {
fprintf(stderr, "error when dup !\n");
exit(-1);
}
close(p[0]);
close(p[1]);
runcmd(pcmd->right);
fprintf(stderr, "exec error !\n");
}
if (fork1() == 0) {
// left side command for writing
close(1);
if (dup(p[1]) != 1) {
fprintf(stderr, "dup error !\n");
exit(-1);
}
close(p[0]);
close(p[1]);
runcmd(pcmd->left);
fprintf(stderr, "exec error !\n");
}
close(p[0]);
close(p[1]);
int stat;
wait(&stat);
wait(&stat);
break;
default:
fprintf(stderr, "unknown runcmd\n");
exit(-1);
}
exit(0);
}
The wierd thing is, when I execute "ls | sort" in the terminal, I constantly get the following output
6.828$ ls | sort
6.828$ a.out
sh.c
t.sh
This indicates that before the next command prompt "6828$" is printed, the output from the child process is still not flushed to terminal.
However, if I don't use pid = waitpid(-1, &r, WNOHANG)) and use pid = waitpid(-1, &r, 0)) (or wait()), the output would be normal like:
6.828$ ls | sort
a.out
sh.c
t.sh
I have been thinking about the cause of the problem for a long time but did not come up with a possible reason. Can anyone suggest some possible reason?
Thanks a lot!

This code does not have well-defined behaviour:
while ((pid = waitpid(-1, &r, WNOHANG)) >= 0) {
if (errno == ECHILD) {
break;
}
}
The while loop breaks immediately if waitpid returns -1, which is precisely what it returns in the case of an error. So if the body of the loop is entered, waitpid returned some non-negative value: either 0 -- indicating that the child is still executing -- or the pid of a child which had exited. Those are not error conditions, so the value of errno is not meaningful. It might be ECHILD, in which case the loop will incorrectly break.
You must only check the value of errno in cases where the value is meaningful. Or, to be more precise, quoting the Posix standard:
The value of errno shall be defined only after a call to a function for which it is explicitly stated to be set and until it is changed by the next function call or if the application assigns it a value. The value of errno should only be examined when it is indicated to be valid by a function's return value.
But I'm puzzled why you feel it necessary to busy loop using WNOHANG. That's a massive waste of resources, since your parent process will repeatedly execute the system call until the child actually terminates. Since you really intend to wait until the child terminates, it would make much more sense to just call wait or to specify 0 as a flag value to waitpid.
On the other hand, you might want to repeat the wait (or waitpid) if it returns -1 with errno set to EINTR. And if it returns -1 and errno is neither EINTR nor ECHILD, then some hard error has occurred which you might want to log. But that's not related to your problem, afaics.

Related

Piping implementation stuck on second command

int main() {
char *cmd1[2] = { "ls", NULL };
char *cmd2[3] = { "grep", "a", NULL };
char *cmd3[3] = { "wc", "-l", NULL };
char *cmd4[5] = { "cat", NULL };
char *cmd5[5] = { "cat", NULL };
int pipe_count = 2;
int pid1, pid2, pid3, pid4, pid5;
int pfd1[2];
int pfd2[2];
pipe(pfd1);
pipe(pfd2);
if ((pid1 = fork()) == 0) {
close(pfd2[0]);
close(pfd2[1]);
close(pfd1[0]);
dup2(pfd1[1], 1);
if (execvp(cmd1[0], cmd1) == -1) {
exit(-1);
}
} else if (pid1 > 0) {
waitpid(pid1, NULL, 0);
}
if ((pid2 = fork()) == 0) {
if (pipe_count >= 2) {
close(pfd1[1]);
close(pfd2[0]);
dup2(pfd1[0], 0);
dup2(pfd2[1], 1);
} else {
close(pfd1[1]);
close(pfd2[0]);
close(pfd2[1]);
dup2(pfd1[0], 0);
}
if (execvp(cmd2[0], cmd2) == -1) {
exit(-1);
}
if (pipe_count == 1) {
printf("\n");
return 0;
}
} else if (pid2 > 0) {
waitpid(pid2, NULL, 0);
}
if (pipe_count >= 2) {
if ((pid3 = fork()) == 0) {
if (pipe_count >= 3) {
close(pfd1[0]);
close(pfd2[1]);
dup2(pfd2[0], 0);
dup2(pfd1[1], 1);
} else {
close(pfd1[0]);
close(pfd1[1]);
close(pfd2[1]);
dup2(pfd2[0], 0);
}
if (execvp(cmd3[0], cmd3) == -1) {
exit(-1);
}
if (pipe_count == 2) {
printf("\n");
}
} else if (pid3 > 0) {
waitpid(pid3, NULL, 0);
}
}
if (pipe_count >= 3) {
if ((pid4 = fork()) == 0) {
close(pfd1[1]);
close(pfd2[0]);
dup2(pfd1[0], 0);
if (pipe_count == 4)
dup2(pfd2[1], 1);
else
close(pfd2[1]);
if (execvp(cmd4[0], cmd4) == -1) {
exit(-1);
}
} else if (pid4 > 0) {
waitpid(pid4, NULL, 0);
}
}
if (pipe_count == 4) {
if ((pid5 = fork()) == 0) {
close(pfd1[0]);
close(pfd2[1]);
dup2(pfd2[0],0);
close(pfd1[1]);
if (execvp(cmd5[0], cmd5) == -1) {
exit(-1);
}
} else if (pid5 > 0) {
waitpid(pid5, NULL, 0);
}
}
return 0;
}
I'm trying to build a shell with piping command. When I input ls | grep a | wc -l for example, I realize that the program is stuck on grep a when I use ps f on the terminal. The shell is not responsive.
When I kill the child process for grep a, I'm again stuck on wc -l and have to kill it on the terminal again.
After killing the processes, no output is printed (My desired output is 2).
Any help would be appreciated.
As already diagnosed in the comments, there are many problems with the original code, including:
The most likely problem is that your parent process isn't closing the pipes before waiting for the child processes to die, so the child processes don't get EOF and don't terminate. (This was one of the problems, but far from the only problem.)
If you have N processes to run, you need N-1 pipes. You have only two pipes here; you've got a lot of work to do before you can make it work with just two pipes. The case of N=2 still has special cases: the first and last processes need to be treated a bit different from the way you treat processes 2..N-1.
You also need to run the processes in a pipeline concurrently. The controlling process should not wait for any of the children until the whole pipeline has been launched. This is because, for example, process P1 may generate so much data that it fills the pipe buffer connecting it to process P2, at which point it will block waiting for P2 to read some data. But if P2 hasn't been launched yet, P1 will never be unblocked, so the pipeline will make no progress. You need to rethink the waiting code as well as the piping code. You end up closing a lot of file descriptors.
You aren't closing enough file descriptors. Rule of thumb: If you dup2() one end of a pipe to standard input or standard output, close both of the original file descriptors from pipe() as soon as possible. In particular, that means before using any of the exec*() family of functions. The rule also applies with either dup() or fcntl() with F_DUPFD.
Note that there is no need to test the return value from the exec*() family of functions. If they succeed, they do not return; if they return, they failed. I think it is good practice, in most cases, to generate an error message before exiting after an exec*() call fails.
Putting those observations together leads to code like this:
/* SO 7412-0402 */
#include <stdio.h>
#include <unistd.h>
#include <sys/wait.h>
#include <stderr.h>
int main(void)
{
char *cmd1[2] = { "ls", NULL };
char *cmd2[3] = { "grep", "a", NULL };
char *cmd3[3] = { "wc", "-l", NULL };
char *cmd4[5] = { "cat", NULL };
char *cmd5[5] = { "cat", NULL };
int pid1, pid2, pid3, pid4, pid5;
int pfd1[2];
int pfd2[2];
int pfd3[2];
int pfd4[2];
err_setarg0("pipe61");
err_setlogopts(ERR_PID | ERR_MILLI);
err_remark("Parent process\n");
if (pipe(pfd1) != 0 ||
pipe(pfd2) != 0 ||
pipe(pfd3) != 0 ||
pipe(pfd4) != 0)
err_syserr("failed to create a pipe: ");
if ((pid1 = fork()) < 0)
err_syserr("failed to fork(): ");
if (pid1 == 0)
{
err_remark("Child process 1\n");
dup2(pfd1[1], 1);
close(pfd1[0]); close(pfd1[1]);
close(pfd2[0]); close(pfd2[1]);
close(pfd3[0]); close(pfd3[1]);
close(pfd4[0]); close(pfd4[1]);
execvp(cmd1[0], cmd1);
err_syserr("failed to execute '%s': ", cmd1[0]);
}
if ((pid2 = fork()) < 0)
err_syserr("failed to fork(): ");
else if (pid2 == 0)
{
err_remark("Child process 2\n");
dup2(pfd1[0], 0);
dup2(pfd2[1], 1);
close(pfd1[0]); close(pfd1[1]);
close(pfd2[0]); close(pfd2[1]);
close(pfd3[0]); close(pfd3[1]);
close(pfd4[0]); close(pfd4[1]);
execvp(cmd2[0], cmd2);
err_syserr("failed to execute '%s': ", cmd2[0]);
}
if ((pid3 = fork()) < 0)
err_syserr("failed to fork(): ");
else if (pid3 == 0)
{
err_remark("Child process 3\n");
dup2(pfd2[0], 0);
dup2(pfd3[1], 1);
close(pfd1[0]); close(pfd1[1]);
close(pfd2[0]); close(pfd2[1]);
close(pfd3[0]); close(pfd3[1]);
close(pfd4[0]); close(pfd4[1]);
execvp(cmd3[0], cmd3);
err_syserr("failed to execute '%s': ", cmd3[0]);
}
if ((pid4 = fork()) < 0)
err_syserr("failed to fork(): ");
else if (pid4 == 0)
{
err_remark("Child process 4\n");
dup2(pfd3[0], 0);
dup2(pfd4[1], 1);
close(pfd1[0]); close(pfd1[1]);
close(pfd2[0]); close(pfd2[1]);
close(pfd3[0]); close(pfd3[1]);
close(pfd4[0]); close(pfd4[1]);
execvp(cmd4[0], cmd4);
err_syserr("failed to execute '%s': ", cmd4[0]);
}
if ((pid5 = fork()) < 0)
err_syserr("failed to fork(): ");
else if (pid5 == 0)
{
err_remark("Child process 5\n");
dup2(pfd4[0], 0);
close(pfd1[0]); close(pfd1[1]);
close(pfd2[0]); close(pfd2[1]);
close(pfd3[0]); close(pfd3[1]);
close(pfd4[0]); close(pfd4[1]);
execvp(cmd5[0], cmd5);
err_syserr("failed to execute '%s': ", cmd5[0]);
}
close(pfd1[0]); close(pfd1[1]);
close(pfd2[0]); close(pfd2[1]);
close(pfd3[0]); close(pfd3[1]);
close(pfd4[0]); close(pfd4[1]);
int status;
int corpse;
while ((corpse = wait(&status)) > 0)
printf("%d: child %d exited with status 0x%.4X\n", getpid(), corpse, status);
return 0;
}
Notice that the blocks for pid2, pid3 and pid4 are almost the same; the block for pid1 only duplicates a pipe descriptor to stdout, while the block for pid5 only duplicates a pipe descriptor to stdin.
The code for the error reporting routines is available in my SOQ (Stack Overflow Questions) repository on GitHub as files stderr.c and stderr.h in the src/libsoq sub-directory. The err(3) functions on Linux and BSD have similar functionality but different function names.
Here is the output from a sample run (of the program pipe61 compiled from source code pipe61.c shown above.
pipe61: 2022-10-19 23:52:03.833 - pid=50391: Parent process
pipe61: 2022-10-19 23:52:03.834 - pid=50392: Child process 1
pipe61: 2022-10-19 23:52:03.834 - pid=50393: Child process 2
pipe61: 2022-10-19 23:52:03.834 - pid=50394: Child process 3
pipe61: 2022-10-19 23:52:03.834 - pid=50395: Child process 4
pipe61: 2022-10-19 23:52:03.834 - pid=50396: Child process 5
50391: child 50392 exited with status 0x0000
50391: child 50393 exited with status 0x0000
16
50391: child 50394 exited with status 0x0000
50391: child 50395 exited with status 0x0000
50391: child 50396 exited with status 0x0000
Clearly, this code is not easily configurable to deal with 6 or more stages in the pipeline except by cut'n'paste programming, nor is it trivial to remove any stages from the pipeline (variable renaming). For 'real' code, you'd need to use an array-driven approach to avoid unnecessary duplication of code. You'd have the pipe descriptors stored in an array; you'd have the PIDs stored in an array. You'd probably have an array of pointers to the lists of command arguments — three-star programming. And you'd probably use a function to launch the Nth child process.

Bad file descriptors when implementing pipes & execvp

I'm currently working on an assignment that is teaching us on how to implement pipes in my custom shell. Before I actually implement pipes on my shell and change my code, they want us to create two children, and run a command on each child while implementing a pipe:
Execute "ls -l" on child 1
Execute "tail -n 2" on child 2
Currently, my code looks like this:
int main (int argc, char * argv[]){
int debugMode=0;
int p[2];
int writeDup;
int readDup;
int status;
if (strcmp(argv[1],"-d")==0)
debugMode=1;
if (pipe(p)<0)
return 0;
int child1= fork();
if (child1 == 0)
{
if (debugMode == 1)
fprintf(stderr, "Child 1 is redirecting stdout to write end of pipe.\n");
fclose(stdout);
writeDup = dup(p[1]);
close(writeDup);
char *args[] = {"ls","-l",NULL};
if (execvp(args[0],args)<0){
if (debugMode ==1)
perror("ls -l failed ");
return 0;
}
}
else
{
if (debugMode == 1)
fprintf(stderr, "Parent process is waiting to close write end of pipe.\n");
while ((child1=waitpid(-1,&status,0))!=-1);
close(p[1]);
}
int child2 = fork();
if (child2 == 0)
{
fclose(stdin);
readDup = dup(p[0]);
close(readDup);
char *args[] = {"tail","-n","2",NULL};
if (execvp(args[0],args)<0){
if (debugMode ==1)
perror("tail -n 2 failed ");
return 0;
}
}
else{
if (debugMode == 1)
fprintf(stderr, "Parent process is closing read end of pipe.\n");
while ((child2=waitpid(-1,&status,0))!=-1);
close(p[0]);
}
if (debugMode == 1 && child1 != 0 && child2 !=0)
fprintf(stderr, "Waiting for child processes to terminate.\n");
while ((child1=waitpid(-1,&status,0))!=-1 && (child2=waitpid(-1,&status,0))!=-1 );
return 0;
}
However, while executing, I receive several errors:
ls: write error : bad file descriptor
tail: cannot fstat 'standard input': Bad file descriptor
tail: -: bad file descriptor
They requested us to close the standard inputs & outputs, so by doing so I assume that the program should default into reading/writing into the pipe. I'm continuing to try to find a solution, I would appreciate any help or direction!

How to implement successive pipes in C?

I work on a program that downloads the best (marked as '(best)') video format using youtube-dl. It reads a command-line argument then it launches a child process 'youtube-dl -F [url]'. Then it passes the line with '(best)' to a routine that extracts the format and executes, again as a child, 'youtube-dl -f [best format] [url]'. The problem is it works only for the first link. Maybe a child doesn't write to a pipe properly, maybe a parent doesn't read from the pipe. I'm lost. Thanks for your help.
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/wait.h>
#include <unistd.h>
#define LINE_LEN 255
enum { ERROR=-1, CHILD };
void error(char *msg)
{
fprintf(stderr, "%s: %s\n", msg, strerror(errno));
exit(1);
}
void dl_best(char *format, char *url)
{
char fmt[4];
int status;
pid_t pid;
for (int i = 0; *format != ' '; format++, i++)
fmt[i] = *format;
fmt[3] = '\0';
switch(pid = fork()) {
case ERROR:
error("Failed to pipe in dl_best");
break;
case CHILD:
if (execlp("youtube-dl", "youtube-dl", "-f", fmt, url, NULL) == -1)
error("Failed to execle() in dl_best");
break;
default:
if (waitpid(pid, &status, 0) == -1)
error("Waitpid failed in dl_best()");
break;
}
}
void get_format(char *url)
{
pid_t pid;
int fd[2], status;
char line[LINE_LEN];
if (pipe(fd) == -1)
error("Pipe failed");
if ((pid = fork()) == ERROR) {
error("Failed to create a child precess in get_format()");
} else if (pid == CHILD) {
if (close(fd[0]) == -1)
error("Child failed to close reading pipe");
if (dup2(fd[1],1) == -1)
error("Dup2 failed in get_format()");
if (execlp("youtube-dl", "youtube-dl", "-F", url, NULL) == -1)
error("Failed to execute get_formats");
} else { //parent
if (close(fd[1]) == -1)
error("Parent failed to close writing pipe");
if (dup2(fd[0],0) == -1)
error("Dup2 failed in get_format()");
if (waitpid(pid, &status, 0) == -1)
error("Waitpid failed in get_format()");
while (fgets(line, LINE_LEN, stdin)) {
if (strstr(line, "(best)") != NULL)
dl_best(line, url);
}
if (close(fd[0]) == -1)
error("Parent failed to close reading pipe");
}
}
int main(int argc, char *argv[])
{
//int fd[2], status, argc_cp = argc;
//dl_best("22 ", argv[--argc]);
while (--argc)
get_format(argv[argc]);
return 0;
}
if (dup2(fd[0],0) == -1)
error("Dup2 failed in get_format()");
if (waitpid(pid, &status, 0) == -1)
error("Waitpid failed in get_format()");
while (fgets(line, LINE_LEN, stdin)) {
if (strstr(line, "(best)") != NULL)
dl_best(line, url);
}
This is a common mistake when piping from child to parent in C. With this code, the child will fill up the pipe buffer and then block waiting for the parent to drain the pipe, but the parent won't ever do that because it's blocked waiting for the child to exit, so the overall program will deadlock.
(You won't hit the deadlock for invocations where the child's complete output is smaller than the size of the pipe buffer. This may be why it appears to work for the first command line argument only.) You need to read all of the data produced by the child before you wait for the child to terminate. For this program, that's as simple as moving the waitpid and its conditional below the while loop.
Your repeated replacement of file descriptor 0 may also be running foul of the C99 rule that end-of-file is a sticky condition. (This may also explain why it appears to work for the first command line argument only.) You could address that by calling clearerr after the dup2, but it would be better not to mess with stdin at all. Instead, use fdopen to convert fd[0] into a FILE.
Putting both of those fixes together, your parent-side code in get_format should look something like this:
} else { //parent
if (close(fd[1]) != 0)
error("Parent failed to close writing pipe");
FILE *fp = fdopen(fd[0], "rt");
if (!fp)
error("Parent failed to allocate a FILE");
while (fgets(line, LINE_LEN, fp)) {
if (strstr(line, "(best)") != NULL)
dl_best(line, url);
}
if (fclose(fp) != 0)
error("Parent failed to close reading pipe");
if (waitpid(pid, &status, 0) != pid)
error("Waitpid failed in get_format()");
}
(Note that fd[0] is closed together with fp, by the fclose; it is not necessary (in fact, it would be wrong) to call close on it.)
I would also consider allowing the dl_best child to run asynchronously - that is, have dl_best return the child PID rather than waiting for it itself, and then get_format waits for both children after its while loop - but that's an optimization, not a bugfix.

C program iterates too much when using fork inside while

I'm reading lines of text from file, and for each line I'm processing it using several { fork() --> child process invokes execvp(), and parent invokes wait() } .
at the end of process I'm writing the results to a file.
Problem is: the while loop seems to iterate too much and also the writing to the file.
The results.csv file contains 6 lines instead of just 2 (the while iteration
iterates a text file with 2 lines, but also when I use printf it seems like the last line is read twice).
What am I missing here?
The code example is:
FILE* results = fopen("results.csv", "w");
if (results == NULL){
fclose(fp);
perror("Failed opening results file");
exit(-1);
}
fdIn = open(inputPath, O_RDONLY);
if (fdIn < 0){
perror("Failed opening input file");
exit(-1);
}
while (fgets(student, sizeof(student), fp) != NULL) {
// override end line char of unix ('\n') with '\0'
student[strlen(student)-1] ='\0';
pid = fork();
if (pid < 0){
close(fdIn);
perror("Failed creating process for executing student's program");
exit(-1);
}
if (pid == 0) {// son process code
fdOut = open("tempOutput.txt", (O_WRONLY | O_CREAT | O_TRUNC), 0666);
if (fdOut < 0){
perror("Failed opening temporary output file");
exit(-1);
}
close(1);
dup(fdOut);
close(fdOut);
close(0);
dup(fdIn);
close(fdIn);
char studProgPath[bufSize];
strcpy(studProgPath,studentsFolderPath);
strcat(studProgPath,"/");
strcat(studProgPath,student);
strcat(studProgPath,"/");
strcat(studProgPath,"a.out");
char * args[] = {"a.out", NULL};
ret_code = execvp(studProgPath,args);
if (ret_code == -1){
perror("Failed executing student program");
exit(-1);
}
}
waited = wait(&stat);
if (stat == -1){ // need to grade 0
printf("%s,0\n",student);
}else{ // open process to compare the output with the expected
pid = fork();
if (pid < 0){
perror("Failed opening process for comparing outputs");
exit(-1);
}
if(pid == 0) { // son process
char * args[] = {"comp.exe",outputPath,"tempOutput.txt",NULL};
ret_code = execvp("comp.exe",args);
exit(ret_code);
}
waited = wait(&stat);
if (stat == -1) {
perror("Failed executing comparing program");
exit(-1);
} else if (stat == 0 || stat == 1) { // if outputs are not the same
fprintf(results,"%s,0\n",student);
} else { // matching outputs grade 100
fprintf(results,"%s,100, pid: %d\n",student,getpid());
}
}
}
The file which gets triple entries gets opened here:
FILE* results = fopen("results.csv", "w");
The following lines write to this results file, slightly before the function calls fork():
} else if (stat == 0 || stat == 1) { // if outputs are not the same
fprintf(results,"%s,0\n",student);
} else { // matching outputs grade 100
fprintf(results,"%s,100, pid: %d\n",student,getpid());
}
This file should be flushed with fflush(results) before the fork, otherwise the buffer of results might be flushed three times: in the parent, and in the two copies in the children.
Also, results and student should be closed with fclose(results) and student, before calling execvp. If the files are not closed, then the a.out might manipulate the results file. I assume that a.out is an external code which you don't control.
while (fgets(student, sizeof(student), fp) != NULL) {
// override end line char of unix ('\n') with '\0'
student[strlen(student)-1] ='\0';
fflush(results); // otherwise each child may flush the same chars
pid = fork();
if (pid < 0){
fclose(results); // otherwise ./a.out might write to this file
fclose(fp); // better also close it.
close(fdIn);

Writing own shell - code hangs when handling certain pipes - in C

I'm currently writing my own shell as a project for a class, and have everything virtually working. My problem is with my pipes, sometimes they work, and sometimes, they just hang until I interrupt them. I've done research on this, and it seems that the function that is getting it's stdin written to isn't receiving an EOF from the first process; usually as I've learned the problem is that the pipe isn't being closed, but this isn't the case (to my knowledge) with my code.
All redirection works and any variation thereof:
ls -l > file1
wc < file1 > file2
The following piped commands work:
w | head -n 4
w | head -n 4 > file1
This doesn't work: ls | grep file1 it shows the correct output and never ends unless an interrupt signal is sent to it by the user. ls | grep file1 > file2 also does not work. It hangs without showing output, creates the file2, but never writes to it.
Anyway, I hope there's something I'm missing that someone else can notice; I've been at this for a while. Let me know if there's anymore code I can provide. The code I've posted below is the main file, nothing removed.
/*
* This code implemenFts a simple shell program
* At this time it supports just simple commands with
* any number of args.
*/
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <errno.h>
#include <signal.h>
#include <sys/wait.h>
#include <fcntl.h>
#include <sys/stat.h>
#include <stdlib.h>
#include <string.h>
#include "input.h"
#include "myShell.h"
#include "BackgroundStack.h"
/*
* The main shell function
*/
main() {
char *buff[20];
char *inputString;
BackgroundStack *bgStack = malloc(sizeof(BackgroundStack));
initBgStack(bgStack);
struct sigaction new_act;
new_act.sa_handler = sigIntHandler;
sigemptyset ( &new_act.sa_mask );
new_act.sa_flags = SA_RESTART;
sigaction(SIGINT, &new_act, NULL);
// Loop forever
while(1) {
const char *chPath;
doneBgProcesses(bgStack);
// Print out the prompt and get the input
printPrompt();
inputString = get_my_args(buff);
if (buff[0] == NULL) continue;
if (buff[0][0] == '#') continue;
switch (getBuiltInCommand(buff[0])) {
case EXIT:
exit(0);
break;
case CD:
chPath = (buff[1]==NULL) ? getenv("HOME") : buff[1];
if (chdir(chPath) < 0) {
perror(": cd");
}
break;
default:
do_command(buff, bgStack);
}
//free up the malloced memory
free(inputString);
}// end of while(1)
}
static void sigIntHandler (int signum) {}
/*
* Do the command
*/
int do_command(char **args, BackgroundStack *bgStack) {
int status, statusb;
pid_t child_id, childb_id;
char **argsb;
int pipes[2];
int isBgd = isBackgrounded(args);
int hasPipe = hasAPipe(args);
if (isBgd) removeBackgroundCommand(args);
if (hasPipe) {
int cmdBi = getSecondCommandIndex(args);
args[cmdBi-1] = NULL;
argsb = &args[cmdBi];
pipe(pipes);
}
// Fork the child and check for errors in fork()
if((child_id = fork()) == -1) {
switch(errno) {
case EAGAIN:
perror("Error EAGAIN: ");
return;
case ENOMEM:
perror("Error ENOMEM: ");
return;
}
}
if (hasPipe && child_id != 0) {
childb_id = fork();
if(childb_id == -1) {
switch(errno) {
case EAGAIN:
perror("Error EAGAIN: ");
return;
case ENOMEM:
perror("Error ENOMEM: ");
return;
}
}
}
if(child_id == 0 || (childb_id == 0 && hasPipe)) {
if (child_id != 0 && hasPipe) args = argsb;
if (child_id == 0 && isBgd) {
struct sigaction new_act;
new_act.sa_handler = SIG_IGN;
sigaction(SIGINT, &new_act, 0);
}
if (child_id == 0 && hasPipe) {
if (dup2(pipes[1], 1) != 1) fatalPerror(": Pipe Redirection Output Error");
close(pipes[0]);
close(pipes[1]);
}
if (child_id != 0 && hasPipe) {
if (dup2(pipes[0], 0) != 0) fatalPerror(": Pipe Redirection Input Error");
close(pipes[0]);
close(pipes[1]);
waitpid(child_id, NULL, 0);
}
if ((child_id != 0 && hasPipe) || !hasPipe) {
if (hasAReOut(args)) {
char outFile[100];
getOutFile(args, outFile);
int reOutFile = open(outFile, O_RDWR|O_CREAT|O_TRUNC, S_IREAD|S_IWRITE);
if (reOutFile<0) fatalPerror(": Redirection Output Error");
if (dup2(reOutFile,1) != 1) fatalPerror(": Redirection Output Error");
close(reOutFile);
}
}
if ( (child_id == 0 && hasPipe) || !hasPipe) {
if (hasAReIn(args)) {
char inFle[100];
getInFile(args, inFle);
int reInFile = open(inFle, O_RDWR);
if (reInFile<0) fatalPerror(": Redirection Input Error");
if (dup2(reInFile,0) != 0) fatalPerror(": Redirection Input Error");
close(reInFile);
} else if (isBgd && !hasPipe) {
int bgReInFile = open("/dev/null", O_RDONLY);
if (bgReInFile<0) fatalPerror(": /dev/null Redirection Input Error");
if (dup2(bgReInFile,0) != 0) fatalPerror(": /dev/null Redirection Input Error");
close(bgReInFile);
}
}
// Execute the command
execvp(args[0], args);
perror(args[0]);
exit(-1);
}
// Wait for the child process to complete, if necessary
if (!isBgd) waitpid(child_id, &status, 0);
else if (!hasPipe) {
printf("Child %ld started\n", (long)child_id);
BackgroundProcess *bgPrs = malloc(sizeof(BackgroundProcess));
bgPrs->pid = child_id;
bgPrs->exitStatus = -1;
addProcessToBgStack(bgStack, bgPrs);
}
if (hasPipe) waitpid(childb_id, &statusb, 0);
if ( WIFSIGNALED(status) && !isBgd ) printf("Child %ld terminated due to signal %d\n", (long)child_id, WTERMSIG(status) );
if ( hasPipe && WIFSIGNALED(statusb) ) printf("Child %ld terminated due to signal %d\n", (long)childb_id, WTERMSIG(status) );
} // end of do_command
The second child should not wait for the first child to exit - it should just start running straight away (it will block until some output is produced on the pipe by the first child), so remove that waitpid() executed by childb. Instead, the parent process should wait for both child processes (or perhaps just the second one). (Indeed, as noted by JeremyP, this waitpid() call is failing anyway, since childb is not the parent of child).
Your problem, though, is that the parent process is mainintaining open file descriptors to the pipe. Right before the comment // Wait for the child process to complete, if necessary, the parent process should close its pipe file descriptors:
close(pipes[0]);
close(pipes[1]);
The open file descriptor in the parent means that the child grep process never sees EOF, so it doesn't exit.
I don't know the answer but I have spotted one issue.
You'll agree that the condition for
if(child_id == 0 || (childb_id == 0 && hasPipe))
is true only for the two child processes, but inside the if statement block you have this:
if (child_id != 0 && hasPipe) {
if (dup2(pipes[0], 0) != 0) fatalPerror(": Pipe Redirection Input Error");
close(pipes[0]);
close(pipes[1]);
waitpid(child_id, NULL, 0);
}
The waitpid() call is incorrect because it is called from the second child to wait for the first child. It's probably failing with ECHILD because the first child is not a child of the second child.
As for your real problem, I suspect it has to do with the fact that the grep command will not terminate until its input is closed. There might be some deadlock condition going on that stops that from happening. You need to run this in a debugger or put some logging in to see where the parent process is hanging.
Edit
caf's answer tells us everything.
I was assuming that the input to grep was being closed because ls will close its output when it terminates, but of course, the parent process also has grep's input file descriptor open. The version using head works properly because head -n 4 terminates after four lines regardless of whether its input file descriptor is closed or not.

Resources