I'm working on my homework which is to replicate the unix command shell in C.
I've implemented till single command execution with background running (&).
Now I'm at the stage of implementing pipes and I face this issue, For pipes greater than 1, the children commands with pipe are completed, but the final output doesn't get displayed on stdout (the last command's stdin is replaced with read of last pipe)
dup2(pipes[lst_cmd], 0);
I tried fflush(STDIN_FILENO) at the parent too.
The exit of my program is CONTROL-D, and when i press that, the output gets displayed (also exits since my operation on CONTROL-D is to exit(0)).
I think the output of pipe is in the stdout buffer but doesn't get displayed. Is there anyother means than fflush to get the stuff in the buffer to stdout?
Having seen the code (unfair advantage), the primary problem was the process structure combined with not closing pipes thoroughly.
The process structure for a pipeline ps | sort was:
main shell
- coordinator sub-shell
- ps
- sort
The main shell was creating N pipes (N = 1 for ps | sort). The coordinator shell was then created; it would start the N+1 children. It did not, however, wait for them to terminate, nor did it close its copy of the pipes. Nor did the main shell close its copy of the pipes.
The more normal process structure would probably do without the coordinator sub-shell. There are two mechanisms for generating the children. Classically, the main shell would fork one sub-process; it would do the coordination for the first N processes in the pipeline (including creating the pipes in the first place), and then would exec the last process in the pipeline. The main shell waits for the one child to finish, and the exit status of the pipeline is the exit status of the child (aka the last process in the pipeline).
More recently, bash provides a mechanism whereby the main shell gets the status of each child in the pipeline; it does the coordination.
The primary fixes (apart from some mostly minor compilation warnings) were:
main shell closes all pipes after forking coordinator.
main shell waits for coordinator to complete.
coordinator closes all pipes after forking pipeline.
coordinator waits for all processes in pipeline to complete.
coordinator exits (instead of returning to provide duelling dual prompts).
A better fix would eliminate the coordinator sub-shell (it would behave like the classical system described).
Related
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>
int main()
{
int pid = fork();
if (pid) {
sleep(5);
// wait(NULL); // works fine when waited for it.
} else {
execlp("vim", "vim", (char *)NULL);
}
}
When I run this code, vim runs normally then crashes after the 5 seconds (i.e. when its parent exits). When I wait for it (i.e. not letting it become an orphan process), the code works totally fine.
Why does becoming an orphan process become a problem here? Is it something specific to vim?
Why is this even a thing that's visible to vim? I thought that only the parent knows when its children die. But here, I see that somehow, the child notices when it gets adopted, something happens and crashes somehow. Do the children processes get notified when their parent dies as well?
When I run this code, I get this output after the crash:
Vim: Error reading input, exiting...
Vim: preserving files...
Vim: Finished.
This actually happens because of the shell that is executing the binary that forks Vim!
When the shell runs a foreground command, it creates a new process group and makes it the foreground process group of the terminal attached to the shell. In bash 5.0, you can find the code that transfers this responsibility in give_terminal_to(), which uses tcsetpgrp() to set the foreground process group.
It is necessary to set the foreground process group of a terminal correctly, so that the program running in foreground can get signals from the terminal (for example, Ctrl+C sending an interrupt signal, Ctrl+Z sending a terminal stop signal to suspend the process) and also change terminal settings in ways that full-screen programs such as Vim typically do. (The subject of foreground process group is a bit out of scope for this question, just mentioning it here since it plays part in the response.)
When the process (more precisely, the pipeline) executed by the shell terminates, the shell will take back the foreground process group, using the same give_terminal_to() code by calling it with the shell's process group.
This is usually fine, because at the time the executed pipeline is finished, there's usually no process left on that process group, or if there are any, they typically don't hold on to the terminal (for example, if you're launching a background daemon from the shell, the daemon will typically close the stdin/stdout/stderr streams to relinquish access to the terminal.)
But that's not really the case with the setup you proposed, where Vim is still attached to the terminal and part of the foreground process group. When the parent process exits, the shell assumes the pipeline is finished and it will set the foreground process group back to itself, "stealing" it from the former foreground process group which is where Vim is. Consequently, the next time Vim tries to read from the terminal, the read will fail and Vim will exit with the message you reported.
One way to see by yourself that the parent processing exiting does not affect Vim by itself is running it through strace. For example, with the following command (assuming ./vim-launcher is your binary):
$ strace -f -o /tmp/vim-launcher.strace ./vim-launcher
Since strace is running with the -f option to follow forks, it will also start tracing Vim when it's launched. The shell will be executing strace (not vim-launcher), so its foreground pipeline will only end when strace stops running. And strace will not stop running until Vim exits. Vim will work just fine past the 5 seconds, even though it's been reparented to init.
There also used to be an fghack tool, part of daemontools, that accomplished the same task of blocking until all forked children would exit. It would accomplish that by creating a new pipe and have the pipe inherited by the process it spawned, in a way that would get automatically inherited by all other forked children. That way, it could block until all copies of that pipe file descriptor were closed, which typically only happens when all processes exit (unless a background process goes out of its way to close all inherited file descriptors, but that's essentially stating that they don't want to be tracked, and they would most probably have relinquished their access to the terminal by that point.)
I have a collection of processes that I'm starting from a shell script as follows:
#!/bin/bash
#This is a shortcut to start multiple storage services
function finish {
alljobs=$(jobs -p)
if [ ! -z "$alljobs" ]; then
kill $alljobs >/dev/null 2>&1
else
echo "Entire trio ceased running"
fi
}
trap finish EXIT
./storage &
P1=$!
./storage &
P2=$!
./storage &
P3=$!
wait $P1 $P2 $P3
Currently, it executes how I want it to, in that when I send a ctrl+c signal to it, the script sends that signal to all my background processes.
However: I've now extended these programs so that, based on connections/messages they receive from clients, they may do an execv, killing themselves and starting up a new, separate program. (For the curious, they're simulating a "server dead" state by starting up an idling process, which may then receive signals to start up the original process again.)
The problem is that, after the execv, this new process no longer responds to a kill sent by the bash script.
Is there a way to allow for this original script's execution (and subsequent signalling) to also send a signal to the newly exec'd process as well?
I suggest you consider searching for child processes by the parent pid. More specifically, before killing a pid, search for the child processes of that pid using ps and kill those children first. Finally, kill the parent.
I have a feeling that there are race conditions that will cause this to fail in some cases.
Update: my issue was completely unrelated to this script; the pid of the running/rebooted processes never changed, but I was inadvertently inheriting the signals I was blocking in various threads internal to the program. Some sensible calls to pthread_sigmask solved the issue.
Thanks to #Mark for the tips about child processes though; if there were fork calls going on, that would have been a very good approach!
I have some C code called a c-shell that does the following. The parent c-shell reads in a Linux command line, and forks a child process to perform the command. The child does not exec the command until it receives a signal from the parent that it is ready for it to execute. It can handle input files for giving arguments to commands or it can just read them from the command line. It can handle sending output to output files rather than just printing the executed command output to stdout. The way that it sends the output to the output file is by the child redirecting it's stdout to a pipe, and the parent reads from this pipe once it receives the sig-child signal that the child process finished running. It can handle multiple commands (where you put a semi-colon between commands). It can handle piping output from the first command to a second command in the command line. However - and this is my question - it cannot handle a command where you pipe the output of one command to be the input of the second command, and then send the output of the second command to the output file. I'm baffled, given all the above cases work perfectly. I can redirect output from an executed child process to the parent when it finishes so it can complete it. I can redirect the output of the first command running to be the input to the second command running. But I cannot do this if I try to send the output to the second command to an output file. If this question does not make sense, I will post more specifics.
For example: if I enter into my c-shell the following command line: ls -l | grep lsOut (meaning, I do a detailed directory listing, and within that directory listing output, there are some files that contain the characters, "lsOut" (output files from the ls command), and the grep command should filter out all other files in the directory listing that do not contain those characters. That works just fine when it prints to stdout. When I do a command such as: ps > psOut, the output of the ps command writes to the psOut file with no problem. However, if I do the command: ls -l | grep lsOut > lsOutFile, what happens is baffling. It prints the first command, ls -l, to stdout and although I see in print statements that the second command, grep lsOut is being run, and should be receiving the output from ls -l as input to grep lsOut, it appears not to have any affect. The only output is the entire ls -l directory with no grep filtering, and although it says it writes it to the output file, it does not get there. If you want me to post a link to code, I can do that. Thank you very much! I spent hours trying to debug this problem.
The way that it sends the output to the output file is by the child
redirecting it's stdout to a pipe, and the parent reads from this pipe
once it receives the sig-child signal that the child process finished
running.
Hold it right here. As the saying goes: Do not pass "Go". Do not collect $200.
This part is already not quite right. If the child process starts spewing sufficient amount of output, you'll end up with both a hung parent and a hung child process, here.
Pipe buffers are not unlimited in size. Pipe buffers have a fixed, upper, maximum internal size. My recollection is that the default pipe buffer size is 8,192 bytes. It might actually be something else, but the actual size doesn't matter. Whatever the pipe buffer size is, once the buffer fills up, the process that's writing to the pipe buffer is put to sleep, until the reading process starts emptying the pipe by reading from it. As long as the reader and the writer processes work independently, one's reading, one's writing, everything runs smoothly. If the writer is writing faster than the reader is reading, once the number of unread characters reaches the pipe's maximum size, the kernel quietly puts the writer process to sleep, inside write(), until the reader catches up.
If your parent process waits for the child process to exit before it starts reading from the stdout pipe, and the child process writes more than 8,192 (or whatever the actual size is) bytes, the child process will be paused inside it's write() call, until the pipe's read from. And since the parent process isn't going to read from the pipe until the child process terminates, both processes will wait for each other, forever.
So, we already know that your application isn't handling this situation correctly. Although you've described a slightly different problem with your application, given that the application is not handling inter-process pipe semantics correctly, it's fairly likely that your actual problem, if not this, is closely related.
You must completely re-engineer how your applications implements inter-process piping, correctly.
I am writing my own shell for a homework assignment, and am running into issues.
My shell program gets an input cat scores | grep 100 from the console and prints the output as expected but the grep command doesn't terminate and I can see it running infinitely from ps command.
EDIT - There was an error while closing fds. Now grep command is not executing and console output is -
grep: (standard input): Bad file descriptor
I am reading the number of commands from the console and creating necessary pipes and storing them in a two dimensional int array fd[][] before forking the first process.
fd[0][0] will contain read end of 1st pipe and fd[0][1] will contain write end of 1st pipe. fd[1][0] will contain read end of 2nd pipe and fd[1][1] will contain write end of 2nd pipe and so on.
Each new process duplicates its stdin with the read end of its pipe with the previous process and duplicates its stdout with the write end of its pipe with the next process.
Below is my function:
void run_cmds(char **args, int count,int pos)
{
int pid,status;
pid = fork();
if ( pid == 0 )
{
if(pos != 0) dup2(fd[pos-1][0],0); // not changing stdin for 1st process
if(pos != count) dup2(fd[pos][1],1); //not changing stdout for last process
close_fds(pos);
execvp(*args,args);
}
else
{
waitpid(pid,&status,0);
count--;
pos++;
//getting next command and storing it in args
if(count > 0)
run_cmds(args,count,pos);
}
}
}
args will contain the arguments for the command.
count is the number of commands I need to create.
pos is the position of the command in the input
I am not able to figure out the problem. I used this same approach for hard coded values before this and it was working.
What am I missing with my understanding/implementation of dup2/fork and why is the command waiting infinitely?
Any inputs would be greatly helpful. Struck with this for the past couple of days!
EDIT : close_fds() function is as below -
For any process , I am closing both the pipes linking the process.
void close_fds(int pos)
{
if ( pos != 0 )
{
close(fd[pos-1][0]);
close(fd[pos-1][1]);
}
if ( pos != count)
{
close(fd[pos][0]);
close(fd[pos][1]);
}
}
First diagnosis
You say:
Each new process duplicates its stdin with the read end of its pipe with the previous process and duplicates its stdout with the write end of its pipe with the next process.
You don't mention the magic word close().
You need to ensure that you close both the read and the write end of each pipe when you use dup() or dup2() to connect it to standard input. That means with 2 pipes you have 4 calls to close().
If you don't close the pipes correctly, the process that is reading won't get EOF (because there's a process, possibly itself, that could write to the pipe). It is crucial to have enough (not too few, not too many) calls to close().
I am calling close_fds() after dup2 calls. The function will go through the fd[][2] array and do a close() call for each fd in the array.
OK. That is important. It means my primary diagnosis probably wasn't spot on.
Second diagnoses
Several other items:
You should have code after the execvp() that reports an error and exits if the execvp() returns (which means it fails).
You should not immediately call waitpid(). All the processes in a pipeline should be allowed to run concurrently. You need to launch all the processes, then wait for the last one to exit, cleaning up any others as they die (but not necessarily worrying about everything in the pipeline exiting before continuing).
If you do force the first command to execute in its entirety before launching the second, and if the first command generates more output than will fit into the pipe, you will have a deadlock — the first process can't exit because it is blocked writing, and the second process can't be started because the first hasn't exited. Interrupts and reboots and the end of the universe will all solve the problem somewhat crudely.
You decrement count as well as incrementing pos before you recurse. That might be bad. I think you should just increment pos.
Third diagnosis
After update showing close_fds() function.
I'm back to "there are problems with closing pipes" (though the waiting and error reporting problems are still problems). If you have 6 processes in a pipeline and all 5 connecting pipes are created before any processes are run, each process has to close all 10 pipe file descriptors.
Also, don't forget that if the pipes are created in the parent shell, rather than in a subshell that executes one of the commands in the pipeline, then the parent must close all the pipe descriptors before it waits for the commands to complete.
Please manufacture an MCVE (How to create a Minimal, Complete, and Verifiable Example?) or
SSCCE (Short, Self-Contained, Correct Example) — two names and links for the same basic idea.
You should create a program that manufactures the data structures that you're passing to the code that invokes run_cmds(). That is, you should create whatever data structures your parsing code creates, and show the code that creates the pipe or pipes for the 'cat score | grep 100' command.
I am no longer clear how the recursion works — or whether it is invoked in your example. I think it is unused, in fact in your example, which is probably as well since you would end up with the same command being executed multiple times, AFAICS.
Most probable reasons why grep doesn't terminate:
You don't call waitpid with the proper PID (even though there is such a call in your code, it may not get executed for some reason), so grep becomes a zombie process. Maybe your parent shell process is waiting for another process first (infinitely, because the other one never terminates), and it doesn't call waitpid with the PID of grep. You can find Z in the output of ps if grep is a zombie.
grep doesn't receive an EOF on its stdin (fd 0), some process is keeping the write end of its pipe open. Have you closed all file descriptors in the fd array in the parent shell process? If not closed everywhere, grep will never receive an EOF, and it will never terminate, because it will be blocked (forever) waiting for more data on its stdin.
I am working on an assignment to build a simple shell, and I'm trying to add a few features that aren't required yet, but I'm running into an issue with pipes.
Once my command is parsed, I fork a process to execute them. This process is a subroutine that will execute the command, if there is only one left, otherwise it will fork. The parent will execute the first command, the child will process the rest. Pipes are set up and work correctly.
My main process then calls wait(), and then outputs the prompt. When I execute a command like ls -la | cat, the prompt is printed before the output from cat.
I tried calling wait() once for each command that should be executed, but the first call works and all successive calls return ECHILD.
How can I force my main thread to wait until all children, including children of children, exit?
You can't. Either make your child process wait for its children and don't exit until they've all been waited for or fork all the children from the same process.
See this answer how to wait() for child processes: How to wait until all child processes called by fork() complete?
There is no way to wait for a grandchild; you need to implement the wait logic in each process. That way, each child will only exit after all it's children have exited (and that will then include all grandchildren recusively).
Since you are talking about grandchilds, you are obviously spawning the childs in a cascading manner. Thats a possible way to implement a pipe.
But keep in mind that the returned value from your pipe (the one you get when doing echo $? in your terminal) is the one returned from the right-most command.
This means that you need to spawn childs from right to left in this cascading implementation. You dont want to lose that returned value.
Now assuming we are only talking about builtin commands for the sake of simplicity (no extra calls to fork() and execve() are made), an intersting fact is that in some shells like "zsh", the right-most command is not even forked. We can see that with a simple piped command like:
export stack=OVERFLOW | export overflow=STACK
Using then the command env, we can appreciate the persistance of the overflow=STACK in the environment variables. It shows that the right-most command was not executed in a subshell, whereas export stack=OVERFLOW was.
Note: This is not the case in a shell like "sh".
Now lets use a basic piped command to give a possible logic for this cascading implementation.
cat /dev/random | head
Note: Even though cat /dev/random is supposedly a never ending command, it will stop as soon as the command head is done reading the first line outputed by cat /dev/random. This is because stdin is closed when head is done, and the command cat /dev/random aborts because its writing in a broken pipe.
LOGIC:
The parent process (your shell) sees that there is a pipe to execute. It will then fork two processes. The parent stays your shell, it will wait for the child to return, and store the returned value.
In the context of the first generation child: (trying to execute the right-most command of the pipe)
It sees that the command is not the last command, it will fork() again (What i call "cascading implementation").
Now that the fork is done, the parent process is going to execute first of all its task (head -1), it will then close its stdin and stdout, then wait() for its child. This is really important to close firstly stdin and stdout, then call wait(). Closing stdout sends EOF to the parent, if reading on stdin. Closing stdin make sure the grand-children trying to write in the pipe aborts, with a "broken pipe" error.
In the context of the grand-children:
It sees that it is the last command of a pipe, it will just execute the command and return its value (it closes stdin and stdout).