I am not English user, so there can be some wrong sentences:)
I am making minishell.
In code, child process runs execvp function. And parent process change child's pgid to be same with child's pid.
However, after change child's pgid, child begin to run in background, so minishell doesn't work well.
If I don't change child's pgid, then it works perfectly.
I want to know how to bring background child process to foreground, or how not to make child run in background.
int execute(){
pid_t pid = fork();
if (pid == 0){
execvp(args[idx], &args[idx]);
}
else{
setpgid(pid, pid);
int status;
waitpid(pid, &status, WUNTRACED);
if (WIFSTOPPED(status))
kill(pid, SIGINT);
}
}
TL;DR: use tcsetpgrp() to change which process group is in the foreground on the relevant terminal.
At any given time, a terminal has at most one controlling process group. Processes in this process group can receive input from the terminal and write output to it, and they receive certain signals generated by the terminal driver in response to keystrokes such as ctrl-C. Being in its controlling terminal's controlling process group is what it means for a process to be in the foreground.
Processes in other process groups do not receive input from the terminal and cannot write output to it. They will receive different signals from the terminal driver, under different circumstances, than processes in the controlling process group do. This is what it means to be in the background.
A shell that performs job control, which includes, but is not limited to, managing foreground and background status for other processes, does so in part by assigning the processes it spawns to appropriate process groups (setpgid()) and managing which one, including its own, is the foreground process group at any given time (tcsetpgrp()). This is much more complicated than simply running all jobs in its own process group.
The Glibc manual contains quite a lot of information about this area, including example code. Consider reading the chapter Job Control, and especially its section Implementing a Job Control Shell.
Related
In this code (run on linux):
void child_process()
{
int count=0;
for(;count<1000;count++)
{
printf("Child Process: %04d\n",count);
}
printf("Child's process id: %d\n",getpid());
}
void parent_process()
{
int count=0;
for(;count<1000;count++)
{
printf("Parent Process: %04d\n",count);
}
}
int main()
{
pid_t pid;
int status;
if((pid = fork()) < 0)
{
printf("unable to create child process\n");
exit(1);
}
if(pid == 0)
child_process();
if(pid > 0)
{
printf("Return value of wait: %d\n",wait();
parent_process();
}
return 0;
}
If the wait() were not present in the code, one of the process (child or parent) would finish it's execution and then the control is given to the linux terminal and then finally the process left (child or parent) would run. The output of such a case is:
Parent Process: 0998
Parent Process: 0999
guest#debian:~/c$ Child Process: 0645 //Control given to terminal & then child process is again picked for processing
Child Process: 0646
Child Process: 0647
In case wait() is present in the code, what should be the flow of execution?
When fork() is called then a process tree must be created containing parent and child process. In above code when the processing of child process ends, the parent is informed about the death of child zombie process via wait() system call, but parent and child being two separate processes, is it mandatory that the control is passed the directly to the parent after child process is over? (no control given to other process like terminal at all) - if yes then it is like child process is a part of parent process (like a function called from another function).
This comment is, at least, misleading:
//Control given to terminal & then child process is again picked for processing
The "terminal" process doesn't really enter into the equation. It's always running, assuming that you are using a terminal emulator to interact with your program. (If you're using the console, then there is no terminal process. But that's unlikely these days.)
The process in control of the user interface is whatever shell you're using. You type some command-line like
$ ./a.out
and the shell arranges for your program to run. (The shell is an ordinary user program without special privileges, by the way. You could write your own.)
Specifically, the shell:
Uses fork to create a child process.
Uses waitpid to wait for that child process to finish.
The child process sets up any necessary redirects and then uses some exec system call, typically execve, to replace itself with the ./a.out program, passing execve (or whatever) the command line arguments you specified.
That's it.
Your program, in ./a.out, uses fork to create a child and then possibly waits for the child to finish before terminating. As soon as your parent process terminates, the shell's waitpid() can return, and as soon as it returns, the shell prints a new command prompt.
So there are at least three relevant processes: the shell, your parent process, and your child process. In the absence of synchronisation functions like waitpid(), there are no guarantees about ordering. So when your parent process calls fork(), the created child could start executing immediately. Or not. If it does start executing immediately, it does not necessarily preempt your parent process, assuming your computer is reasonably modern and has more than one core. They could both be executing at the same time. But that's not going to last very long because your parent process will either immediately call exit or immediately call wait.
When a process calls wait (or waitpid), it is suspended and becomes runnable again when the process it is waiting for terminates. But again there are no guarantees. The mere fact that a process is runnable doesn't mean that it will immediately start running. But generally, in the absence of high load, the operating system will start running it pretty soon. Again, it might be running at the same time as another process, such as your child process (if your parent didn't wait for it to finish).
In short, if you performed your experiment a million times, and your parent waits for your child, then you will see the same result a million times; the child must finish before the parent is unsuspended, and your parent must finish before the shell is unsuspended. (If your parent process printed something before waiting, you would see different results; the parent and child outputs could be in any order, or even overlapped.)
If, on the other hand, your parent does not wait for the child, then you could see any of a number of results, and in a million repetitions you're likely to see more than one of them (but not with the same probability). Since there is no synchronisation between parent and child, the outputs could appear in either order (or be interleaved). And since the child is not synchronised with the shell, its output could appear before or after the shell's prompt, or be interleaved with the shell's prompt. No guarantees, other than that the shell will not resume until your parent is done.
Note that the terminal emulator, which is a completely independent process, is runnable the entire time. It owns a pseudo-terminal ("pty") which is how it emulates a terminal. The pseudo-terminal is a kind of pipe; at one end of the pipe is the process which thinks it's communicating with a console, and at the other end is the terminal emulator which interprets whatever is being written to the pty in order to render it in the GUI, and which sends any keystrokes it receives, suitably modified as a character stream back through the pipe. Since the terminal emulator is never suspended and its execution is therefore interleaved with whatever other processes are active on your computer, it will (more or less) immediately show you any output which is sent by your shell or the processes it starts up. (Again, assuming the machine is not overloaded.)
I have a daemon application that starts several 3rd party executables (all closed-sources and non modifiable).
I would like to have all the child processes to automatically terminate when the parent exits for any reason (including crashes).
Currently, I am using prctl to achieve this (see also this question):
int ret = fork();
if (ret == 0) {
//Setup other stuff
prctl (PR_SET_PDEATHSIG, SIGKILL);
if (execve( "childexecutable" ) < 0) { /*signal error*/}
}
However, if "childexecutable" also forks and spawns "grandchildren", then "grandchildren" is not killed when my process exits.
Maybe I could create an intermediate process that serves as subreaper, that would then kill "someexecutable" when my process dies, but then wait for SIGCHLD and continue to kill child processes until none is left, but it seems very brittle.
Are there better solutions?
Creating a subreaper is not useful in this case, your grandchildren would be reparented to and reaped by init anyway.
What you could do however is:
Start a parent process and fork a child immediately.
The parent will simply wait for the child.
The child will carry out all the work of your actual program, including spawning any other children via fork + execve.
Upon exit of the child for any reason (including deathly signals e.g. a crash) the parent can issue kill(0, SIGKILL) or killpg(getpgid(0), SIGKILL) to kill all the processes in its process group. Issuing a SIGINT/SIGTERM before SIGKILL would probably be a better idea depending on what child processes you want to run, as they could handle such signals and do a graceful cleanup of used resources (including children) before exiting.
Assuming that none of the children or grandchildren changes their process group while running, this will kill the entire tree of processes upon exit of your program. You could also keep the PR_SET_PDEATHSIG before any execve to make this more robust. Again depending on the processes you want to run a PR_SET_PDEATHSIG with SIGINT/SIGTERM could make more sense than SIGKILL.
You can issue setpgid(getpid(), 0) before doing any of the above to create a new process group for your program and avoid killing any parents when issuing kill(0, SIGKILL).
The logic of the "parent" process should be really simple, just a fork + wait in a loop + kill upon the right condition returned by wait. Of course, if this process crashes too then all bets are off, so take care in writing simple and reliable code.
I am writing a shell in C and I am trying to add signal handling. In the shell, fork() is called and the child process executes a shell command. The child process is put into its own process group. This way, if Ctrl-C is pressed when a child process is in the foreground, it closes all of the processes that share the same process group id. The shell executes the commands as expected.
The problem is the signals. When, for example, I execute "sleep 5", and then I press Ctrl-C for SIGINT, the "shell>" prompt comes up as expected but the process is still running in the background. If I quickly run "ps" after I press Ctrl-C, the sleep call is still there. Then after the 5 seconds are up and I run "ps" again, it's gone. The same thing happens when I press Ctrl-Z (SIGTSTP). With SIGTSTP, the process goes to the background, as expected, but it doesn't pause execution. It keeps running until it's finished.
Why are these processes being sent to the background like this and continuing to run?
Here is the gist of my code...
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>
int status;
void sig_handler_parent()
{
printf("\n");
}
void sig_handler_sigchild(int signum)
{
waitpid(-1, &status, WNOHANG);
}
int main()
{
signal(SIGCHLD, sig_handler_sigchild);
signal(SIGINT, sig_handler_parent);
signal(SIGQUIT, sig_handler_parent);
signal(SIGTERM, sig_handler_parent);
signal(SIGCONT, sig_handler_parent);
signal(SIGTSTP, sig_handler_parent);
while (1)
{
printf("shell> ");
// GET COMMAND INPUT HERE
pid = fork();
if (pid == 0)
{
setpgid(getpid(), getpid());
execvp(cmd[0], cmd);
printf("%s: unknown command\n", cmd[0]);
exit(1);
}
else
waitpid(0, &status, WUNTRACED);
}
return 0;
}
p.s. I have already tried setting all of the signal handlers to SIG_DFL before the exec command.
The code you provide does not compile, and an attempt to fix it shows
that you omitted a lot. I am only guessing.
In order to bring you forward, I'll point out a number of facts that
you might have misunderstood. Together with a couple of documentation
links, I hope this is helpful.
Error Handling
First: please make a habit of handling errors, especially when you
know there's something that you don't understand. For example, the
parent (your shell) waits until the child terminates,
waitpid(0, &status, WUNTRACED);
You say,
When, for example, I execute "sleep 5", and then I press Ctrl-C for
SIGINT, the "shell>" prompt comes up as expected but the process is
still running in the background.
What actually happens is that once you press Ctrl-C, the parent (not the
child; see below for why) receives SIGINT (the kernel's terminal
subsystem handles keyboard input, sees that someone holds "Ctrl" and
"C" at the same time, and concludes that all processes with that
controlling terminal must be sent SIGINT).
Change the parent branch to,
int error = waitpid(0, &status, WUNTRACED);
if (error != 0)
perror("waitpid");
With this, you'd see perror() print something like:
waitpid: interrupted system call
You want SIGINT to go to the child, so something must be wrong.
Signal Handlers, fork(), and exec()
Next, what happens to your signal handlers across fork() and
exec()?
The signal overview man
page states,
A child created via fork(2) inherits a copy of its parent's signal
dispositions. During an execve(2), the dispositions of handled
signals are reset to the default; the dispositions of ignored
signals are left unchanged.
So, ideally, what this means is that:
The parent (shell) sees SIGINT, as observed above, and prints
"interrupted system call".
The child's signal handlers are reset back to their defaults. For SIGINT,
this means to terminate.
You do not fiddle with the controlling terminal, so the child
inherits the controlling terminal of the parent. This means that
SIGINT is delivered to both parent and child. Given that the
child's SIGINT behavior is to terminate, I'd bet that no process is
left running.
Except when you use setpgid() to create a new process group.
Process Groups, Sessions, and Controlling Terminal
Someone once called me a UNIX greybeard. While this is true form a
visual point of view, I must reject that compliment because I rarely
hang around in one of the darkest corners of UNIX - the terminal
subsystem. Shell writers have to understand that too though.
In this context, it's the "NOTES" section of the setpgid() man
page. I suggest
you read that, especially where it says,
At any time, one (and only one) of the process groups in the session
can be the foreground process group for the terminal; (...)
The shell (bash maybe) from which you start your shell program has
done so for the foreground invocation of your program, and marked that
as "foreground process group". Effectively this means, "Please, dear
terminal, whenever someone presses Ctrl-C, send a SIGINT to all
processes in that group. I (your parent) just sit and wait (waitpid()) until all is over, and will take control again then.".
You create a process group for the child, but don't tell the terminal
about it. What you want is to
Detach the parent from the terminal.
Set the child process group as the foregroud process group of the terminal.
Wait for the child (you already do).
Regain terminal foreground.
Further down in the "NOTES" section of said man page, they give links
to how that is done. Follow those, read thoroughly, try out things,
and make sure you handle errors. In most cases, such errors are signs
of misunderstanding. And, in most cases, such errors are fixed by
re-reading the documentation.
Are you sure that your child process is actually receiving the signals from your tty? I believe you need to make a call to tcsetpgrp to actually tell the controlling terminal to send signals to the process group of your child process.
For example, after you call fork, and before exec, try this from within your child.
tcsetpgrp(STDIN_FILENO, getpid())
Here is the man page for tcsetpgrp(3)
I would like to have the same effect in my program as the bash(terminal) does when we kill it using SIGKILL. As we know that we cannot handle SIGKILL in our progams so when ever I kill my program its children are assigned to init process, there is no way to handle it so that I can kill all my child processes and then kill the parent itself. Though when ever we kill terminal all the process created through it are killed even if we kill our terminal by SIGKILL.
For this I did some research and found the below post:
[https://unix.stackexchange.com/questions/54963/how-can-terminal-emulators-kill-their-children-after-recieving-a-sigkill][1]
The post is bit confusing, still what I got out of the post is that if the process you are killing is the process leader of the process group then all its children will be killed.
So for simplicity, I implemented below program to test if it is so:
int main()
{
printf("Curent PID: %u\n", getpid());
// make a new session
pid_t pid = setsid();
printf("New session ID: %u\n", pid);
pid = fork();
switch(pid)
{
case -1:
perror("UNable to fork the process\n");
exit(EXIT_FAILURE);
case 0:
// child process
while (1)
{
sleep(1);
}
break;
}
while (1)
{
printf("Process Leader running\n");
sleep(1);
}
return 0;
}
After running the above program when I killed the parent process the child process didn't got killed. I also modified the above program so that it does not belong to any tty, I thought might be the process leader should not be associated with any tty. I did that by the following way:
Create a normal process (Parent process)
Create a child process from within the above parent process
The process hierarchy at this stage looks like : TERMINAL -> PARENT PROCESS -> CHILD PROCESS
Terminate the the parent process.
The child process now becomes orphan and is taken over by the init process.
Call setsid() function to run the process in new session and have a new group.
then the same above code repeats.
Still when i killed the process leader children were there.
Maybe I didn't got that post on unix.stackexchange or is it the deafult behaviour in LINUX.
One way that I can implement to make all children kill is by catching each TERMINATING SIGNAL like SIGTERM, SIGHUP etc. handle them and write logic inside those signal handlers to kill child first.But still on SIGKILL I can't do anything.
Also I am interested to know that if killing parent process doesn't affect child process even if the parent is process leader or whatever then how bash(terminal) manages to kill all the child processes even if we send SIGKILL to it. Is there some extra logic written for terminals in LINUX kernel.
If there is a way to kill all child processes when parent is killed even using SIGKILL signal, I would be happy to know that too.
Manual page of kill says:
Negative PID values may be used to choose whole process groups
In my understanding to kill the whole group of processes, you have to send negative PID.
Another mechanism is causing that killing a terminal kills its child processes. Processes running from a terminal have their stdin/stdout attached to the terminal. When killing the terminal, those connections are closed and a signal (SIG_HUP) is sent to those processes. A usual program does not handle this signal and is terminated by default.
The advice from Marian is quite correct and well worth researching, but if you choose to follow that route you will likely end up with an implementation of what might be called the "hostage trick".
The hostage trick consists of your root process spawning an artificial child process which spends all its time in the stopped state. This "hostage" will be spawned immediately before the first child process which does real work in your (multi-process) program.
The hostage process is made the leader of its own process group and then enters a loop in which it stops itself with "raise(SIGSTOP)". If it is ever continued, it checks to see whether its parent has terminated (i.e. whether it has been re-parented or cannot signal its parent with the null signal (ESRCH)). If the parent has terminated, then the hostage should terminate, otherwise it should re-suspend with another "raise(SIGSTOP)".
You need to be careful about race conditions: e.g. for the re-parenting test take care to cache the parent-process-id for the hostage as the return value from "getpid()" before "fork()"-ing the hostage and also make "setpgid()" calls downstream of "fork()" in both parent and child. You then need to consider what you do if someone "kill(., SIGKILL)"s the hostage!
True, you can put a SIGCHLD handler in the parent to re-spawn it, but that requires considerable care to preserve the continuity of the identity of the hostage's process group; maybe there were other child processes at the time of the SIGKILL and the replacement hostage should go in the original process group, maybe there weren't and the original process group has evaporated.
Even if you get that right, the fact that you have put a "fork()" call in a handler for an asynchronous signal (SIGCHLD) will likely open another can of worms if your main process uses multiple threads.
Because of these difficulties I would advise against using the hostage trick unless the child processes run code over which you have no control (and to think seriously about the costs in complexity and maintainability even then). If you have control of the code of the child processes, then it is much simpler to use a "pipe()".
You create the pipe in the parent process and manage the file descriptors to ensure that the parent process is the sole writer and that each child process allocates one file descriptor to the read-side. If you do this, then the termination of the parent process (whether due to SIGKILL or any other cause) is communicated to the child processes by the EoF condition on the read side of the pipe as the last writer terminates.
If you want to treat SIGKILL specially, then you can use a protocol on the pipe whereby the parent process sends a termination message advising the children of its termination status on all normal terminations and on catchable fatal signals, and leaves the children to infer that the parent was killed by SIGKILL in the event that the read-side of the pipe delivers an EoF without a preceding termination message.
On Linux prctl(PR_SET_PDEATHSIG... will arrange for a process to receive a signal when it's parent dies, this setting is preserved over exec but not inherited by child processes.
I am writing a mini-shell(no, not for school :P; for my own enjoyment) and most of the basic functionality is now done but I am stuck when trying to handle SIGTSTP.
Supposedly, when a user presses Ctrl+Z, SIGTSTP should be sent to the Foreground process of the shell if it exists, and Shell should continue normally.
After creating each process(if it's a Foreground process), the following code waits:
if(waitpid(pid, &processReturnStatus, WUNTRACED)>0){//wait stopped too
if(WIFEXITED(processReturnStatus) || WIFSIGNALED(processReturnStatus))
removeFromJobList(pid);
}
And I am handling the signal as follows:
void sigtstpHandler(int signum)
{
signum++;//Just to remove gcc's warning
pid_t pid = findForegroundProcessID();
if(pid > -1){
kill(-pid, SIGTSTP);//Sending to the whole group
}
}
What happens is that when I press Ctrl+Z, the child process does get suspended indeed(using ps -all to view the state of the processes) but my shell hangs at waitpid it never returns even though I passed WUNTRACED flag which as far as I understood is supposed to make waitpid return when the process is stopped too.
So what could I have possible done wrong? or did I understand waitpid's behavior incorrectly?
Notes:
-findForegroundProcessID() returns the right pid; I double checked that.
-I am changing each process's group when right after I fork
-Handling Ctrl+C is working just fine
-If I use another terminal to send SIGCONT after my shell hangs, the child process resumes its work and the shell reaps it eventually.
-I am catching SIGTSTP which as far as I read(and tested) can be caught.
-I tried using waitid instead of waitpid just in case, problem persisted.
EDIT:
void sigchldHandler(int signum)
{
signum++;//Just to remove the warning
pid_t pid;
while((pid = waitpid(-1, &processReturnStatus, 0)) > 0){
removeFromJobList(pid);
}
if(errno != ECHILD)
unixError("kill error");
}
My SIGCHLD handler.
SIGCHLD is delivered for stopped children. The waitpid() call in the signal handler - which doesn't specify WUNTRACED - blocks forever.
You should probably not have the removeFromJobList() processing in two different places. If I had to guess, it sounds like it touches global data structures, and doesn't belong in a signal handler.
Waitpid is not returning because you are not not setting a sigchld handler (which I sent you earlier). You have child processess that are not getting reaped. Furthermore, waitpid needs to be in a while loop, not an if (also sent you that).
The only signal you are supposed to catch is SIGCHLD. The reason being is that if your processes are forked properly, the kernel will send that signal to the foreground process and it will terminate it or stop it or do whatever the signal is properly.
When process groups are not set correctly, signals will get sent to the wrong process. One way to test that is by running a foreground process and hitting Ctrl-Z. If your entire shell exists, then Ctrl-Z signal is getting sent to the entire shell. This means you did not set the new process in a new process group and gave it a terminal.
Now here's what you need to do if your Ctrl-Z signal is stopping your entire shell. Once you fork a process, in the child process:
- Set the process in its own group using setpgid.
- Give it a sane terminal by blocking SIGTTOU and then giving it the terminal using tcsetpgrp.
In the parent:
- Also set its child process using setpgid. This is because you have no idea if the child or the parent will execute first, so this avoids a race condition. It doesn't hurt to set it twice.