I have a collection of processes that I'm starting from a shell script as follows:
#!/bin/bash
#This is a shortcut to start multiple storage services
function finish {
alljobs=$(jobs -p)
if [ ! -z "$alljobs" ]; then
kill $alljobs >/dev/null 2>&1
else
echo "Entire trio ceased running"
fi
}
trap finish EXIT
./storage &
P1=$!
./storage &
P2=$!
./storage &
P3=$!
wait $P1 $P2 $P3
Currently, it executes how I want it to, in that when I send a ctrl+c signal to it, the script sends that signal to all my background processes.
However: I've now extended these programs so that, based on connections/messages they receive from clients, they may do an execv, killing themselves and starting up a new, separate program. (For the curious, they're simulating a "server dead" state by starting up an idling process, which may then receive signals to start up the original process again.)
The problem is that, after the execv, this new process no longer responds to a kill sent by the bash script.
Is there a way to allow for this original script's execution (and subsequent signalling) to also send a signal to the newly exec'd process as well?
I suggest you consider searching for child processes by the parent pid. More specifically, before killing a pid, search for the child processes of that pid using ps and kill those children first. Finally, kill the parent.
I have a feeling that there are race conditions that will cause this to fail in some cases.
Update: my issue was completely unrelated to this script; the pid of the running/rebooted processes never changed, but I was inadvertently inheriting the signals I was blocking in various threads internal to the program. Some sensible calls to pthread_sigmask solved the issue.
Thanks to #Mark for the tips about child processes though; if there were fork calls going on, that would have been a very good approach!
Related
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>
int main()
{
int pid = fork();
if (pid) {
sleep(5);
// wait(NULL); // works fine when waited for it.
} else {
execlp("vim", "vim", (char *)NULL);
}
}
When I run this code, vim runs normally then crashes after the 5 seconds (i.e. when its parent exits). When I wait for it (i.e. not letting it become an orphan process), the code works totally fine.
Why does becoming an orphan process become a problem here? Is it something specific to vim?
Why is this even a thing that's visible to vim? I thought that only the parent knows when its children die. But here, I see that somehow, the child notices when it gets adopted, something happens and crashes somehow. Do the children processes get notified when their parent dies as well?
When I run this code, I get this output after the crash:
Vim: Error reading input, exiting...
Vim: preserving files...
Vim: Finished.
This actually happens because of the shell that is executing the binary that forks Vim!
When the shell runs a foreground command, it creates a new process group and makes it the foreground process group of the terminal attached to the shell. In bash 5.0, you can find the code that transfers this responsibility in give_terminal_to(), which uses tcsetpgrp() to set the foreground process group.
It is necessary to set the foreground process group of a terminal correctly, so that the program running in foreground can get signals from the terminal (for example, Ctrl+C sending an interrupt signal, Ctrl+Z sending a terminal stop signal to suspend the process) and also change terminal settings in ways that full-screen programs such as Vim typically do. (The subject of foreground process group is a bit out of scope for this question, just mentioning it here since it plays part in the response.)
When the process (more precisely, the pipeline) executed by the shell terminates, the shell will take back the foreground process group, using the same give_terminal_to() code by calling it with the shell's process group.
This is usually fine, because at the time the executed pipeline is finished, there's usually no process left on that process group, or if there are any, they typically don't hold on to the terminal (for example, if you're launching a background daemon from the shell, the daemon will typically close the stdin/stdout/stderr streams to relinquish access to the terminal.)
But that's not really the case with the setup you proposed, where Vim is still attached to the terminal and part of the foreground process group. When the parent process exits, the shell assumes the pipeline is finished and it will set the foreground process group back to itself, "stealing" it from the former foreground process group which is where Vim is. Consequently, the next time Vim tries to read from the terminal, the read will fail and Vim will exit with the message you reported.
One way to see by yourself that the parent processing exiting does not affect Vim by itself is running it through strace. For example, with the following command (assuming ./vim-launcher is your binary):
$ strace -f -o /tmp/vim-launcher.strace ./vim-launcher
Since strace is running with the -f option to follow forks, it will also start tracing Vim when it's launched. The shell will be executing strace (not vim-launcher), so its foreground pipeline will only end when strace stops running. And strace will not stop running until Vim exits. Vim will work just fine past the 5 seconds, even though it's been reparented to init.
There also used to be an fghack tool, part of daemontools, that accomplished the same task of blocking until all forked children would exit. It would accomplish that by creating a new pipe and have the pipe inherited by the process it spawned, in a way that would get automatically inherited by all other forked children. That way, it could block until all copies of that pipe file descriptor were closed, which typically only happens when all processes exit (unless a background process goes out of its way to close all inherited file descriptors, but that's essentially stating that they don't want to be tracked, and they would most probably have relinquished their access to the terminal by that point.)
I am writing a basic shell program for a university assignment, and i need to test for when the user enters the string "exit". When this happens the program should quit.
I can test for this successfully, but if i have forked new processes that have dealt with an strerror in my program, i have to keep entering exit for however many active processes are running at that current time.
Is there a way of exiting all associated processes with a program under this condition?
Cheers.
As said in comments, you should not spawn interactive processes in the background (at least how your shell and your command will handle the only stdin?).
Also as a shell you should keep track of all spawned processes (in background) so that you are able to catch their return code, as done in sh/bash (at least). For exemple in bash:
> sleep 1 &
[1] 8215
>
(1 sec later)
[1]+ Terminated sleep 1
So if you have the list of existing children you can send SIGINT/SIGKILL to all of them.
Whatever if you really want to be sure to kill everyone you should use process group (PG) killing. Using kill() function with PID=0 sends the kill signal to all processes in the same process group than you.
So you can start your shell by setting a new process group (to be sure to not kill something else), and this PG will be inherited by your children (appart if a child set a new PG of course).
This would looks like:
// at the begining of your main code
// try to get a new process group for me
x = setpgid(0,0);
if (x == -1) {
perror("setpgid");
exit(1);
}
(…)
// here you're about to exit from main, just kill
// all members of your group
kill(0, SIGINT); // send an INT signal
kill(0, SIGKILL); // paranoid: if a child catch INT it will get a KILL
// now you can exit, but you're probably dead 'cause you
// also receive the SIGINT. If you want to survive you have to
// catch SIGINT, but you will not catch KILL whatever
If it is needed for you to survive the kill you may catch the signal using signal() or better sigaction() so that you will not be killed and so able to perform other before-exit actions.
How can I wait for multiple instance of same program to finish ? Below is my scenario any suggestion or pointers ?
I need to restart a running C process. After googling for long time, I figured out restarting can only done by fork and exec(I need the new instance to have a different pid than the original one hence using only exec wont work). Below is the sequence i did.
Shell 1:
Bash script
1. Start the first instance(./test.exe lets say pid 100)
2. Wait for it complete(pid 100) <<< Here need to wait for all instances of test.exe to complete
Shell 2:
1. Send a signal to above process(pid-100)
2. In signal handler had fork(new pid 200) a new process with exec command(./test.exe --restart) and kill parent (pid 100)
Now my question is that how can wait for all instances of test.exe to complete in shell1's bash script ?(basically have to wait until pid 200 is completed)
With my current approach shell1's bash script exits as soon as I send signal to kill pid 100
Update:
Actually I am looking for some bash/unix command to wait for all instances of test.exe is finished. Something like - 'wait $(pgrep -f test.exe)'
Basically you are looking for inter-process synchronisation mechanisms, i.e. inter-process semaphores and inter-process mutexes.
Two approaches that come to mind are POSIX semaphores and the older System V semaphores. I would recommend the former.
Also check out this SO reply.
Hope this helps :)
I am trying to figure how to kill all processses in a session (with the same SID) using system calls with C. I am not interested in to just kill all with a specific PGID since not all processes I am interested about does not have the same PGID, but they have the same SID.
My research have only found this, there Graeme made an excellent answer for scripts:
https://unix.stackexchange.com/questions/124127/kill-all-descendant-processes
I would be pleased to get an answer for how it would be possible to kill all direct descendant children and even more pleased how I could kill all children within the session.
Or is what I am asking possible? I am not intrested in a solution there I am simply listing the PIDs of the parents descendant.
You can always use the /proc/ file system to query processes (see proc(5) for more). In particular you can then scan the /proc/PID/ directories (where PID is some numerical name like 1234, it is the relevant pid, so process of pid 1234 is described in /proc/1234/ pseudo-directory; hence you could readdir the /proc/ directory and find every numerical name inside it) and check which processes have a defined parent pid. You'll read sequentially pseudo-files like /proc/1234/status (and its PPid: line). See also this answer and that one.
please try this
pkill -9 -s [session id]
As far as I understand, you can't do that safely.
The only processes you can safely kill are your direct children because only for them you'll be able to know with certainty that their pid is accurate.*
For any other process, pids of their nonchildren processes are a moving target (though very very slowly moving, unless you're on an extremely busy system where processes are spawned like crazy, making pid recycling very fast).
So you could theoretically walk the process tree, e.g.:
#This will create a nice process tree
(sleep 1000 & ( sleep 1000& sleep 1000& sleep 1000)& sleep 1000 )&
#View it with
ps T --forest
#Or recursively get all the nodes (this should match the corresponding part of the above ps command)
walk_processes() { echo $1; cat /proc/$1/task/$1/children | while read -d ' ' pid; do walk_processes $pid; done; }
walk_processes $!
But you can't use the pids obtained in the above fashion to implement a safe "kill a whole process tree" (with a session being a specific type of a process tree).
You can only kill the direct children of the session leader or their whole process groups, but the processes that get killed in this fashion may not transmit the kill signal further on to their subgroups—something you can't safely/reliably do for them. The processes that remained after a closing of a session in this way are then reparented to init. If they're stopped process groups, they'd have no one to wake them up (these are called orphaned groups), so init will send them SIGCONT and SIGHUP (these two, with SIGHUP being sent first). SIGHUP will normally kill them. If they have a handler for SIGHUP, they may live on as daemons.
In other words, if you want to safely kill children of your session leader, prevent those children from creating subgroups, (making your session id always match with a single process group id).
*The reason for that is after you successfully kill your own child process, it becomes a zombie until you wait on it, and that zombie reserves the pid spot so until you've waited on your child, your child's pid is not a moving target (all other pids are).
With inspiration from Basile Starynkevitch I used this simple loop, there I afterwards waits for the children.
/* Searches through all directories in /proc */
while((dent = readdir(srcdir)) != NULL) {
/* If numerical */
if (dent->d_name[0] >= '0' && dent->d_name[0] <= '9') {
/* Take data from /proc/[pid]/stat, see URL below for more info */
/* http://man7.org/linux/man-pages/man5/proc.5.html */
sprintf(path, "/proc/%s/stat", dent->d_name);
stat_f = fopen(path,"r");
fscanf(stat_f, "%d %*s %*c %d", &pid, &ppid);
fclose(stat_f);
/* Kill if shell is parent to process */
if (shell_pid == ppid) kill(pid, SIGKILL);
}
}
I'm working on my homework which is to replicate the unix command shell in C.
I've implemented till single command execution with background running (&).
Now I'm at the stage of implementing pipes and I face this issue, For pipes greater than 1, the children commands with pipe are completed, but the final output doesn't get displayed on stdout (the last command's stdin is replaced with read of last pipe)
dup2(pipes[lst_cmd], 0);
I tried fflush(STDIN_FILENO) at the parent too.
The exit of my program is CONTROL-D, and when i press that, the output gets displayed (also exits since my operation on CONTROL-D is to exit(0)).
I think the output of pipe is in the stdout buffer but doesn't get displayed. Is there anyother means than fflush to get the stuff in the buffer to stdout?
Having seen the code (unfair advantage), the primary problem was the process structure combined with not closing pipes thoroughly.
The process structure for a pipeline ps | sort was:
main shell
- coordinator sub-shell
- ps
- sort
The main shell was creating N pipes (N = 1 for ps | sort). The coordinator shell was then created; it would start the N+1 children. It did not, however, wait for them to terminate, nor did it close its copy of the pipes. Nor did the main shell close its copy of the pipes.
The more normal process structure would probably do without the coordinator sub-shell. There are two mechanisms for generating the children. Classically, the main shell would fork one sub-process; it would do the coordination for the first N processes in the pipeline (including creating the pipes in the first place), and then would exec the last process in the pipeline. The main shell waits for the one child to finish, and the exit status of the pipeline is the exit status of the child (aka the last process in the pipeline).
More recently, bash provides a mechanism whereby the main shell gets the status of each child in the pipeline; it does the coordination.
The primary fixes (apart from some mostly minor compilation warnings) were:
main shell closes all pipes after forking coordinator.
main shell waits for coordinator to complete.
coordinator closes all pipes after forking pipeline.
coordinator waits for all processes in pipeline to complete.
coordinator exits (instead of returning to provide duelling dual prompts).
A better fix would eliminate the coordinator sub-shell (it would behave like the classical system described).