I'm trying to understand the signal handling and process. I have a parent process that created several child processes. Now in the parent process I have a list of all child processes. when a child is terminated I want to delete it from the list. I know that when a child is terminated he's sending SIGCHLD to the parent. OK, now it's the tricky part, how can I find out if that child terminated or just suspended or something else?
As you said
when a child is terminated he's sending SIGCHLD to the parent
Make the parent call wait().
Either
by a blocking call to wait() or
on a continous base or
by seting up a signal handler serving SIGCHLD which in turn calls wait().
If having called waitpid() with a pid of -1 using the option WUNTRACED and then applying the macro WIFSTOPPED to the value returend by waitpid() it tells you whether the process was stopped or ended.
For Linux since kernel version 2.6.10 the same as for WIFSTOPPED applies to WCONTINUED.
There is a system call in signal.h - sigaction() , similar to siganl( ) but more useful.Visit http://man7.org/linux/man-pages/man2/sigaction.2.html
The signal handler function prototype for sigaction() looks like this:
void sa_handler(int signo, siginfo_t *si, void *ucontext);
It has an argument of type siginfo_t which contains all the information about the signal including the pid of the process which sent it.
Though using conventional signal handling mechanism, it can be done using waitid() as mentioned in previous answer(s) but waitid() needs pid as one of its argument.
Using the options argument of the wait() family of functions (waitpid(),waitid()).
http://linux.die.net/man/2/waitid
Related
When I create two child processes, I can't use SIGCHLD to tell whether both child processes have terminated since once a signal is delivered, future signals of the same type are discarded. When I receive a SIGCHLD signal and handle that signal, I cannot be sure whether that means both child processes have terminated and sent the SIGCHLD signal or just one of them has terminated. In other words, signals are not queued. However, with the function wait(), if I have two child processes, then I can call wait() twice to reap both child processes, and I wonder how it is implemented under the hood. It seems it is not using SIGCHLD as signals are not queued. So how is it able to handle terminated child processes one by one?
In cases like this, you'd instead call waitpid() with the WNOHANG flag in a loop until it returns 0 to indicate none of the specified processes have changed state (or -1 to indicate an error).
int status;
pid_t p;
while ((p = waitpid(-1, &status, WNOHANG)) > 0) {
// Inspect status and do whatever
}
As for how they work under the hood? The kernel keeps exited but not yet waited for processes in its list of processes (so called zombie processes). wait() just returns the status of one of the caller's exited children, or blocks until there is one (or an error), at which point the calling process is scheduled to run again.
There is a way to look when pid/tid status change with waitpid but this is blocking function.
I want to monitor all threads in specific pid and get signal when one of them change and print the tid.
For now I open threads as count of threads in that process and each 1 make waitpid on 1 tid and after that blocking function finish I print that tid that changed.
How can I get a signal that tid change so I can monitor all tid's in 1 thread.
I didn't want to monitor all pid in system only specific pid/tid.
Those tids/pids are not children of my process.
You can call
int status;
pid_t waitpid(-1, &status, 0);
to wait for any child process change.
So you do not have to specify in advance, which pid to monitor, and can react on any status change. This way you do not need to start one thread for each pid.
As to the signal part of your question: A SIGCHLD is sent to your process when a child process exits. This signal is ignored by default, but you can install a custom signal handler for it, of course.
If you only want to reap specific pids, linux provides the option WNOWAIT, which only reports the state, but does not really reap the child process. Now you can check, if the pid is one of those you want to monitor, and if so, call waitpid() again without the option.
If the processes are not children, waitpid() cannot be used in general. One option is, to attach with ptrace() to these 40 processes to get signalled, if one of these processes exit. This might have unwanted side-effects, however.
If you're using POSIX threads, then you could use pthread_cleanup_push and pthread_cleanup_pop to call a "cleanup" function when your thread is exiting.
This "cleanup" function could then send one of the user signals (SIGUSR1 or SIGUSR2) to the process which then catches it and treats it as a signal about thread termination.
If you use sigqueue you can add the thread-id for the signal handler so it knows which thread just exited.
You can use pthread_sigmask to block the user signal in all threads, to make sure it's only delivered to the main process thread (or use pthread_sigqueue to send to the main process thread specifically).
For example, in the parent process, I forked a child process and wait on the child process:
int main() {
setSignal(SIGCHLD, sigchld_handler)
while(1) {
// fork some child processes
myForkFunction()
waitpid(-1, &status, 0)
}
}
Moreover, I have a SIGCHLD signal handler:
void
sigchld_handler(int sig) {
while ((pid = waitpid(-1, &status, WNOHANG)) > 0) {
// Reap zombie processes
}
}
As can be seen, waitpid() appears both in the main() function and in the sigchld_handler() function. I was wondering whether waitpid can be interrupted by SIGCHLD. If it can be interrupted by SIGCHLD, what will happen then?
Does anyone have any ideas about this?
The POSIX specification for waitpid() says in part:
If _POSIX_REALTIME_SIGNALS is defined, and the implementation queues the SIGCHLD signal, then if wait() or waitpid() returns because the status of a child process is available, any pending SIGCHLD signal associated with the process ID of the child process shall be discarded. Any other pending SIGCHLD signals shall remain pending.
Otherwise, if SIGCHLD is blocked, if wait() or waitpid() return because the status of a child process is available, any pending SIGCHLD signal shall be cleared unless the status of another child process is available.
For all other conditions, it is unspecified whether child status will be available when a SIGCHLD signal is delivered.
The third of the quoted paragraphs seems to imply that you're treading on thin ice. It doesn't mention 'implementation defined' or similar — unspecified means that the standard says nothing about what shall happen and you may or may not get any information from the implementation-specific documentation.
There is a lot of (very densely worded) information in the POSIX specification. There are also some examples, and a rationale — which mentions sigwait() and sigwaitinfo(). It is worth reading the whole of the waipid() page. You should probably also read about Signal concepts too — more dense reading. (One of these days, I'll do it, too — when I need to know about bits of signals that I haven't covered before.)
Why are you using WUNTRACED instead of 0 or WNOHANG? WUNTRACED is a very specialized condition — POSIX says:
WUNTRACED
The status of any child processes specified by pid that are stopped, and whose status has not yet been reported since they stopped, shall also be reported to the requesting process.
Similar comments apply to WCONTINUED. Those two flags are useful when you need them, but you very seldom need them.
I suggest you should normally use either 0 or WNOHANG in the third argument to waitpid().
Yes, in the sense that only one of them can succeed for a given child process; if the signal handler interrupts the one in main, then after the signal handler returns, the child will already have been reaped and the call in main will fail.
With that said, however, it's bad practice to write code like this. There should be a single place you handle reaping of a given child process, and usually a signal handler is a very bad choice because it's global and it would have to be aware of all possible child processes your program might have finishing, and have a way to communicate those results to the proper parts of your program.
Instead, it's generally better to monitor the termination of child processes via poll on a pipe to/from the child process, and only waitpid after you know it's terminated, or to perform blocking waitpid from a thread whose only job is to wait for the child.
In C programming for Linux, I know the wait() function is used to wait for the child process to terminate, but are there some ways (or functions) for child processes to wait for parent process to terminate?
Linux has an extension (as in, non-POSIX functions) for this. Look up prctl ("process-related control").
With prctl, you can arrange for the child to get a signal when the parent dies. Look for the PR_SET_PDEATHSIG operation code used with prctl.
For instance, if you set it to the SIGKILL signal, it effectively gives us a way to have the children die when a parent dies. But of course, the signal can be something that the child can catch.
prctl can do all kinds of other things. It's like an ioctl whose target is the process itself: a "process ioctl".
Short answer: no.
A parent process can control the terminal or process group of its children, which is why we have the wait() and waitpid() functions. A child doesn't have that kind of control over its parent, so there's nothing built in for that.
If you really need a child to know when its parent exits, you can have the parent send a signal to the child in an atexit() handler, and have the child catch that signal.
In Linux, you can use prctl with the value PR_SET_PDEATHSIG to establish a signal that will be sent to your process when the thread that created it dies. Maybe you find it useful.
When a parent process ends, child process is adopted by init, so it is enough to check in child proces if ppid()==1 or ppid()!= than oryginal PPID
That means the parent process was finished.
I am writing a shell, now it comes to control the child process.
When I use signal (SIGTERM, SIG_DFL); in the child process,
the signal SIGINT is generated by Ctrl + C, and that signal terminates whole the OS shell.
how can I just terminate the process e.g “cat” only, but not whole shell??
Should I use somethings like:
void sig_handler(int sig) {
if(sig ==SIGINT)
{
kill(pid);
}
}
Really thanks a slot.
Your question is rather vague. Can you be more clear on what you want to achieve?
I think you should be using signal(SIGTERM, sig_handler) instead of SIG_DFL which is the default action taken. Since, you have a signal handler, you call it instead of predefined functions like SIG_INT or SIG_DFL. The code inside your function looks fine. As long as you know the pid, you can do a kill(pid).
In the exec'd child, the SIGINT (and SIGQUIT) handlers will be SIG_DFL if they were set to a handler in the parent shell, and that's most likely correct. (You can't inherit a non-default signal handler across an exec, of course, because the function usually doesn't even exist in the exec'd process.)
Setting a handler for SIGTERM won't affect the response to SIGINT, or vice versa.
Your shell shouldn't need to deliver signals to its children.