When I want to inspect tracee syscall I use PTRACE_ATTACH , then PTRACE_SYSCALL in loop , and finally PTRACE_DETACH .
The problem is that if the tracee registered to SIGTRAP or SIGCONT it can change his behaviour while I use PTRACE_SYSCALL or PTRACE_DETACH and I don't want to do it.
When I attach to tracee with PTRACE_ATTACH tracee got SIGSTOP but it can't register/reaction to this signal, so that it fine.
What is the solution that the tracee could not catch SIGTRAP when I use PTRACE_SYSCALL or SIGCONT when I used PTRACE_CONT
The tracer always gets first dibs at signal handling, and it can choose to suppress the signal so that the tracee's handler doesn't run. The only thing you have to worry about is if the tracee blocks the signal with something like sigprocmask or a blocking call to pselect and uses something like sigpending or signalfd to look for it, which you could fix by modifying or emulating the relevant syscalls.
To suppress a signal, just pass 0 for sig when resuming after signal-delivery-stop. From man 2 ptrace:
Signal injection and suppression
After signal-delivery-stop is observed by the tracer, the tracer should restart the tracee with the call
ptrace(PTRACE_restart, pid, 0, sig)
where PTRACE_restart is one of the restarting ptrace requests. If sig is 0, then a signal is not delivered. Otherwise, the signal sig is delivered. This operation is called signal injection in this manual page, to distinguish it from signal-delivery-stop.
Related
Consider a scenario like this: The parent process calls wait() to wait for the child process to exit, and the signal handler is registered for SIGCHLD. When the parent process blocks at wait(), the child process ends, at which point the parent process receives a SIGCHLD signal (regardless of setting special fields).
After I tested, I found that wait() was not interrupted by the SIGCHLD` signal to fail and return -1, but returned successfully after executing the signal processing function. Why is that?
man wait
ERRORS
EINTR: WNOHANG was not set and an unblocked signal or a SIGCHLD was caught
Since you've established a signal handler for SIGCHLD, wait does not get interrupted.
For more info, see: signal, especially:
Waiting for a signal to be caught
Synchronously accepting a signal
Signal mask and pending signals
A signal may be blocked, which means that it will not be
delivered until it is later unblocked. ...
Execution of signal handlers
Interruption of system calls and library functions by signal handlers
If a signal handler is invoked while a system call or library function call is blocked, then either:
the call is automatically restarted after the signal handler returns; or
the call fails with the error EINTR.
Which of these two behaviors occurs depends on the interface and whether or not the signal handler was established using the SA_RESTART flag (see sigaction(2)). The details vary across UNIX systems; below, the details for Linux.
If a blocked call to one of the following interfaces is
interrupted by a signal handler, then the call is automatically
restarted after the signal handler returns if the SA_RESTART flag was used; otherwise the call fails with the error EINTR:
wait
After I tested, I found that wait() was not interrupted by the SIGCHLD` signal to fail and return -1, but returned successfully after executing the signal processing function. Why is that?
Well, if the signal handler ran while the thread was blocked in wait() then that call was interrupted. I guess the question is why wait() then went ahead with collecting the child and returned successfully instead of failing with EINTR.
I can reproduce that behavior. The specifics of how you register the handler are unclear, but in my tests I see the handler running and wait() thereafter returning successfully even when the SA_RESTART flag is not set for SIGCHLD, which is generally a major factor in whether restartable system calls such as wait() fail with EINTR when interrupted by a signal.
I'm having trouble locating any documentation that specifically prescribes the observed combination of results for wait() + handler function + SIGCHLD, but the bottom line is that SIGCHLD is special. In particular, it has a special relationship with wait(), because the events that a system-generated SIGCHLD reports on are exactly the ones that a blocking wait() call is waiting for. Some of the manifestations of that specialness are
The sigaction() function defines two flags modulating behavior related specifically to SIGCHLD, and none specific to any other signal.
Even though the default disposition of SIGCHLD is documented as SIG_IGN, its actual default behavior is unique to that signal and distinct from the behavior obtained by explicitly setting the disposition to SIG_IGN.
POSIX has special provisos for the behavior of the wait-family functions, as described in the notes in the wait() manual page, about how these functions are affected by the disposition and flags associated with SIGCHLD.
I don't think either POSIX or Linux explicitly says so, but it all comes around to a pending SIGCHLD being how the wait-family functions recognize that there is a child to collect. POSIX is sufficiently unspecific that I think other POSIX systems could do it differently, but to the best of my knowledge, using SIGCHLD for this purpose is both traditional and what Linux does. Enough so that signal-handling behavior is specifically designed to accommodate the common behavior of using wait() inside a handler for SIGCHLD to provide for central processing of terminated children.
It is also notable that wait() will collect the child and clear the pending SIGCHLD even if that signal is blocked, analogously to how sigwait() will receive blocked signals. In that case, any registered handler is bypassed.
Your case of establishing a handler for SIGCHLD that does not collect the status information for the child is unusual, but consider what needs to happen here:
a SIGCHLD has been received, it is not blocked, and a signal handler has been registered for it, so the signal handler must run and the SIGCHLD must be removed from the pending list.
after your particular handler runs, the status information for the child has not yet been consumed, so it must be consumed when control returns to wait(). Otherwise, it can never be consumed and reported, for receipt of a SIGCHLD is how the system is triggered to do that, and the context in which the status information is delivered.
I anticipate that your wait() would fail with either ECHILD or EINTR if the signal handler collected the waited-for child via its own wait() call. Which one depends in part on whether the SA_RESTART flag is set for SIGCHLD. I anticipate that it would fail with EINTR if there was a running child, and the wait() was interrupted by a synthetic SIGCHLD, and the SA_RESTART flag was not set.
Is there any way in C programming language , to stop a child process , and then call it again to start from the beginning? I have realised that if I use SIGKILL and then call the child process again nothing happens.
void handler {
printf(“entered handler”);
kill(getpid(),SIGKILL);
}
int main () {
pid_t child;
child=fork();
if (child<0) printf(“error”);
else if (child==0) {
signal(SIGINT,handler);
pause();
}
else {
kill(child,SIGINT);
kill(child,SIGINT);
}
This should print two times “Entered Handler” but it does not. Probably because it cannot call child again . Could I correct this in some way?
This should print two times “Entered Handler” but it does not.
Probably because it cannot call child again .
There are several problems here, but a general inability to deliver SIGINT twice to the same process is not one of them. The problems include:
The signal handler delivers a SIGKILL to the process in which it is running, effecting that process's immediate termination. Once terminated, the process will not respond to further signals, so there is no reason to expect that the child would ever print "entered handler" twice.
There is a race condition between the child installing a handler for SIGINT and the parent sending it that signal. If the child receives the signal before installing a handler for it, then the child will terminate without producing any output.
There is a race condition between the the first signal being accepted by the child and the second being delivered to it. Normal signals do not queue, so the second will be lost if delivered while the first is still pending.
There is a race condition between the child blocking in pause() and the parent signaling. If the signal handler were not killing the child, then it would be possible for the child to receive both signals before reaching the pause() call, and therefore fail to terminate at all.
In the event that the child made it to blocking in pause() before the parent first signaled it, and if it did not commit suicide by delivering itself a SIGKILL, then the signal should cause it to unblock and return from pause(), on a path to terminating normally. Thus, there would then also be a race condition between delivery of the second signal and normal termination of the child.
The printf() function is not async-signal safe. Calling it from a signal handler produces undefined behavior.
You should always use sigaction() to install signal handlers, not signal(), because the behavior of signal() is underspecified and varies in practice. The only safe use for signal() is to reset the disposition of a signal to its default.
Could I correct this in
some way?
Remove the kill() call from the signal handler.
Replace the printf() call in the signal handler with a corresponding write() call.
Use sigaction() instead of signal() to install the handler. The default flags should be appropriate for your use.
Solve the various race conditions by
Having the parent block SIGINT (via sigprocmask()) before forking, so that it will initially be blocked in the child.
Have the child use sigsuspend(), with an appropriate signal mask, instead of pause().
Have the child send some kind of response to the parent after returning from sigsuspend() (a signal of its own, perhaps, or a write to a pipe that the parent can read), and have parent await that response before sending the second signal.
Have the child call sigsuspend() a second time to receive the second signal.
After a process calls ptrace(PTRACE_TRACEME, ...), where the tracee stoped?
Is the tracee stoped in exec() system call? (seems not)
Is the tracee stoped in dynamic linker text?
...
If I compile a executable without any dynamic link libraries and glibc c-runtime, and specify the entry-point,
the tracee would stop at the entry-point.
But when I compile a executable with glibc(gcc hello-world.c), it would stoped at /lib/ld-2.20.so offset + 0xfb0. (cat /proc/[pid]/maps)
Hope more details about this.
man ptrace seems no help.
The tracee is usually stopped when a call to execve() is done which will cause it to be sent a SIGTRAP.
Some uses raise() to ensure a signal is sent, like this :
ptrace(PTRACE_TRACEME);
kill(getpid(), SIGSTOP);
return execvp(args[0], args);
kill(getpid(), SIGSTOP);
return execvp(args[0], args);
thx. so, when the tracee is stopped, the CPU EIP register points to which instruction?
The specification of the kill(pid_t pid, int sig) function answers this.
If the value of pid causes sig to be generated for the sending
process, and if sig is not blocked for the calling thread and if no
other thread has sig unblocked or is waiting in a sigwait() function
for sig, either sig or at least one pending unblocked signal shall be
delivered to the sending thread before kill() returns.
So, due to the above kill(getpid(), SIGSTOP) the process is stopped before the library function kill() returns, typically at the instruction just after the system call.
If a process is currently stopped due to a SIGTRAP signal and it is sent a SIGSTOP signal via kill(), what would be the default behavior? Would the SIGSTOP be a pending signal that is delivered after the process continues again? Or will it just be discarded/ignored?
If the SIGSTOP is queued up, is there any way to remove it from the queue from outside of that process, such as in a tracing process?
From the signal(7) man page:
The signals SIGKILL and SIGSTOP cannot be caught, blocked, or ignored.
A simple test with an app stopped on a breakpoint and sending it a SIGSTOP shows gdb displaying some information when I hit 'next'. The signal was obviously delivered to the app. It cannot continue to be debugged until I send it a SIGCONT.
(gdb) next
Program received signal SIGSTOP, Stopped (signal).
fill (arr=0x7fffffffdff0, size=5) at tmp.cpp:28
(gdb) next
Program received signal SIGCONT, Continued.
fill (arr=0x7fffffffdff0, size=5) at tmp.cpp:28
(gdb) next
(gdb)
What do you mean 'stopped due to a SIGTRAP signal'? A SIGTRAP will not stop a process; by default it will terminate with a core dump, or you can change it to ignore the signal or call a signal handler, but in no case will the SIGTRAP stop the process by itself. You might have the process being traced by some other process (such as a debugger) with ptrace(2), in which case it will stop just before delivering the SIGTRAP, but in that case its under the control of the ptrace and won't continue until there's a PTRACE_CONT or other ptrace action to continue the process.
When I call kill(Child_PID, SIGSTOP); from the parent, I expect the child to halt execution and the parent to continue. Is that the expected behavior or do I have to explicitly declare the SIGSTOP handler in the child? I have searched everywhere and not been able to find this information.
Thanks.
Braden
POSIX says:
The system shall not allow a process to catch the signals SIGKILL and SIGSTOP.
So, the child has no option but to stop - if the signal is sent successfully. And you cannot set a SIGSTOP handler in the child (or parent, or any other) process.
This is the expected behavior.
Use strace your_program to see what's happening.
That's the expected behaviour. Quoting from the unix man page:
The signals SIGKILL and SIGSTOP cannot be caught, blocked, or ignored.
And the BSD man page mentions that:
The signal() function will fail and no action will take place
if one of the following occur:
[EINVAL] The sig argument is not a valid signal number.
[EINVAL] An attempt is made to ignore or supply a handler
for SIGKILL or SIGSTOP.
Concluding, you're not permitted to install a handler for SIGSTOP. And the process will remain in the suspended state until it receives a SIGCONT.