How to catch SIGABRT in multithread environment? - c

I want to create an Units test framwork, but to provide a good reporting I need to catch SIGABRT, SIGSEGV and probably others signals to prevent my process from being killed (And so, to be able to continue the tests processing)...
But I don't know how to do this and so, I need information:
SIGABRT is a thread direct signal ?
What happens if I only use the main thread to catch the SIGABRT (or SIGSEGV) signal? Could the thread that called abort return from its call (I hope not) ?
If you have any useful documents, links or tutorial, I'm interested. It's for a C code using pthreads.
Thanks for your help

I need to catch SIGABRT, SIGSEGV and probably others signals to prevent my process from being killed
This is an exercise in futility. After SIGABRT or SIGSEGV is raised, you (in general) have no idea about the state of the process -- it may have corrupted heap, stack, global data internal to your test framework, global data internal to the C runtime system, etc. etc. Continuing such process is exceedingly likely to continue crashing at random (correct) places in the code.
The only sane way to handle this in a test framework is to fork and have the parent process handle child error exits, report them and continue running additional tests.
SIGABRT is a thread direct signal ?
There is no such thing as "direct signal". SIGABRT may be sent to the process from outside, or it can be raised inside the process.
What happens if I only use the main thread to catch the SIGABRT (or SIGSEGV) signal?
SIGSEGV and SIGABRT (when not sent from outside) is sent to the thread which caused the invalid memory operation (or raised it).
In addition, there is no way to "only use main thread" -- sigaction is global across all threads (though you can set a thread-specific signal mask).

Related

Can I kill another process from SIGSEGV handler?

Background: I'm fuzzing a long-lived process with afl-fuzz by passing to it the filename to process from a stub that afl-fuzz runs for each sample.
When the long-lived process crashes via SIGSEGV, I want the stub to also generate a SIGSEGV, so that afl-fuzz will mark the sample as interesting.
Will calling kill(stub_pid, SIGSEGV) from the long-lived process's SIGSEGV handler work ?
Will calling kill(stub_pid, SIGSEGV) from the long-lived process's SIGSEGV handler work ?
If a process ends up in a SIGSEGV-handler something very bad happened, which might include a completely destroyed stack and/or memory management.
It is not a good idea to rely on anything any more at this point, but just that the process goes down.
Trying to invoke any functionally beyond this point is likely to fail, that is unreliable.
A much safer approach to this would be to have the calling process monitor its child, and if the child happens to terminated unexpected (typically via SIGSEGV) start the appropriate actions.
Have a look at signal handling inside shell scripts (seach-key: "trap"), as such a script might be the parent to the process you want to monitor.
not recommended to do this through SIGSEGV but you can do this if you have proper permission.
Instead of wondering how to cause a segmentation fault in your program so that AFL would notice something odd, just call abort(). SIGABRT is caught by AFL as well and is much easier to trigger.

Which "fatal" signals should a user-level program catch?

First of all, I do know that there was a similar question here in the past.
But that question wasn't answered properly. Instead, it diverted into suggestion what to do to catch signals.
So just to clarify: I've done whatever needs to be done to handle signals.
I have an application that forks a daemon that monitors the main process through pipe.
If a main process crashes (e.g. segmentation fault), it has a signal handler that writes all the required info to pipe and aborts.
The goal is to have as much info as possible when something bad happens to the application, w/o messing with "normal" operation, such as SIGHUP, SIGUSR1, etc.
So my question is: which signals should I catch?
I mean signals that w/o me catching them would cause application to abort anyway.
So far I've come up with the following list:
SIGINT (^C, user-initiated, but still good to know)
SIGTERM (kill <pid> from shell or, AFAIK, can be result of OutOfMemory)
SIGSEGV
SIGILL
SIGFPE
SIGBUS
SIGQUIT
Does anybody know if I miss something? kill -l has lots of them... :)
I'm looking at my copy of advanced programming the unix environment (Stevens). According to the table in the section on signals, the following POSIX signals will by default terminate the process, if you don't catch them. It wouldn't be unreasonable to monitor all of these that you don't use:
SIGABRT
SIGALRM
SIGFPE
SIGHUP
SIGILL
SIGINT
SIGKILL
SIGPIPE
SIGQUIT
SIGSEGV
SIGTERM
SIGUSR1
SIGUSR2
You can catch all of these except SIGKILL, but hopefully SIGKILL won't come up very often for you.
Note that your signal man page (man 7 signal) should give you the proper list for your system - this is the POSIX list, and may differ depending on your architecture.
You should not catch the signal and write the code to a pipe. This is both unnecessary and not failsafe.
Let me quote an answer from the question you linked to, to point out why it's not failsafe: "What makes you think that a SEGV hasn't already corrupted your program memory"
Now you may be wondering how to do it better, and why I've said it is "unnecessary".
The return code of the waitpid syscall can be checked using WIFSIGNALED() to determine whether the process was terminated normally or via a signal, and WTERMSIG() will return the signal number.
This is failsafe and it does not require a handler or a pipe. Plus, you don't need to worry what to catch, because it will report every signal that terminates your process.
It depends: whatever signal you like if that's useful to inform the user something bad just happened. However SIGKILL and SIGSTOP cannot be caught.

Signal handling - Async functions and multi threaded applications, Signal stack

Can someone explain why we should not call non async functions from signal handlers ? Like the exact sequence of steps that corrupt the programs while calling with such functions.
And, does signals always run on separate stack ? if so is it a separate context or it runs on the context of the signaled thread ?
Finally, in case of a multi-threaded system what happens when signal handler is executed and some other thread is signaled and calls the same signal handler ?
(I am trying to develop deep understanding of signals and its applications)
When a process receives a signal, it is handled in the context of the process. You should only use aync-safe functions or re-entrant functions from inside a signal handler. For instance, you cannot call a malloc() or a printf() within a signal handler. The reason being:
*) Lets assume your process was executing in malloc when you received the signal. So the global heap data structures are in an inconsistent state. Now if you acquire the heap lock from inside your signal handler and make changes you will further render the heap inconsistent.
*) Another possibility is if the heap lock has been acquired by your process when it received the signal, and then you call malloc() from your signal handler, it sees that lock is held and it waits infinitely to acquire the lock (infinitely because the thread that can release the lock will not run till the signal is completely handled).
2) Signals run in the context of the process. As for the signal stack you can look at this SO answer -> Do signal handers have a separate stack?
3) As for getting multiple instances of the same signal you can look at this link -> Signal Handling in UNIX where Rumple Stiltskin answers it well.
I know some Solaris. So I'm using that for details. LWP==Solaris for "thread" as in pthreads.
trap signals like SIGILL, are delivered to the thread that caused the trap. Asynchronous signals are delivered to the first active thread (LWP), or process that is not blocking that signal. A kernel module called aslwp() traverses the process-header table (has associated LWP's) looking for the first likely candidate to receive the asynch signal.
A signal stack lives in the kernel. I'm not sure what/how to answer your signal stack question.
One process may have several pending signals. Is that what you mean?
Each signal destined for a process is held there until the process switches context (or is forced) into the active state. This in part because you generally cannot incur a trap when the process context has been swapped out and the process does nothing cpu-wise. You certainly can incur asynch signals. But the process cannot "do anything" with any signal if it cannot run. So, at this point the kernel swaps the context back to active, and the signal is delivered via aslwp().
Realtime signals behave differently, and I'm letting it stay with that.
Try reading this:
developers.sun.com/solaris/articles/signalprimer.html

is SIGSEGV delivered to each thread?

I have a program in Linux which is multithreaded. There are certain memory areas in which I'm interested to see if they have been written within a certain time period. For that I give only read access to those memory pages and install a signal handler for SIGSEGV. Now my question is, will each thread call the signal handler for itself. Say Thread 1 writes to some forbidden memory area, will it be the one to execute the signal handler?
First of all
Signal dispositions are process-wide;
all threads in a process share the
same disposition for each signal. If
one thread uses sigaction() to
establish a handler for, say, SIGINT,
then that handler may be invoked from
any thread to which the SIGINT is
delivered.
But read on
A signal may be directed to either the
process as a whole or to a specific
thread. A signal is thread-directed if
it is generated as the direct result
of the execution of a specific
hardware instruction within the
context of the thread (SIGBUS, SIGFPE, SIGILL, and SIGSEGV)
I am quoting from TLPI.
No, per the question title.
To the question body: For the particular signal that you are asking for, yes (otherwise: it depends). The thread causing a segfault will receive the signal.
See signal(7):
A signal may be generated (and thus pending) for a process as a whole (e.g.,
when sent using kill(2)) or for a specific thread (e.g., certain signals, such
as SIGSEGV and SIGFPE, generated as a consequence of executing a specific
machine-language instruction are thread directed [...].

Does setitimer's ITIMER_PROF(SIGPROF)s send to every thread in Multithread and NPTL and Linux(2.6.21.7)?

Manual has said that setitimer is shared in the whole PROCESS and the SIGPROF is send to the PROCESS not to the thread.
But when I create the timer in my multithread PROCESS, unless I create independent stacks for every thread in the PROCESS to handler the signo, I will got some very serious errors in the sig handler. Through some debugging, I confirm that the stack(sole stack case) must have been reenterd.
So now I suspect that SIGPROFs may be send to multithread at the same time? Thanks!
I don't follow the details of your question but the general case is:
A signal may be generated (and thus pending) for a process as a whole (e.g., when sent using kill(2)) or for a specific thread (e.g., certain signals, such as SIGSEGV and SIGFPE, generated as a consequence of executing a specific machine-language instruction are thread directed, as are signals targeted at a specific thread using pthread_kill(3)). A process-directed signal may be delivered to any one of the threads that does not currently have the signal blocked. If more than one of the threads has the signal unblocked, then the kernel chooses an arbitrary thread to which to deliver the signal.
man (7) signal
You can block the signal for specific threads with pthread_sigmask and by elimination direct it to the thread you want to handle it.
According to POSIX, the alternate signal stack established with sigaltstack is per-thread, and is not inherited by new threads. However, I believe some versions of Linux and/or userspace pthread library code (at least old kernels with LinuxThreads and maybe some versions with NPTL too?) have a bug where the alternate stack is inherited, and of course that will lead to crashing whenever you use the alternate stack. Is there a reason you need alternate stacks? Normally the only purpose is to handle stack overflows semi-gracefully (allowing yourself some stack place to catch SIGSEGV and save any unsaved data before exiting). I would just disable it.
Alternatively, use pthread_sigmask to block SIGPROF in all threads but the main one. Note that, to avoid a nasty race condition here, you need to block it in the main thread before calling pthread_create so that the new thread starts with it blocked, and unblock it after pthread_create returns.

Resources