Segmentation fault handling

Segmentation fault handling - c

I have an application which I use to catch any segmentation fault or ctrl-c.
Using the below code, I am able to catch the segmentation fault but the handler is being called again and again. How can I stop them.
For your information, I don't want to exit my application. I just can take care to free all the corrupted buffers.
Is it possible?
void SignalInit(void )
{
struct sigaction sigIntHandler;
sigIntHandler.sa_handler = mysighandler;
sigemptyset(&sigIntHandler.sa_mask);
sigIntHandler.sa_flags = 0;
sigaction(SIGINT, &sigIntHandler, NULL);
sigaction(SIGSEGV, &sigIntHandler, NULL);
}
and handler goes like this.
void mysighandler()
{
MyfreeBuffers(); /*related to my applciation*/
}
Here for Segmentation fault signal, handler is being called multiple times and as obvious MyfreeBuffers() gives me errors for freeing already freed memory. I just want to free only once but still dont want to exit application.
Please help.

The default action for things like SIGSEGV is to terminate your process but as you've installed a handler for it, it'll call your handler overriding the default behavior. But the problem is segfaulting instruction may be retried after your handler finishes and if you haven't taken measures to fix the first seg fault, the retried instruction will again fault and it goes on and on.
So first spot the instruction that resulted in SIGSEGV and try to fix it (you can call something like backtrace() in the handler and see for yourself what went wrong)
Also, the POSIX standard says that,
The behavior of a process is undefined after it returns normally from
a signal-catching function for a [XSI] SIGBUS, SIGFPE, SIGILL, or
SIGSEGV signal that was not generated by kill(), [RTS] sigqueue(),
or raise().
So, the ideal thing to do is to fix your segfault in the first place. Handler for segfault is not meant to bypass the underlying error condition
So the best suggestion would be- Don't catch the SIGSEGV. Let it dump core. Analyze the core. Fix the invalid memory reference and there you go!

I do not agree at all with the statement "Don't catch the SIGSEGV".
That's a pretty good pratice to deal with unexpected conditions. And that's much cleaner to cope with NULL pointers (as given by malloc failures) with signal mechanism associated to setjmp/longjmp, than to distribute error condition management all along your code.
Note however that if you use ''sigaction'' on SEGV, you must not forget to say SA_NODEFER in sa_flags - or find another way to deal with the fact SEGV will trigger your handler just once.
#include <setjmp.h>
#include <signal.h>
#include <stdio.h>
#include <string.h>
static void do_segv()
{
int *segv;
segv = 0; /* malloc(a_huge_amount); */
*segv = 1;
}
sigjmp_buf point;
static void handler(int sig, siginfo_t *dont_care, void *dont_care_either)
{
longjmp(point, 1);
}
int main()
{
struct sigaction sa;
memset(&sa, 0, sizeof(sigaction));
sigemptyset(&sa.sa_mask);
sa.sa_flags = SA_NODEFER;
sa.sa_sigaction = handler;
sigaction(SIGSEGV, &sa, NULL); /* ignore whether it works or not */
if (setjmp(point) == 0)
do_segv();
else
fprintf(stderr, "rather unexpected error\n");
return 0;
}

If the SIGSEGV fires again, the obvious conclusion is that the call to MyfreeBuffers(); has not fixed the underlying problem (and if that function really does only free() some allocated memory, I'm not sure why you would think it would).
Roughly, a SIGSEGV fires when an attempt is made to access an inaccessible memory address. If you are not going to exit the application, you need to either make that memory address accessible, or change the execution path with longjmp().

You shouldn't try to continue after SIG_SEGV. It basically means that the environment of your application is corrupted in some way. It could be that you have just dereferenced a null pointer, or it could be that some bug has caused your program to corrupt its stack or the heap or some pointer variable, you just don't know. The only safe thing to do is terminate the program.
It's perfectly legitimate to handle control-C. Lots of applications do it, but you have to be really careful exactly what you do in your signal handler. You can't call any function that's not re-entrant. So that means if your MyFreeBuffers() calls the stdlib free() function, you are probably screwed. If the user hits control-C while the program is in the middle of malloc() or free() and thus half way through manipulating the data structures they use to track heap allocations, you will almost certainly corrupt the heap if you call malloc() or free() in the signal handler.
About the only safe thing you can do in a signal handler is set a flag to say you caught the signal. Your app can then poll the flag at intervals to decide if it needs to perform some action.

Well you could set a state variable and only free memory if its not set. The signal handler will be called everytime, you can't control that AFAIK.

I can see at case for recovering from a SIG_SEGV, if your handling events in a loop and one of these events causes a Segmentation Violation then you would only want to skip over this event, continue processing the remaining events. In my eyes SIG_SEGV is similar to the NullPointerException in Java. Yes the state will be inconsistent and unknown after either of these, however in some cases you would like to handle the situation and carry on. For instance in Algo trading you would pause the execution of an order and allow a trader to manually take over, with out crashing the entire system and ruining all other orders.

Looks like at least under Linux using the trick with -fnon-call-exceptions option can be the solution. It will give an ability to convert the signal to general C++ exception and handle it by general way.
Look the linux3/gcc46: "-fnon-call-exceptions", which signals are trapping instructions? for example.

Related

When is it preferable to cause a segfault in a watchdog thread versus exiting normally to stop a process?

I am wondering if there is ever a good reason to exit a watchdog thread in the manner depicted, versus exiting with exit(). In the code I came across that brought this question to mind, a segfault was caused by de-referencing a null pointer with the strange line *(char **)0 = "watchdog timeout";.
Unless I'm mistaken, a thread calling exit() terminates the entire process. I interpret a segfault as an error, and not intended behavior, but perhaps there are times when it is desired.
void *watchdog_loop(void *arg) {
time_t now;
while(foo) {
sleep(1);
now = current_time();
if (watchdog_timeout && now - bar > watchdog_timeout) {
raise(SIGSEGV); //something went wrong
}
}
return NULL;
}

Is there ever a time that it would be more desirable to have a watchdog loop segfault intentionally, versus exiting nonzero?
It is never desirable to elicit undefined behavior, which is what the example code does. In particular, note well that that code is not required to cause a segfault to be delivered to the process, though it might reliably do so on certain systems.
However, one might indeed prefer to kill a process via a signal instead of by calling exit(), so as to achieve termination without executing any application or library cleanup code. This is a plausible goal for a watchdog. Even in that event, however,
Either the raise() or the abort() function would definedly cause a signal to be delivered to the process.
SIGSEGV seems an odd choice of signal. Any of SIGABRT, SIGTERM, or SIGKILL would make more sense to me. Of those,
SIGKILL is not specified by the C language spec, but rather by POSIX (and maybe others). On a POSIX system, SIGKILL cannot be blocked or caught, so it is a very good candidate for a signal to terminate the process as quickly and surely as possible.
SIGABRT is used by the abort() function, which also goes to some pains to try to overcome program resistance to being terminated that way. This is the most natural standard function to use to trigger an intentional abnormal program termination.
SIGTERM can be caught and / or blocked, but unlike SIGKILL, it is defined by the C language specification, and therefore is more portable. But I don't really see any advantage over SIGABRT, unless you intend to allow it to be handled.
Another alternative would be _exit() (POSIX) or _Exit() (C99 or later). These perform a cleaner shutdown than you can expect from termination via a signal, but without executing most cleanup code. Open files will be closed, and the parent process will observe the process to terminate normally with a failure status instead of terminating by being killed by a signal.

Catching Segmentation Violations and Getting on with Life

I'm writing a program that's examining its own address space.
Specifically, I care about all malloc-ed data blocks. If there some system call to get a list of them, that would be fantastic (for my application, I cannot use LD_PRELOAD, -wrap, nor any extra command line options). If there's a way to do this, I'd love to hear it even more than an answer to my stated problem, below.
In lieu of this, my current approach is to just dereference everything and look around. Obviously, the set of all possible pointers is a minefield of segfaults waiting to happen, so I tried registering a signal handler and using setjmp/longjmp (simply ignoring the segfault by making the handler do nothing is an infinite loop because the handler will return to the faulting instruction). Some example code goes like so:
static jmp_buf buf;
void handler(int i) {
printf(" Segfaulted!\n");
longjmp(buf,-1);
}
void segfault(void) {
int* x = 0x0;
int y = *x;
}
void test_function(void) {
signal(11,handler);
while (1) {
if (setjmp(buf)==0) {
printf("Segfaulting:\n");
segfault();
}
else {
printf("Recovered and not segfaulting!\n");
}
printf("\n");
}
}
The output is:
Segfaulting:
Segfaulted!
Recovered and not segfaulting!
Segfaulting:
Segmentation fault
So, the handler didn't work the second time around. I don't know why this is, but I speculated it had something to do with not clearing the original signal. I don't know how to do that.
As an aside, I tried sigsetjmp/siglongjmp first, but they weren't defined for some reason in setjmp.h. I got vague vibes that one needed to pass some extra compile flags, but, as before, that is not allowed for this application.
The system being used is Ubuntu Linux 10.04 x86-64, and any solution does not need to be portable.
[EDIT: sigrelse in the handler clears the signal, and fixes the problem effectively. Question now concerns the other issues raised--is there a better way (i.e., get the blocks of malloc)? What's up with sigsetjmp/siglongjmp? Why do I need to reset the signal?]

signal() is a legacy interface, and may or may not re-register a signal handler after it has been invoked, depending on the OS; you may need to issue another signal() call to reset the signal handler as the last action in your handler. See man 2 signal.
sigaction() is the preferred mechanism to set signal handlers, as it has well defined and portable behavior.

When the signal handler for SIGSEGV is invoked, the SIGSEGV signal will be masked as if by sigprocmask. This is true for any signal. Normally returning from the signal handler would unmask it, but since you're not returning, that never happens. There are a couple possible solutions:
You can call sigprocmask either before or after the longjmp to unmask it yourself.
You can install the signal handler with sigaction (the preferred way to do it anyway) and use the SA_NODEFER flag to prevent it from being masked.
You can use the sigsetjmp and siglongjmp functions, which themselves take responsibility for saving and restoring the signal mask.

Issue in sighandler

I am creating a user level thread library implementing preemptive round robin scheduler. I have the handler function like this:
void handler(int signum)
{
gtthread_yield();
}
In the gtthread_yield, i am doing the context switching to switch to the next thread to be executed. The logic works fine in most cases. But, i am getting a segmentation fault when the signal is raised even before the gtthread_yield function (called from sighandler) could finish executing. Because of this , my code accesses invalid memory location (memory that i had already freed)
Is there any way to avoid handler being raised before the gtthread_yield function finishes execution?
Thanks

Use sigaction() and its helpers (sigemptyset(), sigfillset(), sigaddset(), etc.) to block signals while the handler is in progress. That is probably a necessary step; it may not be sufficient. If it is not sufficient, you probably need to revise the signal handling so that it does almost nothing except a volatile sig_atomic_t variable before returning. Then the calling code has to look at the atomic variable and call gtthread_yield() when it is set (remembering to clear the variable after returning from gtthread_yield().

You can block other signals for the handler duration, see e.g. this entry in glibc manual how to do it.

Longjmp out of signal handler?

From the question:
Is it good programming practice to use setjmp and longjmp in C?
Two of the comments left said:
"You can't throw an exception in a signal handler, but you can do a
longjmp safely -- as long as you know what you are doing. – Dietrich
Epp Aug 31 at 19:57
#Dietrich: +1 to your comment. This is a little-known and
completely-under-appreciated fact. There are a number of problems that
cannot be solved (nasty race conditions) without using longjmp out of
signal handlers. Asynchronous interruption of blocking syscalls is the
classic example."
I was under the impression that signal handlers were called by the kernel when it encountered an exceptional condition (e.g. divide by 0). Also, that they're only called if you specifically register them.
This would seem to imply (to me) that they aren't called through your normal code.
Moving on with that thought... setjmp and longjmp as I understand them are for collapsing up the stack to a previous point and state. I don't understand how you can collapse up a stack when a signal handler is called since its called from the Kernel as a one-off circumstance rather than from your own code. What's the next thing up the stack from a signal handler!?

The way the kernel "calls" a signal handler is by interrupting the thread, saving the signal mask and processor state in a ucontext_t structure on the stack just beyond (below, on grows-down implementations) the interrupted code's stack pointer, and restarting execution at the address of the signal handler. The kernel does not need to keep track of any "this process is in a signal handler" state; that's entirely a consequence of the new call frame that was created.
If the interrupted thread was in the middle of a system call, the kernel will back out of the kernelspace code and adjust the return address to repeat the system call (if SA_RESTART is set for the signal and the system call is a restartable one) or put EINTR in the return code (if not restartable).
It should be noted that longjmp is async-signal-unsafe. This means it invokes undefined behavior if you call it from a signal handler if the signal interrupted another async-signal-unsafe function. But as long as the interrupted code is not using library functions, or only using library functions that are marked async-signal-safe, it's legal to call longjmp from a signal handler.
Finally, my answer is based on POSIX since the question is tagged unix. If the question were just about pure C, I suspect the answer is somewhat different, but signals are rather useless without POSIX anyway...

longjmp does not perform normal stack unwinding. Instead, the stack pointer is simply restored from the context saved by setjmp.
Here is an illustration on how this can bite you with non-async-safe critical parts in your code. It is advisable to e.g. mask the offending signal during critical code.

worth reading this: http://man7.org/linux/man-pages/man2/sigreturn.2.html in regard to how Linux handles signal handler invocation, and in this case how it manages signal handler exit, my reading of this suggests that executing a longjmp() from a signal handler (resulting in no call of sigreturn()) might be at best "undefined"... also have to take into account on which thread (and thus user stack) the setjmp() was called, and on which thread (and thus user stack) longjmp() in subsequently called also!

This doesn't answer the question of whether or not it is "good" to do this, but
this is how to do it. In my application, I have a complicated interaction between custom hardware, huge page, shared memory, NUMA lock memory, etc, and it is possible to have memory that seems to be decently allocated but when you touch it (write in this case), it throws a BUS error or SEGV fault in the middle of the application. I wanted to come up with a way of testing memory addresses to make sure that the shared memory wasn't node locked to a node that didn't have enough memory, so that the program would fail early with graceful error messages. So these signal handlers are ONLY used for this one piece of code (a small memcpy of 5 bytes) and not used to rescue the app while it is in use. I think it is safe here.
Apologies if this is not "correct". Please comment and I'll fix it up. I cobbled it together based on hints and some sample code that didn't work.
#include <stdio.h>
#include <signal.h>
#include <setjmp.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
sigjmp_buf JumpBuffer;
void handler(int);
int count = 0;
int main(void)
{
struct sigaction sa;
sa.sa_handler = handler;
sigemptyset(&(sa.sa_mask));
sigaddset(&(sa.sa_mask), SIGSEGV);
sigaction(SIGSEGV, &sa, NULL);
while (1) {
int r = sigsetjmp(JumpBuffer,1);
if (r == 0) {
printf("Ready for memcpy, count=%d\n",count);
usleep(1000000);
char buffer[10];
#if 1
char* dst = buffer; // this won't do bad
#else
char* dst = nullptr; // this will cause a segfault
#endif
memcpy(dst,"12345",5); // trigger seg fault here
longjmp(JumpBuffer,2);
}
else if (r == 1)
{
printf("SEGV. count %d\n",count);
}
else if (r == 2)
{
printf("No segv. count %d\n",count);
}
}
return 0;
}
void handler(int sig)
{
count++;
siglongjmp(JumpBuffer, 1);
}
References
https://linux.die.net/man/3/sigsetjmp
https://pubs.opengroup.org/onlinepubs/9699919799/functions/longjmp.html
http://www.csl.mtu.edu/cs4411.ck/www/NOTES/non-local-goto/sig-1.html
https://www.gnu.org/software/libc/manual/html_node/Longjmp-in-Handler.html

In most systems a signal handler has it's own stack, separate from the main stack. That's why you could longjmp out of a handler. I think it's not a wise thing to do though.

You can't use longjmp to get out of a signal handler.
The reason for this is that setjmp only saves the resources (process registers) etc. that the calling-convention specifies that should be saved over a plain function call.
When an interrupt occurs, the function being interrupted may have a much larger state, and it will not be restored correctly by longjmp.

Providing/passing argument to signal handler

Can I provide/pass any arguments to signal handler?
/* Signal handling */
struct sigaction act;
act.sa_handler = signal_handler;
/* some more settings */
Now, handler looks like this:
void signal_handler(int signo) {
/* some code */
}
If I want to do something special i.e. delete temp files, can I provide those files as an argument to this handler?
Edit 0: Thanks for the answers. We generally avoid/discourage use of global variables. And in this case, If you have a huge program, things can go wrong at different places and you might need to do a lot of cleanup. Why was the API designed this way?

You can't have data of your own passed to the signal handler as parameters. Instead you'll have to store your parameters in global variables. (And be really, really careful if you ever need to change those data after installing the the signal handler).
Response to edit 0: Historical reasons. Signals are a really old and really low-level design. Basically you're just given the kernel a single address to some machine code and asking it to go to this specific address if such and such happens. We're back in the "portable assembler" mindset here, where the kernels provide a no-frills baseline service, and whatever the user process can reasonably be expected to to for itself, it must do itself.
Also, the usual arguments against global variables don't really apply here. The signal handler itself is a global setting, so there is no relevant possibility of having several different sets of user-specified parameters for it around. (Well, actually it is not entirely global but only thread-global. But the threading API will include some mechanism for thread-local storage, which is just what you need in this case).

A signal handler registration is already a global state equivalent to global variables. So it's no greater offense to use global variables to pass arguments to it. However, it's a huge mistake (almost certainly undefined behavior unless you're an expert!) to do anything from a signal handler anyway. If you instead just block signals and poll for them from your main program loop, you can avoid all these issues.

This is a really old question but I think I can show you a nice trick that would have answered your problem.
No need to use sigqueue or whatever.
I also dislike the use of globals variables so I had to find a clever way, in my case, to send a void ptr (which you can later cast to whatever suits your need).
Actually you can do this :
signal(SIGWHATEVER, (void (*)(int))sighandler); // Yes it works ! Even with -Wall -Wextra -Werror using gcc
Then your sighandler would look like this :
int sighandler(const int signal, void *ptr) // Actually void can be replaced with anything you want , MAGIC !
You might ask : How to get the *ptr then ?
Here's how :
At initialization
signal(SIGWHATEVER, (void (*)(int))sighandler)
sighandler(FAKE_SIGNAL, your_ptr);
In your sighandler func
:
int sighandler(const int signal, void *ptr)
{
static my_struct saved = NULL;
if (saved == NULL)
saved = ptr;
if (signal == SIGNALWHATEVER)
// DO YOUR STUFF OR FREE YOUR PTR
return (0);
}

Absolutely. You can pass integers and pointers to signal handlers by using sigqueue() instead of the usual kill().
http://man7.org/linux/man-pages/man2/sigqueue.2.html

Store the names of the files in a global variable and then access it from the handler. The signal handler callback will only be passed one argument: the ID for the actual signal that caused the problem (eg SIGINT, SIGTSTP)
Edit 0: "There must be a rock solid reason for not allowing arguments to the handler." <-- There is an interrupt vector (basically, a set of jump addresses to routines for each possible signal). Given the way that the interrupt is triggered, based on the interrupt vector, a particular function is called. Unfortunately, it's not clear where the memory associated with the variable will be called, and depending on the interrupt that memory may actually be corrupted. There is a way to get around it, but then you can't leverage the existing int 0x80 assembly instruction (which some systems still use)

I think you it's better to use SA_SIGINFO in sa_flags so the handler will get
void signal_handler(int sig, siginfo_t *info, void *secret)
in siginfo_t you can provide your params.
Ty:HAPPY code

You can use a signal handler which is a method of a class. Then that handler can access member data from that class. I'm not entirely sure what Python does under the covers here around the C signal() call, but it must be re-scoping data?
I was amazed that this works, but it does. Run this and then kill the process from another terminal.
import os, signal, time
class someclass:
def __init__(self):
self.myvalue = "something initialized not globally defined"
signal.signal(signal.SIGTERM, self.myHandler)
def myHandler(self, s, f):
# WTF u can do this?
print "HEY I CAUGHT IT, AND CHECK THIS OUT", self.myvalue
print "Making an object"
a = someclass()
while 1:
print "sleeping. Kill me now."
time.sleep(60)

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Segmentation fault handling - c

Well you could set a state variable and only free memory if its not set. The signal handler will be called everytime, you can't control that AFAIK.

Related

When is it preferable to cause a segfault in a watchdog thread versus exiting normally to stop a process?

Catching Segmentation Violations and Getting on with Life

Issue in sighandler

Longjmp out of signal handler?

Providing/passing argument to signal handler

Categories

Resources