I am trying to handle a SIGFPE signal but my program just crashes or runs forever. I HAVE to use signal() and not the other ones like sigaction().
So in my code I have:
#include <stdio.h>
#include <signal.h>
void handler(int signum)
{
// Do stuff here then return to execution below
}
int main()
{
signal(SIGFPE, handler);
int i, j;
for(i = 0; i < 10; i++)
{
// Call signal handler for SIGFPE
j = i / 0;
}
printf("After for loop");
return 0;
}
Basically, I want to go into the handler every time there is a division by 0. It should do whatever it needs to inside the handler() function then continue the next iteration of the loop.
This should also work for other signals that need to be handled. Any help would be appreciated.
If you have to use signal to handle FPE or any other signal that you cause directly by invoking the CPU nonsense that causes it, it is only defined what happens if you either exit the program from the signal handler or use longjmp to get out.
Also note the exact placement of the restore functions, at the end of the computation branch but at the start of the handle branch.
Unfortunately, you can't use signal() like this at all; the second invocation causes the code to fall down. You must use sigaction if you intend to handle the signal more than once.
#include <stdio.h>
#include <signal.h>
#include <setjmp.h>
#include <string.h>
jmp_buf fpe;
void handler(int signum)
{
// Do stuff here then return to execution below
longjmp(fpe, 1);
}
int main()
{
volatile int i, j;
for(i = 0; i < 10; i++)
{
// Call signal handler for SIGFPE
struct sigaction act;
struct sigaction oldact;
memset(&act, 0, sizeof(act));
act.sa_handler = handler;
act.sa_flags = SA_NODEFER | SA_NOMASK;
sigaction(SIGFPE, &act, &oldact);
if (0 == setjmp(fpe))
{
j = i / 0;
sigaction(SIGFPE, &oldact, &act);
} else {
sigaction(SIGFPE, &oldact, &act);
/* handle SIGFPE */
}
}
printf("After for loop");
return 0;
}
Caveat: Sorry to rain on the parade, but you really don't want to do this.
It is perfectly valid to trap [externally generated] signals like SIGINT, SIGTERM, SIGHUP etc. to allow graceful cleanup and termination of a program that may have files open that are partially written to.
However, internally generated signals, such as SIGILL, SIGBUS, SIGSEGV and SIGFPE are very hard to recover from meaningfully. The first three are bugs--pure and simple. And, IMO, the SIGFPE is also a hard bug as well.
After such a signal, your program is in an unsafe and indeterminate state. Even trapping the signal and doing longjmp/siglongjmp doesn't fix this.
And, there is no way to tell exactly how bad the damage is. Or, how bad the damage will become if the program tries to proceed.
If you get SIGFPE, was it for a floating point calculation [which you might be able to smooth over]. Or, was it for integer divide-by-zero? What calculation was being done? And, where? You don't know.
Trying to continue can sometimes cause 10x the damage because now the program is out of control. After recovery, the program may be okay, but it may not be. So, the reliability of the program after the event, can not be determined with any degree of certainty.
What were the events (i.e.) calculations that led up to the SIGFPE? Maybe, it's not merely a single divide, but the chain of calculations that led up to the value being zero. Where did these values go? Will these now suspect values be used by code after the recovery operation has taken place?
For example, the program might overwrite the wrong file because the failed calculation was somehow involved in selecting the file descriptor that a caller is going to use.
Or, you leak memory. Or, corrupt the heap. Or, was the error within the heap allocation code itself?
Consider the following function:
void
myfunc(char *file)
{
int fd;
fd = open(file,O_WRONLY);
while (1) {
// do stuff ...
// write to the file
write(fd,buf,len);
// do more stuff ...
// generate SIGFPE ...
x = y / z;
}
close(fd);
}
Even with a signal handler that does siglongjmp, the file that myfunc was writing to is now corrupted/truncated. And, the file descriptor won't be closed.
Or, what if myfunc was reading from the file and saving the data to some array. That array is only partially filled. Now, you get SIGFPE. This is intercepted by the signal handler which does siglongjmp.
One of the callers of myfunc does the sigsetjmp to "catch" this. But, what can it do? The caller has no idea how bad things are. It might assume that the buffer myfunc was reading into is fully formed and write it out to a different file. That other file has now become corrupted.
UPDATE:
Oops, forgot to mention undefined behavior ...
Normally, we associate UB, such as writing past the end of an array, with a segfault [SIGSEGV]. But, what if it causes SIGFPE instead?
It's no longer just a "bad calculation" -- we're trapping [and ignoring] UB at the earliest detection point. If we do recovery, the next usage could be worse.
Here's an example:
// assume these are ordered in memory as if they were part of the same struct:
int x[10];
int y;
int z;
void
myfunc(void)
{
// initialize
y = 23;
z = 37;
// do stuff ...
// generate UB -- we run one past the end of x and zero out y
for (int i = 0; i <= 10; ++i)
x[i] = 0;
// do more stuff ...
// generate SIGFPE ...
z /= y;
// do stuff ...
// do something _really_ bad with y that causes a segfault or _worse_
// sends a space rocket off-course ...
}
Related
I have a simple program using signal with the user's handlers.
#include <signal.h>
#include <stdio.h>
#include <zconf.h>
int x = 0;
int i = 3;
void catcher3(int signum) {
i = 1;
}
void catcher2(int signum) {
// Stuck in infinity loop here.
// Happens even with i == 0
if (i != 0) {
x = 5;
}
}
void catcher1(int signum) {
printf("i = %d\n", i);
i--;
if (i == 0) {
signal(SIGFPE, catcher2);
signal(SIGTERM, catcher3);
}
}
int main() {
signal(SIGFPE, catcher1);
x = 10 / x;
printf("Goodbye");
}
While I expect it to print:
3
2
1
Goodbye
It actually prints:
3
2
1
# Infinity loop within catcher2
My questions are:
On running a user handler like catcher1, to which point the code returns after the handler's execution? I would expect it continue the execution but it re-runs the signal handler.
What causes the infinity loop?
How to fix it?
Why sending SIGTERM won't print "Goodbye"? (kill -s TERM <pid>)
As pointed out by AProgrammer, the program doesn't necessarily read x after returning from the handler, even if x is marked volatile (which it should be anyway). This is because the execution continues to the offending instruction. The read from memory and the actual division could be separate instructions.
To get around this you will have to continue the execution to a point before x was read from memory.
You can modify your program as follows -
#include <csetjmp>
jmp_buf fpe;
volatile int x = 0; // Notice the volatile
volatile int i = 3;
void catcher2(int signum) {
if (i != 0) {
x = 5;
longjump(fpe, 1);
}
}
int main() {
signal(SIGFPE, catcher1);
setjump(fpe);
x = 10 / x;
printf("Goodbye");
}
Rest of the functions can remain the same.
You should also not be using printf from the signal handler. Instead use write directly to print debug messages as -
write(1, "SIGNAL\n", sizeof("SIGNAL\n"));
The handling of signals is complex and full of implementation defined, unspecified and undefined behavior. If you want to be portable, there is in fact very few things that you can do. Mostly reading and writing volatile sig_atomic_t and calling _Exit. Depending on the signal number, it is often undefined if you leave the signal handler in another way than calling _Exit.
In your case, I think FPE is one of those signals for which leaving normally the signal handler is UB. The best I can see is restarting the machine instruction which triggered the signal. Few architectures, and last I looked x86 was not one of them, provide a way to do 10/x without loading x in a register; that means that restarting the instruction will always restart the signal, even if you modify x and x us a volatile sig_atomtic_t.
Usually longjmp is also able to leave signal handler. #Bodo confirmed that using setjmp and longjmp to restart the division, you can get the behavior you want.
Note: on Unix there is another set of functions, sigaction, siglongjump and others, which is better to use. In fact I don't recommend using something else in any serious program.
I'm new at signal handling in Unix through C and I have been looking at some tutorials on it (out of pure interest).
My questions is, is it possible to continue execution of a program past the point where a signal is handled?
I understand that the signal handling function does the cleanup but in the spirit of exception handling (such as in C++), is it possible for that signal to be handled in the same fashion and for the program to continue running normally?
At the moment catch goes in an infinite loop (presumably a way to quit would be to call exit(1) ).
My intention would be for b to be assigned 1 and for the program to finish gracefully (if that is possible of course).
Here's my code:
#include <signal.h>
#include <stdio.h>
int a = 5;
int b = 0;
void catch(int sig)
{
printf("Caught the signal, will handle it now\n");
b = 1;
}
int main(void)
{
signal(SIGFPE, catch);
int c = a / b;
return 0;
}
Also, as C is procedural, how come the signal handler declared before the offending statement is actually called after the latter has executed?
And finally, in order for the handling function to do its clean up properly, all the variables than need to be cleaned up in the event of an exception need to be declared prior to the function, right?
Thanks in advance for your answers and apologies if some of the above is very obvious.
Yes, that's what signal handlers are for. But some signals need to be handled specially in order to allow the program to continue (e.g. SIGSEGV, SIGFPE, …).
See the manpage of sigaction:
According to POSIX, the behavior of a process is undefined after it ignores a SIGFPE, SIGILL, or SIGSEGV signal that was not
generated by kill(2) or raise(3). Integer division by zero has undefined result. On some architectures it will generate a
SIGFPE signal. (Also dividing the most negative integer by -1 may generate SIGFPE.) Ignoring this signal might lead to an
endless loop.
Right now, you are ignoring the signal, by not doing anything to prevent it from happening (again). You need the execution context in the signal handler and fix it up manually, which involves overwriting some registers.
If SA_SIGINFO is specified in sa_flags, then sa_sigaction (instead of
sa_handler) specifies the signal-handling function for signum. This
function receives the signal number as its first argument, a pointer
to a siginfo_t as its second argument and a pointer to a ucontext_t
(cast to void *) as its third argument. (Commonly, the handler
function doesn't make any use of the third argument. See
getcontext(2) for further information about ucontext_t.)
The context allows access to the registers at the time of fault and needs to be changed to allow your program to continue. See this lkml post. As mentioned there, siglongjmp might also be an option. The post also offers a rather reusable solution for handling the error, without having to make variables global etc.:
And because you handle it youself, you have any flexibility you want
to with error handling. For example, you can make the fault handler
jump to some specified point in your function with something like
this:
__label__ error_handler;
__asm__("divl %2"
:"=a" (low), "=d" (high)
:"g" (divisor), "c" (&&error_handler))
... do normal cases ...
error_handler:
... check against zero division or overflow, so whatever you want to ..
Then, your handler for SIGFPE needs only to do something like
context.eip = context.ecx;
If you know what you are doing, you can set the instruction pointer to point right after the offending instruction. Below is my example for x86 (32bit and 64bit). Don't try at home or in real products !!!
#define _GNU_SOURCE /* Bring REG_XXX names from /usr/include/sys/ucontext.h */
#include <stdio.h>
#include <string.h>
#include <signal.h>
#include <ucontext.h>
static void sigaction_segv(int signal, siginfo_t *si, void *arg)
{
ucontext_t *ctx = (ucontext_t *)arg;
/* We are on linux x86, the returning IP is stored in RIP (64bit) or EIP (32bit).
In this example, the length of the offending instruction is 6 bytes.
So we skip the offender ! */
#if __WORDSIZE == 64
printf("Caught SIGSEGV, addr %p, RIP 0x%lx\n", si->si_addr, ctx->uc_mcontext.gregs[REG_RIP]);
ctx->uc_mcontext.gregs[REG_RIP] += 6;
#else
printf("Caught SIGSEGV, addr %p, EIP 0x%x\n", si->si_addr, ctx->uc_mcontext.gregs[REG_EIP]);
ctx->uc_mcontext.gregs[REG_EIP] += 6;
#endif
}
int main(void)
{
struct sigaction sa;
memset(&sa, 0, sizeof(sa));
sigemptyset(&sa.sa_mask);
sa.sa_sigaction = sigaction_segv;
sa.sa_flags = SA_SIGINFO;
sigaction(SIGSEGV, &sa, NULL);
/* Generate a seg fault */
*(int *)NULL = 0;
printf("Back to normal execution.\n");
return 0;
}
In general, yes, execution continues after the handler returns. But if the signal was caused by a hardware error (such as a floating point exception or a segmentation fault), you have no way of undoing that error, and so your program will be terminated regardless.
In other words, you have to distinguish between signals and things that cause signals. Signals by themselves are perfectly fine and handlable, but they don't always let you fix errors that cause signals.
(Some signals are special, such as ABRT and STOP, in the sense that even if you just raise such a signal manually with kill, you still can't "prevent its effects". And of course KILL cannot even be handled at all.)
Here is my code,
#include<signal.h>
#include<stdio.h>
int main(int argc,char ** argv)
{
char *p=NULL;
signal(SIGSEGV,SIG_IGN); //Ignoring the Signal
printf("%d",*p);
printf("Stack Overflow"); //This has to be printed. Right?
return 0;
}
While executing the code, i'm getting segmentation fault. I ignored the signal using SIG_IGN. So I shouldn't get Segmentation fault. Right? Then, the printf() statement after printing '*p' value must executed too. Right?
Your code is ignoring SIGSEGV instead of catching it. Recall that the instruction that triggered the signal is restarted after handling the signal. In your case, handling the signal didn't change anything so the next time round the offending instruction is tried, it fails the same way.
If you intend to catch the signal change this
signal(SIGSEGV, SIG_IGN);
to this
signal(SIGSEGV, sighandler);
You should probably also use sigaction() instead of signal(). See relevant man pages.
In your case the offending instruction is the one which tries to dereference the NULL pointer.
printf("%d", *p);
What follows is entirely dependent on your platform.
You can use gdb to establish what particular assembly instruction triggers the signal. If your platform is anything like mine, you'll find the instruction is
movl (%rax), %esi
with rax register holding value 0, i.e. NULL. One (non-portable!) way to fix this in your signal handler is to use the third argument signal your handler gets, i.e. the user context. Here is an example:
#include <signal.h>
#include <stdio.h>
#define __USE_GNU
#include <ucontext.h>
int *p = NULL;
int n = 100;
void sighandler(int signo, siginfo_t *si, ucontext_t* context)
{
printf("Handler executed for signal %d\n", signo);
context->uc_mcontext.gregs[REG_RAX] = &n;
}
int main(int argc,char ** argv)
{
signal(SIGSEGV, sighandler);
printf("%d\n", *p); // ... movl (%rax), %esi ...
return 0;
}
This program displays:
Handler executed for signal 11
100
It first causes the handler to be executed by attempting to dereference a NULL address. Then the handler fixes the issue by setting rax to the address of variable n. Once the handler returns the system retries the offending instruction and this time succeeds. printf() receives 100 as its second argument.
I strongly recommend against using such non-portable solutions in your programs, though.
You can ignore the signal but you have to do something about it. I believe what you are doing in the code posted (ignoring SIGSEGV via SIG_IGN) won't work at all for reasons which will become obvious after reading the bold bullet.
When you do something that causes the kernel to send you a SIGSEGV:
If you don't have a signal handler, the kernel kills the process and that's that
If you do have a signal handler
Your handler gets called
The kernel restarts the offending operation
So if you don't do anything abut it, it will just loop continuously. If you do catch SIGSEGV and you don't exit, thereby interfering with the normal flow, you must:
fix things such that the offending operation doesn't restart or
fix the memory layout such that what was offending will be ok on the
next run
Another option is to bracket the risky operation with setjmp/longjmp, i.e.
#include <setjmp.h>
#include <signal.h>
static jmp_buf jbuf;
static void catch_segv()
{
longjmp(jbuf, 1);
}
int main()
{
int *p = NULL;
signal(SIGSEGV, catch_segv);
if (setjmp(jbuf) == 0) {
printf("%d\n", *p);
} else {
printf("Ouch! I crashed!\n");
}
return 0;
}
The setjmp/longjmp pattern here is similar to a try/catch block. It's very risky though, and won't save you if your risky function overruns the stack, or allocates resources but crashes before they're freed. Better to check your pointers and not indirect through bad ones.
Trying to ignore or handle a SIGSEGV is the wrong approach. A SIGSEGV triggered by your program always indicates a bug. Either in your code or code you delegate to. Once you have a bug triggered, anything could happen. There is no reasonable "clean-up" or fix action the signal handler can perform, because it can not know where the signal was triggered or what action to perform. The best you can do is to let the program fail fast, so a programmer will have a chance to debug it when it is still in the immediate failure state, rather than have it (probably) fail later when the cause of the failure has been obscured. And you can cause the program to fail fast by not trying to ignore or handle the signal.
Is it possible to restore the normal execution flow of a C program, after the Segmentation Fault error?
struct A {
int x;
};
A* a = 0;
a->x = 123; // this is where segmentation violation occurs
// after handling the error I want to get back here:
printf("normal execution");
// the rest of my source code....
I want a mechanism similar to NullPointerException that is present in Java, C# etc.
Note: Please, don't tell me that there is an exception handling mechanism in C++ because I know that, dont' tell me I should check every pointer before assignment etc.
What I really want to achieve is to get back to normal execution flow as in the example above. I know some actions can be undertaken using POSIX signals. How should it look like? Other ideas?
#include <unistd.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/mman.h>
#include <signal.h>
#include <stdlib.h>
#include <ucontext.h>
void safe_func(void)
{
puts("Safe now ?");
exit(0); //can't return to main, it's where the segfault occured.
}
void
handler (int cause, siginfo_t * info, void *uap)
{
//For test. Never ever call stdio functions in a signal handler otherwise*/
printf ("SIGSEGV raised at address %p\n", info->si_addr);
ucontext_t *context = uap;
/*On my particular system, compiled with gcc -O2, the offending instruction
generated for "*f = 16;" is 6 bytes. Lets try to set the instruction
pointer to the next instruction (general register 14 is EIP, on linux x86) */
context->uc_mcontext.gregs[14] += 6;
//alternativly, try to jump to a "safe place"
//context->uc_mcontext.gregs[14] = (unsigned int)safe_func;
}
int
main (int argc, char *argv[])
{
struct sigaction sa;
sa.sa_sigaction = handler;
int *f = NULL;
sigemptyset (&sa.sa_mask);
sa.sa_flags = SA_SIGINFO;
if (sigaction (SIGSEGV, &sa, 0)) {
perror ("sigaction");
exit(1);
}
//cause a segfault
*f = 16;
puts("Still Alive");
return 0;
}
$ ./a.out
SIGSEGV raised at address (nil)
Still Alive
I would beat someone with a bat if I saw something like this in production code though, it's an ugly, for-fun hack. You'll have no idea if the segfault have corrupted some of your data, you'll have no sane way of recovering and know that everything is Ok now, there's no portable way of doing this. The only mildly sane thing you could do is try to log an error (use write() directly, not any of the stdio functions - they're not signal safe) and perhaps restart the program. For those cases you're much better off writing a superwisor process that monitors a child process exit, logs it and starts a new child process.
You can catch segmentation faults using a signal handler, and decide to continue the excecution of the program (at your own risks).
The signal name is SIGSEGV.
You will have to use the sigaction() function, from the signal.h header.
Basically, it works the following way:
struct sigaction sa1;
struct sigaction sa2;
sa1.sa_handler = your_handler_func;
sa1.sa_flags = 0;
sigemptyset( &sa1.sa_mask );
sigaction( SIGSEGV, &sa1, &sa2 );
Here's the prototype of the handler function:
void your_handler_func( int id );
As you can see, you don't need to return. The program's execution will continue, unless you decide to stop it by yourself from the handler.
"All things are permissible, but not all are beneficial" - typically a segfault is game over for a good reason... A better idea than picking up where it was would be to keep your data persisted (database, or at least a file system) and enable it to pick up where it left off that way. This will give you much better data reliability all around.
See R.'s comment to MacMade answer.
Expanding on what he said, (after handling SIGSEV, or, for that case, SIGFPE, the CPU+OS can return you to the offending insn) here is a test I have for division by zero handling:
#include <stdio.h>
#include <limits.h>
#include <string.h>
#include <signal.h>
#include <setjmp.h>
static jmp_buf context;
static void sig_handler(int signo)
{
/* XXX: don't do this, not reentrant */
printf("Got SIGFPE\n");
/* avoid infinite loop */
longjmp(context, 1);
}
int main()
{
int a;
struct sigaction sa;
memset(&sa, 0, sizeof(struct sigaction));
sa.sa_handler = sig_handler;
sa.sa_flags = SA_RESTART;
sigaction(SIGFPE, &sa, NULL);
if (setjmp(context)) {
/* If this one was on setjmp's block,
* it would need to be volatile, to
* make sure the compiler reloads it.
*/
sigset_t ss;
/* Make sure to unblock SIGFPE, according to POSIX it
* gets blocked when calling its signal handler.
* sigsetjmp()/siglongjmp would make this unnecessary.
*/
sigemptyset(&ss);
sigaddset(&ss, SIGFPE);
sigprocmask(SIG_UNBLOCK, &ss, NULL);
goto skip;
}
a = 10 / 0;
skip:
printf("Exiting\n");
return 0;
}
No, it's not possible, in any logical sense, to restore normal execution following a segmentation fault. Your program just tried to dereference a null pointer. How are you going to carry on as normal if something your program expects to be there isn't? It's a programming bug, the only safe thing to do is to exit.
Consider some of the possible causes of a segmentation fault:
you forgot to assign a legitimate value to a pointer
a pointer has been overwritten possibly because you are accessing heap memory you have freed
a bug has corrupted the heap
a bug has corrupted the stack
a malicious third party is attempting a buffer overflow exploit
malloc returned null because you have run out of memory
Only in the first case is there any kind of reasonable expectation that you might be able to carry on
If you have a pointer that you want to dereference but it might legitimately be null, you must test it before attempting the dereference. I know you don't want me to tell you that, but it's the right answer, so tough.
Edit: here's an example to show why you definitely do not want to carry on with the next instruction after dereferencing a null pointer:
void foobarMyProcess(struct SomeStruct* structPtr)
{
char* aBuffer = structPtr->aBigBufferWithLotsOfSpace; // if structPtr is NULL, will SIGSEGV
//
// if you SIGSEGV and come back to here, at this point aBuffer contains whatever garbage was in memory at the point
// where the stack frame was created
//
strcpy(aBuffer, "Some longish string"); // You've just written the string to some random location in your address space
// good luck with that!
}
Call this, and when a segfault will occur, your code will execute segv_handler and then continue back to where it was.
void segv_handler(int)
{
// Do what you want here
}
signal(SIGSEGV, segv_handler);
There is no meaningful way to recover from a SIGSEGV unless you know EXACTLY what caused it, and there's no way to do that in standard C. It may be possible (conceivably) in an instrumented environment, like a C-VM (?). The same is true for all program error signals; if you try to block/ignore them, or establish handlers that return normally, your program will probably break horribly when they happen unless perhaps they're generated by raise or kill.
Just do yourself a favour and take error cases into account.
In POSIX, your process will get sent SIGSEGV when you do that. The default handler just crashes your program. You can add your own handler using the signal() call. You can implement whatever behaviour you like by handling the signal yourself.
You can use the SetUnhandledExceptionFilter() function (in windows), but even to be able to skip the "illegal" instruction you will need to be able to decode some assembler opcodes. And, as glowcoder said, even if it would "comment out" in runtime the instructions that generates segfaults, what will be left from the original program logic (if it may be called so)?
Everything is possible, but it doesn't mean that it has to be done.
Unfortunately, you can't in this case. The buggy function has undefined behavior and could have corrupted your program's state.
What you CAN do is run the functions in a new process. If this process dies with a return code that indicates SIGSEGV, you know it has failed.
You could also rewrite the functions yourself.
I can see at case for recovering from a Segmentation Violation, if your handling events in a loop and one of these events causes a Segmentation Violation then you would only want to skip over this event, continue processing the remaining events. In my eyes Segmentation Violation are much the same as NullPointerExceptions in Java. Yes the state will be inconsistent and unknown after either of these, however in some cases you would like to handle the situation and carry on. For instance in Algo trading you would pause the execution of an order and allow a trader to manually take over, with out crashing the entire system and ruining all other orders.
the best solution is to inbox each unsafe access this way :
#include <iostream>
#include <signal.h>
#include <setjmp.h>
static jmp_buf buf;
int counter = 0;
void signal_handler(int)
{
longjmp(buf,0);
}
int main()
{
signal(SIGSEGV,signal_handler);
setjmp(buf);
if(counter++ == 0){ // if we did'nt try before
*(int*)(0x1215) = 10; // access an other process's memory
}
std::cout<<"i am alive !!"<<std::endl; // we will get into here in any case
system("pause");
return 0;
}
you program will never crash in almost all os
This glib manual gives you a clear picture of how to write signal handlers.
A signal handler is just a function that you compile together with the rest
of the program. Instead of directly invoking the function, you use signal
or sigaction to tell the operating system to call it when a signal arrives.
This is known as establishing the handler.
In your case you will have to wait for the SIGSEGV indicating a segmentation fault. The list of other signals can be found here.
Signal handlers are broadly classified into tow categories
You can have the handler function note that the signal arrived by tweaking some
global data structures, and then return normally.
You can have the handler function terminate the program or transfer
control to a point where it can recover from the situation that caused the signal.
SIGSEGV comes under program error signals
I want to write a signal handler to catch SIGSEGV.
I protect a block of memory for read or write using
char *buffer;
char *p;
char a;
int pagesize = 4096;
mprotect(buffer,pagesize,PROT_NONE)
This protects pagesize bytes of memory starting at buffer against any reads or writes.
Second, I try to read the memory:
p = buffer;
a = *p
This will generate a SIGSEGV, and my handler will be called.
So far so good. My problem is that, once the handler is called, I want to change the access write of the memory by doing
mprotect(buffer,pagesize,PROT_READ);
and continue normal functioning of my code. I do not want to exit the function.
On future writes to the same memory, I want to catch the signal again and modify the write rights and then record that event.
Here is the code:
#include <signal.h>
#include <stdio.h>
#include <malloc.h>
#include <stdlib.h>
#include <errno.h>
#include <sys/mman.h>
#define handle_error(msg) \
do { perror(msg); exit(EXIT_FAILURE); } while (0)
char *buffer;
int flag=0;
static void handler(int sig, siginfo_t *si, void *unused)
{
printf("Got SIGSEGV at address: 0x%lx\n",(long) si->si_addr);
printf("Implements the handler only\n");
flag=1;
//exit(EXIT_FAILURE);
}
int main(int argc, char *argv[])
{
char *p; char a;
int pagesize;
struct sigaction sa;
sa.sa_flags = SA_SIGINFO;
sigemptyset(&sa.sa_mask);
sa.sa_sigaction = handler;
if (sigaction(SIGSEGV, &sa, NULL) == -1)
handle_error("sigaction");
pagesize=4096;
/* Allocate a buffer aligned on a page boundary;
initial protection is PROT_READ | PROT_WRITE */
buffer = memalign(pagesize, 4 * pagesize);
if (buffer == NULL)
handle_error("memalign");
printf("Start of region: 0x%lx\n", (long) buffer);
printf("Start of region: 0x%lx\n", (long) buffer+pagesize);
printf("Start of region: 0x%lx\n", (long) buffer+2*pagesize);
printf("Start of region: 0x%lx\n", (long) buffer+3*pagesize);
//if (mprotect(buffer + pagesize * 0, pagesize,PROT_NONE) == -1)
if (mprotect(buffer + pagesize * 0, pagesize,PROT_NONE) == -1)
handle_error("mprotect");
//for (p = buffer ; ; )
if(flag==0)
{
p = buffer+pagesize/2;
printf("It comes here before reading memory\n");
a = *p; //trying to read the memory
printf("It comes here after reading memory\n");
}
else
{
if (mprotect(buffer + pagesize * 0, pagesize,PROT_READ) == -1)
handle_error("mprotect");
a = *p;
printf("Now i can read the memory\n");
}
/* for (p = buffer;p<=buffer+4*pagesize ;p++ )
{
//a = *(p);
*(p) = 'a';
printf("Writing at address %p\n",p);
}*/
printf("Loop completed\n"); /* Should never happen */
exit(EXIT_SUCCESS);
}
The problem is that only the signal handler runs and I can't return to the main function after catching the signal.
When your signal handler returns (assuming it doesn't call exit or longjmp or something that prevents it from actually returning), the code will continue at the point the signal occurred, reexecuting the same instruction. Since at this point, the memory protection has not been changed, it will just throw the signal again, and you'll be back in your signal handler in an infinite loop.
So to make it work, you have to call mprotect in the signal handler. Unfortunately, as Steven Schansker notes, mprotect is not async-safe, so you can't safely call it from the signal handler. So, as far as POSIX is concerned, you're screwed.
Fortunately on most implementations (all modern UNIX and Linux variants as far as I know), mprotect is a system call, so is safe to call from within a signal handler, so you can do most of what you want. The problem is that if you want to change the protections back after the read, you'll have to do that in the main program after the read.
Another possibility is to do something with the third argument to the signal handler, which points at an OS and arch specific structure that contains info about where the signal occurred. On Linux, this is a ucontext structure, which contains machine-specific info about the $PC address and other register contents where the signal occurred. If you modify this, you change where the signal handler will return to, so you can change the $PC to be just after the faulting instruction so it won't re-execute after the handler returns. This is very tricky to get right (and non-portable too).
edit
The ucontext structure is defined in <ucontext.h>. Within the ucontext the field uc_mcontext contains the machine context, and within that, the array gregs contains the general register context. So in your signal handler:
ucontext *u = (ucontext *)unused;
unsigned char *pc = (unsigned char *)u->uc_mcontext.gregs[REG_RIP];
will give you the pc where the exception occurred. You can read it to figure out what instruction it
was that faulted, and do something different.
As far as the portability of calling mprotect in the signal handler is concerned, any system that follows either the SVID spec or the BSD4 spec should be safe -- they allow calling any system call (anything in section 2 of the manual) in a signal handler.
You've fallen into the trap that all people do when they first try to handle signals. The trap? Thinking that you can actually do anything useful with signal handlers. From a signal handler, you are only allowed to call asynchronous and reentrant-safe library calls.
See this CERT advisory as to why and a list of the POSIX functions that are safe.
Note that printf(), which you are already calling, is not on that list.
Nor is mprotect. You're not allowed to call it from a signal handler. It might work, but I can promise you'll run into problems down the road. Be really careful with signal handlers, they're tricky to get right!
EDIT
Since I'm being a portability douchebag at the moment already, I'll point out that you also shouldn't write to shared (i.e. global) variables without taking the proper precautions.
You can recover from SIGSEGV on linux. Also you can recover from segmentation faults on Windows (you'll see a structured exception instead of a signal). But the POSIX standard doesn't guarantee recovery, so your code will be very non-portable.
Take a look at libsigsegv.
You should not return from the signal handler, as then behavior is undefined. Rather, jump out of it with longjmp.
This is only okay if the signal is generated in an async-signal-safe function. Otherwise, behavior is undefined if the program ever calls another async-signal-unsafe function. Hence, the signal handler should only be established immediately before it is necessary, and disestablished as soon as possible.
In fact, I know of very few uses of a SIGSEGV handler:
use an async-signal-safe backtrace library to log a backtrace, then die.
in a VM such as the JVM or CLR: check if the SIGSEGV occurred in JIT-compiled code. If not, die; if so, then throw a language-specific exception (not a C++ exception), which works because the JIT compiler knew that the trap could happen and generated appropriate frame unwind data.
clone() and exec() a debugger (do not use fork() – that calls callbacks registered by pthread_atfork()).
Finally, note that any action that triggers SIGSEGV is probably UB, as this is accessing invalid memory. However, this would not be the case if the signal was, say, SIGFPE.
There is a compilation problem using ucontext_t or struct ucontext (present in /usr/include/sys/ucontext.h)
http://www.mail-archive.com/arch-general#archlinux.org/msg13853.html