snprintf in signal handler creates segmentation fault if started with valgrind - c

This very simple C program gives me a segmentation fault when running it with valgrind.
Its runs fine when started normal.
It crashes when you send the USR1 signal to the process.
The problem seems to be the way snprintf handles the formatting of the float value, because it works fine if you use a string (%s) or int (%d) format parameter.
P.S. I know you you shouldn't call any printf family function inside a signal handler, but still why does it only crash with valgrind.
#include <stdio.h>
#include <signal.h>
void sig_usr1(int sig) {
char buf[128];
snprintf(buf, sizeof(buf), "%f", 1.0);
}
int main(int argc, char **argv) {
(void) signal(SIGUSR1, sig_usr1);
while(1);
}

As cnicutar notes, valgrind may have an effect on anything timing related and signal handlers would certainly qualify.
I don't think snprintf is safe to use in a signal handler so it might be working in the non-valgrind case solely by accident and then valgrind comes in, changes the timing, and you get the flaming death that you were risking without valigrind.
I found a list of functions that are safe in signal handlers (according to POSIX.1-2003 ) here:
http://linux.die.net/man/2/signal
Yes, the linux.die.net man pages are a bit out of date but the list here (thanks to RedX for finding this one):
https://www.securecoding.cert.org/confluence/display/seccode/SIG30-C.+Call+only+asynchronous-safe+functions+within+signal+handlers
doesn't mention snprintf either except in the context of OpenBSD where it say:
... asynchronous-safe in OpenBSD but "probably not on other systems," including snprintf(), ...
so the implication is that snprintf is not, in general, safe in a signal handler.
And, thanks to Nemo, we have an authoritative list of functions that are safe for use in signal handlers:
http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_04_03
Start at that link and search down for _Exit and you'll see the list; then you'll see that snprintf is not on the list.
Also, I remember using write() in a signal handler because fprintf wasn't safe for a signal handler but that was a long time ago.
I don't have a copy of the relevant standard so I can't back this up with anything really authoritative but I thought I'd mention it anyway.

From manual: http://www.network-theory.co.uk/docs/valgrind/valgrind_27.html and http://www.network-theory.co.uk/docs/valgrind/valgrind_24.html
Valgrind's signal simulation is not as robust as it could be. Basic POSIX-compliant sigaction and sigprocmask functionality is supplied, but it's conceivable that things could go badly awry if you do weird things with signals. Workaround: don't. Programs that do non-POSIX signal tricks are in any case inherently unportable, so should be avoided if possible.
So, snprintf in signal handler is not a POSIX-allowed signal trick and valgrind has a right to brick your programs.
Why snprintf is not signal-safe?
The glibc manual says: http://www.gnu.org/software/hello/manual/libc/Nonreentrancy.html
If a function uses and modifies an object that you supply, then it is potentially non-reentrant; two calls can interfere if they use the same object.
This case arises when you do I/O using streams. Suppose that the signal handler prints a message with fprintf. Suppose that the program was in the middle of an fprintf call using the same stream when the signal was delivered. Both the signal handler's message and the program's data could be corrupted, because both calls operate on the same data structure—the stream itself.
However, if you know that the stream that the handler uses cannot possibly be used by the program at a time when signals can arrive, then you are safe. It is no problem if the program uses some other stream.
You can say that s*printf* are not on streams, but on strings. But internally, glibc's snprintf does work on special stream:
ftp://sources.redhat.com/pub/glibc/snapshots/glibc-latest.tar.bz2/glibc-20090518/libio/vsnprintf.c
int
_IO_vsnprintf (string, maxlen, format, args)
{
_IO_strnfile sf; // <<-- FILE*-like descriptor
The %f output code in glibc also has a malloc call inside it:
ftp://sources.redhat.com/pub/glibc/snapshots/glibc-latest.tar.bz2/glibc-20090518/stdio-common/printf_fp.c
/* Allocate buffer for output. We need two more because while rounding
it is possible that we need two more characters in front of all the
other output. If the amount of memory we have to allocate is too
large use `malloc' instead of `alloca'. */
size_t wbuffer_to_alloc = (2 + (size_t) chars_needed) * sizeof (wchar_t);
buffer_malloced = ! __libc_use_alloca (chars_needed * 2 * sizeof (wchar_t));
if (__builtin_expect (buffer_malloced, 0))
{
wbuffer = (wchar_t *) malloc (wbuffer_to_alloc);
if (wbuffer == NULL)
/* Signal an error to the caller. */
return -1;
}
else
wbuffer = (wchar_t *) alloca (wbuffer_to_alloc);

Valgrind slightly changes the timings in your program.
Have a loot at the FAQ.
My program crashes normally, but doesn't under Valgrind, or vice
versa. What's happening?
When a program runs under Valgrind, its environment is slightly
different to when it runs natively.
Most of the time this doesn't make any difference, but it can,
particularly if your program is buggy.

This is a valgrind bug. It calls your signal handler with a stack that is not 16-byte aligned as required by the ABI. On x86_64, floating point arguments are passed in XMM registers which can only be stored at addresses that are 16-byte aligned. You can work around the problem by compiling for 32-bit (gcc -m32).

Related

Why does alarm() cause fgets() to stop waiting?

I am playing around with signals in C. My main function basically asks for some input using fgets(name, 30, stdin), and then sits there and waits. I set an alarm with alarm(3), and I reassigned SIGALRM to call a function myalarm that simply calls system("say PAY ATTENTION"). But after the alarm goes off, fgets() stops waiting for input and my main fn continues on. This happens even if I change myalarm to just set some variable and do nothing with it.
void myalarm(int sig) {
//system("say PAY ATTENTION");
int x = 0;
}
int catch_signal(int sig, void (*handler)(int)) { // when a signal comes in, "catch" it and "handle it" in the way you want
struct sigaction action; // create a new sigaction
action.sa_handler = handler; // set it's sa_handler attribute to the function specified in the header
sigemptyset(&action.sa_mask); // "turn all the signals in the sa_mask off?" "set the sa_mask to contian no signals, i.e. nothing is masked?"
action.sa_flags = 0; // not sure, looks like we aren't using any of the available flags, whatever they may be
return sigaction(sig, &action, NULL); // here is where you actually reassign- now when sig is received, it'll do what action tells it
}
int main() {
if(catch_signal(SIGINT, diediedie)== -1) {
fprintf(stderr, "Can't map the SIGINT handler");
exit(2);
}
if(catch_signal(SIGALRM, myalarm) == -1) {
fprintf(stderr, "Can't map the SIGALAM handler\n");
exit(2);
}
alarm(3);
char name[30];
printf("Enter your name: ");
fgets(name, 30, stdin);
printf("Hello, %s\n", name);
return 0;
}
Why does alarm() make fgets() stop waiting for input?
Edit: Added code for my catch_signal function, and, as per one of the comments, used sigaction instead of signal, but the issue persisted.
The answer is most likely going to be OS/system specific.
(As stated by Retr0spectrum) The fgets() function often makes system calls, such as read(). System calls can terminate if a signal is detected. In the case of this question, the fgets() function has made a system call (likely the read() system call) to read a character from stdin. The SIGALRM causes the system call to terminate, and set errno to EINTR. This also causes the fgets() function to terminate, without reading any characters.
This is not unusual. It's just how the OS implements signals.
To avoid this problem, I will often wrap fgets() function in a loop like this:
do {
errno=0;
fgets(name, 30, stdin);
} while(EINTR == errno);
It will require that you: #include <stdio.h>
(As suggested by TonyB).
As to the question of why the alarm signal interrupts the read, there are two reasons:
It's the way Unix used to do it, and it was because it was much easier to implement in the OS. (On the one hand this sounds kind of lame, but the "don't sweat the hard stuff" attitude was responsible in part for the success of Unix in the first place. This is the topic of Richard P. Gabriel's epic Worse Is Better essay.)
It makes it easy to implement a read that times out and gives up, if that's what you want. (See my answer to this other question.)
But, as other comments and answers discuss, the interrupting behavior is somewhat obsolete; most modern systems (Unix and Linux, at least) now automatically restart an interrupted system call such as read, more or less as you were wishing. (Also as pointed out elsewhere, if you know what you're doing you may be able to select between the two behaviors.)
In the end, though, it's a grey area; I'm pretty sure the C standard leaves it unspecified or implementation-defined or undefined what happens if you interrupt a system call with an alarm or other signal.

Signal Handler not invoked when sigaction is used

I am trying to implement a user level thread library and need to schedule threads in a round robin fashion. I am currently trying to make switching work for 2 threads that I have created using makecontext, getcontext and swapcontext. setitimer with ITIMER_PROF value is used and sigaction is assigned a handler to schedule a new thread whenever the SIGPROF signal is generated.
However, the signal handler is not invoked and the threads therefore never get scheduled. What could be the reason? Here are some snippets of the code:
void userthread_init(long period){
/*long time_period = period;
//Includes all the code like initializing the timer and attaching the signal
// handler function "schedule()" to the signal SIGPROF.
// create a linked list of threads - each thread's context gets added to the list/updated in the list
// in userthread_create*/
struct itimerval it;
struct sigaction act;
act.sa_flags = SA_SIGINFO;
act.sa_sigaction = &schedule;
sigemptyset(&act.sa_mask);
sigaction(SIGPROF,&act,NULL);
time_period = period;
it.it_interval.tv_sec = 4;
it.it_interval.tv_usec = period;
it.it_value.tv_sec = 1;
it.it_value.tv_usec = 100000;
setitimer(ITIMER_PROF, &it,NULL);
//for(;;);
}
The above code is to initialize a timer and attach a handler schedule to the signal handler. I am assuming the signal SIGPROF will be given to the above function which will invoke the scheduler() function. The scheduler function is given below:
void schedule(int sig, siginfo_t *siginf, ucontext_t* context1){
printf("\nIn schedule");
ucontext_t *ucp = NULL;
ucp = malloc(sizeof(ucontext_t));
getcontext(ucp);
//ucp = &sched->context;
sched->context = *context1;
if(sched->next != NULL){
sched = sched->next;
}
else{
sched = first;
}
setcontext(&sched->context);
}
I have a queue of ready threads in which their respective contexts are stored. Each thread should get scheduled whenever setcontext instruction is executed. However, scheduler() is not invoked! Can anyone please point out my mistake??
Completely revising this answer after looking at the code. There are a few issues:
There are several compiler warnings
You are never initializing your thread ID's, not outside or inside your thread creation method, so I'm surprised the code even works!
You are reading from uninitialized memory in your gtthread_create() function, I tested on both OSX & Linux, on OSX it crashes, on Linux by some miracle it's initialized.
In some places you call malloc(), and overwrite it with a pointer to something else - leaking memory
Your threads don't remove themselves from the linked list after they've finished, so weird things are happening after the routines finish.
When I add in the while(1) loop, I do see schedule() being called and output from thread 2, but thread 1 vanishes into fat air (probably because of the uninitialized thread ID). I think you need to have a huge code cleanup.
Here's what I'd suggest:
Fix ALL of your compiler warnings — even if you think they don't matter, the noise may lead to you missing things (such as incompatible pointer types, etc). You're compiling with -Wall & -pedantic; that's a good thing - so now take the next step & fix them.
Put \n at the END of your printf statements, not the start — The two threads ARE outputting to stdout, but it's not getting flushed so you can't see it. Change your printf("\nMessage"); calls to printf("Message\n");
Use Valgrind to detect memory issues — valgrind is the single most amazing tool you will ever use for C/C++ development. It's available through apt-get & yum. Instead of running ./test1, run valgrind ./test1 and it will highlight memory corruption, memory leaks, uninitialized reads, etc. I can't stress this enough; Valgrind is amazing.
If a system call returns a value, check it — in your code, check the return values to all of getcontext, swapcontext, sigaction, setitimer
Only call async-signal-safe methods from your scheduler (or any signal handler) — so far you've fixed malloc() and printf() from inside your scheduler. Check out the signal(7) man page - see "Async-signal-safe functions"
Modularize your code — your linked list implementation could be tidier, and if it was separated out, then 1) your scheduler would have less code & be simpler, and 2) you can isolate issues in your linked list without having to debug scheduler code.
You're almost there, so keep at it - but keep in mind these three simple rules:
Clean as you go
Keep the compiler warnings fixed
When weird things are happening, use valgrind
Good luck!
Old answer:
You should check the return value of any system call. Whether or not it helps you find the answer, you should do it anyway :)
Check the return value of sigaction(), if it's -1, check errno. sigaction() can fail for a few reasons. If your signal handler isn't getting fired, it's possible it hasn't been set up.
Edit: and make sure you check the return of setitimer() too!
Edit 2: Just a thought, can you try getting rid of the malloc()? malloc is not signal safe. eg: like this:
void schedule(int sig, siginfo_t *siginf, ucontext_t* context1){
printf("In schedule\n");
getcontext(&sched->context);
if(sched->next != NULL){
sched = sched->next;
}
else{
sched = first;
}
setcontext(&sched->context);
}
Edit 3: According to this discussion, you can't use printf() inside a signal handler. You can try replacing it with a call to write(), which is async-signal safe:
// printf("In schedule\n");
const char message[] = "In schedule\n";
write( 1, message, sizeof( message ) );

signal always ends program

Doing homework with signals and fork and have a problem with the signal.
I've created the function:
void trata_sinal_int() {
char op[2];
printf("\nTerminate? (y/n)\n");
scanf("%s", op);
if (op[0] == 'y') {
printf("Bye Bye\n");
exit(0);
}
}
And in main I have:
signal(SIGINT, trata_sinal_int);
When I run this, and press CTRL ^C the function void trata_sinal_int() is called and I got the message.
If I press y program ends as expected but if I press n program still ends.
It is not returning to were he was before pressing CTRL ^C.
Is this supposed to happen?
It depends on which standard you are adhering to, but Standard C doesn't allow you to do much more than modify a variable of type volatile sig_atomic_t or call _Exit (or abort() or signal()) from a signal handler. POSIX is a lot more lenient. Your code in your signal handler, replete with user interaction, is pushing beyond the limits of what even POSIX allows. Normally, you want your signal handler function to be small and svelte.
Note that the signal handler function should be:
void trata_sinal_int(int signum)
{
This allows you to compile without casts or compiler warnings about type mismatches.
The signal() function may reset the signal handler back to default behaviour when it is invoked; classically, it is necessary to reinstate the signal handler inside the signal handler:
signal(signum, trata_sinal_int);
So far, that's all pretty generic and semi-trivial.
When you type the Control-C, the system does go back to roughly where it was when the signal was originally received. However, what happens next depends on where it was (one of the reasons you have to be so very careful inside the handler). For example, if it was in the middle of manipulating the free list pointers inside malloc(), it would return there, but if you'd reinvoked malloc() inside the handler, all hell might be breaking loose. If you were inside a system call, then your call may be interrupted (return with an error indication and errno == EINTR), or it may resume where it left off. Otherwise, it should go back to where the calculation was running.
Here's (a fixed up version of) your code built into a test rig. The pause() function waits for a signal before returning.
#include <stdio.h>
#include <signal.h>
#include <stdlib.h>
#include <unistd.h>
static void trata_sinal_int(int signum)
{
char op[2];
signal(signum, trata_sinal_int);
printf("\nTerminate? (y/n)\n");
scanf("%s", op);
if (op[0] == 'y')
{
printf("Bye Bye\n");
exit(0);
}
}
int main(void)
{
signal(SIGINT, trata_sinal_int);
for (int i = 0; i < 3; i++)
{
printf("Pausing\n");
pause();
printf("Continuing\n");
}
printf("Exiting\n");
return(0);
}
I should really point out that the scanf() is not very safe at all; a buffer of size 2 is an open invitation to buffer overflow. I'm also not error checking system calls.
I tested on Mac OS X 10.7.5, a BSD derivative. The chance are good that the resetting of signal() would be unnecessary on this platform, because BSD introduced 'reliable signals' a long time ago (pre-POSIX).
ISO/IEC 9899:2011 §7.14.1.1 The signal function
¶5 If the signal occurs other than as the result of calling the abort or raise function, the
behavior is undefined if the signal handler refers to any object with static or thread
storage duration that is not a lock-free atomic object other than by assigning a value to an
object declared as volatile sig_atomic_t, or the signal handler calls any function
in the standard library other than the abort function, the _Exit function, the
quick_exit function, or the signal function with the first argument equal to the
signal number corresponding to the signal that caused the invocation of the handler.
Furthermore, if such a call to the signal function results in a SIG_ERR return, the
value of errno is indeterminate.252)
252) If any signal is generated by an asynchronous signal handler, the behavior is undefined.
The references to quick_exit() are new in C2011; they were not present in C1999.
POSIX 2008
The section on Signal Concepts goes through what is and is not allowed inside a signal handler under POSIX in considerable detail.
First, your signal handler is not exactly async signal safe. In practice this is probably not a problem in your case, since I assume the main() is basically doing nothing while it is waiting for the signal. But it is definately not correct anyway.
As for why the program exits, not counting segfault:s in the signal handler due to invalid use of FILE* functions such as printf, sscanf etc, when the signal is received any system calls you are doing (or, well, most) will be interreupted with EAGAIN.
If you are using something like sleep() in main to wait for the signal to occur it will be interrupted. You are expected to restart it manually.
To avoid this you probably want to use the significantly more portable sigaction interface instead of signal. If nothing else this allows you to indicate that you want system calls to be restarted.
The reason that FILE * functions (and most other functions that use global state such as malloc and free) is not allowed in signal handlers is that you might be in the middle of another operation on the same state when the signal arrives.
This can cause segfaults or other undefined operations.
The normal 'trick' to implement this is to have a self-pipe: The signal handler will write a byte to the pipe, and your main loop will see this (usually by waiting in poll or something similar) and then act on it.
If you absolutely want to do user interaction in the signal handler you have to use write() and read(), not the FILE* functions.

Coming back to life after Segmentation Violation

Is it possible to restore the normal execution flow of a C program, after the Segmentation Fault error?
struct A {
int x;
};
A* a = 0;
a->x = 123; // this is where segmentation violation occurs
// after handling the error I want to get back here:
printf("normal execution");
// the rest of my source code....
I want a mechanism similar to NullPointerException that is present in Java, C# etc.
Note: Please, don't tell me that there is an exception handling mechanism in C++ because I know that, dont' tell me I should check every pointer before assignment etc.
What I really want to achieve is to get back to normal execution flow as in the example above. I know some actions can be undertaken using POSIX signals. How should it look like? Other ideas?
#include <unistd.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/mman.h>
#include <signal.h>
#include <stdlib.h>
#include <ucontext.h>
void safe_func(void)
{
puts("Safe now ?");
exit(0); //can't return to main, it's where the segfault occured.
}
void
handler (int cause, siginfo_t * info, void *uap)
{
//For test. Never ever call stdio functions in a signal handler otherwise*/
printf ("SIGSEGV raised at address %p\n", info->si_addr);
ucontext_t *context = uap;
/*On my particular system, compiled with gcc -O2, the offending instruction
generated for "*f = 16;" is 6 bytes. Lets try to set the instruction
pointer to the next instruction (general register 14 is EIP, on linux x86) */
context->uc_mcontext.gregs[14] += 6;
//alternativly, try to jump to a "safe place"
//context->uc_mcontext.gregs[14] = (unsigned int)safe_func;
}
int
main (int argc, char *argv[])
{
struct sigaction sa;
sa.sa_sigaction = handler;
int *f = NULL;
sigemptyset (&sa.sa_mask);
sa.sa_flags = SA_SIGINFO;
if (sigaction (SIGSEGV, &sa, 0)) {
perror ("sigaction");
exit(1);
}
//cause a segfault
*f = 16;
puts("Still Alive");
return 0;
}
$ ./a.out
SIGSEGV raised at address (nil)
Still Alive
I would beat someone with a bat if I saw something like this in production code though, it's an ugly, for-fun hack. You'll have no idea if the segfault have corrupted some of your data, you'll have no sane way of recovering and know that everything is Ok now, there's no portable way of doing this. The only mildly sane thing you could do is try to log an error (use write() directly, not any of the stdio functions - they're not signal safe) and perhaps restart the program. For those cases you're much better off writing a superwisor process that monitors a child process exit, logs it and starts a new child process.
You can catch segmentation faults using a signal handler, and decide to continue the excecution of the program (at your own risks).
The signal name is SIGSEGV.
You will have to use the sigaction() function, from the signal.h header.
Basically, it works the following way:
struct sigaction sa1;
struct sigaction sa2;
sa1.sa_handler = your_handler_func;
sa1.sa_flags = 0;
sigemptyset( &sa1.sa_mask );
sigaction( SIGSEGV, &sa1, &sa2 );
Here's the prototype of the handler function:
void your_handler_func( int id );
As you can see, you don't need to return. The program's execution will continue, unless you decide to stop it by yourself from the handler.
"All things are permissible, but not all are beneficial" - typically a segfault is game over for a good reason... A better idea than picking up where it was would be to keep your data persisted (database, or at least a file system) and enable it to pick up where it left off that way. This will give you much better data reliability all around.
See R.'s comment to MacMade answer.
Expanding on what he said, (after handling SIGSEV, or, for that case, SIGFPE, the CPU+OS can return you to the offending insn) here is a test I have for division by zero handling:
#include <stdio.h>
#include <limits.h>
#include <string.h>
#include <signal.h>
#include <setjmp.h>
static jmp_buf context;
static void sig_handler(int signo)
{
/* XXX: don't do this, not reentrant */
printf("Got SIGFPE\n");
/* avoid infinite loop */
longjmp(context, 1);
}
int main()
{
int a;
struct sigaction sa;
memset(&sa, 0, sizeof(struct sigaction));
sa.sa_handler = sig_handler;
sa.sa_flags = SA_RESTART;
sigaction(SIGFPE, &sa, NULL);
if (setjmp(context)) {
/* If this one was on setjmp's block,
* it would need to be volatile, to
* make sure the compiler reloads it.
*/
sigset_t ss;
/* Make sure to unblock SIGFPE, according to POSIX it
* gets blocked when calling its signal handler.
* sigsetjmp()/siglongjmp would make this unnecessary.
*/
sigemptyset(&ss);
sigaddset(&ss, SIGFPE);
sigprocmask(SIG_UNBLOCK, &ss, NULL);
goto skip;
}
a = 10 / 0;
skip:
printf("Exiting\n");
return 0;
}
No, it's not possible, in any logical sense, to restore normal execution following a segmentation fault. Your program just tried to dereference a null pointer. How are you going to carry on as normal if something your program expects to be there isn't? It's a programming bug, the only safe thing to do is to exit.
Consider some of the possible causes of a segmentation fault:
you forgot to assign a legitimate value to a pointer
a pointer has been overwritten possibly because you are accessing heap memory you have freed
a bug has corrupted the heap
a bug has corrupted the stack
a malicious third party is attempting a buffer overflow exploit
malloc returned null because you have run out of memory
Only in the first case is there any kind of reasonable expectation that you might be able to carry on
If you have a pointer that you want to dereference but it might legitimately be null, you must test it before attempting the dereference. I know you don't want me to tell you that, but it's the right answer, so tough.
Edit: here's an example to show why you definitely do not want to carry on with the next instruction after dereferencing a null pointer:
void foobarMyProcess(struct SomeStruct* structPtr)
{
char* aBuffer = structPtr->aBigBufferWithLotsOfSpace; // if structPtr is NULL, will SIGSEGV
//
// if you SIGSEGV and come back to here, at this point aBuffer contains whatever garbage was in memory at the point
// where the stack frame was created
//
strcpy(aBuffer, "Some longish string"); // You've just written the string to some random location in your address space
// good luck with that!
}
Call this, and when a segfault will occur, your code will execute segv_handler and then continue back to where it was.
void segv_handler(int)
{
// Do what you want here
}
signal(SIGSEGV, segv_handler);
There is no meaningful way to recover from a SIGSEGV unless you know EXACTLY what caused it, and there's no way to do that in standard C. It may be possible (conceivably) in an instrumented environment, like a C-VM (?). The same is true for all program error signals; if you try to block/ignore them, or establish handlers that return normally, your program will probably break horribly when they happen unless perhaps they're generated by raise or kill.
Just do yourself a favour and take error cases into account.
In POSIX, your process will get sent SIGSEGV when you do that. The default handler just crashes your program. You can add your own handler using the signal() call. You can implement whatever behaviour you like by handling the signal yourself.
You can use the SetUnhandledExceptionFilter() function (in windows), but even to be able to skip the "illegal" instruction you will need to be able to decode some assembler opcodes. And, as glowcoder said, even if it would "comment out" in runtime the instructions that generates segfaults, what will be left from the original program logic (if it may be called so)?
Everything is possible, but it doesn't mean that it has to be done.
Unfortunately, you can't in this case. The buggy function has undefined behavior and could have corrupted your program's state.
What you CAN do is run the functions in a new process. If this process dies with a return code that indicates SIGSEGV, you know it has failed.
You could also rewrite the functions yourself.
I can see at case for recovering from a Segmentation Violation, if your handling events in a loop and one of these events causes a Segmentation Violation then you would only want to skip over this event, continue processing the remaining events. In my eyes Segmentation Violation are much the same as NullPointerExceptions in Java. Yes the state will be inconsistent and unknown after either of these, however in some cases you would like to handle the situation and carry on. For instance in Algo trading you would pause the execution of an order and allow a trader to manually take over, with out crashing the entire system and ruining all other orders.
the best solution is to inbox each unsafe access this way :
#include <iostream>
#include <signal.h>
#include <setjmp.h>
static jmp_buf buf;
int counter = 0;
void signal_handler(int)
{
longjmp(buf,0);
}
int main()
{
signal(SIGSEGV,signal_handler);
setjmp(buf);
if(counter++ == 0){ // if we did'nt try before
*(int*)(0x1215) = 10; // access an other process's memory
}
std::cout<<"i am alive !!"<<std::endl; // we will get into here in any case
system("pause");
return 0;
}
you program will never crash in almost all os
This glib manual gives you a clear picture of how to write signal handlers.
A signal handler is just a function that you compile together with the rest
of the program. Instead of directly invoking the function, you use signal
or sigaction to tell the operating system to call it when a signal arrives.
This is known as establishing the handler.
In your case you will have to wait for the SIGSEGV indicating a segmentation fault. The list of other signals can be found here.
Signal handlers are broadly classified into tow categories
You can have the handler function note that the signal arrived by tweaking some
global data structures, and then return normally.
You can have the handler function terminate the program or transfer
control to a point where it can recover from the situation that caused the signal.
SIGSEGV comes under program error signals

How to write a signal handler to catch SIGSEGV?

I want to write a signal handler to catch SIGSEGV.
I protect a block of memory for read or write using
char *buffer;
char *p;
char a;
int pagesize = 4096;
mprotect(buffer,pagesize,PROT_NONE)
This protects pagesize bytes of memory starting at buffer against any reads or writes.
Second, I try to read the memory:
p = buffer;
a = *p
This will generate a SIGSEGV, and my handler will be called.
So far so good. My problem is that, once the handler is called, I want to change the access write of the memory by doing
mprotect(buffer,pagesize,PROT_READ);
and continue normal functioning of my code. I do not want to exit the function.
On future writes to the same memory, I want to catch the signal again and modify the write rights and then record that event.
Here is the code:
#include <signal.h>
#include <stdio.h>
#include <malloc.h>
#include <stdlib.h>
#include <errno.h>
#include <sys/mman.h>
#define handle_error(msg) \
do { perror(msg); exit(EXIT_FAILURE); } while (0)
char *buffer;
int flag=0;
static void handler(int sig, siginfo_t *si, void *unused)
{
printf("Got SIGSEGV at address: 0x%lx\n",(long) si->si_addr);
printf("Implements the handler only\n");
flag=1;
//exit(EXIT_FAILURE);
}
int main(int argc, char *argv[])
{
char *p; char a;
int pagesize;
struct sigaction sa;
sa.sa_flags = SA_SIGINFO;
sigemptyset(&sa.sa_mask);
sa.sa_sigaction = handler;
if (sigaction(SIGSEGV, &sa, NULL) == -1)
handle_error("sigaction");
pagesize=4096;
/* Allocate a buffer aligned on a page boundary;
initial protection is PROT_READ | PROT_WRITE */
buffer = memalign(pagesize, 4 * pagesize);
if (buffer == NULL)
handle_error("memalign");
printf("Start of region: 0x%lx\n", (long) buffer);
printf("Start of region: 0x%lx\n", (long) buffer+pagesize);
printf("Start of region: 0x%lx\n", (long) buffer+2*pagesize);
printf("Start of region: 0x%lx\n", (long) buffer+3*pagesize);
//if (mprotect(buffer + pagesize * 0, pagesize,PROT_NONE) == -1)
if (mprotect(buffer + pagesize * 0, pagesize,PROT_NONE) == -1)
handle_error("mprotect");
//for (p = buffer ; ; )
if(flag==0)
{
p = buffer+pagesize/2;
printf("It comes here before reading memory\n");
a = *p; //trying to read the memory
printf("It comes here after reading memory\n");
}
else
{
if (mprotect(buffer + pagesize * 0, pagesize,PROT_READ) == -1)
handle_error("mprotect");
a = *p;
printf("Now i can read the memory\n");
}
/* for (p = buffer;p<=buffer+4*pagesize ;p++ )
{
//a = *(p);
*(p) = 'a';
printf("Writing at address %p\n",p);
}*/
printf("Loop completed\n"); /* Should never happen */
exit(EXIT_SUCCESS);
}
The problem is that only the signal handler runs and I can't return to the main function after catching the signal.
When your signal handler returns (assuming it doesn't call exit or longjmp or something that prevents it from actually returning), the code will continue at the point the signal occurred, reexecuting the same instruction. Since at this point, the memory protection has not been changed, it will just throw the signal again, and you'll be back in your signal handler in an infinite loop.
So to make it work, you have to call mprotect in the signal handler. Unfortunately, as Steven Schansker notes, mprotect is not async-safe, so you can't safely call it from the signal handler. So, as far as POSIX is concerned, you're screwed.
Fortunately on most implementations (all modern UNIX and Linux variants as far as I know), mprotect is a system call, so is safe to call from within a signal handler, so you can do most of what you want. The problem is that if you want to change the protections back after the read, you'll have to do that in the main program after the read.
Another possibility is to do something with the third argument to the signal handler, which points at an OS and arch specific structure that contains info about where the signal occurred. On Linux, this is a ucontext structure, which contains machine-specific info about the $PC address and other register contents where the signal occurred. If you modify this, you change where the signal handler will return to, so you can change the $PC to be just after the faulting instruction so it won't re-execute after the handler returns. This is very tricky to get right (and non-portable too).
edit
The ucontext structure is defined in <ucontext.h>. Within the ucontext the field uc_mcontext contains the machine context, and within that, the array gregs contains the general register context. So in your signal handler:
ucontext *u = (ucontext *)unused;
unsigned char *pc = (unsigned char *)u->uc_mcontext.gregs[REG_RIP];
will give you the pc where the exception occurred. You can read it to figure out what instruction it
was that faulted, and do something different.
As far as the portability of calling mprotect in the signal handler is concerned, any system that follows either the SVID spec or the BSD4 spec should be safe -- they allow calling any system call (anything in section 2 of the manual) in a signal handler.
You've fallen into the trap that all people do when they first try to handle signals. The trap? Thinking that you can actually do anything useful with signal handlers. From a signal handler, you are only allowed to call asynchronous and reentrant-safe library calls.
See this CERT advisory as to why and a list of the POSIX functions that are safe.
Note that printf(), which you are already calling, is not on that list.
Nor is mprotect. You're not allowed to call it from a signal handler. It might work, but I can promise you'll run into problems down the road. Be really careful with signal handlers, they're tricky to get right!
EDIT
Since I'm being a portability douchebag at the moment already, I'll point out that you also shouldn't write to shared (i.e. global) variables without taking the proper precautions.
You can recover from SIGSEGV on linux. Also you can recover from segmentation faults on Windows (you'll see a structured exception instead of a signal). But the POSIX standard doesn't guarantee recovery, so your code will be very non-portable.
Take a look at libsigsegv.
You should not return from the signal handler, as then behavior is undefined. Rather, jump out of it with longjmp.
This is only okay if the signal is generated in an async-signal-safe function. Otherwise, behavior is undefined if the program ever calls another async-signal-unsafe function. Hence, the signal handler should only be established immediately before it is necessary, and disestablished as soon as possible.
In fact, I know of very few uses of a SIGSEGV handler:
use an async-signal-safe backtrace library to log a backtrace, then die.
in a VM such as the JVM or CLR: check if the SIGSEGV occurred in JIT-compiled code. If not, die; if so, then throw a language-specific exception (not a C++ exception), which works because the JIT compiler knew that the trap could happen and generated appropriate frame unwind data.
clone() and exec() a debugger (do not use fork() – that calls callbacks registered by pthread_atfork()).
Finally, note that any action that triggers SIGSEGV is probably UB, as this is accessing invalid memory. However, this would not be the case if the signal was, say, SIGFPE.
There is a compilation problem using ucontext_t or struct ucontext (present in /usr/include/sys/ucontext.h)
http://www.mail-archive.com/arch-general#archlinux.org/msg13853.html

Resources