POSIX partial write() and Signal Interrupts - c

From the man page of write()
Note that a successful write() may transfer fewer than count bytes.
Such partial writes can occur for various reasons; for example,
because there was insufficient space on the disk device to write all
of the requested bytes, or because a blocked write() to a socket,
pipe, or similar was interrupted by a signal handler after it had
transferred some, but before it had transferred all of the requested
bytes. In the event of a partial write, the caller can make another
write() call to transfer the remaining bytes. The subsequent call
will either transfer further bytes or may result in an error (e.g.,
if the disk is now full).
I have the following questions
1) In the case of write() being interrupted by signal handler after a partial transfer, will write() set the errno to EINTR ?
2) If errno is not set, is there a way to identify such an event with out extra piece of code (Like installing signal disposition and setting a flag value to true) ?
Note : The further write() calls are successful in transferring the remaining bytes after the event of signal interrupt.

To answer your individual numbered questions:
errno is only meaningful after one of the standard functions returns a value indicating an error - for write, -1 - and before any other standard function or application code that might clobber it is called. So no, if write returns a short write, errno will not be set to anything meaningful. If it's equal to EINTR, it just happens to be; this is not something meaningful you can interpret.
The way you identify such an event is by the return value being strictly less than the nbytes argument. This doesn't actually tell you the cause of the short write, so it could be something else like running out of space. If you need to know, you need to arrange for the signal handler to inform you. But in almost all cases you don't actually need to know.
Regarding the note, if write is returning the full nbytes after a signal arriving, the signal handler was non-interrupting. This is the default on Linux with any modern libc (glibc, musl, anything but libc5 basically), and it's almost always the right thing. If you actually want interrupting signals you have to install the signal handler with sigaction and the SA_RESTART flag clear. (And conversely if you're installing signal handlers you want to have the normal, reasonable, non-interrupting behavior, for portability you should use sigaction and set the SA_RESTART flag rather than using the legacy function signal).

Let's try it and see:
#define _GNU_SOURCE
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <unistd.h>
#include <signal.h>
static void handle_sigalrm(int sig) {
}
int main(void) {
struct sigaction act;
memset(&act, 0, sizeof act);
act.sa_handler = handle_sigalrm;
sigaction(SIGALRM, &act, NULL);
int fds[2];
pipe(fds);
int bufsize = fcntl(fds[1], F_GETPIPE_SZ) + 10;
char *buf = calloc(bufsize, 1);
ssize_t written;
printf("will attempt to write %d bytes and EINTR is %d\n", bufsize, EINTR);
alarm(1);
errno = 0;
written = write(fds[1], buf, bufsize);
printf("write returned %td and errno is %d\n", written, errno);
return 0;
}
That program makes a pipe that nothing will ever read from, does a write to it that's bigger than the kernel's buffer, and arranges for a signal handler to run while the write is blocking. On my system, it prints this:
will attempt to write 65546 bytes and EINTR is 4
write returned 65536 and errno is 0
Thus, the answer to "In the case of write() being interrupted by signal handler after a partial transfer, will write() set the errno to EINTR?" is "no, it won't".

Related

pselect() on Linux does not deliver signals if events are pending

I'm trying to add a signal handler for proper cleanup to my event-driven application.
My signal handler for SIGINT only changes the value of a global flag variable, which is then checked in the main loop. To avoid races, the signal is blocked at all times, except during the pselect() call. This should cause pending signals to be delivered only during the pselect() call, which should be interrupted and fail with EINTR.
This usually works fine, except if there are already events pending on the monitored file descriptors (e.g. under heavy load, when there's always activity on the file descriptors).
This sample program reproduces the problem:
#include <assert.h>
#include <errno.h>
#include <stdbool.h>
#include <stdio.h>
#include <string.h>
#include <sys/select.h>
#include <fcntl.h>
#include <signal.h>
#include <unistd.h>
volatile sig_atomic_t stop_requested = 0;
void handle_signal(int sig)
{
// Use write() and strlen() instead of printf(), which is not async-signal-safe
const char * out = "Caught stop signal. Exiting.\n";
size_t len = strlen (out);
ssize_t writelen = write(STDOUT_FILENO, out, len);
assert(writelen == (ssize_t) len);
stop_requested = 1;
}
int main(void)
{
int ret;
// Install signal handler
{
struct sigaction sa;
memset(&sa, 0, sizeof(sa));
sa.sa_handler = handle_signal;
ret = sigaction(SIGINT, &sa, NULL);
assert(ret == 0);
}
// Block SIGINT
sigset_t old_sigmask;
{
sigset_t blocked;
sigemptyset(&blocked);
sigaddset(&blocked, SIGINT);
ret = sigprocmask(SIG_BLOCK, &blocked, &old_sigmask);
assert(ret == 0);
}
ret = raise(SIGINT);
assert(ret == 0);
// Create pipe and write data to it
int pipefd[2];
ret = pipe(pipefd);
assert(ret == 0);
ssize_t writelen = write(pipefd[1], "foo", 3);
assert(writelen == 3);
while (stop_requested == 0)
{
printf("Calling pselect().\n");
fd_set fds;
FD_ZERO(&fds);
FD_SET(pipefd[0], &fds);
struct timespec * timeout = NULL;
int ret = pselect(pipefd[0] + 1, &fds, NULL, NULL, timeout, &old_sigmask);
assert(ret >= 0 || errno == EINTR);
printf("pselect() returned %d.\n", ret);
if (FD_ISSET(pipefd[0], &fds))
printf("pipe is readable.\n");
sleep(1);
}
printf("Event loop terminated.\n");
}
This program installs a handler for SIGINT, then blocks SIGINT, sends SIGINT to itself (which will not be delivered yet because SIGINT is blocked), creates a pipe and writes some data into the pipe, and then monitors the read end of the pipe for readability.
This readability monitoring is done using pselect(), which is supposed to unblock SIGINT, which should then interrupt the pselect() and call the signal handler.
However, on Linux (I tested on 5.6 and 4.19), the pselect() call returns 1 instead and indicates readability of the pipe, without calling the signal handler. Since this test program does not read the data that was written to the pipe, the file descriptor will never cease to be readable, and the signal handler is never called. In real programs, a similar situation might arise under heavy load, where a lot of data might be available for reading on different file descriptors (e.g. sockets).
On the other hand, on FreeBSD (I tested on 12.1), the signal handler is called, and then pselect() returns -1 and sets errno to EINTR. This is what I expected to happen on Linux as well.
Am I misunderstanding something, or am I using these interfaces incorrectly? Or should I just fall back to the old self-pipe trick, which (I believe) would handle this case better?
This is a type of resource starvation caused by always checking for active resources in the same order. When resources are always checked in the same order, if the resources checked first are busy enough the resources checked later may never get any attention.
See What is starvation?.
The Linux implementation of pselect() apparently checks file descriptors before checking for signals. The BSD implementation does the opposite.
For what it's worth, the POSIX documentation for pselect() states:
If none of the selected descriptors are ready for the requested operation, the pselect() or select() function shall block until at least one of the requested operations becomes ready, until the timeout occurs, or until interrupted by a signal.
A strict reading of that description requires checking the descriptors first. If any descriptor is active, pselect() will return that instead of failing with errno set to EINTR.
In that case, if the descriptors are so busy that one is always active, the signal processing gets starved.
The BSD implementation likely starves active descriptors if signals come in too fast.
One common solution is to always process all active resources every time a select() call or similar returns. But you can't do that with your current design that mixes signals with descriptors because pselect() doesn't even get to checking for a pending signal if there are active descriptors. As #Shawn mentioned in the comments, you can map signals to file descriptors using signalfd(). Then add the descriptor from signalfd() to the file descriptor set passed to pselect().

how to clear stdout after CTRL - C in linux c

We dont want anything to be printed after user interrupt via CTRL-C. We have tried adding __fpurge as well fflush inside sigInt signal handler, but it is not working.
How can I clear buffered stdout values immediately? I have came across few similar thread but no where i could able to find a working solution .
Few additional info's:
Inside sigInt signal handler even after adding exit(0) , buffer content are getting printed but the processor is killed .
added exit(0) to narrow down the issue , i dont want to kill the processor
I know the above is expected behavior , not sure how to avoid it .
Consider this edited example -- edited; this one does not exit the process:
#define _POSIX_C_SOURCE 200809L /* For nanosleep() */
#include <unistd.h>
#include <stdlib.h>
#include <termios.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <signal.h>
#include <string.h>
#include <errno.h>
#include <time.h>
#include <stdio.h>
static void exit_handler(int signum)
{
int fd, result;
/* If the standard streams are connected to a tty,
* tell the kernel to discard already buffered data.
* (That is, in kernel buffers. Not C library buffers.)
*/
if (isatty(STDIN_FILENO))
tcflush(STDIN_FILENO, TCIOFLUSH);
if (isatty(STDOUT_FILENO))
tcflush(STDOUT_FILENO, TCIOFLUSH);
if (isatty(STDERR_FILENO))
tcflush(STDERR_FILENO, TCIOFLUSH);
/* Redirect standard streams to /dev/null,
* so that nothing further is output.
* This is a nasty thing to do, and a code analysis program
* may complain about this; it is suspicious behaviour.
*/
do {
fd = open("/dev/null", O_RDWR);
} while (fd == -1 && errno == EINTR);
if (fd != -1) {
if (fd != STDIN_FILENO)
do {
result = dup2(fd, STDIN_FILENO);
} while (result == -1 && (errno == EINTR || errno == EBUSY));
if (fd != STDOUT_FILENO)
do {
result = dup2(fd, STDOUT_FILENO);
} while (result == -1 && (errno == EINTR || errno == EBUSY));
if (fd != STDERR_FILENO)
do {
result = dup2(fd, STDERR_FILENO);
} while (result == -1 && (errno == EINTR || errno == EBUSY));
if (fd != STDIN_FILENO && fd != STDOUT_FILENO && fd != STDERR_FILENO)
close(fd);
}
}
static int install_exit_handler(const int signum)
{
struct sigaction act;
memset(&act, 0, sizeof act);
sigemptyset(&act.sa_mask);
act.sa_handler = exit_handler;
act.sa_flags = 0;
if (sigaction(signum, &act, NULL) == -1)
return errno;
return 0;
}
int main(void)
{
if (install_exit_handler(SIGINT)) {
fprintf(stderr, "Cannot install signal handler: %s.\n", strerror(errno));
return EXIT_FAILURE;
}
while (1) {
struct timespec t = { .tv_sec = 0, .tv_nsec = 200000000L };
printf("Output\n");
fflush(stdout);
nanosleep(&t, NULL);
}
/* Never reached. */
return EXIT_SUCCESS;
}
When the process receives a SIGINT signal, it will first flush whatever is in kernel terminal buffer, then redirect the standard streams to /dev/null (i.e., nowhere).
Note that you'll need to kill the process by sending it the TERM or KILL signal (i.e. killall ./yourprogname in another terminal).
When you are running the verbose process over a remote connection, quite a lot of information may be in flight at all times. Both the local machine and the remote machine running the process will have their socket buffers nearly full, so the latency may be much larger than ordinarily -- I've seen several second latencies in this case even on fast (GbE) local networks.
This means that propagating the signal from the local machine to the remote machine will take a measurable time; in worst cases on the order of seconds. Only then will the remote process stop outputting data. All pending data will still have to be transmitted from the remote machine to the local machine, and that may take quite a long time. (Typically, the bottleneck is the terminal itself; in most cases it is faster to minimize the terminal, so that it does not try to render any of the text it receives, only buffers it internally.)
This is why Ctrl+C does not, and cannot, stop remote output instantaneously.
In most cases, you'll be using an SSH connection to the remote machine. The protocol does not have a "purge" feature, either, that might help here. Many, myself included, have thought about it -- at least my sausage fingers have accidentally tabbed to the executable file instead of the similarly named output file, and not only gotten the terminal full of garbage, but the special characters in binary files sometimes set the terminal state (see e.g. xterm control sequences, ANSI escape codes) to something unrecoverable (i.e., Ctrl+Z followed by reset Enter does not reset the terminal back to a working state; if it did, kill -KILL %- ; fg would stop the errant command in Bash, and get you your terminal back), and you need to break the connection, which will also terminate all processes started from the same terminal running remotely in the background.
The solution here is to use a terminal multiplexer, like GNU screen, which allows you to connect to and disconnect from the remote machine, without interrupting an existing terminal connection. (To put it simply, screen is your terminal avatar on the remote machine.)
First up, a quote from the C11 standard, emphasis mine:
7.14.1.1 The signal function
5 If the signal occurs other than as the result of calling the abort or raise function, the behaviour is undefined if [...] the signal handler calls any function in the standard library other than the abort function, the _Exit function, the quick_exit function, or the signal function with the first argumentt equal to the signal number corresponding to the signal that caused the invocation of the handler.
This means calling fflush is undefined behaviour.
Looking at the functions you may call, abort and _Exit both leave the flushing of buffers implementation-defined, and quick_exit calls _Exit, so you are out of luck as far as far as the standard is concerned since I could not find the implementation's definition on their behaviour for Linux. (Surprise. Not.)
The only other "terminating" function, exit, does flush the buffers, and you may not call it from the handler in the first place.
So you have to look at Linux-specific functionality. The man page to _exit makes no statement on buffers. The close man page warns against closing file descriptors that may be in use by system calls from other threads, and states that "it is not common for a filesystem to flush the buffers when the stream is closed", meaning that it could happen (i.e. close not guaranteeing that unwritten buffer contents are actually discarded).
At this point, if I were you, I would ask myself "is this such a good idea after all"...
The problem is that neither Posix nor Linux library declares that fpurge nor __fpurge to be safe in a signal handler function. As explained by DevSolar, C language itsel does not declare many safe functions for standard library (at least _Exit, but Posix explicitely allows close and write. So, you can always close the underlying file descriptor which should be 1:
void handler(int sig) {
static char msg[] = "Interrupted";
write(2, msg, sizeof(msg) - 1); // carefully use stderr here
close(1); // foo is displayed if this line is commented out
_Exit(1);
}
int main() {
signal(SIGINT, handler);
printf("bar");
sleep(15);
return 0;
}
When I type Ctrl-C during the sleep it gives as expected:
$ ./foo
^CInterrupted with 2
$
The close system call should be enough, because as it closes the underlying file descriptor. So even if there are later attemps to flush stdout buffer, they will write on a closed file descriptor as as such have no effect at all. The downside is that stdout has been redirected, the program should store the new value of the underlying file descriptor in a global variable.
If you do kill(getpid(), SIGKILL); with in the signal handler (which is async-safe), you would get killed immediately by the OS (as you wanted to exit(0) anyway). Further output is not to be expected any more.
Only problem: you won't be able to clean up poperly, exit gracefully afterwards in the main thread. If you can afford that...

Why isn't write(2) returning EINTR?

I've been reading about EINTR on write(2) etc, and trying to determine whether I need to check for it in my program. As a sanity check, I tried to write a program that would run into it. The program loops forever, writing repeatedly to a file.
Then, in a separate shell, I run:
while true; do pkill -HUP test; done
However, the only output I see from test.c is the .s from the signal handler. Why isn't the SIGHUP causing write(2) to fail?
test.c:
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <fcntl.h>
#include <signal.h>
#include <string.h>
#include <errno.h>
#include <sys/types.h>
void hup_handler(int sig)
{
printf(".");
fflush(stdout);
}
int main()
{
struct sigaction act;
act.sa_handler = hup_handler;
act.sa_flags = 0;
sigemptyset(&act.sa_mask);
sigaction(SIGHUP, &act, NULL);
int fd = open("testfile", O_WRONLY);
char* buf = malloc(1024*1024*128);
for (;;)
{
if (lseek(fd, 0, SEEK_SET) == -1)
{
printf("lseek failed: %s\n", strerror(errno));
}
if (write(fd, buf, sizeof(buf)) != sizeof(buf))
{
printf("write failed: %s\n", strerror(errno));
}
}
}
Linux tends to avoid EINTR on writes to/reads from files; see discussion here. While a process is blocking on a disk write, it may be placed in an uninterruptible sleep state (process code D) which indicates that it cannot be interrupted at that time. This depends on the device driver; the online copy of Linux Device Drivers, 3rd Edition is a good reference for how this appears from the kernel side.
You still need to handle EINTR for other platforms which may not behave the same, or for pipes and sockets where EINTR definitely can occur.
Note that you're only writing sizeof(void *) bytes at a time:
char* buf = malloc(1024*1024*128);
if (write(fd, buf, sizeof(buf)) != sizeof(buf))
This should be
const size_t BUF_SIZE = 1024*1024*128;
char* buf = malloc(BUF_SIZE);
if (write(fd, buf, BUF_SIZE) != BUF_SIZE)
There are 2 possibilities:
You're writing very few bytes, since you're misusing the sizeof operator. Thus the write happens instantaneously and it never gets interrupted - you're only writing 4 or 8 bytes at a time
Somehow the syscall gets restarted, as if you applied SA_RESTART to sigaction
In your code, since buf is a pointer, sizeof(buf) yields the size of the pointer on your machine, not the (much bigger) allocated space
If you check the manual page for EINTR
The call was interrupted by a signal before any data was written
Also from the signal(7) manual page:
read(2), readv(2), write(2), writev(2), and ioctl(2) calls on "slow" devices. A "slow" device is one where the I/O call may block for an indefinite time, for example, a terminal, pipe, or socket. (A disk is not a slow device according to this definition.) If an I/O call on a slow device has already transferred some data by the time it is interrupted by a signal handler, then the call will return a success status (normally, the number of bytes transferred).
Taking these two together, if writing to a file on a disk, and write has started to write (even if only one single byte has been written) the return from that write call will be a success.

catching signals while reading from pipe with select()

using select() with pipe - this is what I am doing and now I need to catch SIGTERM on that. how can I do it? Do I have to do it when select() returns error ( < 0 ) ?
First, SIGTERM will kill your process if not caught, and select() will not return. Thus, you must install a signal handler for SIGTERM. Do that using sigaction().
However, the SIGTERM signal can arrive at a moment where your thread is not blocked at select(). It would be a rare condition, if your process is mostly sleeping on the file descriptors, but it can otherwise happen. This means that either your signal handler must do something to inform the main routine of the interruption, namely, setting some flag variable (of type sig_atomic_t), or you must guarantee that SIGTERM is only delivered when the process is sleeping on select().
I'll go with the latter approach, since it's simpler, albeit less flexible (see end of the post).
So, you block SIGTERM just before calling select(), and reblock it right away after the function returns, so that your process only receives the signal while sleeping inside select(). But note that this actually creates a race condition. If the signal arrives just after the unblock, but just before select() is called, the system call will not have been called yet and thus it will not return -1. If the signal arrives just after select() returns successfully, but just before the re-block, you have also lost the signal.
Thus, you must use pselect() for that. It does the blocking/unblocking around select() atomically.
First, block SIGTERM using sigprocmask() before entering the pselect() loop. After that, just call pselect() with the original mask returned by sigprocmask(). This way you guarantee your process will only be interrupted while sleeping on select().
In summary:
Install a handler for SIGTERM (that does nothing);
Before entering the pselect() loop, block SIGTERM using sigprocmask();
Call pselect() with the old signal mask returned by sigprocmask();
Inside the pselect() loop, now you can check safely whether pselect() returned -1 and errno is EINTR.
Please note that if, after pselect() returns successfully, you do a lot of work, you may experience bigger latency when responding to SIGTERM (since the process must do all processing and return to pselect() before actually processing the signal). If this is a problem, you must use a flag variable inside the signal handler, so that you can check for this variable in a number of specific points in your code. Using a flag variable does not eliminate the race condition and does not eliminate the need for pselect(), though.
Remember: whenever you need to wait on some file descriptors or for the delivery of a signal, you must use pselect() (or ppoll(), for the systems that support it).
Edit: nothing better than a code example to illustrate the usage.
#define _POSIX_C_SOURCE 200809L
#include <errno.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/select.h>
#include <unistd.h>
// Signal handler to catch SIGTERM.
void sigterm(int signo) {
(void)signo;
}
int main(void) {
// Install the signal handler for SIGTERM.
struct sigaction s;
s.sa_handler = sigterm;
sigemptyset(&s.sa_mask);
s.sa_flags = 0;
sigaction(SIGTERM, &s, NULL);
// Block SIGTERM.
sigset_t sigset, oldset;
sigemptyset(&sigset);
sigaddset(&sigset, SIGTERM);
sigprocmask(SIG_BLOCK, &sigset, &oldset);
// Enter the pselect() loop, using the original mask as argument.
fd_set set;
FD_ZERO(&set);
FD_SET(0, &set);
while (pselect(1, &set, NULL, NULL, NULL, &oldset) >= 0) {
// Do some processing. Note that the process will not be
// interrupted while inside this loop.
sleep(5);
}
// See why pselect() has failed.
if (errno == EINTR)
puts("Interrupted by SIGTERM.");
else
perror("pselect()");
return EXIT_SUCCESS;
}
The answer is partly in one of the comment in the Q&A you point to;
> Interrupt will cause select() to return a -1 with errno set to EINTR
That is; for any interrupt(signal) caught the select will return, and the errno will be set to EINTR.
Now if you specifically want to catch SIGTERM, then you need to set that up with a call to signal, like this;
signal(SIGTERM,yourcatchfunction);
where your catch function should be defined something like
void yourcatchfunction(int signaleNumber) { .... }
So in summary, you have setup a signal handler yourcatchfunction and your program is currently in a select() call waiting for IO -- when a signal arrives, your catchfunction will be called and when you return from that the select call will return with the errno set to EINTR.
However be aware that the SIGTERM can occur at any time so you may not be in the select call when it occur, in which case you will never see the EINTR but only a regular call of the yourcatchfunction
Hence the select() returning with err and errno EINTR is just so you can take non-blocking action -- it is not what will catch the signal.
You can call select() in a loop. This is known as restarting the system call. Here is some pseudo-C.
int retval = -1;
int select_errno = 0;
do {
retval = select(...);
if (retval < 0)
{
/* Cache the value of errno in case a system call is later
* added prior to the loop guard (i.e., the while expression). */
select_errno = errno;
}
/* Other system calls might be added here. These could change the
* value of errno, losing track of the error during the select(),
* again this is the reason we cached the value. (E.g, you might call
* a log method which calls gettimeofday().) */
/* Automatically restart the system call if it was interrupted by
* a signal -- with a while loop. */
} while ((retval < 0) && (select_errno == EINTR));
if (retval < 0) {
/* Handle other errors here. See select man page. */
} else {
/* Successful invocation of select(). */
}

How to write a signal handler to catch SIGSEGV?

I want to write a signal handler to catch SIGSEGV.
I protect a block of memory for read or write using
char *buffer;
char *p;
char a;
int pagesize = 4096;
mprotect(buffer,pagesize,PROT_NONE)
This protects pagesize bytes of memory starting at buffer against any reads or writes.
Second, I try to read the memory:
p = buffer;
a = *p
This will generate a SIGSEGV, and my handler will be called.
So far so good. My problem is that, once the handler is called, I want to change the access write of the memory by doing
mprotect(buffer,pagesize,PROT_READ);
and continue normal functioning of my code. I do not want to exit the function.
On future writes to the same memory, I want to catch the signal again and modify the write rights and then record that event.
Here is the code:
#include <signal.h>
#include <stdio.h>
#include <malloc.h>
#include <stdlib.h>
#include <errno.h>
#include <sys/mman.h>
#define handle_error(msg) \
do { perror(msg); exit(EXIT_FAILURE); } while (0)
char *buffer;
int flag=0;
static void handler(int sig, siginfo_t *si, void *unused)
{
printf("Got SIGSEGV at address: 0x%lx\n",(long) si->si_addr);
printf("Implements the handler only\n");
flag=1;
//exit(EXIT_FAILURE);
}
int main(int argc, char *argv[])
{
char *p; char a;
int pagesize;
struct sigaction sa;
sa.sa_flags = SA_SIGINFO;
sigemptyset(&sa.sa_mask);
sa.sa_sigaction = handler;
if (sigaction(SIGSEGV, &sa, NULL) == -1)
handle_error("sigaction");
pagesize=4096;
/* Allocate a buffer aligned on a page boundary;
initial protection is PROT_READ | PROT_WRITE */
buffer = memalign(pagesize, 4 * pagesize);
if (buffer == NULL)
handle_error("memalign");
printf("Start of region: 0x%lx\n", (long) buffer);
printf("Start of region: 0x%lx\n", (long) buffer+pagesize);
printf("Start of region: 0x%lx\n", (long) buffer+2*pagesize);
printf("Start of region: 0x%lx\n", (long) buffer+3*pagesize);
//if (mprotect(buffer + pagesize * 0, pagesize,PROT_NONE) == -1)
if (mprotect(buffer + pagesize * 0, pagesize,PROT_NONE) == -1)
handle_error("mprotect");
//for (p = buffer ; ; )
if(flag==0)
{
p = buffer+pagesize/2;
printf("It comes here before reading memory\n");
a = *p; //trying to read the memory
printf("It comes here after reading memory\n");
}
else
{
if (mprotect(buffer + pagesize * 0, pagesize,PROT_READ) == -1)
handle_error("mprotect");
a = *p;
printf("Now i can read the memory\n");
}
/* for (p = buffer;p<=buffer+4*pagesize ;p++ )
{
//a = *(p);
*(p) = 'a';
printf("Writing at address %p\n",p);
}*/
printf("Loop completed\n"); /* Should never happen */
exit(EXIT_SUCCESS);
}
The problem is that only the signal handler runs and I can't return to the main function after catching the signal.
When your signal handler returns (assuming it doesn't call exit or longjmp or something that prevents it from actually returning), the code will continue at the point the signal occurred, reexecuting the same instruction. Since at this point, the memory protection has not been changed, it will just throw the signal again, and you'll be back in your signal handler in an infinite loop.
So to make it work, you have to call mprotect in the signal handler. Unfortunately, as Steven Schansker notes, mprotect is not async-safe, so you can't safely call it from the signal handler. So, as far as POSIX is concerned, you're screwed.
Fortunately on most implementations (all modern UNIX and Linux variants as far as I know), mprotect is a system call, so is safe to call from within a signal handler, so you can do most of what you want. The problem is that if you want to change the protections back after the read, you'll have to do that in the main program after the read.
Another possibility is to do something with the third argument to the signal handler, which points at an OS and arch specific structure that contains info about where the signal occurred. On Linux, this is a ucontext structure, which contains machine-specific info about the $PC address and other register contents where the signal occurred. If you modify this, you change where the signal handler will return to, so you can change the $PC to be just after the faulting instruction so it won't re-execute after the handler returns. This is very tricky to get right (and non-portable too).
edit
The ucontext structure is defined in <ucontext.h>. Within the ucontext the field uc_mcontext contains the machine context, and within that, the array gregs contains the general register context. So in your signal handler:
ucontext *u = (ucontext *)unused;
unsigned char *pc = (unsigned char *)u->uc_mcontext.gregs[REG_RIP];
will give you the pc where the exception occurred. You can read it to figure out what instruction it
was that faulted, and do something different.
As far as the portability of calling mprotect in the signal handler is concerned, any system that follows either the SVID spec or the BSD4 spec should be safe -- they allow calling any system call (anything in section 2 of the manual) in a signal handler.
You've fallen into the trap that all people do when they first try to handle signals. The trap? Thinking that you can actually do anything useful with signal handlers. From a signal handler, you are only allowed to call asynchronous and reentrant-safe library calls.
See this CERT advisory as to why and a list of the POSIX functions that are safe.
Note that printf(), which you are already calling, is not on that list.
Nor is mprotect. You're not allowed to call it from a signal handler. It might work, but I can promise you'll run into problems down the road. Be really careful with signal handlers, they're tricky to get right!
EDIT
Since I'm being a portability douchebag at the moment already, I'll point out that you also shouldn't write to shared (i.e. global) variables without taking the proper precautions.
You can recover from SIGSEGV on linux. Also you can recover from segmentation faults on Windows (you'll see a structured exception instead of a signal). But the POSIX standard doesn't guarantee recovery, so your code will be very non-portable.
Take a look at libsigsegv.
You should not return from the signal handler, as then behavior is undefined. Rather, jump out of it with longjmp.
This is only okay if the signal is generated in an async-signal-safe function. Otherwise, behavior is undefined if the program ever calls another async-signal-unsafe function. Hence, the signal handler should only be established immediately before it is necessary, and disestablished as soon as possible.
In fact, I know of very few uses of a SIGSEGV handler:
use an async-signal-safe backtrace library to log a backtrace, then die.
in a VM such as the JVM or CLR: check if the SIGSEGV occurred in JIT-compiled code. If not, die; if so, then throw a language-specific exception (not a C++ exception), which works because the JIT compiler knew that the trap could happen and generated appropriate frame unwind data.
clone() and exec() a debugger (do not use fork() – that calls callbacks registered by pthread_atfork()).
Finally, note that any action that triggers SIGSEGV is probably UB, as this is accessing invalid memory. However, this would not be the case if the signal was, say, SIGFPE.
There is a compilation problem using ucontext_t or struct ucontext (present in /usr/include/sys/ucontext.h)
http://www.mail-archive.com/arch-general#archlinux.org/msg13853.html

Resources