I am going through the book "UNIX Systems Programming", and came across the below point.
It is good practice to check for EINTR error code whenever a C library functions are called (say close() ) because the library functions can fail if any signal is received by the process. If EINTR error has occurred, the corresponding C library call should be restarted.
while ((close(fd) == -1) && errno == EINTR); // close is restarted if it fails with EINTR error.
Question: Why the library function should fail if it gets a signal ?
When a signal is received, the corresponding handler is called. After the completion of the handler can't the library functions continue from the point it stopped ?
Why the library function should fail if it gets a Signal ?
Because that's how it's designed, and the goal is that if a signal arrives
while you are stuck in a blocking system call, the system call returns, and you have
a chance to act on the signal.
However, this has traditionally been implemented in many variants on different platforms.
After the completion of the handler can't the library functions continue from the point it stoped ?
Absolutely. If you want this behavior, you can set the SA_RESTART flag when you install a signal handler with sigaction().
Note, even with the SA_RESTART flag, there are still some system calls that are not automatically restarted. For Linux, you can see a list of which calls in the "Interruption of system calls and library functions by signal handlers" paragraph in the signal(7) man page . (If anyone knows a similar list defined by posix, I'd be grateful).
If you install a signal handler using signal() instead of sigaction(), it varies among unix variants whether system calls are automatically restarted or not. SySV derived platform typically does not restart system calls, while BSD dervied platform does.
while ((close(fd) == -1) && errno == EINTR); // close is restarted if it fails with EINTR error.
This is actually quite dangerouos. If close() fails with EINTR, the state of the file descriptor is unknown, meaning if the file descriptor really was closed, you risk a race condition that closes another unrelated file descriptor. This is considered a bug in the posix specification.
Related
I've found this piece of code used several times (also a similar one where it's used open() instead of write()).
int c = write(fd, &v, sizeof(v));
if (c == -1 && errno != EINTR) {
perror("Write to output file");
exit(EXIT_FAILURE);
}
Why it is checked if && errno != EINTR here ?
Looking for errno on man I found the following text about EINTR, but even if I visited man 7 signal that doesn't enlighten me.
EINTR Interrupted function call (POSIX.1); see signal(7).
Many system calls will report the EINTR error code if a signal occurred while the system call was in progress. No error actually occurred, it's just reported that way because the system isn't able to resume the system call automatically. This coding pattern simply retries the system call when this happens, to ignore the interrupt.
For instance, this might happen if the program makes use of alarm() to run some code asynchronously when a timer runs out. If the timeout occurs while the program is calling write(), we just want to retry the system call (aka read/write, etc).
the answers here are really good and i want to add some internal details :
System calls that are interrupted by signals can either abort and
return EINTR or automatically restart themselves
if and only if SA_RESTART is specified in sigaction(2)
and the one responsible for this task is the restart_block which used to track information and arguments for restarting system calls
From the man page on write:
The call was interrupted by a signal before any data was written
is it necessary to check for errno == EINTR if you read massive amounts of data? I use the pread() function to read. In all my time I have never seen EINTR returned, but I have seen some code online where it is explicitely checks for it.
so really is it necessary to check for EINTR and maybe repeat the call?
EINTR is returned when as system call is interrupted as a result of your process receiving a signal. If your process was blocked in the kernel, waiting for the read to complete, and a signal is caught, this may wake the kernel; this depends on if the operation is interruptable. The sleeping I/O routine is woken and is expected to return EINTR to user-space.
Just before the kernel returns to user space, it checks for pending signals. If a signal is pending, it will take the action associated with that signal. Possible actions include: dispatching the signal to a signal handler, killing your process, or ignoring the signal. Assuming this does not kill your process and/or your signal handler returns normally, the system call will return EINTR.
If you were not expecting this, you typically want to try the action again, but this can also be used as a way to gracefully abort an I/O operation. For example, alarm(2) can be used to implement a timeout, where SIGALRM is delivered if the I/O does not complete in a timely manner. In your signal handler, you could set a flag indicating a timeout and when your read operation returns EINTR, you can check for your timeout flag.
The reason is - on a busy system, for example, it is possible to have an interrupt on the read.
So, on your desktop you may never see it. On an overloaded server, you can.
Se Chapter 5 of Advanced Programming in the UNIX Environment - Stevens and Rago. There is a complete explanation.
Some system calls can be restarted transparently by the Kernel if the SA_RESTART flag is used when installing the signal handler, according to man signal(7):
If a blocked call to one of the following interfaces is interrupted
by a signal handler, then the call will be automatically restarted
after the signal
handler returns if the SA_RESTART flag was used; otherwise the call will fail with the error EINTR:
Then it mentions some system calls that can (and can not) be restarted, but does not mention close() in either places, how would I know if close(), or any other function, is restartable or not ? does POSIX specify it or is it a Linux-specific behaviour ? where can I find more info ?
close is a rather special case. Not only is it not restartable on Linux; when close returns with EINTR on Linux, it has actually already succeeded, and making another call to close will fail with EBADF in single-threaded processes and cause extremely dangerous file-descriptor races in multi-threaded processes.
As of the published POSIX 2008, this behavior is permitted:
If close() is interrupted by a signal that is to be caught, it shall return -1 with errno set to [EINTR] and the state of fildes is unspecified.
This issue was raised with the Austin Group (as Issue #529) and it was resolved to revise the specification such that returning with EINTR means the file descriptor is still open; this is contrary to the current Linux behavior. If the file descriptor has already been closed at the time the signal is handled, the close function is now required to return with EINPROGRESS instead of EINTR. This can be fixed in userspace on Linux, and there is an open glibc bug report, #14627 for it, but as of this writing it has not received any response.
This issue also has serious implications for POSIX thread cancellation, the side effects of which are specified in terms of the side effects upon returning with EINTR. There is a related issue on the Austin Group tracker, Issue #614.
As per POSIX.1-2008, the SA_RESTART flag applies to all interruptible functions (all function which are documented to fail with EINTR):
SA_RESTART
This flag affects the behavior of interruptible functions; that is, those specified to fail with errno set to [EINTR]. If set, and a function specified as interruptible is interrupted by this signal, the function shall restart and shall not fail with [EINTR] unless otherwise specified. If an interruptible function which uses a timeout is restarted, the duration of the timeout following the restart is set to an unspecified value that does not exceed the original timeout value. If the flag is not set, interruptible functions interrupted by this signal shall fail with errno set to [EINTR].
That is, the list of functions which are not restarted is Linux-specific (and probably counts as a bug).
I saw that semaphores in my application were not always working as expected. Then I was told that this unexpected behavior can be caused when a signal interrupts the sem_wait call.
So, my question is what care needs to be taken by the programmer in the presence of signals. For sem_wait, we can check for the return values, but is this the same for all non-async safe functions? And what else we should keep in mind when expecting signals to interrupt our code?
UNIX signals is a can of worm, just to have said that.
There's 2 camps regarding syscalls and signals.
SysV/Posix semantics: syscalls are interrupted by a signal, they return an error and sets errno to EINTR
BSD semantics syscalls are auto restarted if a signal occurs (well, most of them are anyway, some are not, e.g. select/poll/sleep).
When using the signal(), the default is one of the two above, with BSD systems and Linux defaulting to BSD semantics, and everyone[citation needed..] else have the SysV semantics.
(On Linux, this depends on many things, e.g. compiling with -std=c99 gives SysV semantics, with -std=gnu99 gives BSD semantics. See e.g. http://www.gnu.org/s/hello/manual/libc/Interrupted-Primitives.html)
When you install a signal handler with sigaction(), you get to chose which semantics with the SA_RESTART flags.
Basically:
Don't use signals if you can help it.
Use the BSD semantics if you can.
On code that needs to be portable and handles signals, you need to wrap each and every system call in a loop that checks the call for failure, inspects errno for EINTR and perform the syscall again (or do something based on the caught signal ).
library calls can use signals, even if your code don't.
syscalls in general, with SysV/Posix semantics, will return -1 and set errno to EINTR. But read the documenation to learn what the error condition is.
EDIT: edited, as I mixed up BSD vs Sysv semantics.
While writing a simple server-client application, this question came in my mind. When someone tries to write to a broken pipe, a SIGPIPE would be generated. Let's say I handle the signal in my code.
Now what error does the write call returns - EPIPE or EINTR ( as it was interrupted by a signal). I tried with a sample program and I seem to be getting EPIPE always. Is this a guaranteed behavior or it could be any of the two error values?
POSIX says that EPIPE should be returned and SIGPIPE sent:
For write()s or pwrite()s to pipes or FIFOs not open for reading by any process, or with only one end open.
For write()s to sockets that are no longer connected or shut down for writing.
You can have a look at the POSIX standard here
The write(2) call returns -1 on error, so I guess you are asking about the value of errno(3).
You'll get EPIPE if you handle, block, or ignore the signal. Otherwise the process is terminated by default, see signal(7).
In general, "interrupted by a signal" (EINTR) refers to the utterly ridiculous Unix System V signal handling, whereby ANY system call could fail if your process received (and handled) a signal during the system call. This required wrapping every system call with do ... while (ret==-1 && errno==EINTR); or similar. While POSIX still allows either this or the good ("BSD") behavior, sane systems like GNU/Linux have the BSD behavior by default. You can always obtain the BSD behavior by calling sigaction with the right arguments, or even make a wrapper function to do so for you.
As such, EINTR is unrelated to the SIGPIPE caused by write errors.