I find the poll() function quite useful for multiplexing pipes and sockets, but I wanted to extend on that and poll my own mediums, as in implement my own pipe and have it work with poll for POLLIN and POLLOUT events, How would I do that?
int self = GenerateMyPipe();
int sock = socket(...);
struct pollfd fd[2];
//Init Pollfd and Stuff...
poll(fd, 2, -1);
...
Thanks for reading...
There's no standard POSIX method for this, but on Linux you can use eventfd()
eventfd() creates an "eventfd object" that can be used as an event
wait/notify mechanism by user-space applications, and by the kernel
to notify user-space applications of events. The object contains an
unsigned 64-bit integer (uint64_t) counter that is maintained by the
kernel. This counter is initialized with the value specified in the
argument initval.
...
The returned file descriptor supports poll(2) (and analogously
epoll(7)) and select(2), as follows:
The file descriptor is readable (the select(2) readfds
argument; the poll(2) POLLIN flag) if the counter has a
value greater than 0.
The file descriptor is writable (the select(2) writefds
argument; the poll(2) POLLOUT flag) if it is possible to
write a value of at least "1" without blocking.
You change the counter by writing to the descriptor.
Related
I'm trying to add a signal handler for proper cleanup to my event-driven application.
My signal handler for SIGINT only changes the value of a global flag variable, which is then checked in the main loop. To avoid races, the signal is blocked at all times, except during the pselect() call. This should cause pending signals to be delivered only during the pselect() call, which should be interrupted and fail with EINTR.
This usually works fine, except if there are already events pending on the monitored file descriptors (e.g. under heavy load, when there's always activity on the file descriptors).
This sample program reproduces the problem:
#include <assert.h>
#include <errno.h>
#include <stdbool.h>
#include <stdio.h>
#include <string.h>
#include <sys/select.h>
#include <fcntl.h>
#include <signal.h>
#include <unistd.h>
volatile sig_atomic_t stop_requested = 0;
void handle_signal(int sig)
{
// Use write() and strlen() instead of printf(), which is not async-signal-safe
const char * out = "Caught stop signal. Exiting.\n";
size_t len = strlen (out);
ssize_t writelen = write(STDOUT_FILENO, out, len);
assert(writelen == (ssize_t) len);
stop_requested = 1;
}
int main(void)
{
int ret;
// Install signal handler
{
struct sigaction sa;
memset(&sa, 0, sizeof(sa));
sa.sa_handler = handle_signal;
ret = sigaction(SIGINT, &sa, NULL);
assert(ret == 0);
}
// Block SIGINT
sigset_t old_sigmask;
{
sigset_t blocked;
sigemptyset(&blocked);
sigaddset(&blocked, SIGINT);
ret = sigprocmask(SIG_BLOCK, &blocked, &old_sigmask);
assert(ret == 0);
}
ret = raise(SIGINT);
assert(ret == 0);
// Create pipe and write data to it
int pipefd[2];
ret = pipe(pipefd);
assert(ret == 0);
ssize_t writelen = write(pipefd[1], "foo", 3);
assert(writelen == 3);
while (stop_requested == 0)
{
printf("Calling pselect().\n");
fd_set fds;
FD_ZERO(&fds);
FD_SET(pipefd[0], &fds);
struct timespec * timeout = NULL;
int ret = pselect(pipefd[0] + 1, &fds, NULL, NULL, timeout, &old_sigmask);
assert(ret >= 0 || errno == EINTR);
printf("pselect() returned %d.\n", ret);
if (FD_ISSET(pipefd[0], &fds))
printf("pipe is readable.\n");
sleep(1);
}
printf("Event loop terminated.\n");
}
This program installs a handler for SIGINT, then blocks SIGINT, sends SIGINT to itself (which will not be delivered yet because SIGINT is blocked), creates a pipe and writes some data into the pipe, and then monitors the read end of the pipe for readability.
This readability monitoring is done using pselect(), which is supposed to unblock SIGINT, which should then interrupt the pselect() and call the signal handler.
However, on Linux (I tested on 5.6 and 4.19), the pselect() call returns 1 instead and indicates readability of the pipe, without calling the signal handler. Since this test program does not read the data that was written to the pipe, the file descriptor will never cease to be readable, and the signal handler is never called. In real programs, a similar situation might arise under heavy load, where a lot of data might be available for reading on different file descriptors (e.g. sockets).
On the other hand, on FreeBSD (I tested on 12.1), the signal handler is called, and then pselect() returns -1 and sets errno to EINTR. This is what I expected to happen on Linux as well.
Am I misunderstanding something, or am I using these interfaces incorrectly? Or should I just fall back to the old self-pipe trick, which (I believe) would handle this case better?
This is a type of resource starvation caused by always checking for active resources in the same order. When resources are always checked in the same order, if the resources checked first are busy enough the resources checked later may never get any attention.
See What is starvation?.
The Linux implementation of pselect() apparently checks file descriptors before checking for signals. The BSD implementation does the opposite.
For what it's worth, the POSIX documentation for pselect() states:
If none of the selected descriptors are ready for the requested operation, the pselect() or select() function shall block until at least one of the requested operations becomes ready, until the timeout occurs, or until interrupted by a signal.
A strict reading of that description requires checking the descriptors first. If any descriptor is active, pselect() will return that instead of failing with errno set to EINTR.
In that case, if the descriptors are so busy that one is always active, the signal processing gets starved.
The BSD implementation likely starves active descriptors if signals come in too fast.
One common solution is to always process all active resources every time a select() call or similar returns. But you can't do that with your current design that mixes signals with descriptors because pselect() doesn't even get to checking for a pending signal if there are active descriptors. As #Shawn mentioned in the comments, you can map signals to file descriptors using signalfd(). Then add the descriptor from signalfd() to the file descriptor set passed to pselect().
So, according to manual, pselect can have a timeout parameter and it will wait if no file-descriptors are changing. Also, it has an option to be interrupted by a signal:
sigemptyset(&emptyset); /* Signal mask to use during pselect() */
res = pselect(0, NULL, NULL, NULL, NULL, &emptyset);
if (errno == EINTR) printf("Interrupted by signal\n");
It is however not obvious from the manual which signals are able to interrupt pselect?
If I have threads (producers and consumers), and each (consumer)thread is using pselect, is there a way to interrupt only one (consumer)thread from another(producer) thread?
i think the issue is analyzed in https://lwn.net/Articles/176911/
For this reason, the POSIX.1g committee devised an enhanced version of
select(), called pselect(). The major difference between select() and
pselect() is that the latter call has a signal mask (sigset_t) as an
additional argument:
int pselect(int n, fd_set *readfds, fd_set *writefds, fd_set *exceptfds,
const struct timespec *timeout, const sigset_t *sigmask);
pselect uses the sigmask argument to configure which signals can interrupt it
The collection of signals that are currently blocked is called the
signal mask. Each process has its own signal mask. When you create a
new process (see Creating a Process), it inherits its parent’s mask.
You can block or unblock signals with total flexibility by modifying
the signal mask.
source : https://www.gnu.org/software/libc/manual/html_node/Process-Signal-Mask.html
https://linux.die.net/man/2/pselect
https://www.linuxprogrammingblog.com/code-examples/using-pselect-to-avoid-a-signal-race
Because of your second questions there are multiple algorithms for process synchronization see i.e. https://www.geeksforgeeks.org/introduction-of-process-synchronization/ and the links down on this page or https://en.wikipedia.org/wiki/Sleeping_barber_problem and associated pages. So basically signals are only one path for IPC in linux, cf IPC using Signals on linux
(Ignoring all the signal's part of the question, and only answering to
If I have threads (producers and consumers), and each (consumer)thread
is using pselect, is there a way to interrupt only one
(consumer)thread from another(producer) thread?"
, since the title does not imply the use of signals).
The easiest way I know is for the thread to expose a file descriptor that will always included in the p/select monitored descriptors, so it always monitor at least one. If other thread writes to that, the p/select call will return:
struct thread {
pthread_t tid;
int wake;
...
}
void *thread_cb(void *t) {
struct thread *me = t;
t->wake = eventfd(0, 0);
...
fd_set readfds;
// Populate readfds;
FD_SET(t->wake, &readfds);
select(...);
}
void interrupt_thread(struct thread *t) {
eventfd_write(t->wake, 1);
}
If no eventfd is available, you can replace it with a classic (and more verbose) pipe, or other similar communication mechanism.
I am trying to understand how epoll() is different from select() and poll(). select() and poll() are pretty similar. select() allows you to monitor multiple file descriptors and it checks if any of those file descriptors are available for an operation (e.g. read, write) without blocking. When the timeout expires, select() returns the file descriptors that are ready and the program can perform the operations on those file descriptors without blocking.
...
FD_ZERO(&rfds);
FD_SET(0, &rfds);
/* Wait up to five seconds. */
tv.tv_sec = 5;
tv.tv_usec = 0;
retval = select(1, &rfds, NULL, NULL, &tv);
/* Don’t rely on the value of tv now! */
if (retval == -1)
perror("select()");
else if (retval)
printf("Data is available now.\n");
/* FD_ISSET(0, &rfds) will be true. */
else
printf("No data within five seconds.\n");
...
poll() is a little more flexible in that it does not rely on bitmap, but array of file descriptors. Also, since poll() uses separate fields for requested (events) and result (revents), you don't have to worry to refill the sets that were overwritten by kernel.
...
struct pollfd fds[2];
fds[0].fd = open("/dev/dev0", ...);
fds[1].fd = open("/dev/dev1", ...);
fds[0].events = POLLOUT | POLLWRBAND;
fds[1].events = POLLOUT | POLLWRBAND;
ret = poll(fds, 2, timeout_msecs);
if (ret > 0) {
for (i=0; i<2; i++) {
if (fds[i].revents & POLLWRBAND) {
...
However, I read that there is an issue with poll() too since both select() and poll() are stateless; the kernel does not internally maintain the requested sets. I read this:
Suppose that there are 10,000 concurrent connections. Typically, only
a small number of file descriptors among them, say 10, are ready to
read. The rest 9,990 file descriptors are copied and scanned for no
reason, for every select()/poll() call. As mentioned earlier, this
problem comes from the fact that those select()/poll() interfaces are
stateless.
I don't understand what is meant by the file descripters are "copied" and "scanned". Copied where? And I don't know what is meant by "stateless". Thanks for clarification.
"Stateless" means "Does not retain anything between two calls". So kernel need to rebuild many things for mainly nothing in the mentioned example.
When I create socket on Linux, it's possible to specify the flag O_CLOEXEC on creation time:
auto fd = socket(AF_INET, SOCK_STREAM|SOCK_CLOEXEC, 0);
So there is no way, that some other thread will do fork()+exec() with this socket remaining open.
But on Mac, according to the manual, I can't do the same trick:
The socket has the indicated type, which specifies the semantics of communication.
Currently defined types are:
SOCK_STREAM
SOCK_DGRAM
SOCK_RAW
SOCK_SEQPACKET
SOCK_RDM
The flag O_CLOEXEC can only be set with a call to fcntl(). And it's not an atomic way - some thread may do exec() between calls to socket() and fcntl().
How to resolve this issue?
Assuming that you code is using pthreads, consider using the 'pthread_atfork', which make it possible to specify callback for forking. Extending the idea from Alk's post, but implementing the wait in the fork callback.
From the OP, looks like it is possible to control the code that create the sockets (where in Linux it will use SOCK_CLOEXEC), but not possible to control or modify the forking calls.
mutex_t socket_lock ;
main()
{
...
// establish atfork, before starting any thread.
mutex_t socket_lock = mutex_init(...);
pthread_atfork(set_socket_lock, release_sock_lock, release_sock_lock) ;
pthread_create(...) ;
.. rest of the code ...
}
int socket_wrapper(...)
{
set_socket_lock() ;
int sd = socket(..);
fcntl(sd, O_CLOEXEC);
release_socket_lock();
return sd;
}
void set_socket_lock(void)
{
lock(socket_lock ) ;
}
void release_socket_lock(void)
unlock(socket_lock ) ;
}
As an added bonus, the above logic mutex will also cover the case that a fork is called, and while the fork is executing, one of the thread in the parent/child will attempt to create a socket. Not sure if this is even possible, as the kernel might first suspend all threads before starting any of the fork activity.
Disclaimer: I did not try to run the code, as I do not have access to Mac machines.
struct siginfo {
int si_signo; /* signal number */
int si_errno; /* if nonzero, errno value from <errno.h> */
int si_code; /* additional info (depends on signal) */
pid_t si_pid; /* sending process ID */
uid_t si_uid; /* sending process real user ID */
void *si_addr; /* address that caused the fault */
int si_status; /* exit value or signal number */
long si_band; /* band number for SIGPOLL */
/* possibly other fields also */
};
I do not understand si_band.
If you look at the Linux manpage for sigaction, you'll see that:
SIGPOLL/SIGIO fills in si_band and si_fd. The si_band event is a bit mask
containing the same values as are filled in the revents field by poll(2).
The si_fd field indicates the file descriptor for which the I/O event
occurred.
The explanation for what that bitmask means can be found in the linked man page - essentially, it tells the signal handler what type of event triggered the signal (and in Linux at least, you also get the corresponding file descriptor.)
I'm not sure how portable this is. si_band seems to be in POSIX, but not si_fd. Reference: POSIX <signal.h>, POSIX poll(2)
A process can ask for SIGPOLL signals in order to implement asynchronous I/O. From the man page of sigactions:
SIGPOLL/SIGIO fills in si_band and si_fd. The si_band event is a
bit mask containing the same values as are filled in the revents field
by poll(2).
revents describes the types of the events that happened and lead to SIGPOLL being sent. The man page of poll, describes it in detail:
The field revents is an output parameter, filled by the kernel with
the events that actually occurred. The bits returned in revents can
include:
POLLIN There is data to read.
POLLPRI
There is urgent data to read (e.g., out-of-band data on TCP
socket; pseudoterminal master in packet mode has seen state
change in slave).
POLLOUT
Writing now will not block.
POLLRDHUP (since Linux 2.6.17)
Stream socket peer closed connection, or shut down writing half
of connection. The _GNU_SOURCE feature test macro must be
defined (before including any header files) in order to obtain
this definition.
POLLERR
Error condition (output only).
POLLHUP
Hang up (output only).
POLLNVAL
Invalid request: fd not open (output only).