So, according to manual, pselect can have a timeout parameter and it will wait if no file-descriptors are changing. Also, it has an option to be interrupted by a signal:
sigemptyset(&emptyset); /* Signal mask to use during pselect() */
res = pselect(0, NULL, NULL, NULL, NULL, &emptyset);
if (errno == EINTR) printf("Interrupted by signal\n");
It is however not obvious from the manual which signals are able to interrupt pselect?
If I have threads (producers and consumers), and each (consumer)thread is using pselect, is there a way to interrupt only one (consumer)thread from another(producer) thread?
i think the issue is analyzed in https://lwn.net/Articles/176911/
For this reason, the POSIX.1g committee devised an enhanced version of
select(), called pselect(). The major difference between select() and
pselect() is that the latter call has a signal mask (sigset_t) as an
additional argument:
int pselect(int n, fd_set *readfds, fd_set *writefds, fd_set *exceptfds,
const struct timespec *timeout, const sigset_t *sigmask);
pselect uses the sigmask argument to configure which signals can interrupt it
The collection of signals that are currently blocked is called the
signal mask. Each process has its own signal mask. When you create a
new process (see Creating a Process), it inherits its parent’s mask.
You can block or unblock signals with total flexibility by modifying
the signal mask.
source : https://www.gnu.org/software/libc/manual/html_node/Process-Signal-Mask.html
https://linux.die.net/man/2/pselect
https://www.linuxprogrammingblog.com/code-examples/using-pselect-to-avoid-a-signal-race
Because of your second questions there are multiple algorithms for process synchronization see i.e. https://www.geeksforgeeks.org/introduction-of-process-synchronization/ and the links down on this page or https://en.wikipedia.org/wiki/Sleeping_barber_problem and associated pages. So basically signals are only one path for IPC in linux, cf IPC using Signals on linux
(Ignoring all the signal's part of the question, and only answering to
If I have threads (producers and consumers), and each (consumer)thread
is using pselect, is there a way to interrupt only one
(consumer)thread from another(producer) thread?"
, since the title does not imply the use of signals).
The easiest way I know is for the thread to expose a file descriptor that will always included in the p/select monitored descriptors, so it always monitor at least one. If other thread writes to that, the p/select call will return:
struct thread {
pthread_t tid;
int wake;
...
}
void *thread_cb(void *t) {
struct thread *me = t;
t->wake = eventfd(0, 0);
...
fd_set readfds;
// Populate readfds;
FD_SET(t->wake, &readfds);
select(...);
}
void interrupt_thread(struct thread *t) {
eventfd_write(t->wake, 1);
}
If no eventfd is available, you can replace it with a classic (and more verbose) pipe, or other similar communication mechanism.
Related
I'm trying to add a signal handler for proper cleanup to my event-driven application.
My signal handler for SIGINT only changes the value of a global flag variable, which is then checked in the main loop. To avoid races, the signal is blocked at all times, except during the pselect() call. This should cause pending signals to be delivered only during the pselect() call, which should be interrupted and fail with EINTR.
This usually works fine, except if there are already events pending on the monitored file descriptors (e.g. under heavy load, when there's always activity on the file descriptors).
This sample program reproduces the problem:
#include <assert.h>
#include <errno.h>
#include <stdbool.h>
#include <stdio.h>
#include <string.h>
#include <sys/select.h>
#include <fcntl.h>
#include <signal.h>
#include <unistd.h>
volatile sig_atomic_t stop_requested = 0;
void handle_signal(int sig)
{
// Use write() and strlen() instead of printf(), which is not async-signal-safe
const char * out = "Caught stop signal. Exiting.\n";
size_t len = strlen (out);
ssize_t writelen = write(STDOUT_FILENO, out, len);
assert(writelen == (ssize_t) len);
stop_requested = 1;
}
int main(void)
{
int ret;
// Install signal handler
{
struct sigaction sa;
memset(&sa, 0, sizeof(sa));
sa.sa_handler = handle_signal;
ret = sigaction(SIGINT, &sa, NULL);
assert(ret == 0);
}
// Block SIGINT
sigset_t old_sigmask;
{
sigset_t blocked;
sigemptyset(&blocked);
sigaddset(&blocked, SIGINT);
ret = sigprocmask(SIG_BLOCK, &blocked, &old_sigmask);
assert(ret == 0);
}
ret = raise(SIGINT);
assert(ret == 0);
// Create pipe and write data to it
int pipefd[2];
ret = pipe(pipefd);
assert(ret == 0);
ssize_t writelen = write(pipefd[1], "foo", 3);
assert(writelen == 3);
while (stop_requested == 0)
{
printf("Calling pselect().\n");
fd_set fds;
FD_ZERO(&fds);
FD_SET(pipefd[0], &fds);
struct timespec * timeout = NULL;
int ret = pselect(pipefd[0] + 1, &fds, NULL, NULL, timeout, &old_sigmask);
assert(ret >= 0 || errno == EINTR);
printf("pselect() returned %d.\n", ret);
if (FD_ISSET(pipefd[0], &fds))
printf("pipe is readable.\n");
sleep(1);
}
printf("Event loop terminated.\n");
}
This program installs a handler for SIGINT, then blocks SIGINT, sends SIGINT to itself (which will not be delivered yet because SIGINT is blocked), creates a pipe and writes some data into the pipe, and then monitors the read end of the pipe for readability.
This readability monitoring is done using pselect(), which is supposed to unblock SIGINT, which should then interrupt the pselect() and call the signal handler.
However, on Linux (I tested on 5.6 and 4.19), the pselect() call returns 1 instead and indicates readability of the pipe, without calling the signal handler. Since this test program does not read the data that was written to the pipe, the file descriptor will never cease to be readable, and the signal handler is never called. In real programs, a similar situation might arise under heavy load, where a lot of data might be available for reading on different file descriptors (e.g. sockets).
On the other hand, on FreeBSD (I tested on 12.1), the signal handler is called, and then pselect() returns -1 and sets errno to EINTR. This is what I expected to happen on Linux as well.
Am I misunderstanding something, or am I using these interfaces incorrectly? Or should I just fall back to the old self-pipe trick, which (I believe) would handle this case better?
This is a type of resource starvation caused by always checking for active resources in the same order. When resources are always checked in the same order, if the resources checked first are busy enough the resources checked later may never get any attention.
See What is starvation?.
The Linux implementation of pselect() apparently checks file descriptors before checking for signals. The BSD implementation does the opposite.
For what it's worth, the POSIX documentation for pselect() states:
If none of the selected descriptors are ready for the requested operation, the pselect() or select() function shall block until at least one of the requested operations becomes ready, until the timeout occurs, or until interrupted by a signal.
A strict reading of that description requires checking the descriptors first. If any descriptor is active, pselect() will return that instead of failing with errno set to EINTR.
In that case, if the descriptors are so busy that one is always active, the signal processing gets starved.
The BSD implementation likely starves active descriptors if signals come in too fast.
One common solution is to always process all active resources every time a select() call or similar returns. But you can't do that with your current design that mixes signals with descriptors because pselect() doesn't even get to checking for a pending signal if there are active descriptors. As #Shawn mentioned in the comments, you can map signals to file descriptors using signalfd(). Then add the descriptor from signalfd() to the file descriptor set passed to pselect().
I'm writing my own echo server using sockets and syscalls. I am using epoll to work with many different clients at the same time and all the operations done with clients are nonblocking. When the server is on and doing nothing, it is in epoll_wait. Now I want to add the possibility to shut the server down using signals. For example, I start the server in bash terminal, then I press ctrl-c and the server somehow handles SIGINT. My plan is to use signalfd. I create new signalfd and add it to epoll instance with the following code:
sigset_t mask;
sigemptyset(&mask);
sigaddset(&mask, SIGTERM);
sigaddset(&mask, SIGINT);
signal_fd = signalfd(-1, &mask, 0);
epoll_event event;
event.data.fd = signal_fd;
event.events = EPOLLIN;
epoll_ctl(fd, EPOLL_CTL_ADD, signal_fd, &event);
Then I expect, that when epoll is waiting and I press ctrl-c, event on epoll happens, it wakes up and then I handle the signal with the following code:
if (events[i].data.fd == signal_fd)
{
//do something
exit(0);
}
Though in reality the server just stops without handling the signal. What am I doing wrong, what is the correct way to solve my problem? And if I'm not understanding signals correctly, what is the place, where the one should use signalfd?
epoll_wait returns -1 and errno == EINTR when it is interrupted by a signal. In this case you need to read from signal_fd.
Set the signal handler for your signals to SIG_IGN, otherwise signals may terminate your application.
See man signal 7:
The following interfaces are never restarted after being interrupted by
a signal handler, regardless of the use of SA_RESTART; they always fail
with the error EINTR when interrupted by a signal handler:
File descriptor multiplexing interfaces: epoll_wait(2),
epoll_pwait(2), poll(2), ppoll(2), select(2), and pselect(2).
Though in reality the server just stops without handling the signal. What am I doing wrong, what is the correct way to solve my problem? And if I'm not understanding signals correctly, what is the place, where one should use signalfd?
Signal handlers are per process. You left the signal handler at the default, which is to terminate the processes.
So you need to add something like this,
struct sigaction action;
std::memset(&action, 0, sizeof(struct sigaction));
action.sa_handler = your_handler;
sigaction(signum, &action, NULL);
for each signum that you want your application to receive interrupts for. Also handle the return value of sigaction. My experience is that if you use SIG_IGN as handler than you still interrupt a system call like epoll_pwait from the "outside", but it won't work when you try to wake up the thread from the program itself by sending the signal directly to that thread using pthread_kill.
Next you need to mask all signals from every thread, so that by default no thread will receive it (otherwise a random thread is woken up to handle the signal). The easiest way to do that is by doing it in main before creating any thread.
For example,
sigset_t all_signals;
sigemptyset(&all_signals);
sigaddset(&all_signals, signum); // Repeat for each signum that you use.
sigprocmask(SIG_BLOCK, &all_signals, NULL);
And then unblock the signals per thread when you want that thread to receive the signal.
If you use signalfd, then you do not want to unblock them - that system call unblocks the signals itself, just pass the appropriate mask (set bits for signalfd (it uses the passed mask to unblock). See also the man page of signalfd).
epoll_pwait works differently; like pselect you unblock the signal that you are interested in. You set a handler for that signal (see above) that sets a flag. Then just before calling epoll_pwait you block the signal, then test the flag and handle it, and then call epoll_pwait without first unblocking the signal. After epoll_wait returns you can unblock the signal again so that your handler can be called again.
You have to block all the signals you want to handle with your signal-FD before you create that signal-FD. Otherwise, those signals still interrupt blocked system calls such as epoll_wait() - as you observed.
See also the signalfd(2) man page:
Normally, the set of signals to be received via the file descriptor
should be blocked using sigprocmask(2), to prevent the signals being
handled according to their default dispositions.
Thus, you have to change your example like this:
sigset_t mask;
sigemptyset(&mask);
sigaddset(&mask, SIGTERM);
sigaddset(&mask, SIGINT);
int r = sigprocmask(SIG_BLOCK, &mask, 0);
if (r == -1) {
// XXX handle errors
}
signal_fd = signalfd(-1, &mask, 0);
if (signal_fd == -1) {
// XXX handle errors
}
epoll_event event;
event.data.fd = signal_fd;
event.events = EPOLLIN;
r = epoll_ctl(fd, EPOLL_CTL_ADD, signal_fd, &event);
if (r == -1) {
// XXX handle errors
}
How can I register a signal handler for ALL signal, available on the running OS, using signal(3)?
My code looks like this:
void sig_handler(int signum)
{
printf("Received signal %d\n", signum);
}
int main()
{
signal(ALL_SIGNALS_??, sig_handler);
while (1) {
sleep(1);
};
return 0;
}
Most systems have a macro NSIG or _NSIG (the former would not be available in standards-conformance mode since it violates the namespace) defined in signal.h such that a loop for (i=1; i<_NSIG; i++) will walk all signals. Also, on POSIX systems that have signal masks, CHAR_BIT*sizeof(sigset_t) is an upper bound on the number of signals which you could use as a fallback if neither NSIG nor _NSIG is defined.
Signal handlers have to deal with reentrancy concerns and other problems. In practice, it's often more convenient to mask signals and then retrieve them from time to time. You can mask all signals (except SIGSTOP and SIGKILL, which you can't handle anyway) with this:
sigset_t all_signals;
sigfillset(&all_signals);
sigprocmask(SIG_BLOCK, &all_signals, NULL);
The code is slightly different if you're using pthreads. Call this in every thread, or (preferably) in the main thread before you create any others:
sigset_t all_signals;
sigfillset(&all_signals);
pthread_sigmask(SIG_BLOCK, &all_signals, NULL);
Once you've done that, you should periodically call sigtimedwait(2) like this:
struct timespec no_time = {0, 0};
siginfo_t result;
int rc = sigtimedwait(&all_signals, &result, &no_time);
If there is a signal pending, information about it will be placed in result and rc will be the signal number; if not, rc will be -1 and errno will be EAGAIN. If you're already calling select(2)/poll(2) (e.g. as part of some event-driven system), you may want to create a signalfd(2) instead and attach it to your event loop. In this case, you still need to mask the signals as shown above.
So this is the problem im having right now, I've created 2 different programs (1 will be managing the other, while the other will be executed multiple times). The programs will be communicating back and forth via signals. My question is, is it possible (and how) to get the process id of the program sending the signal.
My programs use signal() to catch signals and kill() to send them.
Although signal() is in standard C library, this function is not portable, its behavior depending on the system. Better use sigaction() which is POSIX.1.
Here is an example of how to use sigaction with a handler void h(int sig) :
int mysignal (int sig, void (*h)(int), int options)
{
int r;
struct sigaction s;
s.sa_handler = h;
sigemptyset (&s.sa_mask);
s.sa_flags = options;
r = sigaction (sig, &s, NULL);
if (r < 0) perror (__func__);
return r;
}
options are described in man sigaction. A good choice is options=SA_RESTART.
To know the PID of the process which sent a signal, set options=SA_SIGINFO, and use a sa_sigaction callback instead of sa_handler; it will receive a siginfo_t struct, having a si_pid field. You can associate a data to the signal using sigqueue.
Generally speaking, using signals is a bad idea to communicate in a safe manner (when n signals are sent, only the first will have a chance to be delivered; there is no hook to associate other datas; the available user signals are only two). Better use pipes, named pipes or sockets.
Don't use signal(), it's obsolete. If you have it, use sigaction() instead, it provides an interface to get the sender's process ID.
To get the process id of the process that is sending the signal you need to include in your options SA_SIGINFO. If you do so the interface to the sigaction is slightly different. Here is an example of the proper handler to use and how to set itup. (I include SA_RESTART as an option just because it is generally a good idea, but it is not necessary)
// example of a handler which checks the signalling pid
void handler(int sig, siginfo_t* info, void* vp) {
if (info->si_pid != getpid()) {
// not from me (or my call to alarm)
return;
}
// from me. let me know alarm when off
alarmWentOff = 1;
}
Here is my general code for setting up a handler:
typedef void InfoHandler(int, siginfo_t *, void *);
InfoHandler*
SignalWithInfo(int signum, InfoHandler* handler)
{
struct sigaction action, old_action;
memset(&action, 0, sizeof(struct sigaction));
action.sa_sigaction = handler;
sigemptyset(&action.sa_mask); /* block sigs of type being handled */
action.sa_flags = SA_RESTART|SA_SIGINFO; /* restart syscalls if possible */
if (sigaction(signum, &action, &old_action) < 0)
perror("Signal error");
return (old_action.sa_sigaction);
}
and finally, for this particular case:
SignalWithInfo(SIGALRM, handler);
It's changed now. The signal information shows not 1 pid, but easy digital number. That accords to the killall process. May be, it's possible, to enumerate the /proc/ directories, when you find the /proc/DIGIT, open the /proc/DIGIT/comm, read and close it. May be, acueried name will be "shutdown" or "reboot"
I discovered an issue with thread implementation, that is strange to me. Maybe some of you can explain it to me, would be great.
I am working on something like a proxy, a program (running on different machines) that receives packets over eth0 and sends it through ath0 (wireless) to another machine which is doing the exactly same thing. Actually I am not at all sure what is causing my problem, that's because I am new to everything, linux and c programming.
I start two threads,
one is listening (socket) on eth0 for incoming packets and sends it out through ath0 (also socket)
and the other thread is listening on ath0 and sends through eth0.
If I use threads, I get an error like that:
sh-2.05b# ./socketex
Failed to send network header packet.
: Interrupted system call
If I use fork(), the program works as expected.
Can someone explain that behaviour to me?
Just to show the sender implementation here comes its code snippet:
while(keep_going) {
memset(&buffer[0], '\0', sizeof(buffer));
recvlen = recvfrom(sockfd_in, buffer, BUFLEN, 0, (struct sockaddr *) &incoming, &ilen);
if(recvlen < 0) {
perror("something went wrong / incoming\n");
exit(-1);
}
strcpy(msg, buffer);
buflen = strlen(msg);
sentlen = ath_sendto(sfd, &btpinfo, &addrnwh, &nwh, buflen, msg, &selpv2, &depv);
if(sentlen == E_ERR) {
perror("Failed to send network header packet.\n");
exit(-1);
}
}
UPDATE: my main file, starting either threads or processes (fork)
int main(void) {
port_config pConfig;
memset(&pConfig, 0, sizeof(pConfig));
pConfig.inPort = 2002;
pConfig.outPort = 2003;
pid_t retval = fork();
if(retval == 0) {
// child process
pc2wsuThread((void *) &pConfig);
} else if (retval < 0) {
perror("fork not successful\n");
} else {
// parent process
wsu2pcThread((void *) &pConfig);
}
/*
wint8 rc1, rc2 = 0;
pthread_t pc2wsu;
pthread_t wsu2pc;
rc1 = pthread_create(&pc2wsu, NULL, pc2wsuThread, (void *) &pConfig);
rc2 = pthread_create(&wsu2pc, NULL, wsu2pcThread, (void *) &pConfig);
if(rc1) {
printf("error: pthread_create() is %d\n", rc1);
return(-1);
}
if(rc2) {
printf("error: pthread_create() is %d\n", rc2);
return(-1);
}
pthread_join(pc2wsu, NULL);
pthread_join(wsu2pc, NULL);
*/
return 0;
}
Does it help?
update 05/30/2011
-sh-2.05b# ./wsuproxy 192.168.1.100
mgmtsrvc
mgmtsrvc
Failed to send network header packet.
: Interrupted system call
13.254158,75.165482,DATAAAAAAmgmtsrvc
mgmtsrvc
mgmtsrvc
Still get the interrupted system call, as you can see above.
I blocked all signals as followed:
sigset_t signal_mask;
sigfillset(&signal_mask);
sigprocmask(SIG_BLOCK, &signal_mask, NULL);
The two threads are working on the same interfaces, but on different ports. The problem seems to appear still in the same place (please find it in the first code snippet). I can't go further and have not enough knowledge of how to solve that problem. Maybe some of you can help me here again.
Thanks in advance.
EINTR does not itself indicate an error. It means that your process received a signal while it was in the sendto syscall, and that syscall hadn't sent any data yet (that's important).
You could retry the send in this case, but a good thing would be to figure out what signal caused the interruption. If this is reproducible, try using strace.
If you're the one sending the signal, well, you know what to do :-)
Note that on linux, you can receive EINTR on sendto (and some other functions) even if you haven't installed a handler yourself. This can happen if:
the process is stopped (via SIGSTOP for example) and restarted (with SIGCONT)
you have set a send timeout on the socket (via SO_SNDTIMEO)
See the signal(7) man page (at the very bottom) for more details.
So if you're "suspending" your service (or something else is), that EINTR is expected and you should restart the call.
Keep in mind if you are using threads with signals that a given signal, when delivered to the process, could be delivered to any thread whose signal mask is not blocking the signal. That means if you have blocked incoming signals in one thread, and not in another, the non-blocking thread will receive the signal, and if there is no signal handler setup for the signal, you will end-up with the default behavior of that signal for the entire process (i.e., all the threads, both signal-blocking threads and non-signal-blocking threads). For instance, if the default behavior of a signal was to terminate a process, one thread catching that signal and executing it's default behavior will terminate the entire process, for all the threads, even though some threads may have been masking the signal. Also if you have two threads that are not blocking a signal, it is not deterministic which thread will handle the signal. Therefore it's typically the case that mixing signals and threads is not a good idea, but there are exceptions to the rule.
One thing you can try, is since the signal mask for a spawned thread is inherited from the generating thread, is to create a daemon thread for handling signals, where at the start of your program, you block all incoming signals (or at least all non-important signals), and then spawn your threads. Now those spawned threads will ignore any incoming signals in the parent-thread's blocked signal mask. If you need to handle some specific signals, you can still make those signals part of the blocked signal mask for the main process, and then spawn your threads. But when you're spawning the threads, leave one thread (could even be the main process thread after it's spawned all the worker threads) as a "daemon" thread waiting for those specific incoming (and now blocked) signals using sigwait(). That thread will then dispatch whatever functions are necessary when a given signal is received by the process. This will avoid signals from interrupting system calls in your other worker-threads, yet still allow you to handle signals.
The reason your forked version may not be having issues is because if a signal arrives at one parent process, it is not propagated to any child processes. So I would try, if you can, to see what signal it is that is terminating your system call, and in your threaded version, block that signal, and if you need to handle it, create a daemon-thread that will handle that signal's arrival, with the rest of the threads blocking that signal.
Finally, if you don't have access to any external libraries or debuggers, etc. to see what signals are arriving, you can setup a simple procedure for seeing what signals might be arriving. You can try this code:
#include <signal.h>
#include <stdio.h>
int main()
{
//block all incoming signals
sigset_t signal_mask;
sigfillset(&signal_mask);
sigprocmask(SIG_BLOCK, &signal_mask, NULL);
//... spawn your threads here ...
//... now wait for signals to arrive and see what comes in ...
int arrived_signal;
while(1) //you can change this condition to whatever to exit the loop
{
sigwait(&signal_mask, &arrived_signal);
switch(arrived_signal)
{
case SIGABRT: fprintf(stderr, "SIGABRT signal arrived\n"); break;
case SIGALRM: fprintf(stderr, "SIGALRM signal arrived\n"); break;
//continue for the rest of the signals defined in signal.h ...
default: fprintf(stderr, "Unrecognized signal arrived\n");
}
}
//clean-up your threads and anything else needing clean-up
return 0;
}