Unix programming - signal handler - c

I've run into some problems while trying to write a smallshell in c.
The problem is the following: Assume I have written some code for a signal-handler that, in this case, is modified to catch SIGCHLD signals, how could I notify my program that a signal has been caught?
The problem is easy if I were to use a global variable, but that's not really the way I want to go about it. So any suggestions/hints would be much appreciated!
This is how I solve it right now.
volatile sig_atomic_t exit_status; /* <--global variabel */
void sigchld_handler(int signal) {
switch (signal) {
case SIGCHLD:
exit_status = 1; /* SIGCHLD was caught, notify program.. */
break;
default:
fprintf(stderr, "Some signal catched\n"); /* not a signal of intrest */
break;
}
}
//Thanks

A standard solution is to use the unix self-pipe trick. The benefit is that the read end of the pipe can be used with select() or epoll() thus integrating nicely with event loops without having to periodically poll the value of an atomic variable.

signal(7) contains a list of functions that are safe to execute in signal handlers; fprintf(3) isn't one of them. What happens if a child dies while your shell is printing the prompt or status messages? Corrupted data-structures is the usual result. (This is fine for toys -- but I wouldn't want this in a shell.)
Setting global variables is quite typical for signal handlers. It's an easy way to signal the process's main event loop or main processing loop that something else needs to be done.

To be honest for me this looks like perfect case for global variable. But if you don't want to do that there are lots of alternatives: http://beej.us/guide/bgipc/output/html/multipage/index.html Pick one from the list that suits best on your architecture.

Related

SigHandler causing program to not terminate

Currently I am trying to create a signal handler that, when it receives a SIGTERM signal, it closes open network sockets and file descriptors.
Here is my SigHandler function
static void SigHandler(int signo){
if(signo == SIGTERM){
log_trace("SIGTERM received - handling signal");
CloseSockets();
log_trace("SIGTERM received - All sockets closed");
if (closeFile() == -1)
log_trace("SIGTERM received - No File associated with XXX open - continuing with shutdown");
else
log_trace("SIGTERM received - Closed File Descriptor for XXX - continuing with shutdown");
log_trace("Gracefully shutting down XXX Service");
} else {
log_trace("%d received - incompatible signal");
return;
}
exit(0);
}
This code below sits in main
if (sigemptyset(&set) == SIGEMPTYSET_ERROR){
log_error("Signal handling initialization failed");
}
else {
if(sigaddset(&set, SIGTERM) == SIGADDSET_ERROR) {
log_error("Signal SIGTERM not valid");
}
action.sa_flags = 0;
action.sa_mask = set;
action.sa_handler = &SigHandler;
if (sigaction(SIGTERM, &action, NULL) == SIGACTION_ERROR) {
log_error("SIGTERM handler initialization error");
}
}
When I send kill -15 PID, nothing happens. The process doesn't terminate, nor does it become a zombie process (not that it should anyway). I do see the traces printing within the SigHandler function however, so I know it is reaching that point in the code. It just seems that when it comes to exit(0), that doesn't work.
When I send SIGKILL (kill -9 PID) it kills the process just fine.
Apologies if this is vague, I'm still quite new to C and UNIX etc so I'm quite unfamiliar with most of how this works at a low level.
Your signal handler routine is conceptually wrong (it does not use just async-signal-safe functions). Read carefully signal(7) and signal-safety(7) to understand why. And your handler could apparently work most of the time but still be undefined behavior.
The usual trick is to set (in your signal handler) some volatile sig_atomic_t variable and test that variable outside of the signal handler.
Another possible trick is the pipe(7) to self trick (the Qt documentation explains it well), with your signal handler just doing a write(2) (which is async-signal-safe) to some global file descriptor obtained by e.g. pipe(2) (or perhaps the Linux specific eventfd(2)...) at program initialization before installing that signal handler.
A Linux specific way is to use signalfd(2) for SIGTERM and handle that in your own event loop (based upon poll(2)). That trick is conceptually a variant of the pipe to self one. But signalfd has some shortcomings, that a web search will find you easily.
Signals are conceptually hard to use (some view them as a design mistake in Unix), especially in multi-threaded programs.
You might want to read the old ALP book. It has some good explanations related to your issue.
PS. If your system is QNX you should read its documentation.
You should be using _exit from the signal handler instead, this also closes all the files.
Also read (very carefully) Basile's answer and take a long hard look at the list of async safe functions which you are allowed to use in signal handlers.
His advice about just changing a flag and testing it in your code is the best way if you need to do something you aren't allowed in the signal handler. Note that all blocking posix calls can be interrupted by signals so testing your atomic variable if you get an error on a blocking call (to say read) is a sure way to know if you have received a signal.

Interrupt a function inside a main loop after a certain amount of time

I have a main loop, and I have a function that displays a menu with some choices and waits for user's input.
The point is that I also have to check if a new event occures --in my specific case, a new message is received, but this isn't relevant at all-- and I can't wait the user to make an action: I have to implement a timeout for that function.
Here is a simple example of what I'm talking about:
int choice;
for(;;){
/* a new message could be arrived and we should read it now ... */
choice = menu_function();
/*
...but the user still hasn't made an action,
so the menu_function() hasn't returned yet.
*/
switch(choice){
case 1:
break;
case 2:
break;
default:
break;
}
}
So far I thought of using fork() before the menu_function() and kill() this process after it received a SIGALRM signal through alarm(), but I don't think this is the proper solution since it's inside a loop.
What kind of solution should I adopt?
P.S. I don't think this is a duplicate since, as I already said, the function interrupting request is inside a loop. Or at least for me, I think it's a different thing.
Since you mention fork I assume you're on a POSIX system (like Linux or macOS)?
That means you can install a signal handler for SIGALRM in the process doing the waiting, and the reception of the signal should interrupt the blocking operation (with errno == EINTR) which you can check for and have the menu_function return a value meaning "exit". The code in your loop could then check for this value and break out of the loop.
Another alternative is to not use the standard C input functions unless there's actually something to read. You can do this by using the select call with the desired timeout, and poll FILENO_STDIN. If the select function returns with a timeout, again let menu_function return a special value meaning "exit", else it will use the standard C function to read and parse the input.
No need to fork new processes for either the input-handler or the timer.

A function that use global variable but exit, should still be avoided in signal handlers ?

As I studied something about unix programming with C, I've learned that functions that fails to be reentrant should be avoided inside a signal handler, but if I've something like:
int main(int argc, char** argv){
...
fileFd=open(...)
signal(SIGUSR1, signalHandler)
...
}
void signalHandler(int signo){
switch(signo){
case SIGUSR1:
myExit(EXIT_FAILURE);
break;
default:
break;
}
}
Where myExit is
void myExit(int ret){
...DO STUFF...
close(fileFd);
exit(ret);
}
and fileFd is a global variable, and if I remember correctly that makes of myExit a non-reentrant... but it's still a problem to use it in a signal handler even if it will cause the exit of the program ?
Thanks, any help is appreciated and sorry if it's a dumb question.
The only thing you can do safely in a signal handler is to set a volatile sig_atomic_t variable. Please do all your handling in the main loop of your program by checking that a signal has been received (outside of the signal handler). If you do have to start doing non-portable things then consider at least using _Exit() or _exit(). Certain C libraries will guarantee that certain functions are signal-safe but this is not guaranteed to work on different systems obviously.
It might get a problem depending on "...DO STUFF...". If the stuff done there may not be done twice (like freeing the same pointer), it could crash.
In your particular case however this will probably not be a problem, if close(fileFD) is the only thing affecting global state and your file access API allows double-closing of files.
Still not safe if you use any async-unsafe functions. An asynchronous signal like USR1 can occur between any two instructions in your program, including in the middle of a critical section (locked section) in some library code.
So, for example, if the interruption happens in the middle of malloc, calling free in your signal handler (e.g. to clean up) will deadlock.

Linux select() vs ppoll() vs pselect()

In my application, there is a io-thread, that is dedicated for
Wrapping data received from the application in a custom protocol
Sending the data+custom protocol packet over tcp/ip
Receiving data+custom protocol packet over tcp/ip
Unwrapping the custom protocol and handing the data to the application.
Application processes the data over a different thread. Additionally, the requirements dictate that the unacknowledged window size should be 1, i.e. there should be only one pending unacknowledged message at anytime. This implies that if io-thread has dispatched a message over the socket, it will not send any more messages, till it hears an ack from the receiver.
Application's processing thread communicates to io-thread via pipe. Application needs to shut gracefully if someone from linux CLI types ctrl+C.
Thus, given these requirements, i have following options
Use PPoll() on socket and pipe descriptors
Use Select()
Use PSelect()
I have following questions
The decision between select() and poll(). My application only deals with less than 50 file descriptors. Is it okay to assume there would be no difference whether i choose select or poll ?
Decision between select() and pselect(). I read the linux documentation and it states about race condition between signals and select(). I dont have experience with signals, so can someone explain more clearly about the race condition and select() ? Does it have something to do with someone pressing ctrl+C on CLI and application not stopping?
Decision between pselect and ppoll() ? Any thoughts on one vs the other
I'd suggest by starting the comparison with select() vs poll(). Linux also provides both pselect() and ppoll(); and the extra const sigset_t * argument to pselect() and ppoll() (vs select() and poll()) has the same effect on each "p-variant", as it were. If you are not using signals, you have no race to protect against, so the base question is really about efficiency and ease of programming.
Meanwhile there's already a stackoverflow.com answer here: what are the differences between poll and select.
As for the race: once you start using signals (for whatever reason), you will learn that in general, a signal handler should just set a variable of type volatile sig_atomic_t to indicate that the signal has been detected. The fundamental reason for this is that many library calls are not re-entrant, and a signal can be delivered while you're "in the middle of" such a routine. For instance, simply printing a message to a stream-style data structure such as stdout (C) or cout (C++) can lead to re-entrancy issues.
Suppose you have code that uses a volatile sig_atomic_t flag variable, perhaps to catch SIGINT, something like this (see also http://pubs.opengroup.org/onlinepubs/007904975/functions/sigaction.html):
volatile sig_atomic_t got_interrupted = 0;
void caught_signal(int unused) {
got_interrupted = 1;
}
...
struct sigaction sa;
sa.sa_handler = caught_signal;
sigemptyset(&sa.sa_mask);
sa.sa_flags = SA_RESTART;
if (sigaction(SIGINT, &sa, NULL) == -1) ... handle error ...
...
Now, in the main body of your code, you might want to "run until interrupted":
while (!got_interrupted) {
... do some work ...
}
This is fine up until you start needing to make calls that wait for some input/output, such as select or poll. The "wait" action needs to wait for that I/O—but it also needs to wait for a SIGINT interrupt. If you just write:
while (!got_interrupted) {
... do some work ...
result = select(...); /* or result = poll(...) */
}
then it's possible that the interrupt will happen just before you call select() or poll(), rather than afterward. In this case, you did get interrupted—and the variable got_interrupted gets set—but after that, you start waiting. You should have checked the got_interrupted variable before you started waiting, not after.
You can try writing:
while (!got_interrupted) {
... do some work ...
if (!got_interrupted)
result = select(...); /* or result = poll(...) */
}
This shrinks the "race window", because now you'll detect the interrupt if it happens while you're in the "do some work" code; but there is still a race, because the interrupt can happen right after you test the variable, but right before the select-or-poll.
The solution is to make the "test, then wait" sequence "atomic", using the signal-blocking properties of sigprocmask (or, in POSIX threaded code, pthread_sigmask):
sigset_t mask, omask;
...
while (!got_interrupted) {
... do some work ...
/* begin critical section, test got_interrupted atomically */
sigemptyset(&mask);
sigaddset(&mask, SIGINT);
if (sigprocmask(SIG_BLOCK, &mask, &omask))
... handle error ...
if (got_interrupted) {
sigprocmask(SIG_SETMASK, &omask, NULL); /* restore old signal mask */
break;
}
result = pselect(..., &omask); /* or ppoll() etc */
sigprocmask(SIG_SETMASK, &omask, NULL);
/* end critical section */
}
(the above code is actually not that great, it's structured for illustration rather than efficiency -- it's more efficient to do the signal mask manipulation slightly differently, and place the "got interrupted" tests differently).
Until you actually start needing to catch SIGINT, though, you need only compare select() and poll() (and if you start needing large numbers of descriptors, some of the event-based stuff like epoll() is more efficient than either one).
Between (p)select and (p)poll is a rather subtle difference:
For select, you have to initialize and populate the ugly fd_set bitmaps everytime before you call select because select modifies them in-place in a "destructive" fashion. (poll distinguishes between the .events and .revents members in struct pollfd).
After selecting, the entire bitmap is often scanned (by people/code) for events even if most of the fds are not even watched.
Third, the bitmap can only deal with fds whose number is less than a certain limit (contemporary implementations: somewhere between 1024..4096), which rules it out in programs where high fds can be easibly attained (notwithstanding that such programs are likely to already use epoll instead).
The accepted answer is not correct vis a vis difference between select and pselect. It does describe well how a race condition between sig-handler and select can arise, but it is incorrect in how it uses pselect to solve the problem. It misses the main point about pselect which is that it waits for EITHER the file-descriptor or the signal to become ready. pselect returns when either of these are ready.Select ONLY waits on the file-descriptor. Select ignores signals. See this blog post for a good working example:
https://www.linuxprogrammingblog.com/code-examples/using-pselect-to-avoid-a-signal-race
To make the picture presented by the accepted answer complete following basic fact should be mentioned: both select() and pselect() may return EINTR as stated in their man pages:
EINTR A signal was caught; see signal(7).
This "caught" means that the signal should be recognized as "occurred during the system call execution":
1. If non-masked signal occurs during select/pselect execution then select/pselect will exit.
2. If non-masked signal occurs before select/pselect has been called this will not have any effect and select/pselect will continue waiting, potentially forever.
So if a signal occurs during select/pselect execution we are ok - the execution of select/pselect will be interrupted and then we can test the reason for the exit and discover that is was EINTR and then we can exit the loop.
The real threat that we face is a possibility of signal occurrence outside of select/pselect execution, then we may hang in the system call forever. Any attempt to discover this "outsider" signal by naive means:
if (was_a_signal) {
...
}
will fail since no matter how close this test will be to the call of select/pselect there is always a possibility that the signal will occur just after the test and before the call to select/pselect.
Then, if the only place to catch the signal is during select/pselect execution we should invent some kind of "wine funnel" so all "wine splashes" (signals), even outside of "bottle neck" (select/pselect execution period) will eventually come to the "bottle neck".
But how can you deceive system call and make it "think" that the signal has occurred during this system call execution when in reality it has occurred before?
Easy. Here is our "wine funnel": you just block the signal of interest and by that cause it (if it has occurred at all) waiting outside of the process "for the door to be opened" and you "open the door" (unmask the signal) only when you're prepared "to welcome the guest" (select/pselect is running). Then the "arrived" signal will be recognized as "just occurred" and will interrupt the execution of the system call.
Of course, "opening the door" is the most critical part of the plan - it cannot be done by the usual means (first unmask, then call to select/pselect), the only possibility is to do the both actions (unmask and system call) at once (atomically) - this is what pselect() is capable of but select() is not.

pthreads: How to handle signals in a main thread that creates other threads? (specific code shown)

I have a main thread, which stays in the main function, i.e. I do not create it specifically as in pthread_create, because it's not necessary. This thread opens a file, then creates other threads, waits for them to finish their work (i.e., does the join), cleans up everything (pointers, semaphores, conditional variables and so on...).
Now, I have to apply this code to block SIGINT:
sigset_t set;
int sig;
sigemptyset(&set);
sigaddset(&set, SIGINT);
pthread_sigmask(SIG_BLOCK, &set, NULL);
while (1) {
sigwait(&set, &sig);
switch (sig) {
case SIGINT:
/* handle interrupts */
break;
default:
/* unexpected signal */
pthread_exit((void *)-1);
}
}
and it says You must use the main() function to launch the N+1 threads and wait for their completion. If a SIGINT signal arrives at the program it should be handled by the main thread in order to shutdown the program and its threads a clean way
My doubt is how should I put this code? Is it wrong to put it on a background thread created in main() ? Because I already have a cicle, with an exit flag, that creates and join all the other threads, so I don't understand if this code goes exactly to the main function where all is done/called to initiate the program. If I put it on a thread, with this code and the handler to clean, is this considerated as busy waiting?
"It says"? What says? The homework assignment?
The first thing you should understand about programming with threads and signals is that you have very little control over which thread a signal is delivered to. If your main thread wants to be the one to get the signal, it should block the signal before creating any new threads and possible unblock it after it finishes creating them, to ensure that the signal is not delivered to them.
However, if you're following best practices for signal handlers, it probably doesn't matter which thread handles the signal. All the signal handler should do is set a global flag or write a byte to a pipe (whichever works best to get the main thread to notice that the signal happened. (Note that you cannot use condition variables or any locking primitives from signal handlers!) As in the code fragment in your question, blocking the signal and using sigwait is also possible (be aware, again, that it needs to be blocked in all threads), but most programs can't afford to stop and wait just for signals; they need to wait for condition variables and/or input from files as well. One way to solve this issue is to make a dedicated thread to call sigwait, but that's rather wasteful. A better solution, if you're already using select, would be to switch to pselect that can wait for signals as well as file descriptor events (at the same time).
Rather than asking us for the answers (which would be hard to give anyway without seeing the full program you're trying to make this work with), you'd be much better off trying to really understand the intricacies of signals with threads.

Resources