OSX "Illegal instruction 4" with nested function - c

The goal here was to catch SIGINT to close the server socket on a little socket server. I've tried to use a nested functions to keep the code clean. But...
When I do Ctrl-C (SIGINT, right?), I get Illegal instruction: 4. After reading this post, I've tried adding -mmacosx-version-min=10.8 to the compile flags since I'm on 10.8. Same error when doing Ctrl-C.
Two questions here: Why do I get `Illegal instruction 4"? How can I close the server socket without using a global variable?
My software:
Mac OSX 10.8.4
GCC 4.2.1
Here's how I'm compiling:
gcc -fnested-functions main.c
Here's the code:
#include <sys/socket.h>
#include <unistd.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
void register_sigint_handler(int *serverSocket)
{
void sigint_handler(int signal) {
printf("Shutting down...\n");
printf("Server socket was %d\n", *serverSocket);
close(*serverSocket);
exit(0);
}
signal(SIGINT, &sigint_handler);
}
int main(void) {
int serverSocket = 0, guestSocket = 0;
register_sigint_handler(&serverSocket);
serverSocket = socket(PF_INET, SOCK_STREAM, 0);
while (1) {}
close(serverSocket);
return 0;
}

While I can't tell you specifically what happens, the gcc docs have a generalization:
If you try to call the nested function through its address after the
containing function exits, all hell breaks loose.
Passing the function pointer to signal() will do exactly that, call your local function after the containing function has exited. So you shouldn't pass a nested function pointer to signal()
You should probably just use a normal function for the handler, which sets a flag.
static volatile int do_exit;
void sigint_handler(int sig)
{
do_exit = 1;
}
In a server one usually have main loop of some sort, around e.g. select or poll, your main loop, the empty while loop can now become
while (!do_exit) {
pause();
}
(Note that sockets are automatically closed by the operating system when the process exits;)

"Don't do that, then."
GCC's nested-functions-in-C extension does not provide true closures. When you take the address of sigint_handler, a "trampoline" (a small piece of self-modifying code) is written to the stack; as soon as register_sigint_handler exits, the trampoline is destroyed; a subsequent attempt to invoke the trampoline (by the kernel, in order to dispatch the signal) causes undefined behavior.
Signal handlers are process-global by definition. Therefore, it is incorrect in principle to attempt to avoid using global variables in signal handlers. Imagine what you'd do to make this code cope with two server sockets: you can only have one SIGINT handler registered, so somehow it has to close both sockets.
All open file descriptors are automatically closed when your process terminates. Therefore, it is not necessary to close them by hand first. Furthermore, it is a breach of convention to exit successfully on ^C. If this program were being driven by a supervisory process, that process would want to know (via waitpid's status code) that it exited because of a SIGINT. Putting those two things together, you should not have this signal handler at all.
(That stops being true if you need to do something to your active sockets on exit. For instance, if you wanted to write something to every active client connection on exit, you'd want a signal handler. But at that point you want to have the signal handler alert the main event loop, and do the work there.)
(Consider using libevent for this sort of thing rather than doing all the low-level goo yourself.)

Related

C: Terminate a called function that is stuck in infinite loop from main

Let's assume I've got the following main in a c file
int f();
int main(){
//terminate f() if in infinite loop
return f();
}
and then a separate c file that could potentially hold the following:
int f() {
for(;;) {}
return 0;
}
Is there any way to detect that the function f() is in an infinite loop and terminate it's execution from within the main function?
EDIT:
I need this functionality as I am writing a testbench where the function called could potentially have an infinite loop - that's what I am checking for in the end. Therefore, I cannot modify f() in anyway. I'm also in a Linux environment.
No, there is no way to definitively determine if a function contains an infinite loop.
However, we can make a few assumptions to detect a potential infinite loop and exit a program gracefully within the program (e.g. we don't have to press Ctrl+C). This method is common in several testing frameworks used in JS. Basically, we set some arbitrary time limit for a function to complete in. If the function does not complete within that time limit, we assume it will not complete and we throw an error.
In C/C++ you could implement this with pthreads if you're on a Unix system. In Windows, you would use windows.h. I only have experience with pthreads, so I'll show a simple example of how you might get this working using pthreads.
#include <pthread.h> // Load pthread
#include <signal.h> // If f() does not exit, we will need this library to send it a signal to kill itself.
#include <stdbool.h> // You could use an int or char.
#include <stddef.h> // Defines NULL
#include <unistd.h> // Defines sleep()
bool testComplete; // Has the test completed?
/**
* The function being tested.
*/
void f() {
while(true);
}
/**
* This method handles executing the test. This is the function pthread will
* use as its start routine. It takes no arguments and returns no results.
* The signature is required for pthread_create().
*/
void *runTest(void *ptr) {
testComplete = false;
f();
testComplete = true;
}
int main() {
pthread_t testThread;
pthread_create(&testThread, NULL, runTest, NULL); // Create and start the new thread. It will begin executing runTest() eventually.
sleep(5); // Give it 5 seconds to complete (this should be adjusted or could even be made dynamic).
if(testComplete) {
// Test completed successfully.
pthread_join(testThread, NULL);
} else {
// The test did not exit successfully within the time limit. Kill it and you'll probably what to provide some feedback here.
pthread_kill(testThread, SIGPIPE); // There are other signals, but this one cannot be ignored or caught.
}
}
To compile this, you would need to execute gcc your_filename.c -o output_filename -lpthread.
If you expect the program to run on both Unix and Windows systems, you may want to consider making some unified interface for accessing threads and then adapting the OS-specific interfaces to your interface. It will make things a little simpler, especially when expanding this library.
You could call f() in a different thread and have main time-out f() when it reaches a certain limit. However, I don't think this is practical and you should really work on solving the infinite loop first.
On a Posix system (Linux, MacOS) you can schedule an alarm in the future with setitimer() before calling the function. Signal SIGALRM will be delivered to the process after the specified delay. Make sure that your program has the signal handler, you should register it with sigaction() before starting the timer.
When the signal handler takes control after the signal is raised, you may get out if the offending loop with setjmp() and longjmp().
If you call f() the way you showed (from main) then at that point the main context is in f, not main and therefore you cannot "check f from main".
What you can try is calling f() from a separate thread and check whether that thread has finished within specified time limit. However I'm not sure about practicality of this. While I don't know what you really plan to do in that function, n some cases you may stop this function from executing at the point where it did soemthing that requires cleaning up. One example that comes to mind is it calling malloc but but being able to call free at the point where you interrupt it.
Honestly, if there's a certain requirement about the time in which given function has to finish, just put that check within the function itself and return false to indicate it didn't finish successfully.

When I catch a signal with signal(SIGINT,f), is f executed in parallel?

I have C code like this
#include <stdio.h>
#include <unistd.h>
#include <signal.h>
void handler_function(int);
int i=0;
int j=0;
int main() {
signal(SIGINT,f);
while(1) {
/* do something in variable `i` */
}
}
void f(int signum) {
/* do something else on variable `i` */
}
Can it produce a data race? i.e. is f executed in parallel (even in a multithread machine) to the main. Or maybe is the main stopped until f finish its execution?
First of all according to the man page of signal() you should not use signal() but sigaction()
The behavior of signal() varies across UNIX versions, and has also varied historically across different versions of Linux. Avoid its use: use sigaction(2) instead. See Portability below.
But one might hope that signal() behaves sanely. However, there might be a data race because main might be interrupted before a store e.g. in a situation like this
if ( i > 10 ) {
i += j;
}
void f(int signum) {
i = 0;
}
If main is past the compare (or if the according registers do not get update if main was interrupted while compare), main would still to i += j which is a data race.
So where does this leave us? - Don't ever modify globals that get modified elsewhere in signal handlers if you cannot guarantee that the signal handler cannot interrupt this operation (e.g. disable signal handler for certain operations).
Unless you use the raise() from Standard C or kill() with the value from getpid() as the PID argument, signal events are asynchronous.
In single-threaded code on a multi-core machine, it means that you cannot tell what is happening in the 'do something to variable i' code. For example, that code might have just fetched the value from i and have incremented it, but not yet saved the incremented value. If the signal handler function f() reads i, modifies it in a different way, saves the result and returns, the original code may now write the incremented value of i instead of using the value modified by f().
This is what leads to the many constraints on what you can do in a signal handler. For example, it is not safe to call printf() in a signal handler because it might need to do memory allocation (malloc()) and yet the signal might have arrived while malloc() was modifying its linked lists of available memory. The second call to malloc() might get thoroughly confused.
So, even in a single-threaded program, you have to be aware and very careful about how you modify global variables.
However, in a single-threaded program, there will be no activity from the main loop while the signal is being handled. Indeed, even in a multi-threaded program, the thread that receives (handles) the signal is suspended while the signal handler is running, but other threads are not suspeded so there could be concurrent activity from other threads. If it matters, make sure the access to the variables is properly serialized.
See also:
What is the difference between sigaction() and signal()?
Signal concepts.

Why we call Signal Handler twice?

I am a novice to signal handling using c language. I am analyzing below signal handling code which extracted from specific resource.
Here is that code .
#include <stdio.h>
#include <signal.h>
void intproc();
void quitproc();
main()
{
int i;
signal(SIGINT,intproc);
signal(SIGQUIT,quitproc);
printf("Ctrl+c is disabled. Use ctrl+\\ to quit\n");
for (i=0;;i++) {
printf("In an infinite loop...\n");
sleep(200);
}
}
void intproc()
{
signal(SIGINT,intproc);
printf("You have pressed ctrl+c.\n");
}
void quitproc()
{ signal(SIGQUIT,intproc);
printf("You have pressed ctrl+\\. Now the program quits.\n");
exit(0);
}
what I want to know is why we call again Signal handler "(SIGINT,intproc)" inside intproc() function ?
I tried to run this code without that signal handler within that function ,and its also working .
This is very old code. In the old days (perhaps SunOS3, 1990-s) a signal handler was automatically uninstalled when executed. See signal(2) (difference between SysV and BSD behavior) and avoid using signal.
Carefully read signal(7) then use sigaction(2). Don't use signal(2). Care about async signal safe functions (the only ones you can call from a signal handler; you should not use printf inside a signal handler!). Consider simply setting some volatile sig_atomic_t global (or static) variable inside your signal handler (and test it outside).
Read Advanced Linux Programming which explains these things in detail.
After the function intproc has completed, the program carries on, but the signal action is restored to the default. When it receives a second SIGINT signal, the program takes the default action, which is to terminate the program.
If you want to retain the signal handler, you would need to re-establish it by calling signal again.
This is the reason you should always prefer the more robust sigaction over the signal function.

IPC using Signals on linux

It is possible to do IPC (inter process communication) using signal catch and signal raise?
I made two programs. In the first program I did handling of signals, and in the other program I just raised signal which I want to handle in another program. I'ts working fine for me but I want to do communication between these two programs using signals and also want to send some bytes of data with this raise signal. How can I do this?
I want to pass messages with this signal also. Can i do it? It is possible?
And also, what are the disadvantages and advantages of IPC mechanisms using signals?
The following is working code of my two programs. Ising this, I am able to just raise signals and catch signals, but I want to pass data from one program to another.
In the second program, I used the first program's process ID. How can I make it dynamic.?
first program :
/* Example of using sigaction() to setup a signal handler with 3 arguments
* including siginfo_t.
*/
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <signal.h>
#include <string.h>
static void hdl (int sig, siginfo_t *siginfo, void *context)
{
printf("sig no = %d \n", sig);
if(sig == SIGINT)
exit(0);
printf ("Sending PID: %ld, UID: %ld\n",
(long)siginfo->si_pid, (long)siginfo->si_uid);
}
int main (int argc, char *argv[])
{
struct sigaction act;
sigemptyset(&act.sa_mask);
act.sa_sigaction = &hdl;
act.sa_flags = SA_SIGINFO;
if (sigaction(SIGUSR1, &act, NULL) < 0) {
perror ("sigaction SIGUSR1");
return 1;
}
if (sigaction(SIGINT, &act, NULL) < 0) {
perror ("sigaction SIGINT");
return 1;
}
while (1)
{
sleep(1);
}
return 0;
}
second program
#include <stdio.h>
#include <signal.h>
void main(void)
{
while (1)
{
sleep(1);
kill(11558, SIGUSR1);
}
}
Signals are intended to provide a rudimentary form of control over a process, not as an IPC mechanism. Signals have several issues when used as anything else:
A lot of system calls will be interrupted by a signal and need special handling.
Accordingly, a lot of code in the wild is not signal-safe.
Signals do not have any kind of data content, except for themselves. This makes them mostly useless as a message passing method.
There is only so much you can do in a signal handler.
Most importantly, subsequent signals of the same type are not queued - they are merged into one instance.
Even more important, there is no guarantee that signals are delivered in the same order as they were generated. From the manual page:
By contrast, if multiple standard signals are pending for a process, the order in which
they are delivered is unspecified.
You might theoretically be able set up some kind of channel using several signals going back and forth, with some acting like some sort of acknowledgement, but no sane person would want to attempt something like that. You might as well use smoke signals instead...
No, don't try and use signals for this. You cannot attach extra data with signals other than the siginfo struct. The main problem with using signals though is that so little is signal safe. You have to avoid just about all the C runtime routines, and make sure the recieving program does EINTR checks on all its kernel calls. The only thing you can say about when a signal occurs is that it won't be when you expect it (a bit like the Spanish Inquisition).
I suggest you look into the other IPC mechanisms, such as shared memory, message queues, fifos (named pipes), and sockets.
It is possible to do IPC (inter process communication) using signal catch and signal raise?
Yes and no. Considering signals only, you can send a signal to another process, but you can't send anything other than just a signal.
I want to pass messages with this signal also. Can i do it? It is possible?
No, not the way you're trying to. You can use sockets, files, pipes, or named pipes to do this. If you want to learn more about UNIX IPC, read Advanced Programming in the UNIX Environment.
Except in one specific case that I've encountered signals aren't generally useful as IPC mechanism.
The only time I've used signals was as part of an IPC mechanism when you need to interrupt the normal flow of operation of the signalled process to handle something, for example a timer interrupt. The signal ( have used signals together with boost shared memory to implement interprocess event management. The shared memory contains a list of events that need processing and the signal is used to get the process to process these events. These events are out-of-band and unpredictable so using a signal was ideal. I performed considerable testing to verify the implementation (and it was hard to get it all stable).
This used sigqueue together with signal SIGRTMIN+1 in a Linux environment using glibc and using SA_RESTART on the sigaction will avoid the need to directly handle EINTR see glibc: Primitives Interrupted by Signals. BSD has a similar scheme so EINTR handling wasn't required in my system. All of the points made by the other answers were considered and handled (and tested).
However if you just want to pass values back and forwards within the normal operation of the process then another IPC such as sockets, files, pipes or named pipes are better. If you can use ZeroMQ then even better as that does a lot of the hard work for you in a very elegant way.
I'm currently reading man 7 signal:
Real-time signals are distinguished by the following:
If the signal is sent using sigqueue(3), an accompanying value (either an integer or a pointer) can be sent with the signal. ...
Note: Real-time signals start from SIGRTMIN to SIGRTMAX.

Longjmp out of signal handler?

From the question:
Is it good programming practice to use setjmp and longjmp in C?
Two of the comments left said:
"You can't throw an exception in a signal handler, but you can do a
longjmp safely -- as long as you know what you are doing. – Dietrich
Epp Aug 31 at 19:57
#Dietrich: +1 to your comment. This is a little-known and
completely-under-appreciated fact. There are a number of problems that
cannot be solved (nasty race conditions) without using longjmp out of
signal handlers. Asynchronous interruption of blocking syscalls is the
classic example."
I was under the impression that signal handlers were called by the kernel when it encountered an exceptional condition (e.g. divide by 0). Also, that they're only called if you specifically register them.
This would seem to imply (to me) that they aren't called through your normal code.
Moving on with that thought... setjmp and longjmp as I understand them are for collapsing up the stack to a previous point and state. I don't understand how you can collapse up a stack when a signal handler is called since its called from the Kernel as a one-off circumstance rather than from your own code. What's the next thing up the stack from a signal handler!?
The way the kernel "calls" a signal handler is by interrupting the thread, saving the signal mask and processor state in a ucontext_t structure on the stack just beyond (below, on grows-down implementations) the interrupted code's stack pointer, and restarting execution at the address of the signal handler. The kernel does not need to keep track of any "this process is in a signal handler" state; that's entirely a consequence of the new call frame that was created.
If the interrupted thread was in the middle of a system call, the kernel will back out of the kernelspace code and adjust the return address to repeat the system call (if SA_RESTART is set for the signal and the system call is a restartable one) or put EINTR in the return code (if not restartable).
It should be noted that longjmp is async-signal-unsafe. This means it invokes undefined behavior if you call it from a signal handler if the signal interrupted another async-signal-unsafe function. But as long as the interrupted code is not using library functions, or only using library functions that are marked async-signal-safe, it's legal to call longjmp from a signal handler.
Finally, my answer is based on POSIX since the question is tagged unix. If the question were just about pure C, I suspect the answer is somewhat different, but signals are rather useless without POSIX anyway...
longjmp does not perform normal stack unwinding. Instead, the stack pointer is simply restored from the context saved by setjmp.
Here is an illustration on how this can bite you with non-async-safe critical parts in your code. It is advisable to e.g. mask the offending signal during critical code.
worth reading this: http://man7.org/linux/man-pages/man2/sigreturn.2.html in regard to how Linux handles signal handler invocation, and in this case how it manages signal handler exit, my reading of this suggests that executing a longjmp() from a signal handler (resulting in no call of sigreturn()) might be at best "undefined"... also have to take into account on which thread (and thus user stack) the setjmp() was called, and on which thread (and thus user stack) longjmp() in subsequently called also!
This doesn't answer the question of whether or not it is "good" to do this, but
this is how to do it. In my application, I have a complicated interaction between custom hardware, huge page, shared memory, NUMA lock memory, etc, and it is possible to have memory that seems to be decently allocated but when you touch it (write in this case), it throws a BUS error or SEGV fault in the middle of the application. I wanted to come up with a way of testing memory addresses to make sure that the shared memory wasn't node locked to a node that didn't have enough memory, so that the program would fail early with graceful error messages. So these signal handlers are ONLY used for this one piece of code (a small memcpy of 5 bytes) and not used to rescue the app while it is in use. I think it is safe here.
Apologies if this is not "correct". Please comment and I'll fix it up. I cobbled it together based on hints and some sample code that didn't work.
#include <stdio.h>
#include <signal.h>
#include <setjmp.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
sigjmp_buf JumpBuffer;
void handler(int);
int count = 0;
int main(void)
{
struct sigaction sa;
sa.sa_handler = handler;
sigemptyset(&(sa.sa_mask));
sigaddset(&(sa.sa_mask), SIGSEGV);
sigaction(SIGSEGV, &sa, NULL);
while (1) {
int r = sigsetjmp(JumpBuffer,1);
if (r == 0) {
printf("Ready for memcpy, count=%d\n",count);
usleep(1000000);
char buffer[10];
#if 1
char* dst = buffer; // this won't do bad
#else
char* dst = nullptr; // this will cause a segfault
#endif
memcpy(dst,"12345",5); // trigger seg fault here
longjmp(JumpBuffer,2);
}
else if (r == 1)
{
printf("SEGV. count %d\n",count);
}
else if (r == 2)
{
printf("No segv. count %d\n",count);
}
}
return 0;
}
void handler(int sig)
{
count++;
siglongjmp(JumpBuffer, 1);
}
References
https://linux.die.net/man/3/sigsetjmp
https://pubs.opengroup.org/onlinepubs/9699919799/functions/longjmp.html
http://www.csl.mtu.edu/cs4411.ck/www/NOTES/non-local-goto/sig-1.html
https://www.gnu.org/software/libc/manual/html_node/Longjmp-in-Handler.html
In most systems a signal handler has it's own stack, separate from the main stack. That's why you could longjmp out of a handler. I think it's not a wise thing to do though.
You can't use longjmp to get out of a signal handler.
The reason for this is that setjmp only saves the resources (process registers) etc. that the calling-convention specifies that should be saved over a plain function call.
When an interrupt occurs, the function being interrupted may have a much larger state, and it will not be restored correctly by longjmp.

Resources