I'm a Java programmer writing a JNI application which calls native C code. What I want is to detect native program crash from Java code that calls the native code.
I know that there is no exception handling in C but just curious whether I can write dangerous codes "safely" inside try-catch block in Java (e.g.code that may contain bad pointer in runtime or file not found cases). In that case the code will crash, but I was looking for a safe way to report it.
I believe you're taking the wrong approach. As a language C is not a place where "it's better to ask for forgiveness than permission." You must ask for permission in C, though in the event of a disaster, you can certainly do some cleanup:
You should look into signal handling.
When you do something like
int *a = NULL;
int b = *a; //segfault
Your program will receive SIGSEGV, which will force your program to quit, unless you have installed a signal handler.
Most unhandled signals will cause program termination, and most can be caught (SIGKILL for example cannot be caught).
This allows you to do some cleanup.
#include <signal.h>
#include <stdio.h>
//typedef void (*sighandler_t)(int);
void myhandler(int signal){
printf("oh noes...\n");
}
int main(void)
{
signal(SIGSEGV, myhandler);
int *a = NULL;
int b = *a; //segfault
return 0;
}
None of this is recommended though. You MUST know the contents of your pointers at all times. Otherwise, giants will raze your village.
EDIT: the signal() function is suggested as deprecated. New code should use [sigaction()][2] instead. This example uses signal() for its simplicity.
No you can't reasonably detect a native crash from Java. If such a crash happens, your program is likely to be killed before you have a chance to catch it.
What you can do is to check for return codes. Any decently written C library will return you an error code if it fails (note that's different from crash). You can use those codes and translate them in Java exceptions if you wish to.
For crashes, you could use the "signal" APIs, though it's not as straight forward as it looks: if you actually caught a crash, then there's not much you could be doing as the entire program memory may have been corrupted. I'd recommend against it if you are a beginner.
C doesn't really have exceptions. It has two things which play a similar role, but which are vastly different from one another: signals and error codes.
Let's consider error codes first. When a function call in C fails, it will typically signal this failure to the caller by returning an error code, or by returning a placeholder value (e.g, 0, -1, or NULL) and setting an error in errno, or storing it in a location that can be retrieved by calling another method. No special effort is required to handle these exceptions. Simply check the return values of functions, e.g.
FILE *fh = fopen("example", "r");
if (fh == NULL) {
fprintf(stderr, "Couldn't open file: %s\n", strerror(errno));
return -1;
}
// do things with fh...
Signals, on the other hand, are used in C to denote really unusual situations, such as an attempt to execute invalid code or dereference a bad pointer, or other external conditions like the user pressing control-C to terminate your program. You can attempt to handle some such signals using the signal() function, but the ones you've described, like a bad pointer, typically indicate that the process is screwed up in a rather permanent way, so allowing them to terminate the process is usually the correct solution.
If you are indeed running code that is prone to triggering segmentation faults, it would be wise to run it in a separate process, rather than allowing it to potentially corrupt the state of a Java runtime.
Related
Some C++ libraries call abort() function in the case of error (for example, SDL). No helpful debug information is provided in this case. It is not possible to catch abort call and to write some diagnostics log output. I would like to override this behaviour globally without rewriting/rebuilding these libraries. I would like to throw exception and handle it. Is it possible?
Note that abort raises the SIGABRT signal, as if it called raise(SIGABRT). You can install a signal handler that gets called in this situation, like so:
#include <signal.h>
extern "C" void my_function_to_handle_aborts(int signal_number)
{
/*Your code goes here. You can output debugging info.
If you return from this function, and it was called
because abort() was called, your program will exit or crash anyway
(with a dialog box on Windows).
*/
}
/*Do this early in your program's initialization */
signal(SIGABRT, &my_function_to_handle_aborts);
If you can't prevent the abort calls (say, they're due to bugs that creep in despite your best intentions), this might allow you to collect some more debugging information. This is portable ANSI C, so it works on Unix and Windows, and other platforms too, though what you do in the abort handler will often not be portable. Note that this handler is also called when an assert fails, or even by other runtime functions - say, if malloc detects heap corruption. So your program might be in a crazy state during that handler. You shouldn't allocate memory - use static buffers if possible. Just do the bare minimum to collect the information you need, get an error message to the user, and quit.
Certain platforms may allow their abort functions to be customized further. For example, on Windows, Visual C++ has a function _set_abort_behavior that lets you choose whether or not a message is displayed to the user, and whether crash dumps are collected.
According to the man page on Linux, abort() generates a SIGABRT to the process that can be caught by a signal handler. EDIT: Ben's confirmed this is possible on Windows too - see his comment below.
You could try writing your own and get the linker to call yours in place of std::abort. I'm not sure if it is possible however.
There is question about using exit in C++. The answer discusses that it is not good idea mainly because of RAII, e.g., if exit is called somewhere in code, destructors of objects will not be called, hence, if for example a destructor was meant to write data to file, this will not happen, because the destructor was not called.
I was interested how is this situation in C. Are similar issues applicable also in C? I thought since in C we don't use constructors/destructors, situation might be different in C. So is it ok to use exit in C? For example I have seen following functions sometimes used in C:
void die(const char *message)
{
if(errno) {
perror(message);
} else {
printf("ERROR: %s\n", message);
}
exit(1);
}
Rather than abort(), the exit() function in C is considered to be a "graceful" exit.
From C11 (N1570) 7.22.4.4/p2 The exit function (emphasis mine):
The exit function causes normal program termination to occur.
The Standard also says in 7.22.4.4/p4 that:
Next, all open streams with unwritten buffered data are flushed, all
open streams are closed, and all files created by the tmpfile function
are removed.
It is also worth looking at 7.21.3/p5 Files:
If the main function returns to its original caller, or if the exit
function is called, all open files are closed (hence all output
streams are flushed) before program termination. Other paths to
program termination, such as calling the abort function, need not
close all files properly.
However, as mentioned in comments below you can't assume that it will cover every other resource, so you may need to resort to atexit() and define callbacks for their release individually. In fact it is exactly what atexit() is intended to do, as it says in 7.22.4.2/p2 The atexit function:
The atexit function registers the function pointed to by func, to be
called without arguments at normal program termination.
Notably, the C standard does not say precisely what should happen to objects of allocated storage duration (i.e. malloc()), thus requiring you be aware of how it is done on particular implementation. For modern, host-oriented OS it is likely that the system will take care of it, but still you might want to handle this by yourself in order to silence memory debuggers such as Valgrind.
Yes, it is ok to use exit in C.
To ensure all buffers and graceful orderly shutdown, it would be recommended to use this function atexit, more information on this here
An example code would be like this:
void cleanup(void){
/* example of closing file pointer and free up memory */
if (fp) fclose(fp);
if (ptr) free(ptr);
}
int main(int argc, char **argv){
/* ... */
atexit(cleanup);
/* ... */
return 0;
}
Now, whenever exit is called, the function cleanup will get executed, which can house graceful shutdown, clean up of buffers, memory etc.
You don't have constructors and destructors but you could have resources (e.g. files, streams, sockets) and it is important to close them correctly. A buffer could not be written synchronously, so exiting from the program without correctly closing the resource first, could lead to corruption.
Using exit() is OK
Two major aspects of code design that have not yet been mentioned are 'threading' and 'libraries'.
In a single-threaded program, in the code you're writing to implement that program, using exit() is fine. My programs use it routinely when something has gone wrong and the code isn't going to recover.
But…
However, calling exit() is a unilateral action that can't be undone. That's why both 'threading' and 'libraries' require careful thought.
Threaded programs
If a program is multi-threaded, then using exit() is a dramatic action which terminates all the threads. It will probably be inappropriate to exit the entire program. It may be appropriate to exit the thread, reporting an error. If you're cognizant of the design of the program, then maybe that unilateral exit is permissible, but in general, it will not be acceptable.
Library code
And that 'cognizant of the design of the program' clause applies to code in libraries, too. It is very seldom correct for a general purpose library function to call exit(). You'd be justifiably upset if one of the standard C library functions failed to return just because of an error. (Obviously, functions like exit(), _Exit(), quick_exit(), abort() are intended not to return; that's different.) The functions in the C library therefore either "can't fail" or return an error indication somehow. If you're writing code to go into a general purpose library, you need to consider the error handling strategy for your code carefully. It should fit in with the error handling strategies of the programs with which it is intended to be used, or the error handling may be made configurable.
I have a series of library functions (in a package with header "stderr.h", a name which treads on thin ice) that are intended to exit as they're used for error reporting. Those functions exit by design. There are a related series of functions in the same package that report errors and do not exit. The exiting functions are implemented in terms of the non-exiting functions, of course, but that's an internal implementation detail.
I have many other library functions, and a good many of them rely on the "stderr.h" code for error reporting. That's a design decision I made and is one that I'm OK with. But when the errors are reported with the functions that exit, it limits the general usefulness the library code. If the code calls the error reporting functions that do not exit, then the main code paths in the function have to deal with error returns sanely — detect them and relay an error indication to the calling code.
The code for my error reporting package is available in my SOQ (Stack Overflow Questions) repository on GitHub as files stderr.c and stderr.h in the src/libsoq sub-directory.
One reason to avoid exit in functions other than main() is the possibility that your code might be taken out of context. Remember, exit is a type of non local control flow. Like uncatchable exceptions.
For example, you might write some storage management functions that exit on a critical disk error. Then someone decides to move them into a library. Exiting from a library is something that will cause the calling program to exit in an inconsitent state which it may not be prepared for.
Or you might run it on an embedded system. There is nowhere to exit to, the whole thing runs in a while(1) loop in main(). It might not even be defined in the standard library.
Depending on what you are doing, exit may be the most logical way out of a program in C. I know it's very useful for checking to make sure chains of callbacks work correctly. Take this example callback I used recently:
unsigned char cbShowDataThenExit( unsigned char *data, unsigned short dataSz,unsigned char status)
{
printf("cbShowDataThenExit with status %X (dataSz %d)\n", status, dataSz);
printf("status:%d\n",status);
printArray(data,dataSz);
cleanUp();
exit(0);
}
In the main loop, I set everything up for this system and then wait in a while(1) loop. It is possible to make a global flag to exit the while loop instead, but this is simple and does what it needs to do. If you are dealing with any open buffers like files and devices you should clean them up before close for consistency.
It is terrible in a big project when any code can exit except for coredump. Trace is very import to maintain a online server.
Obviously, it's good practice. That goes without saying. I see it every time in example code (like socket(), fork(), or malloc(), to name a few). I know to do it, I just don't understand the why of it so much. Are they prone to failing often? Is it because system calls are made in kernel mode? What's the reasoning behind it?
I presume you are asking why code that calls these routines checks the results to determine whether an error occurred.
Each of the routines you cite, socket, fork, and malloc, requires resources. Those resources may be unavailable either because the calling process has exceeded limits set by the system administrator or the user or because the system has exhausted the resources it has and cannot provide any more to processes. Therefore, it is possible, even if not frequent, that a call to one of these routines will return failure. So a calling process should check for failure.
Additionally, in some implementations, some system routines (such as read and write) can be interrupted if a signal is delivered to the process before the operation completed. (When a signal arrives, it is considered important, and it is desirable to deliver it to the process immediately rather than wait for a potentially long operation to complete. So the operation is interrupted, the signal is delivered, the process may handle the signal and return from the signal handler. Then control is returned to the code that called the original routine, and that code must be informed that the operation was interrupted.) This interruption results in returning failure with an error status indicating the operation was interrupted.
Always, if only..
Way back when as a C function could only return an integer, and exceptions were science fiction, they came up with the idea of returning either success or a code that provided a clue as to what had gone wrong. It became a convention.
Depends on what you call a failure.
Something like opening a file (given the developer can be bothered) are relatively easy to deal with, File not found for instance. Malloc, is a bit more difficult to take some remedial action.
The key point though is to check as near to the error as possible. If you don't, you find that the file you wanted to open and append to didn't exist 10,000 lines of code later, when you try and write the results of your extensive computation to it and get say an access violation.
Basically this stuff is the reason exceptions were invented. Checking the return value is "optional", swallowing an exception is explicit.
example:
FILE *fp;
fp = fopen("c:\\removedDirectory\nonexistingFile.txt", "r")//returns NULL
if(fp != NULL)
{
//stuff here will fail if fp == NULL
}
If you do not check output of fopen, (replace with any function that returns an error) and fp is NULL, the subsequent functions depending on a real file stream will not work.
I'm working with an embedded system where the exit() call doesn't seem to exist.
I have a function that calls malloc and rather than let the program crash when it fails I'd rather exit a bit more gracefully.
My initial idea was to use goto however the labels seem to have a very limited scope (I'm not sure, I've never used them before "NEVER USE GOTO!!1!!").
I was wondering if it is possible to goto a section of another function or if there are any other creative ways of exiting a C program from an arbitrary function.
void main() {
//stuff
a();
exit:
return;
}
void a() {
//stuff
//if malloc failed
goto exit;
}
Thanks for any help.
Options:
since your system is non-standard (or perhaps is standard but non-hosted), check its documentation for how to exit.
try abort() (warning: this will not call atexit handlers).
check whether your system allows you to send a signal to yourself that will kill yourself.
return a value from a() indicating error, and propagate that via error returns all the way back to main.
check whether your system has setjmp/longjmp. These are difficult to use correctly but they do provide what you asked for: the ability to transfer execution from anywhere in your program (not necessarily including a signal/interrupt handler, but then you probably wouldn't be calling malloc in either of those anyway) to a specific point in your main function.
if your embedded system is such that your program is the only code that runs on it, then instead of exiting you could call some code that goes into an error state: perhaps an infinite loop, that perhaps flashes an LED or otherwise indicates that badness has happened. Maybe you can provoke a reboot.
Why dont you use return values
if malloc failed
return 1;
else
return 0;
...........
if(!a())
return;
goto cannot possibly jump to another function.
Normally, you are advised please don't use goto! In this case what you are asking is not possible.
How to deal with this? There are few solutions.
Check return code or value of problematic functions and act accordingly.
Use setjmp/longjmp. This advice should be considered even more evil than using goto itself, but it does support jumping from one function to another.
Embedded systems rarely have any variation of exit(), as that function doesn't necessarily make any sense in the given context. Where does the controller of an elevator or a toaster exit to?
In multitasking embedded systems there could be a system call to exit or terminate a process, leaving only an idle process alive that does simply a busy loop: while (1); or in some cases call a privileged instruction to go to power saving mode: while (1) { asm("halt") };
In embedded systems one possible method to "recover" from error is to asm("trap #0"); or any equivalent of calling an interrupt vector, that implements graceful system shutdown with dumping core to flash drive or outputting an error code to UART.
I have an application which I use to catch any segmentation fault or ctrl-c.
Using the below code, I am able to catch the segmentation fault but the handler is being called again and again. How can I stop them.
For your information, I don't want to exit my application. I just can take care to free all the corrupted buffers.
Is it possible?
void SignalInit(void )
{
struct sigaction sigIntHandler;
sigIntHandler.sa_handler = mysighandler;
sigemptyset(&sigIntHandler.sa_mask);
sigIntHandler.sa_flags = 0;
sigaction(SIGINT, &sigIntHandler, NULL);
sigaction(SIGSEGV, &sigIntHandler, NULL);
}
and handler goes like this.
void mysighandler()
{
MyfreeBuffers(); /*related to my applciation*/
}
Here for Segmentation fault signal, handler is being called multiple times and as obvious MyfreeBuffers() gives me errors for freeing already freed memory. I just want to free only once but still dont want to exit application.
Please help.
The default action for things like SIGSEGV is to terminate your process but as you've installed a handler for it, it'll call your handler overriding the default behavior. But the problem is segfaulting instruction may be retried after your handler finishes and if you haven't taken measures to fix the first seg fault, the retried instruction will again fault and it goes on and on.
So first spot the instruction that resulted in SIGSEGV and try to fix it (you can call something like backtrace() in the handler and see for yourself what went wrong)
Also, the POSIX standard says that,
The behavior of a process is undefined after it returns normally from
a signal-catching function for a [XSI] SIGBUS, SIGFPE, SIGILL, or
SIGSEGV signal that was not generated by kill(), [RTS] sigqueue(),
or raise().
So, the ideal thing to do is to fix your segfault in the first place. Handler for segfault is not meant to bypass the underlying error condition
So the best suggestion would be- Don't catch the SIGSEGV. Let it dump core. Analyze the core. Fix the invalid memory reference and there you go!
I do not agree at all with the statement "Don't catch the SIGSEGV".
That's a pretty good pratice to deal with unexpected conditions. And that's much cleaner to cope with NULL pointers (as given by malloc failures) with signal mechanism associated to setjmp/longjmp, than to distribute error condition management all along your code.
Note however that if you use ''sigaction'' on SEGV, you must not forget to say SA_NODEFER in sa_flags - or find another way to deal with the fact SEGV will trigger your handler just once.
#include <setjmp.h>
#include <signal.h>
#include <stdio.h>
#include <string.h>
static void do_segv()
{
int *segv;
segv = 0; /* malloc(a_huge_amount); */
*segv = 1;
}
sigjmp_buf point;
static void handler(int sig, siginfo_t *dont_care, void *dont_care_either)
{
longjmp(point, 1);
}
int main()
{
struct sigaction sa;
memset(&sa, 0, sizeof(sigaction));
sigemptyset(&sa.sa_mask);
sa.sa_flags = SA_NODEFER;
sa.sa_sigaction = handler;
sigaction(SIGSEGV, &sa, NULL); /* ignore whether it works or not */
if (setjmp(point) == 0)
do_segv();
else
fprintf(stderr, "rather unexpected error\n");
return 0;
}
If the SIGSEGV fires again, the obvious conclusion is that the call to MyfreeBuffers(); has not fixed the underlying problem (and if that function really does only free() some allocated memory, I'm not sure why you would think it would).
Roughly, a SIGSEGV fires when an attempt is made to access an inaccessible memory address. If you are not going to exit the application, you need to either make that memory address accessible, or change the execution path with longjmp().
You shouldn't try to continue after SIG_SEGV. It basically means that the environment of your application is corrupted in some way. It could be that you have just dereferenced a null pointer, or it could be that some bug has caused your program to corrupt its stack or the heap or some pointer variable, you just don't know. The only safe thing to do is terminate the program.
It's perfectly legitimate to handle control-C. Lots of applications do it, but you have to be really careful exactly what you do in your signal handler. You can't call any function that's not re-entrant. So that means if your MyFreeBuffers() calls the stdlib free() function, you are probably screwed. If the user hits control-C while the program is in the middle of malloc() or free() and thus half way through manipulating the data structures they use to track heap allocations, you will almost certainly corrupt the heap if you call malloc() or free() in the signal handler.
About the only safe thing you can do in a signal handler is set a flag to say you caught the signal. Your app can then poll the flag at intervals to decide if it needs to perform some action.
Well you could set a state variable and only free memory if its not set. The signal handler will be called everytime, you can't control that AFAIK.
I can see at case for recovering from a SIG_SEGV, if your handling events in a loop and one of these events causes a Segmentation Violation then you would only want to skip over this event, continue processing the remaining events. In my eyes SIG_SEGV is similar to the NullPointerException in Java. Yes the state will be inconsistent and unknown after either of these, however in some cases you would like to handle the situation and carry on. For instance in Algo trading you would pause the execution of an order and allow a trader to manually take over, with out crashing the entire system and ruining all other orders.
Looks like at least under Linux using the trick with -fnon-call-exceptions option can be the solution. It will give an ability to convert the signal to general C++ exception and handle it by general way.
Look the linux3/gcc46: "-fnon-call-exceptions", which signals are trapping instructions? for example.