Question:
What happens if I exit program without closing files?
Are there some bad things happening (e.g. some OS level file descriptor array is not freed up..?)
And to the answer the same in both cases
programmed exiting
unexpected crash
Code examples:
With programmed exiting I mean something like this:
int main(){
fopen("foo.txt","r");
exit(1);
}
With unexpected crash I mean something like this:
int main(){
int * ptr=NULL;
fopen("foo.txt","r");
ptr[0]=0; // causes segmentation fault to occur
}
P.S.
If the answer is programming language dependent then I would like to know about C and C++.
If the answer is OS dependent then I am interested in Linux and Windows behaviour.
It depends on how you exit. Under controlled circumstances (via exit() or a return from main()), the data in the (output) buffers will be flushed and the files closed in an orderly manner. Other resources that the process had will also be released.
If your program crashes out of control, or if it calls one of the alternative _exit() or _Exit() functions, then the system will still clean up (close) open file descriptors and release other resources, but the buffers won't be flushed, etc.
The OS tidies up for you. It is like going around to a friends - it is polite to shut the bathroom door and not have them do it for you.
All handles that belong to your process will be cleaned up. However any "named" kernel objects like named pipes and others will stick around.
Related
According to the man page (2) the exit function is not thread safe : MT-Unsafe race:exit, this is because this function tries to clean up resources (flush data to the disk, close file descriptors, etc...) by calling callbacks registered using on_exit and atexit. And I want my program to do that ! (one of my thread keeps a fd open during the whole program's lifespan so _exit is not an option for me because I want all the data to be written to the output file)
My question is the following : if I'm being careful and I don't share any sensible data (like a fd) between my threads, is it "acceptable" to call exit in a multi-threaded program ? Note that I'm only calling exit if an unrecoverable error occurs. Yet, I can't afford having a segfault while the program tries to exit. The thing is, an unrecoverable error can happen from any thread...
I was thinking about using setjmp/longjmp to kill my threads "nicely" but this would be quite complex to do and would require many changes everywhere in my code.
Any suggestions would be greatly appreciated. Thanks ! :)
EDIT : Thanks to #Ctx enlightenment, I came up with the following idea :
#define EXIT(status) do { pthread_mutex_lock(&exit_mutex); exit(status); } while(0)
Of course the exit_mutex must be global (extern).
The manpage states that
The exit() function uses a global variable that is not protected, so it is not thread-safe.
so it won't help, if you are being careful in any way.
But the problem documented is a race condition: MT-Unsafe race:exit
So if you make sure, that exit() can never be called concurrently from two threads, you should be on the safe side! You can make this sure by using a mutex for example.
A modern cross-platform C++ solution could be:
#include <cstdlib>
#include <mutex>
std::mutex exit_mutex;
[[noreturn]] void exit_thread_safe(const int status)
{
exit_mutex.lock();
exit(status);
}
The mutex ensures that exit is never called by 2 (or more) different threads.
However, I still question the reason behind even caring about this. How likely is a multi-threaded call to exit() and which bad things can even realistically happen?
EDIT:
Using std::quick_exit avoids the clang diagnostic warning.
It can't be done: even if no data is shared between threads at first, data must be shared between a thread and its cleanup function. The function should run only after the thread has stopped or reached a safe point.
I have a long-running C program which opens a file in the beginning, writes out "interesting" stuff during execution, and closes the file just before it finishes. The code, compiled with gcc -o test test.c (gcc version 5.3.1.) looks like as follows:
//contents of test.c
#include<stdio.h>
FILE * filept;
int main() {
filept = fopen("test.txt","w");
unsigned long i;
for (i = 0; i < 1152921504606846976; ++i) {
if (i == 0) {//This case is interesting!
fprintf(filept, "Hello world\n");
}
}
fclose(filept);
return 0;
}
The problem is that since this is a scientific computation (think of searching for primes, or whatever is your favourite hard-to-crack stuff) it could really run for a very long time. Since I determined that I am not patient enough, I would like to abort the current computation, but I would like to do this in an intelligent way by somehow forcing the program by external means to flush out all the data that is currently in the OS buffer/disk cache, wherever.
Here is what I have tried (for this bogus program above, and of course not for the real deal which is currently still running):
pressing ctrl+C; or
sending kill -6 <PID> (and also kill -3 <PID>) -- as suggested by #BartekBanachewicz,
but after either of these approaches the file test.txt created in the very beginning of the program remains empty. This means, that the contents of fprintf() were left in some intermediate buffer during the computation, waiting for some OS/hardware/software flush signal, but since no such a signal was obtained, the contents disappeared. This also means, that the comment made by #EJP
Your question is based on a fallacy. 'Stuff that is in the OS
buffer/disk cache' won't be lost.
does not seem to apply here. Experience shows, that stuff indeed get lost.
I am using Ubuntu 16.04 and I am willing to attach a debugger to this process if it is possible, and if it is safe to retrieve the data this way. Since I never done such a thing before, I would appreciate if someone would provide me a detailed answer how to get the contents flushed into the disk safely and surely. Or I am open to other methods as well. There is no room for error here, as I am not going to rerun the program again.
Note: Sure I could have opened and closed a file inside the if branch, but that is extremely inefficient once you have many things to be written. Recompiling the program is not possible, as it is still in the middle of some computation.
Note2: the original question was asked the same question in a slightly more abstract way related to C++, and was tagged as such (that is why people in the comments suggesting std::flush(), which wouldn't help even if this was a C++ question). Well, I guess I made a major edit then.
Somewhat related: Will data written via write() be flushed to disk if a process is killed?
Can I just add some clarity? Obviously months have passed, and I imagine your program isn't running any more ... but there's some confusion here about buffering which still isn't clear.
As soon as you use the stdio library and FILE *, you will by default have a fairly small (implementation dependent, but typically some KB) buffer inside your program which is accumulating what you write, and flushing it to the OS when it's full, (or on file close). When you kill your process, it is this buffer that gets lost.
If the data has been flushed to the OS, then it is kept in a unix file buffer until the OS decides to persist it to disk (usually fairly soon), or someone runs the sync command. If you kill the power on your computer, then this buffer gets lost as well. You probably don't care about this scenario, because you probably aren't planning to yank the power! But this is what #EJP was talking about (re Stuff that is in the OS buffer/disk cache' won't be lost): your problem is the stdio cache, not the OS.
In an ideal world, you'd write your app so it fflushed (or std::flush()) at key points. In your example, you'd say:
if (i == 0) {//This case is interesting!
fprintf(filept, "Hello world\n");
fflush(filept);
}
which would cause the stdio buffer to flush to the OS. I imagine your real writer is more complex, and in that situation I would try to make the fflush happen "often but not too often". Too rare, and you lose data when you kill the process, too often and you lose the performance benefits of buffering if you are writing a lot.
In your described situation, where the program is already running and can't be stopped and rewritten, then your only hope, as you say, is to stop it in a debugger. The details of what you need to do depend on the implementation of the std lib, but you can usually look inside the FILE *filept object and start following pointers, messy though. #ivan_pozdeev's comment about executing std::flush() or fflush() within the debugger is helpful.
By default, the response to the signal SIGTERM is to shut down the application immediately. However, you can add your own custom signal handler to override this behaviour, like this:
#include <unistd.h>
#include <signal.h>
#include <atomic>
...
std::atomic_bool shouldStop;
...
void signalHandler(int sig)
{
//code for clean shutdown goes here: MUST be async-signal safe, such as:
shouldStop = true;
}
...
int main()
{
...
signal(SIGTERM, signalHandler); //this tells the OS to use your signal handler instead of default
signal(SIGINT, signalHandler); //can do it for other signals too
...
//main work logic, which could be of form:
while(!shouldStop) {
...
if(someTerminatingCondition) break;
...
}
//cleanup including flushing
...
}
Be aware that if take this approach, you must make sure that your program does actually terminate after your custom handler is run (it is under no obligation to do so immediately, and can run clean-up logic as it sees fit). If it doesn't shut down, linux will not shut it down either so the SIGTERM will be 'ignored' from an outside perspective.
Note that by default the linux kill command sends a SIGTERM, invoking the behaviour above. If your program is running in the foreground and Ctrl-C is pressed, a SIGINT is sent instead, which is why you might want to handle that as well as per above.
Note also, the implementation suggested above takes care to be safe, in that no async logic is performed in the signal handler other than setting an atomic flag. This is important, as pointed out in the comments below. See the Async-signal safe section of this page for details of what is and isn't allowed.
There is question about using exit in C++. The answer discusses that it is not good idea mainly because of RAII, e.g., if exit is called somewhere in code, destructors of objects will not be called, hence, if for example a destructor was meant to write data to file, this will not happen, because the destructor was not called.
I was interested how is this situation in C. Are similar issues applicable also in C? I thought since in C we don't use constructors/destructors, situation might be different in C. So is it ok to use exit in C? For example I have seen following functions sometimes used in C:
void die(const char *message)
{
if(errno) {
perror(message);
} else {
printf("ERROR: %s\n", message);
}
exit(1);
}
Rather than abort(), the exit() function in C is considered to be a "graceful" exit.
From C11 (N1570) 7.22.4.4/p2 The exit function (emphasis mine):
The exit function causes normal program termination to occur.
The Standard also says in 7.22.4.4/p4 that:
Next, all open streams with unwritten buffered data are flushed, all
open streams are closed, and all files created by the tmpfile function
are removed.
It is also worth looking at 7.21.3/p5 Files:
If the main function returns to its original caller, or if the exit
function is called, all open files are closed (hence all output
streams are flushed) before program termination. Other paths to
program termination, such as calling the abort function, need not
close all files properly.
However, as mentioned in comments below you can't assume that it will cover every other resource, so you may need to resort to atexit() and define callbacks for their release individually. In fact it is exactly what atexit() is intended to do, as it says in 7.22.4.2/p2 The atexit function:
The atexit function registers the function pointed to by func, to be
called without arguments at normal program termination.
Notably, the C standard does not say precisely what should happen to objects of allocated storage duration (i.e. malloc()), thus requiring you be aware of how it is done on particular implementation. For modern, host-oriented OS it is likely that the system will take care of it, but still you might want to handle this by yourself in order to silence memory debuggers such as Valgrind.
Yes, it is ok to use exit in C.
To ensure all buffers and graceful orderly shutdown, it would be recommended to use this function atexit, more information on this here
An example code would be like this:
void cleanup(void){
/* example of closing file pointer and free up memory */
if (fp) fclose(fp);
if (ptr) free(ptr);
}
int main(int argc, char **argv){
/* ... */
atexit(cleanup);
/* ... */
return 0;
}
Now, whenever exit is called, the function cleanup will get executed, which can house graceful shutdown, clean up of buffers, memory etc.
You don't have constructors and destructors but you could have resources (e.g. files, streams, sockets) and it is important to close them correctly. A buffer could not be written synchronously, so exiting from the program without correctly closing the resource first, could lead to corruption.
Using exit() is OK
Two major aspects of code design that have not yet been mentioned are 'threading' and 'libraries'.
In a single-threaded program, in the code you're writing to implement that program, using exit() is fine. My programs use it routinely when something has gone wrong and the code isn't going to recover.
But…
However, calling exit() is a unilateral action that can't be undone. That's why both 'threading' and 'libraries' require careful thought.
Threaded programs
If a program is multi-threaded, then using exit() is a dramatic action which terminates all the threads. It will probably be inappropriate to exit the entire program. It may be appropriate to exit the thread, reporting an error. If you're cognizant of the design of the program, then maybe that unilateral exit is permissible, but in general, it will not be acceptable.
Library code
And that 'cognizant of the design of the program' clause applies to code in libraries, too. It is very seldom correct for a general purpose library function to call exit(). You'd be justifiably upset if one of the standard C library functions failed to return just because of an error. (Obviously, functions like exit(), _Exit(), quick_exit(), abort() are intended not to return; that's different.) The functions in the C library therefore either "can't fail" or return an error indication somehow. If you're writing code to go into a general purpose library, you need to consider the error handling strategy for your code carefully. It should fit in with the error handling strategies of the programs with which it is intended to be used, or the error handling may be made configurable.
I have a series of library functions (in a package with header "stderr.h", a name which treads on thin ice) that are intended to exit as they're used for error reporting. Those functions exit by design. There are a related series of functions in the same package that report errors and do not exit. The exiting functions are implemented in terms of the non-exiting functions, of course, but that's an internal implementation detail.
I have many other library functions, and a good many of them rely on the "stderr.h" code for error reporting. That's a design decision I made and is one that I'm OK with. But when the errors are reported with the functions that exit, it limits the general usefulness the library code. If the code calls the error reporting functions that do not exit, then the main code paths in the function have to deal with error returns sanely — detect them and relay an error indication to the calling code.
The code for my error reporting package is available in my SOQ (Stack Overflow Questions) repository on GitHub as files stderr.c and stderr.h in the src/libsoq sub-directory.
One reason to avoid exit in functions other than main() is the possibility that your code might be taken out of context. Remember, exit is a type of non local control flow. Like uncatchable exceptions.
For example, you might write some storage management functions that exit on a critical disk error. Then someone decides to move them into a library. Exiting from a library is something that will cause the calling program to exit in an inconsitent state which it may not be prepared for.
Or you might run it on an embedded system. There is nowhere to exit to, the whole thing runs in a while(1) loop in main(). It might not even be defined in the standard library.
Depending on what you are doing, exit may be the most logical way out of a program in C. I know it's very useful for checking to make sure chains of callbacks work correctly. Take this example callback I used recently:
unsigned char cbShowDataThenExit( unsigned char *data, unsigned short dataSz,unsigned char status)
{
printf("cbShowDataThenExit with status %X (dataSz %d)\n", status, dataSz);
printf("status:%d\n",status);
printArray(data,dataSz);
cleanUp();
exit(0);
}
In the main loop, I set everything up for this system and then wait in a while(1) loop. It is possible to make a global flag to exit the while loop instead, but this is simple and does what it needs to do. If you are dealing with any open buffers like files and devices you should clean them up before close for consistency.
It is terrible in a big project when any code can exit except for coredump. Trace is very import to maintain a online server.
I am writing a C program for an embedded Linux (debian-arm) device. In some cases, e.g. if a fatal error occurs on the system/program, I want the program to reboot the system by system("reboot");after logging the error(s) via syslog(). My program includes multithreads, UDP sockets, severalfwrite()/fopen(), malloc() calls, ..
I would like to ask a few question what (how) the program should perform processes just before rebooting the system apart from the syslog. I would appreciate to know how these things are done by the experienced programmers.
Is it necessary to close the open sockets (UDP) and threads just before rebooting? If it is the case, is there a function/system call that closes the all open sockets and threads? If the threads needs to be closed and there is no such global function/call to end them, how I suppose to execute pthread_exit(NULL); for each specific threads? Do I need go use something like goto to end the each threads?
How should the program closes files that fopen and fwrite uses? Is there a global call to close the files in use or do I need to find out the files in use manually then use fclose for the each file? I see see some examples on the forums fflush(), flush(), sync(),.. are used, which one(s) would you recommend to use? In a generic case, would it cause any problem if all of these functions are used (although these could be used unnecessary)?
It is not necessary to free the variables that malloc allocated space, is it?
Do you suggest any other tasks to be performed?
The system automatically issues SIGTERM signals to all processes as one of the steps in rebooting. As long as you correctly handle SIGTERM, you need not do anything special after invoking the reboot command. The normal idiom for "correctly handling SIGTERM" is:
Create a pipe to yourself.
The signal handler for SIGTERM writes one byte (any value will do) to that pipe.
Your main select loop includes the read end of that pipe in the set of file descriptors of interest. If that pipe ever becomes readable, it's time to exit.
Furthermore, when a process exits, the kernel automatically closes all its open file descriptors, terminates all of its threads, and deallocates all of its memory. And if you exit cleanly, i.e. by returning from main or calling exit, all stdio FILEs that are still open are automatically flushed and closed. Therefore, you probably don't have to do very much cleanup on the way out -- the most important thing is to make sure you finish generating any output files and remove any temporary files.
You may find the concept of crash-only software useful in figuring out what does and does not need cleaning up.
The only cleanup you need to do is anything your program needs to start up in a consistent state. For example, if you collect some data internally then write it to a file, you will need to ensure this is done before exiting. Other than that, you do not need to close sockets, close files, or free all memory. The operating system is designed to release these resources on process exit.
Firstly, I'm aware that opening a file with fopen() and not closing it is horribly irresponsible, and bad form. This is just sheer curiosity, so please humour me :)
I know that if a C program opens a bunch of files and never closes any of them, eventually fopen() will start failing. Are there any other side effects that could cause problems outside the code itself? For instance, if I have a program that opens one file, and then exits without closing it, could that cause a problem for the person running the program? Would such a program leak anything (memory, file handles)? Could there be problems accessing that file again once the program had finished? What would happen if the program was run many times in succession?
As long as your program is running, if you keep opening files without closing them, the most likely result is that you will run out of file descriptors/handles available for your process, and attempting to open more files will fail eventually. On Windows, this can also prevent other processes from opening or deleting the files you have open, since by default, files are opened in an exclusive sharing mode that prevents other processes from opening them.
Once your program exits, the operating system will clean up after you. It will close any files you left open when it terminates your process, and perform any other cleanup that is necessary (e.g. if a file was marked delete-on-close, it will delete the file then; note that that sort of thing is platform-specific).
However, another issue to be careful of is buffered data. Most file streams buffer data in memory before writing it out to disk. If you're using FILE* streams from the stdio library, then there are two possibilities:
Your program exited normally, either by calling the exit(3) function, or by returning from main (which implicitly calls exit(3)).
Your program exited abnormally; this can be via calling abort(3) or _Exit(3), dying from a signal/exception, etc.
If your program exited normally, the C runtime will take care of flushing any buffered streams that were open. So, if you had buffered data written to a FILE* that wasn't flushed, it will be flushed on normal exit.
Conversely, if your program exited abnormally, any buffered data will not be flushed. The OS just says "oh dear me, you left a file descriptor open, I better close that for you" when the process terminates; it has no idea there's some random data lying somewhere in memory that the program intended to write to disk but did not. So be careful about that.
The C standard says that calling exit (or, equivalently, returning from main) causes all open FILE objects to be closed as-if by fclose. So this is perfectly fine, except that you forfeit the opportunity to detect write errors.
EDIT: There is no such guarantee for abnormal termination (abort, a failed assert, receipt of a signal whose default behavior is to abnormally terminate the program -- note that there aren't necessarily any such signals -- and other implementation-defined means). As others have said, modern operating systems will clean up all externally visible resources, such as open OS-level file handles, regardless; however, FILEs are likely not to be flushed in that case.
There certainly have been OSes that did not clean up externally visible resources on abnormal termination; it tends to go along with not enforcing hard privilege boundaries between "kernel" and "user" code and/or between distinct user space "processes", simply because if you don't have those boundaries it may not be possible to do so safely in all cases. (Consider, for instance, what happens if you write garbage over the open-file table in MS-DOS, as you are perfectly able to do.)
Assuming you exit under control, using the exit() system call or returning from main(), then the open file streams are closed after flushing. The C Standard (and POSIX) mandate this.
If you exit out of control (core dump, SIGKILL) etc, or if you use _exit() or _Exit(), then the open file streams are not flushed (but the file descriptors end up closed, assuming a POSIX-like system with file descriptors - Standard C does not mandate file descriptors). Note that _Exit() is mandated by the C99 standard, but _exit() is mandated by POSIX (but they behave the same on POSIX systems). Note that file descriptors are separate from file streams. See the discussion of 'Consequences of Program Termination' on the POSIX page for _exit() to see what happens when a program terminates under Unix.
When the process dies, most modern operating systems (the kernel specifically) will free all of your handles and allocated memory.