What's the point of fclose()? [duplicate] - c

This question already has answers here:
What happens if I don't call fclose() in a C program?
(4 answers)
Closed 7 years ago.
From what I read, fclose() is basically like the free() when memories been allocated but I also read that the operating system will close that file for you and flush away any streams that were open right after it terminates. I've even tested a few programs without fclose() and they all seem to work fine.

A long-running process (ex. a database or web browser) may need to open many files during its lifetime; keeping unused files open wastes resources, and potentially locks other processes out of using the files.
Additionally, fclose flushes the user-space buffer that is frequently used when writing to files to improve performance; if the process exits without flushing that buffer (with fflush/fclose), the data still in the buffer will be lost.

Most modern OSes will also reclaim the memory you malloc()'ed, but using free() when appropriate is still good practice. The point is that once you no longer need a resource you should relinquish it, so the system can repurpose whatever backing resources were reclaimed for use by other applications (typically, memory). Also there are limits on the number of file descriptors you can keep open at the same time.
Apart from that there are further considerations in the case of open() and friends, specifically by default open file descriptors are inherited accross thread and fork()'ed process boundaries. This means that if you fail to close() file descriptors, you may find that a child process can access files opened by the parent process. This is typically undesirable, it's a trivial security hole if you want a privileged parent process to spawn a slave process with lesser privileges.
Additionally, the semantics of unlink() and friends are that the file contents are only 'deleted' once the last open file descriptor to the file is close()'d so again: if you keep files open for longer than strictly necessary you cause suboptimal behaviour in the overall system.
Finally, in the case of sockets a close() also corresponds to disconnecting from the remote peer.

Related

Why should I close all file descriptors after calling fork() and prior to calling exec...()? And how would I do it?

I've seen a lot of C code that tries to close all file descriptors between calling fork() and calling exec...(). Why is this commonly done and what is the best way to do it in my own code, as I've seen so many different implementations already?
When calling fork(), your operation system creates a new process by simply cloning your existing process. The new process will be pretty much identical to the process it was cloned from, except for its process ID and any properties that are documented to be replaced or reset by the fork() call.
When calling any form of exec...(), the process image of the calling process is replaced by a new process image but other than that the process state is preserved. One consequence is that open file descriptors in the process file descriptor table prior to calling exec...() are still present in that table after calling it, so the new process code inherits access to them. I guess this has probably been done so that STDIN, STDOUT, and STDERR are automatically inherited by child processes.
However, keep in mind that in POSIX C file descriptors are not only used to access actual files, they are also used for all kind of system and network sockets, pipes, shared memory identifiers, and so on. If you don't close these prior to calling exec...(), your new child process will get access to all of them, even to those resources it could not gain access on its own as it doesn't even have the required access rights. Think about a root process creating a non-root child process, yet this child would have access to all open file descriptors of the root parent process, including open files that should only be writable by root or protected server sockets below port 1024.
So unless you want a child process to inherit access to currently open file descriptors, as may explicitly be desired e.g. to capture STDOUT of a process or feed data via STDIN to that process, you are required to close them prior to calling exec...(). Not only because of security (which sometimes may play no role at all) but also because otherwise the child process will have less free file descriptors available (and think of a long chain of processes, each opening files and then spawning a sub-process... there will be less and less free file descriptors available).
One way to do that is to always open files using the flag O_CLOEXEC, which ensures that this file descriptor is automatically closed when exec...() is ever called. One problem with that solution is that you cannot control how external libraries may open files, so you cannot rely that all code will always set this flag.
Another problem is that this solution only works for file descriptors created with open(). You cannot pass that flag when creating sockets, pipes, etc. This is a known problem and some systems are working around that by offering the non-standard acccept4(), pipe2(), dup3(), and the SOCK_CLOEXEC flag for sockets, however these are not yet POSIX standard and it's unknown if they will become standard (this is planned but until a new standard has been released we cannot know for sure, also it will take years until all systems have adopted them).
What you can do is to later on set the flag FD_CLOEXEC using fcntl() on the file descriptor, however, note that this isn't safe in a multi-thread environment. Just consider the following code:
int so = socket(...);
fcntl(so, F_SETFD, FD_CLOEXEC);
If another thread calls fork() in between the first and the second line, which is of course possible, the flag has not yet been set yet and thus this file descriptor won't get closed.
So the only way that is really safe is to explicitly close them and this is not as easy as it may seem!
I've seen a lot of code that does stupid things like this:
for (int i = STDERR_FILENO + 1; i < 256; i++) close(i);
But just because some POSIX systems have a default limit of 256 doesn't mean that this limit cannot be raised. Also on some system the default limit is always higher to begin with.
Using FD_SETSIZE instead of 256 is equally wrong as just because the select() API has a hard limit by default on most systems doesn't mean that a process cannot have more open file descriptors than this limit (after all you don't have to use select() with them, you can use poll() API as a replacement and poll() has no upper limit on file descriptor numbers).
Always correct is to use OPEN_MAX instead of 256 as that is really the absolute maximum of file descriptors a process can have. The downside is that OPEN_MAX can theoretically be huge and doesn't reflect the real current runtime limit of a process.
To avoid having to close too many non-existing file descriptors, you can use this code instead:
int fdlimit = (int)sysconf(_SC_OPEN_MAX);
for (int i = STDERR_FILENO + 1; i < fdlimit; i++) close(i);
sysconf(_SC_OPEN_MAX) is documented to update correctly if the open file limit (RLIMIT_NOFILE) has been raised using setrlimit(). The resource limits (rlimits) are the effective limits for a running process and for files they will always have to be between _POSIX_OPEN_MAX (documented as the minimum number of file descriptors a process is always allowed to open, must be at least 20) and OPEN_MAX (must be at least _POSIX_OPEN_MAX and sets the upper limit).
While closing all possible descriptors in a loop is technically correct and will work as desired, it may try to close several thousand file descriptors, most of them will often not exist. Even if the close() call for a non-existing file descriptor is fast (which is not guaranteed by any standard), it may take a while on weaker systems (think of embedded devices, think of small single-board computers), which may be a problem.
So several systems have developed more efficient ways to solve this issue. Famous examples are closefrom() and fdwalk() which BSD and Solaris systems support. Unfortunately The Open Group voted against adding closefrom() to the standard (quote): "it is not possible to standardize an interface that closes arbitrary file descriptors above a certain value while still guaranteeing a conforming environment." (Source) This is of course nonsense, as they make the rules themselves and if they define that certain file descriptors can always be silently omitted from closing if the environment or system requires or the code itself requests that, then this would break no existing implementation of that function and still offer the desired functionality for the rest of us. Without these functions people will use a loop and do exactly what The Open Group tries to avoid here, so not adding it only makes the situation even worse.
On some platforms you are basically out of luck, e.g. macOS, which is fully POSIX conform. If you don't want to close all file descriptors in a loop on macOS, your only option is to not use fork()/exec...() but instead posix_spawn(). posix_spawn() is a newer API for platforms that don't support process forking, it can be implemented purely in user space on top of fork()/exec...() for those platforms that do support forking and can otherwise use some other API a platform offers for starting child processes. On macOS there exists a non-standard flag POSIX_SPAWN_CLOEXEC_DEFAULT, which will tread all file descriptors as if the CLOEXEC flag has been set on them, except for those for that you explicitly specified file actions.
On Linux you can get a list of file descriptors by looking at the path /proc/{PID}/fd/ with {PID} being the process ID of your process (getpid()), that is, if the proc file system has been mounted at all and it has been mounted to /proc (but a lot of Linux tools rely on that, not doing so would break many other things as well). Basically you can limit yourself to close all descriptors listed under this path.
True story: Once upon a time I wrote a simple little C program that opened a file, and I noticed that the file descriptor returned by open was 4. "That's funny," I thought. "Standard input, output, and error are always file descriptors 0, 1, and 2, so the first file descriptor you open is usually 3."
So I wrote another little C program that started reading from file descriptor 3 (without opening it, that is, but rather, assuming that 3 was a pre-opened fd, just like 0, 1, and 2). It quickly became apparent that, on the Unix system I was using, file descriptor 3 was pre-opened on the system password file. This was evidently a bug in the login program, which was exec'ing my login shell with fd 3 still open on the password file, and the stray fd was in turn being inherited by programs I ran from my shell.
Naturally the next thing I tried was a simple little C program to write to the pre-opened file descriptor 3, to see if I could modify the password file and give myself root access. This, however, didn't work; the stray fd 3 was opened on the password file in read-only mode.
But at any rate, this helps to explain why you shouldn't leave file descriptors open when you exec a child process.
[Footnote: I said "true story", and it mostly is, but for the sake of the narrative I did change one detail. In fact, the buggy version of /bin/login was leaving fd 3 opened on the groups file, /etc/group, not the password file.]

Why do we need to close a file in C? [duplicate]

This question already has answers here:
What happens if I don't call fclose() in a C program?
(4 answers)
Closed 8 years ago.
Suppose that we have opened a file using fopen() in C and we unintentionally forget to close it using fclose() then what could be the consequences of it? Also what are the solutions to it if we are not provided with the source code but only executable?
The consequences are that a file descriptor is "leaked". The operating system uses some descriptor, and has some resources associated with that open file. If you fopen and don't close, then that descriptor won't be cleaned up, and will persist until the program closes.
This problem is compounded if the file can potentially be opened multiple times. As the program runs more and more descriptors will be leaked, until eventually the operating system either refuses or is unable to create another descriptor, in which case the call to fopen fails.
If you are only provided with the executable, not the source code, your options are very limited. At that point you'd have to either try decompiling or rewriting the assembly by hand, neither of which are attractive options.
The correct thing to do is file a bug report, and then get an updated/fixed version.
If there are a lot of files open but not closed properly, the program will eventually run out of file handles and/or memory space and crash.
Suggest you engage your developer to update their code.
The consequences is implementation dependent based on the fclose / fopen and associated functions -- they are buffered input/output functions. So things write are written to a "file" is in fact first written to an internal buffer -- the buffer is only flushed to output when the code "feels like it" -- that could be every line, every write of every full block depending on the smartness of the implementation.
The fopen will most likely use open to get an actual file descriptor to the operating system -- on most systems (Linux, Windows etc) the os file descriptor will be closed by the OS when the process terminates -- however if the program does not terminates, the os file descriptor will leak and you will eventually run out of file descriptors and die.
Some standard may mandate a specific behavior when the program terminates either cleanly or through a crash, but the fact is that you cannot reply in this as not all implementations may follow this.
So your risk is that you will loose some of the data which you program believed that it had written -- that would be the data which was sitting in the internal buffer but never flushed -- or you may run out of file descriptors and die.
So, fix the code.

how to reboot a Linux system when a fatal error occurs (C programming)

I am writing a C program for an embedded Linux (debian-arm) device. In some cases, e.g. if a fatal error occurs on the system/program, I want the program to reboot the system by system("reboot");after logging the error(s) via syslog(). My program includes multithreads, UDP sockets, severalfwrite()/fopen(), malloc() calls, ..
I would like to ask a few question what (how) the program should perform processes just before rebooting the system apart from the syslog. I would appreciate to know how these things are done by the experienced programmers.
Is it necessary to close the open sockets (UDP) and threads just before rebooting? If it is the case, is there a function/system call that closes the all open sockets and threads? If the threads needs to be closed and there is no such global function/call to end them, how I suppose to execute pthread_exit(NULL); for each specific threads? Do I need go use something like goto to end the each threads?
How should the program closes files that fopen and fwrite uses? Is there a global call to close the files in use or do I need to find out the files in use manually then use fclose for the each file? I see see some examples on the forums fflush(), flush(), sync(),.. are used, which one(s) would you recommend to use? In a generic case, would it cause any problem if all of these functions are used (although these could be used unnecessary)?
It is not necessary to free the variables that malloc allocated space, is it?
Do you suggest any other tasks to be performed?
The system automatically issues SIGTERM signals to all processes as one of the steps in rebooting. As long as you correctly handle SIGTERM, you need not do anything special after invoking the reboot command. The normal idiom for "correctly handling SIGTERM" is:
Create a pipe to yourself.
The signal handler for SIGTERM writes one byte (any value will do) to that pipe.
Your main select loop includes the read end of that pipe in the set of file descriptors of interest. If that pipe ever becomes readable, it's time to exit.
Furthermore, when a process exits, the kernel automatically closes all its open file descriptors, terminates all of its threads, and deallocates all of its memory. And if you exit cleanly, i.e. by returning from main or calling exit, all stdio FILEs that are still open are automatically flushed and closed. Therefore, you probably don't have to do very much cleanup on the way out -- the most important thing is to make sure you finish generating any output files and remove any temporary files.
You may find the concept of crash-only software useful in figuring out what does and does not need cleaning up.
The only cleanup you need to do is anything your program needs to start up in a consistent state. For example, if you collect some data internally then write it to a file, you will need to ensure this is done before exiting. Other than that, you do not need to close sockets, close files, or free all memory. The operating system is designed to release these resources on process exit.

Preventing reuse of file descriptors

Is there anyway in Linux (or more generally in a POSIX OS) to guarantee that during the execution of a program, no file descriptors will be reused, even if a file is closed and another opened? My understanding is that this situation would usually lead to the file descriptor for the closed file being reassigned to the newly opened file.
I'm working on an I/O tracing project and it would make life simpler if I could assume that after an open()/fopen() call, all subsequent I/O to that file descriptor is to the same file.
I'll take either a compile-time or run-time solution.
If it is not possible, I could do my own accounting when I process the trace file (noting the location of all open and close calls), but I'd prefer to squash the problem during execution of the traced program.
Note that POSIX requires:
The open() function shall return a file descriptor for the named file
that is the lowest file descriptor not currently open for that
process.
So in the strictest sense, your request will change the program's environment to be no longer POSIX compliant.
That said, I think your best bet is to use the LD_PRELOAD trick to intercept calls to close and ignore them.
You'd have to write a SO that contains a close(2) that opens /dev/null on old FDs, and then use $LD_PRELOAD to load it into process space before starting the application.
You must already be ptraceing the application to intercept its file opening and closing operations.
It would appear trivial to prevent FD re-use by "injecting" dup2(X, Y); close(X); calls into the application, and adjusting Y to be anything you want.
However, the application itself could be using dup2 to force a re-use of previously closed FD, and may not work if you prevent that, so I think you'll just have to deal with this in post-processing step.
Also, it's quite easy to write an app that will run out of FDs if you disallow re-use.

What happens if I don't call fclose() in a C program?

Firstly, I'm aware that opening a file with fopen() and not closing it is horribly irresponsible, and bad form. This is just sheer curiosity, so please humour me :)
I know that if a C program opens a bunch of files and never closes any of them, eventually fopen() will start failing. Are there any other side effects that could cause problems outside the code itself? For instance, if I have a program that opens one file, and then exits without closing it, could that cause a problem for the person running the program? Would such a program leak anything (memory, file handles)? Could there be problems accessing that file again once the program had finished? What would happen if the program was run many times in succession?
As long as your program is running, if you keep opening files without closing them, the most likely result is that you will run out of file descriptors/handles available for your process, and attempting to open more files will fail eventually. On Windows, this can also prevent other processes from opening or deleting the files you have open, since by default, files are opened in an exclusive sharing mode that prevents other processes from opening them.
Once your program exits, the operating system will clean up after you. It will close any files you left open when it terminates your process, and perform any other cleanup that is necessary (e.g. if a file was marked delete-on-close, it will delete the file then; note that that sort of thing is platform-specific).
However, another issue to be careful of is buffered data. Most file streams buffer data in memory before writing it out to disk. If you're using FILE* streams from the stdio library, then there are two possibilities:
Your program exited normally, either by calling the exit(3) function, or by returning from main (which implicitly calls exit(3)).
Your program exited abnormally; this can be via calling abort(3) or _Exit(3), dying from a signal/exception, etc.
If your program exited normally, the C runtime will take care of flushing any buffered streams that were open. So, if you had buffered data written to a FILE* that wasn't flushed, it will be flushed on normal exit.
Conversely, if your program exited abnormally, any buffered data will not be flushed. The OS just says "oh dear me, you left a file descriptor open, I better close that for you" when the process terminates; it has no idea there's some random data lying somewhere in memory that the program intended to write to disk but did not. So be careful about that.
The C standard says that calling exit (or, equivalently, returning from main) causes all open FILE objects to be closed as-if by fclose. So this is perfectly fine, except that you forfeit the opportunity to detect write errors.
EDIT: There is no such guarantee for abnormal termination (abort, a failed assert, receipt of a signal whose default behavior is to abnormally terminate the program -- note that there aren't necessarily any such signals -- and other implementation-defined means). As others have said, modern operating systems will clean up all externally visible resources, such as open OS-level file handles, regardless; however, FILEs are likely not to be flushed in that case.
There certainly have been OSes that did not clean up externally visible resources on abnormal termination; it tends to go along with not enforcing hard privilege boundaries between "kernel" and "user" code and/or between distinct user space "processes", simply because if you don't have those boundaries it may not be possible to do so safely in all cases. (Consider, for instance, what happens if you write garbage over the open-file table in MS-DOS, as you are perfectly able to do.)
Assuming you exit under control, using the exit() system call or returning from main(), then the open file streams are closed after flushing. The C Standard (and POSIX) mandate this.
If you exit out of control (core dump, SIGKILL) etc, or if you use _exit() or _Exit(), then the open file streams are not flushed (but the file descriptors end up closed, assuming a POSIX-like system with file descriptors - Standard C does not mandate file descriptors). Note that _Exit() is mandated by the C99 standard, but _exit() is mandated by POSIX (but they behave the same on POSIX systems). Note that file descriptors are separate from file streams. See the discussion of 'Consequences of Program Termination' on the POSIX page for _exit() to see what happens when a program terminates under Unix.
When the process dies, most modern operating systems (the kernel specifically) will free all of your handles and allocated memory.

Resources