C file pointer changing after fork and (failed) exec - c

I made program which make fork and I think child does not affect parent.
But file pointer is changed although I did not made any changes in the parent.
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
int main(void) {
FILE *fp = fopen("sm.c", "r");
char buf[1000];
char *args[] = {"invailid_command", NULL};
fgets(buf, sizeof(buf), fp);
printf("I'm one %d %ld\n", getpid(), ftell(fp));
if (fork() == 0) {
execvp(args[0], args);
exit(EXIT_FAILURE);
}
wait(NULL);
printf("I'm two %d %ld\n", getpid(), ftell(fp));
}
This outputs
I'm one 21500 20
I'm two 21500 -1
And I want to make file pointer not change between two printf calls.
Why does the file pointer change and can I make the file pointer unchangeable even though execvp fails?

Credit to Jonathan Leffler for pointing us in the right direction.
Although your program does not produce the same unexpected behavior for me on CentOS 7 / GCC 4.8.5 / GLIBC 2.17, it is plausible that you observe different behavior. Your program's behavior is in fact undefined according to POSIX (on which you rely for fork). Here are some excerpts from the relevant section (emphasis added):
An open file description may be accessed through a file descriptor,
which is created using functions such as open() or pipe(), or through
a stream, which is created using functions such as fopen() or popen().
Either a file descriptor or a stream is called a "handle" on the open
file description to which it refers; an open file description may have
several handles.
[...]
The result of function calls involving any one handle (the "active
handle") is defined elsewhere in this volume of POSIX.1-2017, but if
two or more handles are used, and any one of them is a stream, the
application shall ensure that their actions are coordinated as
described below. If this is not done, the result is undefined.
[...]
For a handle to become the active handle, the application shall ensure
that the actions below are performed between the last use of the
handle (the current active handle) and the first use of the second
handle (the future active handle). The second handle then becomes the
active handle. [...]
The handles need not be in the same process for these rules to apply.
Note that after a fork(), two handles exist where one existed before.
The application shall ensure that, if both handles can ever be
accessed, they are both in a state where the other could become the
active handle first. [Where subject to the preceding qualification, the] application shall prepare for a fork()
exactly as if it were a change of active handle. (If the only action
performed by one of the processes is one of the exec functions or
_exit() (not exit()), the handle is never accessed in that process.)
For the first handle, the first applicable condition below applies.
[An impressively long list of alternatives that do not apply to the OP's situation ...]
If the stream is open with a mode that allows reading and the underlying open file description refers to a device that is capable of
seeking, the application shall either perform an fflush(), or the
stream shall be closed.
For the second handle:
If any previous active handle has been used by a function that explicitly changed the file offset, except as required above for the
first handle, the application shall perform an lseek() or fseek() (as
appropriate to the type of handle) to an appropriate location.
Thus, for the OP's program to access the same stream in both parent and child, POSIX demands that the parent fflush() stdin before forking, and that the child fseek() it after starting. Then, after waiting for the child to terminate, the parent must fseek() the stream. Given that we know the child's exec will fail, however, the requirement for all the flushing and seeking can be avoided by having the child use _exit() (which does not access the stream) instead of exit().
Complying with POSIX's provisions yields the following:
When these rules are followed, regardless of the sequence of handles
used, implementations shall ensure that an application, even one
consisting of several processes, shall yield correct results: no data
shall be lost or duplicated when writing, and all data shall be
written in order, except as requested by seeks.
It is worth noting, however, that
It is
implementation-defined whether, and under what conditions, all input
is seen exactly once.
I appreciate that it may be somewhat unsatisfying to hear merely that your expectations for program behavior are not justified by the relevant standards, but that's really all there is. The parent and child processes do have some relevant shared data in the form of a common open file description (with which they have separate handles associated), and that seems likely to be the vehicle for the unexpected (and undefined) behavior, but there's no basis for predicting the specific behavior you see, nor the different behavior I see for the same program.

I was able to reproduce this on Ubuntu 16.04 with gcc 5.4.0. The culprit here is exit in conjunction with the way the child process is being created.
The man page for exit states the following:
The exit() function causes normal process termination and the value
of status & 0377 is returned to the parent (see wait(2)).
All functions registered with atexit(3) and on_exit(3) are
called, in the reverse order of their registration. (It is possible
for one of these functions to use atexit(3) or on_exit(3) to
register an additional function to be executed during exit processing;
the new registration is added to the front of the list of
functions that remain to be called.) If one of these functions does
not return (e.g., it calls _exit(2), or kills itself with a
signal), then none of the remaining functions is called, and further
exit processing (in particular, flushing of stdio(3) streams)
is abandoned. If a function has been registered multiple times
using atexit(3) or on_exit(3), then it is called as many times as
it was registered.
All open stdio(3) streams are flushed and closed. Files created by
tmpfile(3) are removed.
The C standard specifies two constants, EXIT_SUCCESS and
EXIT_FAILURE, that may be passed to exit() to indicate successful or
unsuccessful termination, respectively.
So when you call exit in the child it closes the FILE represented by fp.
Normally when a child process is created, it gets a copy of the parent's file descriptors. However, in this case it seems the child's memory still physically points to the parent's. So when exit closes the FILE it is affecting the parent.
If you change the child to instead call _exit, it closes the child's file descriptor but manages to not touch the FILE object and the second call to ftell in the parent will succeed. It's good practice to use _exit in a non-exec'ed child anyway because it prevents atexit handlers from being called in the child.

Related

What is unlocked_stdio in C?

So, I was looking for random linux manual pages, when I encountered this weird one, you can see it by executing "man unlocked_stdio", or you can view it in your browser by going to this page
So, what is this for? It has weird functions like getc_unlocked, getchar_unlocked, putc_unlocked, putchar_unlocked, and etc, all those functions have one thing in common, they have a FILE stream parameter, I know that all those functions are normal IO functions with a "_unlocked" appended to them, but what does that mean?
It has to do with thread safety.
From your link
Each of these functions has the same behavior as its counterpart without the "_unlocked" suffix, except that they do not use locking (they do not set locks themselves, and do not test for the presence of locks set by others) and hence are thread-unsafe. See flockfile(3).
And from flockfile:
The stdio functions are thread-safe. This is achieved by assigning to each FILE object a lockcount and (if the lockcount is nonzero) an owning thread. For each library call, these functions wait until the FILE object is no longer locked by a different thread, then lock it, do the requested I/O, and unlock the object again.
Some pseudocode that shows how it works. This is not necessarily exactly how it is implemented in reality, but it demonstrates the idea, and clearly shows the difference with the unlocked version. Functionalitywise, the locked version is essentially a wrapper around the unlocked version.
int getchar(void) {
// Wait until stdinlock is unlocked and then lock it
// This is an atomic operation
wait_until_unlocked_and_then_lock(stdinlock);
// Get the character from stdin
int ret = getchar_unlocked();
// Release the lock to make the input stream available to other threads
unlock(stdinlock);
// And return the value
return ret;
}

Why should I close all file descriptors after calling fork() and prior to calling exec...()? And how would I do it?

I've seen a lot of C code that tries to close all file descriptors between calling fork() and calling exec...(). Why is this commonly done and what is the best way to do it in my own code, as I've seen so many different implementations already?
When calling fork(), your operation system creates a new process by simply cloning your existing process. The new process will be pretty much identical to the process it was cloned from, except for its process ID and any properties that are documented to be replaced or reset by the fork() call.
When calling any form of exec...(), the process image of the calling process is replaced by a new process image but other than that the process state is preserved. One consequence is that open file descriptors in the process file descriptor table prior to calling exec...() are still present in that table after calling it, so the new process code inherits access to them. I guess this has probably been done so that STDIN, STDOUT, and STDERR are automatically inherited by child processes.
However, keep in mind that in POSIX C file descriptors are not only used to access actual files, they are also used for all kind of system and network sockets, pipes, shared memory identifiers, and so on. If you don't close these prior to calling exec...(), your new child process will get access to all of them, even to those resources it could not gain access on its own as it doesn't even have the required access rights. Think about a root process creating a non-root child process, yet this child would have access to all open file descriptors of the root parent process, including open files that should only be writable by root or protected server sockets below port 1024.
So unless you want a child process to inherit access to currently open file descriptors, as may explicitly be desired e.g. to capture STDOUT of a process or feed data via STDIN to that process, you are required to close them prior to calling exec...(). Not only because of security (which sometimes may play no role at all) but also because otherwise the child process will have less free file descriptors available (and think of a long chain of processes, each opening files and then spawning a sub-process... there will be less and less free file descriptors available).
One way to do that is to always open files using the flag O_CLOEXEC, which ensures that this file descriptor is automatically closed when exec...() is ever called. One problem with that solution is that you cannot control how external libraries may open files, so you cannot rely that all code will always set this flag.
Another problem is that this solution only works for file descriptors created with open(). You cannot pass that flag when creating sockets, pipes, etc. This is a known problem and some systems are working around that by offering the non-standard acccept4(), pipe2(), dup3(), and the SOCK_CLOEXEC flag for sockets, however these are not yet POSIX standard and it's unknown if they will become standard (this is planned but until a new standard has been released we cannot know for sure, also it will take years until all systems have adopted them).
What you can do is to later on set the flag FD_CLOEXEC using fcntl() on the file descriptor, however, note that this isn't safe in a multi-thread environment. Just consider the following code:
int so = socket(...);
fcntl(so, F_SETFD, FD_CLOEXEC);
If another thread calls fork() in between the first and the second line, which is of course possible, the flag has not yet been set yet and thus this file descriptor won't get closed.
So the only way that is really safe is to explicitly close them and this is not as easy as it may seem!
I've seen a lot of code that does stupid things like this:
for (int i = STDERR_FILENO + 1; i < 256; i++) close(i);
But just because some POSIX systems have a default limit of 256 doesn't mean that this limit cannot be raised. Also on some system the default limit is always higher to begin with.
Using FD_SETSIZE instead of 256 is equally wrong as just because the select() API has a hard limit by default on most systems doesn't mean that a process cannot have more open file descriptors than this limit (after all you don't have to use select() with them, you can use poll() API as a replacement and poll() has no upper limit on file descriptor numbers).
Always correct is to use OPEN_MAX instead of 256 as that is really the absolute maximum of file descriptors a process can have. The downside is that OPEN_MAX can theoretically be huge and doesn't reflect the real current runtime limit of a process.
To avoid having to close too many non-existing file descriptors, you can use this code instead:
int fdlimit = (int)sysconf(_SC_OPEN_MAX);
for (int i = STDERR_FILENO + 1; i < fdlimit; i++) close(i);
sysconf(_SC_OPEN_MAX) is documented to update correctly if the open file limit (RLIMIT_NOFILE) has been raised using setrlimit(). The resource limits (rlimits) are the effective limits for a running process and for files they will always have to be between _POSIX_OPEN_MAX (documented as the minimum number of file descriptors a process is always allowed to open, must be at least 20) and OPEN_MAX (must be at least _POSIX_OPEN_MAX and sets the upper limit).
While closing all possible descriptors in a loop is technically correct and will work as desired, it may try to close several thousand file descriptors, most of them will often not exist. Even if the close() call for a non-existing file descriptor is fast (which is not guaranteed by any standard), it may take a while on weaker systems (think of embedded devices, think of small single-board computers), which may be a problem.
So several systems have developed more efficient ways to solve this issue. Famous examples are closefrom() and fdwalk() which BSD and Solaris systems support. Unfortunately The Open Group voted against adding closefrom() to the standard (quote): "it is not possible to standardize an interface that closes arbitrary file descriptors above a certain value while still guaranteeing a conforming environment." (Source) This is of course nonsense, as they make the rules themselves and if they define that certain file descriptors can always be silently omitted from closing if the environment or system requires or the code itself requests that, then this would break no existing implementation of that function and still offer the desired functionality for the rest of us. Without these functions people will use a loop and do exactly what The Open Group tries to avoid here, so not adding it only makes the situation even worse.
On some platforms you are basically out of luck, e.g. macOS, which is fully POSIX conform. If you don't want to close all file descriptors in a loop on macOS, your only option is to not use fork()/exec...() but instead posix_spawn(). posix_spawn() is a newer API for platforms that don't support process forking, it can be implemented purely in user space on top of fork()/exec...() for those platforms that do support forking and can otherwise use some other API a platform offers for starting child processes. On macOS there exists a non-standard flag POSIX_SPAWN_CLOEXEC_DEFAULT, which will tread all file descriptors as if the CLOEXEC flag has been set on them, except for those for that you explicitly specified file actions.
On Linux you can get a list of file descriptors by looking at the path /proc/{PID}/fd/ with {PID} being the process ID of your process (getpid()), that is, if the proc file system has been mounted at all and it has been mounted to /proc (but a lot of Linux tools rely on that, not doing so would break many other things as well). Basically you can limit yourself to close all descriptors listed under this path.
True story: Once upon a time I wrote a simple little C program that opened a file, and I noticed that the file descriptor returned by open was 4. "That's funny," I thought. "Standard input, output, and error are always file descriptors 0, 1, and 2, so the first file descriptor you open is usually 3."
So I wrote another little C program that started reading from file descriptor 3 (without opening it, that is, but rather, assuming that 3 was a pre-opened fd, just like 0, 1, and 2). It quickly became apparent that, on the Unix system I was using, file descriptor 3 was pre-opened on the system password file. This was evidently a bug in the login program, which was exec'ing my login shell with fd 3 still open on the password file, and the stray fd was in turn being inherited by programs I ran from my shell.
Naturally the next thing I tried was a simple little C program to write to the pre-opened file descriptor 3, to see if I could modify the password file and give myself root access. This, however, didn't work; the stray fd 3 was opened on the password file in read-only mode.
But at any rate, this helps to explain why you shouldn't leave file descriptors open when you exec a child process.
[Footnote: I said "true story", and it mostly is, but for the sake of the narrative I did change one detail. In fact, the buggy version of /bin/login was leaving fd 3 opened on the groups file, /etc/group, not the password file.]

Restriction of C standard I/O and why we can't use C standard I/O with sockets

I am reading CSAPP recently. In section 10.9, it said that standard I/O should not be used with socket because of the reasons as follows:
(1) The restrictions of standard I/O
Restriction 1: Input functions following output functions. An input
function cannot follow an output function without an intervening call
to fflush, fseek, fsetpos, or rewind. The fflush function empties the
buffer associated with a stream. The latter three functions use the
Unix I/O lseek function to reset the current file position.
Restriction 2: Output functions following input functions. An output
function cannot follow an input function without an intervening call
to fseek, fsetpos, or rewind, unless the input function encounters an
end-of-file.
(2) It is illegal to use the lseek function on a socket.
Question 1: What would happen if I violate the restriction? I wrote a code snippet and it works fine.
Question 2: To walk around restriction 2, one approach is as follows:
File *fpin, *fpout;
fpin = fdopen(sockfd, "r");
fpout = fdopen(sockfd, "w");
/* Some Work Here */
fclose(fpin);
fclose(fpout);
In the text book, it said,
Closing an already closed descriptor in a threaded program is a
recipe for disaster.
Why?
Your workaround does not work as written, due to the double-close bug you cited. Double-close is harmless in single-threaded programs as long as there are no intervening operations which could open new file descriptors (the second close will just fail harmlessly with EBADF) but they are critical bugs in multi-threaded programs. Consider this scenario:
Thread A calls close(n).
Thread B calls open and it returns n which gets stored as int fd1.
Thread A calls close(n) again.
Thread B calls open again and it returns n again, which gets stored as fd2.
Thread B now attempts to write to fd1 and actually writes into the file opened by the second call to open instead of the one first opened.
This can lead to massive file corruption, information leak (imagine writing a password to a socket instead of a local file), etc.
However, the problem is easy to fix. Instead of calling fdopen twice with the same file descriptor, simply use dup to copy it and pass the copy to fdopen. With this simple fix, stdio is perfectly usable with sockets. It's not suitable for asynchronous event loop usage still, but if you're using threads for IO, it works great.
Edit: I think I skipped answering your question 1. What happens if you violate the rules about how to switch between input and output on a stdio stream is undefined behavior. This means testing it and seeing that it "works" is not meaningful; it could mean either:
The C implementation you're using provides a definition (as part of its documentation) for what happens in this case, and it matches the behavior you wanted. In this case, you can use it, but your code will not be portable to other implementations. Doing so is considered very bad practice for this reason. Or,
You just got the result you expected by chance, usually as a side effect of how the relevant functionality is implemented internally on the implementation you're using. In this case, there's no guarantee that it doesn't have corner cases that fail to behave as you expected, or that it will continue to work the same way in future releases, etc.

Standard streams and vfork

I am playing a bit with fork/vfork functions, and there is something that is puzzling to me. In Stevens book it is written that:
Note in Figure 8.3 that we call _exit instead of exit.
As we described in Section 7.3, _exit does not perform any flushing of standard I/O buffers. If we call exit instead, the results are indeterminate.
Depending on the implementation of the standard I/O library, we might see no difference in the output, or we might find that the output from the parent's printf has disappeared.
If the child calls exit, the implementation flushes the standard I/O streams.
If this is the only action taken by the library, then we will see no difference with the output generated if the child called _exit.
If the implementation also closes the standard I/O streams, however, the memory representing the FILE object for the standard output will be cleared out.
Because the child is borrowing the parent's address space, when the parent resumes and calls printf, no output will appear and printf will return -1.
Note that the parent's STDOUT_FILENO is still valid, as the child gets a copy of the parent's file descriptor array (refer back to Figure 8.2).
Most modern implementations of exit will not bother to close the streams.
Because the process is about to exit, the kernel will close all the file descriptors open in the process.
Closing them in the library simply adds overhead without any benefit.
so I tried to test if I can get printf error, in my manual of vfork there is:
All open stdio(3) streams are flushed and closed. Files created by tmpfile(3) are removed.
but when I compile and execute this program:
#include<stdio.h>
#include<stdlib.h>
#include<unistd.h>
#include<sys/types.h>
#include<sys/wait.h>
int main()
{
int s;
pid_t ret;
if (vfork() == 0)
{
//abort();
exit(6);
}
else
{
ret=wait(&s);
printf("termination status to %d\n",s);
if (WIFEXITED(s))
printf("normalnie, status to %d\n",WEXITSTATUS(s));
}
return 0;
}
everything is working fine, I don't get any printf errors. Why is that?
The end of the paragraph you quoted says:
Most modern implementations of exit will not bother to close the streams. Because the process is about to exit, the kernel will close all the file descriptors open in the process. Closing them in the library simply adds overhead without any benefit.
This is most likely what's happening. Your OS doesn't actually close the stream (but it does probably flush it).
The important thing isn't what exit does here, its the underlying concept. The child is sharing the parent's memory and stack frame. That means that the child can very easily change something that the parent did not expect, which could easily cause the parent to crash or misbehave when it starts running again. The man page for vfork says the only thing a process can do is call exit() or an exec. In fact, the child should not even allocate memory or modify any variables.
To see the impact of this, try putting the vfork call inside of a function and let the child return or modify some variables there and see what happens.

What is the order in which a POSIX system clears the file locks that were not unlocked cleanly?

The POSIX specification for fcntl() states:
All locks associated with a file for a given process shall be removed when a file descriptor for that file is closed by that process or the process holding that file descriptor terminates.
Is this operation of unlocking the file segment locks that were held by a terminated process atomic per-file? In other words, if a process had locked byte segments B1..B2 and B3..B4 of a file but did not unlock the segments before terminating, when the system gets around to unlocking them, are segments B1..B2 and B3..B4 both unlocked before another fcntl() operation to lock a segment of the file can succeed? If not atomic per-file, does the order in which these file segments are unlocked by the system depend on the order in which the file segments were originally acquired?
The specification for fcntl() does not say, but perhaps there is a general provision in the POSIX specification that mandates a deterministic order on operations to clean up after a process that exits uncleanly or crashes.
There's a partial answer in section 2.9.7, Thread Interactions with Regular File Operations, of the POSIX specification:
All of the functions chmod(), close(), fchmod(), fcntl(), fstat(), ftruncate(), lseek(), open(), read(), readlink(), stat(), symlink(), and write() shall be atomic with respect to each other in the effects specified in IEEE Std 1003.1-2001 when they operate on regular files. If two threads each call one of these functions, each call shall either see all of the specified effects of the other call, or none of them.
So, for a regular file, if a thread of a process holds locks on segments of a file and calls close() on the last file descriptor associated with the file, then the effects of close() (including removing all outstanding locks on the file that are held by the process) are atomic with respect to the effects of a call to fcntl() by a thread of another process to lock a segment of the file.
The specification for exit() states:
These functions shall terminate the calling process with the following consequences:
All of the file descriptors, directory streams[, conversion descriptors, and message catalog descriptors] open in the calling process shall be closed.
...
Presumably, open file descriptors are closed as if by appropriate calls to close(), but unfortunately the specification does not say how open file descriptors are "closed".
The 2004 specification seems even more vague when it comes to the steps of abnormal process termination. The only thing I could find is the documentation for abort(). At least with the 2008 specification, there is a section titled Consequences of Process Termination on the page for _Exit(). The wording, though, is still:
All of the file descriptors, directory streams, conversion descriptors, and message catalog descriptors open in the calling process shall be closed.
UPDATE: I just opened issue 0000498 in the Austin Group Defect Tracker.
I don't think the POSIX specification stipulates whether the releasing of locks is atomic or not, so you should assume that it behaves as inconveniently as possible for you. If you need them to be atomic, they aren't; if you need them to be handled separately, they're atomic; if you don't care, some machines will do it one way and other machines the other way. So, write your code so that it doesn't matter.
I'm not sure how you'd write code to detect the problem.
In practice, I expect that the locks would be released atomically, but the standard doesn't say, so you should not assume.

Resources