When sockets are created or files are opened/created in C, is the file descriptor that's assigned to the socket/file guaranteed to be the lowest-valued descriptor available? What does the C spec say about file descriptor assignment in this regard, if anything?
It's not guaranteed to be the lowest, and is implementation dependent (1). In general, however, the routine that assigns open file descriptors uses a method that gives you the first open on. It could be that immediately after several lower ones free, leaving you with a higher descriptor than you might expect though.
The only reason I can think of to know this, though, is for the select function, which is sped up if you pass it the highest file descriptor you need to check for.
(1) Note that those implementations that follow the IEEE standard do guarantee the lowest unused descriptor for files, but this may not apply to sockets. Not every implementation follows the IEEE standard for open(), so if you're writing portable software it is best not to depend on it.
I don't think you'll find it in the C spec, more likely the spec for your OS. My experience in Linux has been that it's always the lowest.
I'll counter this with another question - why does this matter? You shouldn't be comparing the file descriptor with anything (unless checking for stdin/stdout/stderr) or doing math with it. As long as it fits in an int (and its guaranteed to) that's all you really need to know.
Steve M is right; C has no notion of sockets, and its file I/O functions use a [pointer to a] FILE object, not a descriptor.
#aib the open(), close(), lseek(), read(), write() all make use of file descriptors. I hardly ever use streams for I/O.
#Kyle it matters because of statements like select(). Knowing the highest descriptor can improve performance.
The C spec says that it's implementation dependent. If you're looking at a Unix implementation, the man page for open(2) says "The file descriptor returned by a successful call will be the lowest-numbered file descriptor not currently open for the process."
This helps if you're trying to attach a specific file to a specific descriptor. Say you want to redirect stderr to /dev/null. Something like
close(2); open("/dev/null", O_WRONLY);
ought to do it. You should, of course, capture the fd returned by open and ensure that it's 2.
Related
I've seen a lot of C code that tries to close all file descriptors between calling fork() and calling exec...(). Why is this commonly done and what is the best way to do it in my own code, as I've seen so many different implementations already?
When calling fork(), your operation system creates a new process by simply cloning your existing process. The new process will be pretty much identical to the process it was cloned from, except for its process ID and any properties that are documented to be replaced or reset by the fork() call.
When calling any form of exec...(), the process image of the calling process is replaced by a new process image but other than that the process state is preserved. One consequence is that open file descriptors in the process file descriptor table prior to calling exec...() are still present in that table after calling it, so the new process code inherits access to them. I guess this has probably been done so that STDIN, STDOUT, and STDERR are automatically inherited by child processes.
However, keep in mind that in POSIX C file descriptors are not only used to access actual files, they are also used for all kind of system and network sockets, pipes, shared memory identifiers, and so on. If you don't close these prior to calling exec...(), your new child process will get access to all of them, even to those resources it could not gain access on its own as it doesn't even have the required access rights. Think about a root process creating a non-root child process, yet this child would have access to all open file descriptors of the root parent process, including open files that should only be writable by root or protected server sockets below port 1024.
So unless you want a child process to inherit access to currently open file descriptors, as may explicitly be desired e.g. to capture STDOUT of a process or feed data via STDIN to that process, you are required to close them prior to calling exec...(). Not only because of security (which sometimes may play no role at all) but also because otherwise the child process will have less free file descriptors available (and think of a long chain of processes, each opening files and then spawning a sub-process... there will be less and less free file descriptors available).
One way to do that is to always open files using the flag O_CLOEXEC, which ensures that this file descriptor is automatically closed when exec...() is ever called. One problem with that solution is that you cannot control how external libraries may open files, so you cannot rely that all code will always set this flag.
Another problem is that this solution only works for file descriptors created with open(). You cannot pass that flag when creating sockets, pipes, etc. This is a known problem and some systems are working around that by offering the non-standard acccept4(), pipe2(), dup3(), and the SOCK_CLOEXEC flag for sockets, however these are not yet POSIX standard and it's unknown if they will become standard (this is planned but until a new standard has been released we cannot know for sure, also it will take years until all systems have adopted them).
What you can do is to later on set the flag FD_CLOEXEC using fcntl() on the file descriptor, however, note that this isn't safe in a multi-thread environment. Just consider the following code:
int so = socket(...);
fcntl(so, F_SETFD, FD_CLOEXEC);
If another thread calls fork() in between the first and the second line, which is of course possible, the flag has not yet been set yet and thus this file descriptor won't get closed.
So the only way that is really safe is to explicitly close them and this is not as easy as it may seem!
I've seen a lot of code that does stupid things like this:
for (int i = STDERR_FILENO + 1; i < 256; i++) close(i);
But just because some POSIX systems have a default limit of 256 doesn't mean that this limit cannot be raised. Also on some system the default limit is always higher to begin with.
Using FD_SETSIZE instead of 256 is equally wrong as just because the select() API has a hard limit by default on most systems doesn't mean that a process cannot have more open file descriptors than this limit (after all you don't have to use select() with them, you can use poll() API as a replacement and poll() has no upper limit on file descriptor numbers).
Always correct is to use OPEN_MAX instead of 256 as that is really the absolute maximum of file descriptors a process can have. The downside is that OPEN_MAX can theoretically be huge and doesn't reflect the real current runtime limit of a process.
To avoid having to close too many non-existing file descriptors, you can use this code instead:
int fdlimit = (int)sysconf(_SC_OPEN_MAX);
for (int i = STDERR_FILENO + 1; i < fdlimit; i++) close(i);
sysconf(_SC_OPEN_MAX) is documented to update correctly if the open file limit (RLIMIT_NOFILE) has been raised using setrlimit(). The resource limits (rlimits) are the effective limits for a running process and for files they will always have to be between _POSIX_OPEN_MAX (documented as the minimum number of file descriptors a process is always allowed to open, must be at least 20) and OPEN_MAX (must be at least _POSIX_OPEN_MAX and sets the upper limit).
While closing all possible descriptors in a loop is technically correct and will work as desired, it may try to close several thousand file descriptors, most of them will often not exist. Even if the close() call for a non-existing file descriptor is fast (which is not guaranteed by any standard), it may take a while on weaker systems (think of embedded devices, think of small single-board computers), which may be a problem.
So several systems have developed more efficient ways to solve this issue. Famous examples are closefrom() and fdwalk() which BSD and Solaris systems support. Unfortunately The Open Group voted against adding closefrom() to the standard (quote): "it is not possible to standardize an interface that closes arbitrary file descriptors above a certain value while still guaranteeing a conforming environment." (Source) This is of course nonsense, as they make the rules themselves and if they define that certain file descriptors can always be silently omitted from closing if the environment or system requires or the code itself requests that, then this would break no existing implementation of that function and still offer the desired functionality for the rest of us. Without these functions people will use a loop and do exactly what The Open Group tries to avoid here, so not adding it only makes the situation even worse.
On some platforms you are basically out of luck, e.g. macOS, which is fully POSIX conform. If you don't want to close all file descriptors in a loop on macOS, your only option is to not use fork()/exec...() but instead posix_spawn(). posix_spawn() is a newer API for platforms that don't support process forking, it can be implemented purely in user space on top of fork()/exec...() for those platforms that do support forking and can otherwise use some other API a platform offers for starting child processes. On macOS there exists a non-standard flag POSIX_SPAWN_CLOEXEC_DEFAULT, which will tread all file descriptors as if the CLOEXEC flag has been set on them, except for those for that you explicitly specified file actions.
On Linux you can get a list of file descriptors by looking at the path /proc/{PID}/fd/ with {PID} being the process ID of your process (getpid()), that is, if the proc file system has been mounted at all and it has been mounted to /proc (but a lot of Linux tools rely on that, not doing so would break many other things as well). Basically you can limit yourself to close all descriptors listed under this path.
True story: Once upon a time I wrote a simple little C program that opened a file, and I noticed that the file descriptor returned by open was 4. "That's funny," I thought. "Standard input, output, and error are always file descriptors 0, 1, and 2, so the first file descriptor you open is usually 3."
So I wrote another little C program that started reading from file descriptor 3 (without opening it, that is, but rather, assuming that 3 was a pre-opened fd, just like 0, 1, and 2). It quickly became apparent that, on the Unix system I was using, file descriptor 3 was pre-opened on the system password file. This was evidently a bug in the login program, which was exec'ing my login shell with fd 3 still open on the password file, and the stray fd was in turn being inherited by programs I ran from my shell.
Naturally the next thing I tried was a simple little C program to write to the pre-opened file descriptor 3, to see if I could modify the password file and give myself root access. This, however, didn't work; the stray fd 3 was opened on the password file in read-only mode.
But at any rate, this helps to explain why you shouldn't leave file descriptors open when you exec a child process.
[Footnote: I said "true story", and it mostly is, but for the sake of the narrative I did change one detail. In fact, the buggy version of /bin/login was leaving fd 3 opened on the groups file, /etc/group, not the password file.]
When using the open() function in C, I get a fd (file descriptor). I was wondering if it's the same thing as its process id, because as I know, fd is an integer.
No it is not.
PID is process identifier, and file descriptor is file handler identifier.
Specifically from Wikipedia about File Descriptors:
(...) file descriptor (FD) is an abstract indicator for accessing a file. The term is generally used in POSIX operating systems.
In POSIX, a file descriptor is an integer, specifically of the C type int. (...)
And for PID:
[PID] is a number used by most operating system kernels, — such as that of UNIX, Mac OS X or Microsoft Windows — to temporarily uniquely identify a process (...)
No, file descriptors are indices into the file table of your own process. They are always small integers (that is, up to the max-open-files limit for the process) because, among other things, the bitmap interface to select() wouldn't work if they were arbitrary numbers. On the other hand PIDs typically grow to at least 32767 before the wrap around.
An open file in general doesn't have a process ID of its own. And even in the case where one might arguably expect it to be connected to a particular process -- namely when the file handle comes from popen() -- there's no such direct connection and what goes on inside popen() is more complex than "treat this process as if it was a file".
No...
A file descriptor is an opaque handle that is used in the interface between user space and the kernel to identify file/socket resources. Therefore when you use open() or socket() (system calls to interface to the kernel) you are returned a file descriptor, which is an integer (it is actually an index into the processes u structure - but that is not important). Therefore if you want to interface directly with the kernel, using system calls to read(), write(), close() etc. the handle you use is a file descriptor.
A PID (i.e., process identification number) is an identification number that is automatically assigned to each process when it is created on a Unix-like operating system.A process is an executing (i.e., running) instance of a program. Each process is guaranteed a unique PID, which is always a non-negative integer.
One of the first things a UNIX programmer learns is that every running program starts with three files already opened:
Descriptive Name.............fd number...................... Description
Standard In 0 Input from the keyboard
Standard Out 1 Output to the console
Standard Error 2 Error output to the console
If you create any file descriptor, mostly you will get value as 3. Because 3 is least available +ve integer to allocate for fd. Because STDIN,STDOUT,STDERR are occupied with 0,1,2 respectively. That is why fd is called as smallest non negative integer.
I would like to create blocking and non-blocking file in Unix's C. First, blocking:
fd = open("file.txt", O_CREAT | O_WRONLY | O_EXCL);
is that right? Shouldnt I add some mode options, like 0666 for example?
How about non-blocking file? I have no idea for this.
I would like to achieve something like:
when I open it to write in it, and it's opened for writing, it's ok; if not it blocks.
when I open it to read from it, and it's opened for reading, it's ok; if not it blocks.
File descriptors are blocking or non-blocking; files are not. Add O_NBLOCK to the options in the open() call if you want a non-blocking file descriptor.
Note that opening a FIFO for reading or writing will block unless there's a process with the FIFO open for the other operation, or you specify O_NBLOCK. If you open it for read and write, the open() is non-blocking (will return promptly); I/O operations are still controlled by whether you set O_NBLOCK or not.
The updated question is not clear. However, if you're looking for 'exclusive access to the file' (so that no-one else has it open), then neither O_EXCL nor O_NBLOCK is the answer. O_EXCL affects what happens when you create the file; the create will fail if the file already exists. O_NBLOCK affects whether a read() operation will block when there's no data available to read. If you read the POSIX open() description, there is nothing there that allows you to request 'exclusive access' to a file.
To answer the question about file mode: if you include O_CREAT, you need the third argument to open(). If you omit O_CREAT, you don't need the third argument to open(). It is a varargs function:
int open(const char *filename, int options, ...);
I don't know what you are calling a blocking file (blocking IO in Unix means that the IO operations wait for the data to be available or for a sure failure, they are opposed to non-blocking IO which returns immediately if there is no available data).
You always need to specify a mode when opening with O_CREAT.
The open you show will fails if the file already exists (when fixed for the above point).
Unix has no standard way to lock file for exclusive access excepted that. There are advisory locks (but all programs must respect the protocol). Some have mandatory lock extension. The received wisdom is not to rely on either kind of locking when accessing network file system.
Shouldn't I add some mode options?
You should, if the file is write-only and to be created if nonexistent. In this case, open() expects a third argument as well, so omitting it results in undefined behavior.
Edit:
The updated question is even more confusing...
when I open it to write in it, and it's opened for writing, it's ok; if not it blocks.
Why would you need that? See, if you try to write to a file/file descriptor not opened for writing, write() will return -1 and you can check the error code stored in errno. Tell us what you're trying to achieve by this bizarre thing you want instead of overcomplicating and messing up your code.
(Remarks in parentheses:
I would like to create blocking and non-blocking file
What's that?
in unix's C
Again, there's no such thing. There is the C language, which is platform-independent.)
Is there anyway in Linux (or more generally in a POSIX OS) to guarantee that during the execution of a program, no file descriptors will be reused, even if a file is closed and another opened? My understanding is that this situation would usually lead to the file descriptor for the closed file being reassigned to the newly opened file.
I'm working on an I/O tracing project and it would make life simpler if I could assume that after an open()/fopen() call, all subsequent I/O to that file descriptor is to the same file.
I'll take either a compile-time or run-time solution.
If it is not possible, I could do my own accounting when I process the trace file (noting the location of all open and close calls), but I'd prefer to squash the problem during execution of the traced program.
Note that POSIX requires:
The open() function shall return a file descriptor for the named file
that is the lowest file descriptor not currently open for that
process.
So in the strictest sense, your request will change the program's environment to be no longer POSIX compliant.
That said, I think your best bet is to use the LD_PRELOAD trick to intercept calls to close and ignore them.
You'd have to write a SO that contains a close(2) that opens /dev/null on old FDs, and then use $LD_PRELOAD to load it into process space before starting the application.
You must already be ptraceing the application to intercept its file opening and closing operations.
It would appear trivial to prevent FD re-use by "injecting" dup2(X, Y); close(X); calls into the application, and adjusting Y to be anything you want.
However, the application itself could be using dup2 to force a re-use of previously closed FD, and may not work if you prevent that, so I think you'll just have to deal with this in post-processing step.
Also, it's quite easy to write an app that will run out of FDs if you disallow re-use.
What is the behavior of the select(2) function when a file descriptor it is watching for reading is closed by another thread?
From some cursory testing, it does return right away. I suspect the outcome is either that (a) it still continues to wait for data, but if you actually tried to read from it you'd get EBADF (possibly -- there's a potential race) or (b) that it pretends as though the file descriptor were never passed in. If the latter case is true, passing in a single fd with no timeout would cause a deadlock if it were closed.
From some additional investigation, it appears that both dwc and bothie are right.
bothie's answer to the question boils down to: it's undefined behavior. That doesn't mean that it's unpredictable necessarily, but that different OSes do it differently. It would appear that systems like Solaris and HP-UX return from select(2) in this case, but Linux does not based on this post to the linux-kernel mailing list from 2001.
The argument on the linux-kernel mailing list is essentially that it is undefined (and broken) behavior to rely upon. In Linux's case, calling close(2) on the file descriptor effectively decrements a reference count on it. Since there is a select(2) call also with a reference to it, the fd will remain open and waiting for input until the select(2) returns. This is basically dwc's answer. You will get an event on the file descriptor and then it'll be closed. Trying to read from it will result in a EBADF, assuming the fd hasn't been recycled. (A concern that MarkR made in his answer, although I think it's probably avoidable in most cases with proper synchronization.)
So thank you all for the help.
I would expect that it would behave as if the end-of-file had been reached, that's to say, it would return with the file descriptor shown as ready but any attempt to read it subsequently would return "bad file descriptor".
Having said that, doing that is very bad practice anyway, as you'd always have potential race conditions as another file descriptor with the same number could be opened by yet another thread immediately after the other 2nd closed it, then the selecting thread would end up waiting on the wrong one.
As soon as you close a file, its number becomes available for reuse, and may get reused by the next call to open(), socket() etc, even if by another thread. Therefore you really, really need to avoid this kind of thing.
The select system call is a way to wait for file desctriptors to change state while the programs doesn't have anything else to do. The main use is for server applications, which open a bunch of file descriptors and then wait for anything to do on them (accept new connections, read requests or send the responses). Those file descriptors will be opened in non-blocking io mode such that the server process won't hang in a syscall at any times.
This additionally means, there is no need for separate threads, because all the work, that could be done in the thread can be done prior to the select call as well. And if the work takes long, than it can be interrupted, select being called with timeout={0,0}, the file descriptors get handled and afterwards the work is being resumed.
Now, you close a file descriptor in another thread. Why do you have that extra thread at all, and why shall it close the file descriptor?
The POSIX standard doesn't provide any hints, what happens in this case, so what you're doing is UNDEFINED BEHAVIOR. Expect that the result will be very different between different operating systems and even between version of the same OS.
Regards, Bodo
It's a little confusing what you're asking...
Select() should return upon an "interesting" change. If the close() merely decremented the reference count and the file was still open for writing somewhere then there's no reason for select() to wake up.
If the other thread did close() on the only open descriptor then it gets more interesting, but I'd need to see a simple version of the code to see if something's really wrong.