File descriptor and process relation - c

Do file descriptor are with respect to the processes or with respect to the operating system? What I basically want to know is if in a c program I open a file and that file gets assigned a file descriptor value lets say, 103, so when I open a file with file descriptor 103 in some other c program would I be referring to the same file or some other?

Each Process will be having its own file descriptor table. Its processor specific, if you change the fd it will be valid only to that process it wont affect the other processes in the system. once process is terminated fd will be discarded.
What if I fork a new process from the process I opened that file?
Current File description table i.e the table before fork system call will be inherited to the child process.

File descriptors are process specific created through open(). But you can open same file more than once by other processes with open(). In this way each process will have his own file descriptor for same file. File descriptors along with other resources pass through fork to child process.That means child process does not need to reopen same file which parent already has opened.

Related

Processes and a shared file descriptor

I have an application that creates multiple instances (processes) of itself and these processes have a shared data structure. In that struct there is a file descriptor used for logging data to file. There is a check in the logging function that checks to see if the file descriptor is -1 and if it is then it opens the file and sets the value of the shared file descriptor.
Other processes / threads do the same check but at this time the fd is != -1. So the file does not get opened. They then continue to writing to the file. The write fails most of the time and returns -1. When the write did not fail I checked the file path of the fd using readlink. The path was some other file than the log file.
I am assuming that this is because even though the file descriptor value was always 11, even in subsequent runs, that value refers to a different file for each process. So it is the eleventh file that process has open? So the log file is not even regarded as open for these processes and even if they do open the file the fd would be different.
So my question is this correct? My second question is how do I then re-implement this method given that multiple processes need to write to this log file. Would each process need to open that file.. or is there another way that is more efficient.. do I need to close the file so that other processes can open and write to it..?
EDIT:
The software is an open source software called filebench.
The file can be seen here.
Log method is filebench_log. Line 204 is the first check I mentioned where the file is opened. The write happens at line 293. The fd value is eleven among all processes and the value is the same: 11. It is actually shared through all processes and setup mostly here. The file is only opened once (verified via print statements).
The shared data struct that has the fd is called
filebench_shm
and the fd is
filebench_shm->shm_log_fd
EDIT 2:
The error message that I get is Bad file descriptor. Errno is 9.
EDIT 3:
So it seems that each process has a different index table for the fds. Wiki:
On Linux, the set of file descriptors open in a process can be accessed under the path /proc/PID/fd/, where PID is the process identifier.
So the issue that I am having is that for two processes with process IDs 101, 102 the file descriptor 11 is not the same for the two processes:
/proc/101/fd/11
/proc/102/fd/11
I have a shared data structure between these processes.. is there another way I can share an open file between them other than an fd since that doesn't work?
It seems that it would be simplest to open the file before spawning the new processes. This avoids all the coordination complexity regarding opening the file by centralizing it to one time and place.
I originally wrote this as a solution:
Create a shared memory segment.
Put the file descriptor variable in the segment.
Put a mutex semaphore in the segment
Each process accesses the file descriptor in the segment. If it is not open, lock the semaphore, check if it is open, and if not open the
file. Release the mutex.
That way all processes share the same file descriptor.
But this assumes that the the underlying file descriptor object is also in the shared memory, which I think it is not.
Instead, use the open then fork method mentioned in the other answer, or have each process open the file and use flock to serialize access when needed.

File descriptors before fork()

I know that if I call the open function before the fork(), the IO pointer is shared between the processes.
If one of these processes closes the file calling the close(fd) function, will the other processes still be capable to write/read the file or will the file be closed for everyone?
Yes. Each process has a copy of the file descriptor (among other things). So one process closing it won't affect the copy of the fd in other process.
From fork() manual:
The child inherits copies of the parent's set of open file
descriptors. Each file descriptor in the child refers to the same
open file description (see open(2)) as the corresponding file
descriptor in the parent. This means that the two descriptors
share open file status flags, current file offset, and signal-
driven I/O attributes (see the description of F_SETOWN and
F_SETSIG in fcntl(2)).
From close() manual:
If fd is the last file descriptor referring to the underlying open
file description (see open(2)), the resources associated with the
open file description are freed; if the descriptor was the last
reference to a file which has been removed using unlink(2), the file
is deleted.
So if you do close(fd); it closes only the reference in that process and other process holding another reference to the same file descriptor can continue to operate on it.
Whenever a child process is created, it gets a copy of the file descriptor table from the parent process. And there is a reference count corresponding to each file descriptor, that is the number of processes currently accessing the file. So, if a file is open in master process and a child process is created, the reference count increments, as it is now open in child process as well, and when it is closed in any of the processes, it decrements. A file is finally closed when the reference count reaches zero.

Do not want a parent and its child process to share the same file descriptor table

I open a file in program A. Its file descriptor is 3. Using fork followed by an execve I execute another program B, where I immediately open another file. This files descriptor is 4. If A and B was not sharing the file descriptor table then the file descriptor of file opened in B should have been 3. I need to create child processes not sharing the parents address space including open files.
Thanks a lot
The child doesn't share the same FD table, you simply forgot to close them in the child or mark them close-on-exec.
Close the file before execing the new process. Do that in the code between the fork() and exec().

Are file descriptors shared when fork()ing?

Let's say I open a file with open(). Then I fork() my program.
Will father and child now share the same offset for the file descriptor?
I mean if I do a write in my father, the offset will be changed in child too?
Or will the offsets be independent after the fork()?
From fork(2):
* The child inherits copies of the parent’s set of open file descrip-
tors. Each file descriptor in the child refers to the same open
file description (see open(2)) as the corresponding file descriptor
in the parent. This means that the two descriptors share open file
status flags, current file offset, and signal-driven I/O attributes
(see the description of F_SETOWN and F_SETSIG in fcntl(2)).
They do share the same offset.

Behavior of a pipe after a fork()

When reading about pipes in Advanced Programming in the UNIX Environment, I noticed that after a fork the parent can close() the read end of a pipe and it doesn't close the read end for the child. When a process forks, does its file descriptors get retained?
What I mean by this is that before the fork the pipe read file descriptor had a retain count of 1, and after the fork 2. When the parent closed its read side the fd went to 1 and is kept open for the child. Is this essentially what is happening? Does this behavior also occur for regular file descriptors?
As one can read on the man page about fork():
The child process shall have its own copy of the parent's file
descriptors. Each of the child's file
descriptors shall refer to the same
open file description with the
corresponding file descriptor of the
parent.
So yes, the child have exact copy of parent's file descriptors and that refers to all of them, including open files.
The answer is yes, and yes (the same applies to all file descriptors, including things like sockets).
In a fork() call, the child gets its own seperate copy of each file descriptor, that each act like they had been created by dup(). A close() only closes the specific file descriptor that was passed - so for example if you do n2 = dup(n); close(n);, the file (pipe, socket, device...) that n was referring to remains open - the same applies to file descriptors duplicated by a fork().
Yes, a fork duplicates all open file descriptors.
So for a typical pipe, a 2 slot array (int fd[2]), fd[0] is the same for the parent and child, and so is fd[1].
You can create a pipe without forking at all, and read/write to yourself by using fd[0] and fd[1] in one process.

Resources