Memory Mapping a file in POSIX - c

When memory mapping files in a POSIX system, do we need to keep the file-descriptor open till we're done with the mmaped memory block (and close it after we munmap) or can (should?) we close the file descriptor once mmap succeeds? Both seem to work on my Linux system.

From the Open Group standard
The mmap() function shall add an extra
reference to the file associated with
the file descriptor fildes which is
not removed by a subsequent close() on
that file descriptor. This reference
shall be removed when there are no
more mappings to the file.

Related

open(2) a file from existing descriptor

Background
I have multiple threads in the same process that are all installing fcntl(2) locks on a given file. These locks must block, thus to achieve intra-process blocking I must use Open file description locks (or OFD locks, see fcntl(2)). And it is documented that:
Open file description locks placed via the same open file
description (i.e., via the same file descriptor, or via a
duplicate of the file descriptor created by fork(2), dup(2),
fcntl() F_DUPFD, and so on) are always compatible: if a new lock
is placed on an already locked region, then the existing lock is
converted to the new lock type. (Such conversions may result in
splitting, shrinking, or coalescing with an existing lock as
discussed above.)
On the other hand, open file description locks may conflict with
each other when they are acquired via different open file
descriptions. Thus, the threads in a multithreaded program can
use open file description locks to synchronize access to a file
region by having each thread perform its own open(2) on the file
and applying locks via the resulting file descriptor.
Thus, when a thread is booting up, it must open its own descriptor via open. It should be noted that the "main thread" has the file already open and threads come and go throughout the processes lifetime.
Question
So I was thinking, is there any way I can re-use an existing file descriptor to open a separate descriptor to the same file without dup(2)?
In otherwords, if I had file descriptor A, but do not know the filename, can I open descriptor B pointing to that same file A is?
My first instinct is as follows, whereas fd is the original file descriptor and fd2 is the "deep cloned" descriptor.
char buff[500]
sprintf(buff, "/proc/%d/fd/%d", getpid(), fd);
fd2 = open(buff, O_RDWR)
However, it feels dirty. I was hoping there is a system call to do this.

Is it possible to check whether a file descriptor refers to a shared memory object?

Is it possible to check whether a file descriptor originated from a call to shm_open()? We already have isatty() that checks whether a file descriptor refers to a terminal. Is there something similar to know whether a file descriptor refers to a shared memory object?
Is it possible to check whether a file descriptor originated from a call to shm_open()? We already have isatty() that checks whether a file descriptor refers to a terminal. Is there something similar to know whether a file descriptor refers to a shared memory object?
Well, isatty() is not a system call, but just an ioctl(2) wrapper call that has to be made to a tty device (one device that supports it).
but you can use fstat(2) with the file descriptor to get the inode information. in the st_mode field of this structure there are bits that will tell you if it is a file, directory, socket, fifo, block device, char device, etc.
In you case, if you know the descriptor has been built with a call to shm_open, you have already answered your question. But try fstat(2) to see what does it return in those bits in your struct stat buffer.
Beware that what isatty(3) does is different than what you want to know. For isatty(3) to work you must pass not just a char device descriptor, but a char device that implements the ioctl call implemented in the tty driver, which for example will give you a false result when you use it with a magnetic tape descriptor.

What is file-descriptor?

While trying to learn socket programming, I saw the following code:
int sock;
sock = socket(PF_INET, SOCK_STREAM, IPPROTO_TCP);
I browsed through the man page and found that socket returns a file descriptor. I have tried searching the internet and other similar questions here but I couldn't understand what file descriptor really is. That would be great if someone could explain file descriptor in easy language.
There are two related objects: file descriptor and file description. People often confuse these two and think they are the same.
File descriptor is an integer in your application that refers to the file description in the kernel.
File description is the structure in the kernel that maintains the state of an open file (its current position, blocking/non-blocking, etc.). In Linux file descripion is struct file.
POSIX open():
The open() function shall establish the connection between a file and a file descriptor. It shall create an open file description that refers to a file and a file descriptor that refers to that open file description. The file descriptor is used by other I/O functions to refer to that file. The path argument points to a pathname naming the file.
The open() function shall return a file descriptor for the named file that is the lowest file descriptor not currently open for that process. The open file description is new, and therefore the file descriptor shall not share it with any other process in the system.
In Unix/ Linux operating systems, a file descriptor is an abstract indicator (handle) used to access a file or other IO(input/output) resource, such as a pipe or network socket.
Normally a file descriptors index into a per-process file descriptor table maintained by the kernel in Linux/Unix OS, that in turn indexes
into a system-wide table of files opened by all processes, called the file table.
This table records the "mode" with which the file or the other resource has been opened
for the following operations(There are more operations)
reading
writing
appending
writing
and possibly other modes.
It also indexes into a third table called the inode table that describes the actual underlying files.
File Descriptors are nothing but mappings to a file. You can also say these are pointers to a file that the process is using.
FDs are just integer values which act as pointers to process resources.
Whenever a process starts, an entry of the running process is added to the /proc/<pid> directory. This is the place where all of the data related to the process is kept. Also, on process start the kernel allocates 3 file-descriptors to the process for communication with the 3 data streams referred to as stdin, stdout and stderr.
the linux kernel uses an algorithm to always create a FD with the lowest possible integer value so these data-streams are mapped to the numbers 0, 1 and 2.
Let's say in you code you opened a file to read from or to write to. This means the process needs access to a resource and it has to create a mapping/pointer for this new resource.
To do this, the kernel automatically creates a FD as soon as the file is opened by your code.
If you run ls -l /proc/<pid>/fd/ you will se an additional FD created there with id 4 (can be some other number also if the program has used other resources)
I think of file descriptors as (indirect, higher-level) pointers to opaque file objects maintained by the kernel.
Normally, when you deal with objects maintained by a library, you pass to the library pointers to objects that you're not supposed to dereference and manipulate yourself.
For kernel objects, this it's not just that you're not supposed to manipulate them yourself -- you literally can't because they live in a different address space that's not at all accessible to you. And because they live in a different address space, pointers wouldn't be a meaningful way of referring to them.
You need a token or handle which the kernel would internally resolve to a pointer that's meaningful in the kernel address space. File descriptors are such tokens in integer form.
For the kernel:
your_process_id + your_file_descriptor => kernels_file_object_pointer
(or an EBADF error if a given filedescriptor may not be resolved to a file object pointer for the given process)

Difference between fclose and close

If I fopen a file, what's the difference between calling fclose or close and which one should I use?
If forked children have access to the file as well, what should they do when they are finished with the file?
fclose() is function related with file streams. When you open file with the help of fopen() and assign stream to FILE *ptr. Then you will use fclose() to close the opened file.
close() is a function related with file descriptors. When you open file with the help of open() and assign descriptor to int fd. Then you will use close() to close the opened file.
The functions like fopen(), fclose() etc are C standard functions, while the other category of open(), close() etc are POSIX-specific. This means that code written with open(), close() etc is not a standard C code and hence non-portable. Whereas the code written with fopen(), fclose etc is a standard code and can be ported on any type of system.
which one should I use?
It depends on how you opened the file. If you open a file with fopen(), you should use fclose() and if you open file with open(), you should use close().
If forked children have access to the file as well, what should they do when they are finished with the file?
This is also dependent on where you made the fork() call: before opening the file or after opening it.
See: Are file descriptors shared when fork()ing?
See: man fclose and man close
open() and close() are UNIX syscalls which return and take file descriptors, for use with other UNIX syscalls such as write(). fopen() and fclose() are standard C library functions which operate on FILE*s, for use with things like fwrite and fprintf. The latter are almost always what you should be using: They're simpler and more cross-platform.
As for your second question, forked children have the same numeric file descriptor as the parent, but it's a copy; they can close it, and it will still be open for the parent and other children. (Though personally, I don't like to have files open when I fork()... I like to make that sort of shared resource usage explicit. Pipes, of course, are an exception.)
which one should I use?
If you open a file with fopen, close it with fclose. Using close in this case may cause a memory leak on a handle allocated by fopen

Close a FILE pointer without closing the underlying file descriptor

By using fdopen(), fileno() it's possible to open streams with existing file descriptors. However the proper way to close a file, once you've opened it with a stream is to fclose() the FILE pointer. How can one close the stream, but retain the open file descriptor?
This behaviour is akin to calling fflush() and then fileno(), and then never using the FILE pointer again, except in closing. An additional concern is that if you then fdopen() again, there are now multiple FILE pointers, and you can only close one of them.
If you're on a POSIXy system (which I assume you are, since you have fileno()), you can use dup() to clone the file descriptor:
int newfd = dup(fileno(stream));
fclose(stream);
Or you can hand fdopen() a duplicate file descriptor:
FILE *stream = fdopen(dup(fd), "r");
Either way, the other copy of the fd won't close with the FILE *. However, keep in mind the location pointer is shared, so be careful if you are using both at the same time. Also, any fcntl() locks held on the original fd will be released when you close the copy.
If everything else fails, dup(2) could help.

Resources