How does fchdir work? - c

In it's main page it says:
fchdir() is identical to chdir(); the only difference is that the
directory is given as an open file descriptor.
And the prototype is given as following:
int chdir(const char *path);
int fchdir(int fd);
My question is how can a directory be passed as a file descriptor? Do directories also have a corresponding descriptor like files?

Do directories also have a corresponding descriptor like files?
Yes. The Unix philosophy(and Linux) is to treat everything as a stream of bytes. So yes, you can do open(2) on a directory and get its file descriptor.
Not only directories but sockets, pipes and devices can also be opened using open(2) system call and do operations on it as though it's a file.

As you can read on the Opengroup documentation for fchdir,
A conforming application can obtain a file descriptor for a file of type directory using open(), provided that the file status flags and access modes do not contain O_WRONLY or O_RDWR.
So by calling open on a directory, one obtains a file descriptor to a directory. So in a sense, yes, all directories have file descriptors.
The underlying OS takes care to map the file descriptors to the proper filesystem object, thus abstracting away whatever this object may be at the lowest level.

Related

What is the difference between inode number and file descriptor?

I understand file descriptors are kernel handle to identify the file , while inode number of a file is pointer to a structure which has other details about file(Correct me if I am wrong). But I am unable to get the difference between them.
An inode is an artifact of a particular file-system and how it manages indirection. A "traditional *ix" file-system uses this to link together files into directories, and even multiple parts of a file together. That is, an inode represents a physical manifestation of the file-system implementation.
On the other hand, a file descriptor is an opaque identifier to an open file by the Kernel. As long as the file remains open that identifier can be used to perform operations such as reading and writing. The usage of "file" here is not to be confused with a general "file on a disk" - rather a file in this context represents a stream and operations which can be performed upon it, regardless of the source.
A file descriptor is not related to an inode, except as such may be used internally by particular [file-system] driver.
The difference is not substantial, both are related to the abstract term called "file". An inode is a filesystem structure that represents files. Whereas, a file descriptor is an integer returned by open syscall. By definition:
Files are represented by inodes. The inode of a file is a structure kept by the filesystem which holds information about a file, like its type, owner, permissions, inode links count and so on.
On other the hand, a file descriptor
File Descriptors:
The value returned by an open call is termed a file descriptor and is essentially an index into an array of open files kept by the kernel.
The kernel doesn't represent open files by their names, instead it uses an array of entries for open files for every process, so a file descriptor in effect is an index into an array of open files. For example, let's assume you're doing the following operation in a process:
read(0, 10)
0 denotes the file descriptor number, and 10 to read 10 bytes. In this case, the process requests 10 bytes from the file/stream in index 0, this is stdin. The kernel automatically grants each process three open streams:
Descriptor No.
0 ---> stdin
1 ---> stdout
2 ---> stderr
These descriptors are given to you for free by the kernel.
Now, when you open a file, in the process via open("/home/myname/file.txt") syscall, you'll have index 3 for the newly opened file, you open another file, you get index 4 and so forth. These are the descriptors of the opened files in the process:
Descriptor No.
0 ---> stdin
1 ---> stdout
2 ---> stderr
3 ---> /home/user100/out.txt
4 ---> /home/user100/file.txt
See OPEN(2) it explains what goes underneath the surface when you call open.
The fundamental difference is that an inode represents a file while a file descriptor (fd) represents a ticket to access the file, with limited permission and time window. You can think an inode as kind of complex ID of the file. Each file object has a unique inode. On the other hand, a file descriptor is an "opened" file by a particular user. The user program is not aware of the file's inode. It uses the fd to access the file. Depending on the user's permissions and the mode the user program choses to open the file (read-only for example) a fd is allowed a certain set of operations on the file. Once the fd is "closed" the user program can't access the file unless it opens another fd. At any given time, there can be multiple fds accessing a file in the same or different user programs.

What are the advantages of using fstat() vs stat()?

If I have an open file with a known file descriptor, what are the advantages of using fstat(), versus stat()? Why isn't there only one function?
int fstat(int fildes, struct stat *buf)
int stat(const char *path, struct stat *buf)
As noted, stat() works on filenames, while fstat() works on file descriptors.
Why have two functions for that?
One factor is likely to be convenience. It's just nice to be able to fstat() a file descriptor that you obtained from other parts of your code, without having to pass the filename too.
The major reason is security, though. If you first stat() the file and then open() it, there is a small window of time in between where the file could have been modified (or had its permissions changed, etc) or replaced with a symlink.
fstat() avoids that problem. You first open() the file, then the file can't be swapped out beneath your feet any more. Then you fstat() and you can be sure that you have the right file.
fstat is to be used with a file descriptor obtained via the open call. Its main feature is to get information on already opened file descriptors instead of reopening.
You can also use fstat with FILE handlers like so (error handling omitted):
FILE *fp = fopen("/path/to/file", "r");
struct stat st;
fstat(fileno(fp), &st);
If you have a file descriptor, you do not necessarily know the path (e.g. when the file was opened by some other part of your application).
If you know the path, you do not need to call open to get a file descriptor just for the purpose of calling fstat.
If you look at man fstat, you will see the following:
fstat() is identical to stat(), except that the file to be stat-ed is
specified by the file descriptor fd.
To expand a little bit, you would use fstat if you happened to have a file descriptor instead of a file path.
With respect to information provided by the function, they are literally identical, as you can see from the above quote.
If you only have a file descriptor to a file (but you may not know its path), then you could use fstat(); if you only have a path to a file, then you could use stat() directly, no need to open it first.

Is the file position per inode?

I am confused with the concept of file position as used in lseek. Is this file position maintained at inode level or a simple variable which could have different values for different process working on the same file?
Per the lseek docs, the file position is associated with the open file pointed to by a file descriptor, i.e. the thing that is handed to your by open. Because of functions like dup and fork, multiple descriptors can point to a single description, but it's the description that holds the location cursor.
Think about it: if it were associated with the inode, then you would not be able to have multiple processes accessing a file in a sensible manner, since all accesses to that file by one process would affect other processes.
Thus, a single process could have track many different file positions as it has file descriptors for a given file.
It's not an 'inode', but FILEHANDLE inside kernel.
Inode is a part of file description of the *nix specific file system on the disk. FAT32, for example, has no inodes, but supported by Linux.
For know the relation between file descriptors and open files, we need to examine three data structures.
the per-process file descriptor table
the system wide table of open file descriptors
the file system i-node table.
For each process kernel maintains a table of open file descriptors. Each entry in this table records information about a single file descriptor including
a set of flags controlling the operation of the file descriptor.
a reference to the open file description
The kernel maintains a system wide table of all open file descriptors. An open file description stores all information related to an open file including:
the current file offset(as updated by read() and write(),or explicitly modified using lseek())
status flags specified when opening the file.
the file access mode (read-only,write only,or read-write,as specified in open())
setting relating to signal-driven I/O and
a reference to i-node object for this file.
Reference-Page 94,The Linux Programming Interface by Michael Kerrisk

How to obtain a file name from the standard FILE structure?

What I want:
void printFname(FILE * f)
{
char buf[255];
MagicFunction(f,buf);
printf("File name: %s",buf);
}
So, all I need is "MagicFunction", but unfortunatelly I haven't found such ...
Is there any way to implement using an OS library? (windows.h , cocoa.h, posix.h etc.)
There is no such function. There may be no filename, or more than one filename that correspond with the FILE *. On Unix, a program can continue to have a reference to a file after it has been renamed or deleted, which could mean that you have a FILE * with no name. Or more hard links may be made to the file, which means a file can have multiple names; which one would you choose? To further confuse things, a file can be temporarily hidden, by mounting a filesystem over a directory containing that file. The file will still be on disk, at its original pathname, but the file will be inaccessible at that path because the mount is obscuring it.
It's also possible that the FILE * never corresponded to a file on the filesystem at all; while they usually do, you can create one from any file descriptor using fdopen(), and that file descriptor may be a pipe, socket, or other file-like object that has never had a path on the disk. In some versions of the C library, you can open a string stream (for instance, fmemopen() in glibc), so the FILE * actually just corresponds to a memory buffer.
If you care about the name, it's best to just keep track of what it was named when you opened the file.
There are some hacky ways to approximate getting the filename; if you're just using this for debugging or informational purposes, then they may be sufficient. Most of these will require operating on the file descriptor rather than the FILE *, as the file descriptor is the lower level way of referring to a file. To get the file descriptor, run fileno() on the FILE *, and remember to check for errors in case there is no file descriptor associated with that FILE *.
On Linux, you can do readlink() on "/proc/self/fd/fileno" where fileno is the file descriptor. That will show you what filename the file had when the file was opened, or a string indicating what other kind of file descriptor it is, like a socket or inotify handle. FreeBSD and NetBSD have Linux emulation layers, which include emulation of Linux-style procfs; you may be able to do this on those if you mount a Linux-compatible procfs, though I don't have them available for testing.
On Mac OS X, you don't have /proc/self/fd. If you don't care about finding the original filename, but some other filename that refers to the file would work (such that you could pass it to another program), you can construct one: /.vol/deviceid/inode. For example, /.vol/234881030/281363. To get those values, run fstat() on the file descriptor, and use st_dev and st_ino on the resulting struct stat.
On Windows, files and the filesystem work quite differently than Unix. Apparently it's possible to map a file back to its name on Windows. As of Windows Vista, you can simply call GetFinalPathNameByHandle(). This takes a HANDLE; to get the HANDLE from the file descriptor, call _get_osfhandle(). Prior to Windows Vista, you need to do a little more work, as described in this article. Note that on Windows fileno() is named _fileno(), though the former may work with a warning.
Going even further into hacky territory, there are a few more techniques that you could use. You could shell out to lsof, or you could extract the code it uses to resolve pathnames. lsof actually looks directly in kernel memory, extracting information from the kernel's name cache. This has several limitations, outlined in the lsof FAQ. And of course, you need root or equivalent privileges to do this, either directly or with an suid/sgid binary.
And finally, for a portable but slow solution for finding one or more filenames matching an open file, you could find the device and inode number using fstat() on the file descriptor, and then recursively traverse the filesystem stat()ing every file, until you find a file with matching device and inode number. Remember the caveats I mention above; you may find no matching files, more than one matching file, and even if you don't find any matching files, the file might still be there, but hidden by a mount point. And of course, there may be race conditions; something may rename the file in such a way that you never see it while traversing the hierarchy.
There is no such standard function.
Do you fopen() yourself? If then, maintain FILE * to filename hash table yourself.
Otherwise, it's not possible in general.
I don't think that there is such function even at windows.h,coca.h or unistd.h.
Most probably you write it yourself. Just make a
struct myFile {
FILE *fh;
char *filename;
}
and hold such structures into array of struct myFile and in MagicFunction(f,b) walk on the array looking for the address equal to f.

How to check if an opened file has been moved or removed by another process

I have a process using C on Linux OS that writes data to a file. It uses open()/write() functions and I've been wondering if another process rm'd or mv'd the file. How can my process find out and recreate the file?
You can use fstat() to get the information about the open file. If the st_nlink field is zero, the file has been removed from the file system (possibly by being moved to a different file system, but there's no real way for you to determine that). There's a decent chance you have the only remaining reference to that file - though there might be other processes also holding it open. The disk space won't be released until the last process with an open file descriptor for the file finally closes the file.
If the st_nlink field is still positive, then your file still has a name somewhere out in the file system. You then need to use stat() to determine whether the st_dev and st_ino fields for the given file name match the same fields from the file descriptor. If the name still exists and has the same device and inode number, then it is 'the same' file (though the contents may have changed). If there's a difference, then the open file is different from the file specified by name.
Note that if you want to be sure that the given name is not a symbolic link to a moved copy of the file, then you would have to use lstat() on the file when you open it (to ensure it isn't a symlink at that point), and again when you check the file (instead of using stat()).
You can use the stat call to do this.
struct stat st;
if(stat("/tmp",&st) == 0)
printf(" /tmp is present\n");
else
/* Write code to create the file */

Resources