Can hardlinks be overwritten without using a temporary file? - c

I have a hardlink the must always exist on the filesystem. What inode the hardlink points is not constant. I want to update the hardlink without adding a temporary entry to the directory.
(Creating a file without a directory entry can be done using open(2) with the temp flag.)
The issue I'm facing is with replacing/updating the hardlink. From the documentation on the relevant system calls, It seems that I have only two options, and neither avoids a temporary file:
Using renameat, it is possible to insure that the hardlink always exists. However, it must consume a hardlink and hence necessitating a temporary files (not to mention its inability to dereference symbolic links).
using linkat, it is possible to produce a hardlink without sacrificing another file. but it cannot overwrite existing files; requiring the deletion of the original hard link.
Is it at all possible to create a link to an inode that replaces an older link with the same name?

You need to have another file to which to switch the link. However
rename, renameat do not need the inode be linked in the same directory; they just require the inode to exist on the same filesystem, or more specifically on the same mount point; otherwise Linux rename fails with EXDEV:
EXDEV
oldpath and newpath are not on the same mounted filesystem. (Linux permits a filesystem to be mounted at multiple points, but rename() does not work across different
mount points, even if the same filesystem is mounted on both.)
Since Linux 3.11 there is a way to make a new file without linking it to the filesystem: open(2) has a new flag O_TMPFILE:
O_TMPFILE (since Linux 3.11)
Create an unnamed temporary file. The pathname argument
specifies a directory; an unnamed inode will be created in
that directory's filesystem. Anything written to the
resulting file will be lost when the last file descriptor is
closed, unless the file is given a name.
O_TMPFILE must be specified with one of O_RDWR or O_WRONLY
and, optionally, O_EXCL. If O_EXCL is not specified, then
linkat(2) can be used to link the temporary file into the
filesystem, making it permanent, using code like the
following:
char path[PATH_MAX];
fd = open("/path/to/dir", O_TMPFILE | O_RDWR,
S_IRUSR | S_IWUSR);
/* File I/O on 'fd'... */
snprintf(path, PATH_MAX, "/proc/self/fd/%d", fd);
linkat(AT_FDCWD, path, AT_FDCWD, "/path/for/file",
AT_SYMLINK_FOLLOW);
In this case, the open() mode argument determines the file
permission mode, as with O_CREAT.
The manual tells that one of the 2 common use cases for O_TMPFILE is
Creating a file that is initially invisible, which is then
populated with data and adjusted to have appropriate
filesystem attributes (chown(2), chmod(2), fsetxattr(2),
etc.) before being atomically linked into the filesystem
in a fully formed state (using linkat(2) as described
above).
There are many downsides for this, apart from it being quite new: the file system must also support O_TMPFILE; ext[234] do support it, and so does XFS in 3.15; btrfs in 3.16; furthermore it might still not be a match for your case, as the linkat requires the AT_SYMLINK_FOLLOW which is not available for renameat; if the target name already exists, `linkat does not replace the the target.

Related

Get path from file descriptor when path is longer than PATH_MAX

I receive filesystem events from fanotify. Sometimes I want to get an absolute path to a file that's being accessed.
Usually, it's not a problem - fanotify_event_metadata contains a file descriptor fd, so I can call readlink on /proc/self/fd/<fd> and get my path.
However, if a path exceeds PATH_MAX readlink can no longer be used - it fails with ENAMETOOLONG. I'm wondering if there's a way to get a file path in this case.
Obviously, I can fstat the descriptor I get from a fanotify and traverse the entire filesystem looking for files with identical device ID and inode number. But this approach is not feasible for me performance-wise (even if I optimize it to ignore paths shorter than PATH_MAX).
I've tried getting a parent directory by reopening fd with O_PATH and calling openat(fd, "..", ...). Obviously, that failed because fd doesn't refer to a directory. I've also tried examining contents of a buffer after a failed readlink call (hoping it contains partial path). That didn't work either.
So far I've managed to get long paths for files inside the working directory of a process that opened them (fanotify events contain a pid of a target process, so I can read /proc/<pid>/cwd and get the path to the root from there). But that is a partial solution.
Is there a way to get an absolute path from a file descriptor without traversing the whole filesystem? Preferably the one that will work with kernel 2.6.32/glibc 2.11.
Update: For the curious. I've figured out why calling readlink("/proc/self/fd/<fd>", ... with a buffer large enough to store the entire path doesn't work.
Look at the implementation of do_proc_readlink. Notice that it doesn't use provided buffer directly. Instead, it allocates a single page and uses it as a temporary buffer when it calls d_path. In other words, no matter how large is buffer, d_path will always be limited to a size of a page. Which is 4096 bytes on amd64. Same as PATH_MAX! The -ENAMETOOLONG itself is returned by prepend when it runs out of mentioned page.
readlink can be used with a link target that's longer than PATH_MAX. There are two restrictions: the name of the link itself must be shorter than PATH_MAX (check, "/proc/self/fd/<fd>" is about 20 characters) and the provided output buffer must be large enough. You might want to call lstat first to figure out how big the output buffer should be, or just call readlink repeatedly with growing buffers.
the limitation of PATH_MAX births from the fact that the unix (or linux, from now) needs to bind the size of parameters passed to the kernel. There's no limit on how deep a file hierarchy can grow, and always there's the possibility to access all files, independent on how deep they are in the filesystem hierarchy. What is actually limited is the lenght of the string you can pass or receive from the kernel representing a file name. This means you cannot create (because you have to pass the target path) a symlink longer than this length, but you can have easily paths far longer this limit.
When you pass a filename to the kernel, you can do that for two reasons, to name a file (or device, or socket, or fifo, or whatever), to open it, etc. YOu do this and your filename goes first to a routine that converts that path into an inode (which is what the kernel manages actually). That routine begins scanning from two possible point in the filesystem hierarchi. Those points are the inode reference of the root inode and the inode reference of the curren working diretory of a process. The selection of which inode to use as departure inode depends on the presence of a leading / character at the begining of the path. From this point, up to PATH_MAX characters will be processed each time, but that can lead us deep enough that we cannot get to the root in one step only...
Suppose you use the path to change your current directory, and do a chdir A/B/C/D/E/.../Z. Once there, you create new directories and do the same thing, chdir AA/AB/AC/AD/AE/.../AZ, then chdir BA/BB/BC/BD/... and so on... there's nothing in the system that forbids you to get so deep in the filesystem (you can try that yourself, I have done and tested before) You can grow to a map that is by far larger than PATH_MAX. But this only mean that you cannot get there directly from the filesystem root. You can go there in steps, as much as the system allows you, and depending on where you fix you root directory (by means of the chroot(2) syscall) or your current directory (by means of the chdir(2) syscall)
probably you have notice (or not) that there's no system call to get your curren working directory path from root... There are several reasons for this:
root inode and curren working inode are two local-to-process concepts. Two processes in the same system can have different working directories, and also different root directories, up to the point that they are able to share nothing in common and no way from one's directory to reach the other.
inode path can be ambiguous. Well, this is not true for a directory, as it is not allowed two hard links to point to the same directory inode (this was possible in older unices, where directories had to be created with the mknod(2) system call, if you have access to some hp-ux v6 or old Unix SysV R4 you can create directories with a ... entry ---pointing to the granparent of a directory or similar things, just being root and knowing how to use the mknod(2) syscall) The idea is that when two links point to the same inode, which (or both) of then goes to the root, which one is the right path from the root inode to the current dir?
curren inode and root can be separated by a path far enough to not fit in the PATH_MAX limit.
there can be several different filesystems (and filesystem types) involved in getting to the root. So this is not something that can be obtained only knowing the stored data in the disks, you must know the mounting table.
For these reasons, there's no direct support in the kernel to know the root path to a file. And also there's no way to get the path (and this is what the pwd(1) command does) than to follow the .. entry and get to the parent directory and search there a link that gets to the inode number of the current dir... and repeat this until the parent inode is the same as the last inode visited. Only then you'll be in the root directory (your root directory, that is different in general of other processes root directories)
Just try this exercise:
i=0
while [ "$i" -lt 10000 ]
do
mkdir dir-$i
cd dir-$i
i=$(expr "$i" + 1)
done
and see how far you can go from the root directory in your hierarchy.
NOTE 1
Another reason to be impossible to get the path to a file from an open descriptor is that you have access only to the inode (the path you used to open(2) it can have no relationship to the actual root path, as you can use symlinks and relative to the working directory, or changed root dir in between the open call and the time you want to access the path, it can even not exist, as you can have unlink(2)d it) The inode information has no reference to the path to the inode, as there can be multiple (even millions) paths to a file. In the inode you have only a ref count, which means the number of paths that actually finish on that inode.

changing file permissions of default mkstemp

I call the following code in C:
fileCreatefd = mkstemp(fileName);
I see that the file is created with permissions 600 (-rw-------). I want to create this temp file as -rw-rw-rw-
I tried playing around with umask but that only applies a mask over the file permissions -- at least thats my understanding. So how can i create a file with permissions 666?
Thanks
You cannot create it 0666 with mkstemp. You can change the permissions afterwards, if that is sufficient for your application, with fchmod.
fileCreatefd = mkstemp(fileName);
fchmod(fileCreatefd, 0666)
The mkstemp() function generates a unique temporary filename from template, creates and opens the file, and returns an open file descriptor for the file.
The last six characters of template must be "XXXXXX" and these are replaced with a string that makes the filename unique. Since it will be modified, template must not be a string constant, but should be declared as a character array.
The file is created with permissions 0600, that is, read plus write for owner only. (In glibc versions 2.06 and earlier, the file is created with permissions 0666, that is, read and writefor all users.) The returned file descriptor provides both read and write access to the file. The file is opened with the open(2) O_EXCL flag, guaranteeing that the caller is the process that creates the file.
More generally, the POSIX specification of mkstemp() does not say anything about file modes, so the application should make sure its file mode creation mask (umask(2)) is set appropriately before calling mkstemp() (and mkostemp()).
So after creating the File Use fchmod to change the file permission.

A file opened for read and write can be unlinked

In my program (on Mac OS X), I opened the file using following code.
int fd;
fd = open(filename, O_RDWR);
Program to delete the file is as follows:
unlink(filename);
In my case, I have same file which is opened and deleted. I observed the following:
After opening the file, I can delete it using this program and even by using rm command.
After deleting the file, read and write operations are working on the file without any problem.
I would like to know the reason behind this. How to prevent rm command or unlink(2) system call from deleting the file which is being opened?
You can't stop unlink(2) from unlinking a file which it has permission to unlink (i.e. it has write access to the directory).
unlink is not called unlink because nobody could think of a better name. It's called that because that is what it does; it unlinks the file from the directory. (A directory is just a collection of links; i.e. it associates names with the location of the corresponding data.) It does not delete the file; the file is garbage collected by the filesystem when there are no longer any links to it.
Open file descriptors are not the only way to keep links to files. Another, quite common, way is to use the link(1) command without the -s option. This creates "hard" links. If a file has several hard links, then removing one of the links (with unlink(2)) does just that -- it removes one of the links.
The rm command has a possibly more confusing name, but it, too, only removes the name, not the file. The file exists as long as someone has a link to it, including a running process.
First, rm command is calling unlink(2)
Then, unlinking an opened file is a normal thing to do on Linux or others Unixes (e.g. MacOSX). It is the canonical way to get temporary files (like tmpfile(3) probably does).
You should understand what inodes are, and realize that a file is not its name or file path, but essentially an inode. A file can have zero, one, or several file paths or names (one can add more with the link(2) syscall, provided all the names sit in the same filesystem). Directory entries associate names to inodes.
So there is no (POSIX-ly portable) way to prohibit I/O on open-ed files without any names.
For some opened file, the kernel has reference counters to its inode, and keep that inode till all processes having open(2)-ed it did close(2) it or have terminated.
See also inode(7) and credentials(7).
It's a normal Situation in UNIX SYSTEM. when you rm or unlink an opened file. UNIX system just mark a flag , and won't really delete the file desception. until the file is closed. and it will be really deleted in the file system.
It's protection to help the daemon work fine.
A link is a name associated to some file (a file is basically unamed). Note that a file could have different names (try ln).
unlink() removes one of this association to a file. If you remove the last link to a file, this just makes you unable to access the file by a name. But, this doesn't mean that the file is unusable, as a file could have been opened and his currently read/written by some application.
A file is removed if and only if :
- there is no link on it
- it is not currently opened by any application

How to obtain a file name from the standard FILE structure?

What I want:
void printFname(FILE * f)
{
char buf[255];
MagicFunction(f,buf);
printf("File name: %s",buf);
}
So, all I need is "MagicFunction", but unfortunatelly I haven't found such ...
Is there any way to implement using an OS library? (windows.h , cocoa.h, posix.h etc.)
There is no such function. There may be no filename, or more than one filename that correspond with the FILE *. On Unix, a program can continue to have a reference to a file after it has been renamed or deleted, which could mean that you have a FILE * with no name. Or more hard links may be made to the file, which means a file can have multiple names; which one would you choose? To further confuse things, a file can be temporarily hidden, by mounting a filesystem over a directory containing that file. The file will still be on disk, at its original pathname, but the file will be inaccessible at that path because the mount is obscuring it.
It's also possible that the FILE * never corresponded to a file on the filesystem at all; while they usually do, you can create one from any file descriptor using fdopen(), and that file descriptor may be a pipe, socket, or other file-like object that has never had a path on the disk. In some versions of the C library, you can open a string stream (for instance, fmemopen() in glibc), so the FILE * actually just corresponds to a memory buffer.
If you care about the name, it's best to just keep track of what it was named when you opened the file.
There are some hacky ways to approximate getting the filename; if you're just using this for debugging or informational purposes, then they may be sufficient. Most of these will require operating on the file descriptor rather than the FILE *, as the file descriptor is the lower level way of referring to a file. To get the file descriptor, run fileno() on the FILE *, and remember to check for errors in case there is no file descriptor associated with that FILE *.
On Linux, you can do readlink() on "/proc/self/fd/fileno" where fileno is the file descriptor. That will show you what filename the file had when the file was opened, or a string indicating what other kind of file descriptor it is, like a socket or inotify handle. FreeBSD and NetBSD have Linux emulation layers, which include emulation of Linux-style procfs; you may be able to do this on those if you mount a Linux-compatible procfs, though I don't have them available for testing.
On Mac OS X, you don't have /proc/self/fd. If you don't care about finding the original filename, but some other filename that refers to the file would work (such that you could pass it to another program), you can construct one: /.vol/deviceid/inode. For example, /.vol/234881030/281363. To get those values, run fstat() on the file descriptor, and use st_dev and st_ino on the resulting struct stat.
On Windows, files and the filesystem work quite differently than Unix. Apparently it's possible to map a file back to its name on Windows. As of Windows Vista, you can simply call GetFinalPathNameByHandle(). This takes a HANDLE; to get the HANDLE from the file descriptor, call _get_osfhandle(). Prior to Windows Vista, you need to do a little more work, as described in this article. Note that on Windows fileno() is named _fileno(), though the former may work with a warning.
Going even further into hacky territory, there are a few more techniques that you could use. You could shell out to lsof, or you could extract the code it uses to resolve pathnames. lsof actually looks directly in kernel memory, extracting information from the kernel's name cache. This has several limitations, outlined in the lsof FAQ. And of course, you need root or equivalent privileges to do this, either directly or with an suid/sgid binary.
And finally, for a portable but slow solution for finding one or more filenames matching an open file, you could find the device and inode number using fstat() on the file descriptor, and then recursively traverse the filesystem stat()ing every file, until you find a file with matching device and inode number. Remember the caveats I mention above; you may find no matching files, more than one matching file, and even if you don't find any matching files, the file might still be there, but hidden by a mount point. And of course, there may be race conditions; something may rename the file in such a way that you never see it while traversing the hierarchy.
There is no such standard function.
Do you fopen() yourself? If then, maintain FILE * to filename hash table yourself.
Otherwise, it's not possible in general.
I don't think that there is such function even at windows.h,coca.h or unistd.h.
Most probably you write it yourself. Just make a
struct myFile {
FILE *fh;
char *filename;
}
and hold such structures into array of struct myFile and in MagicFunction(f,b) walk on the array looking for the address equal to f.

How to check if an opened file has been moved or removed by another process

I have a process using C on Linux OS that writes data to a file. It uses open()/write() functions and I've been wondering if another process rm'd or mv'd the file. How can my process find out and recreate the file?
You can use fstat() to get the information about the open file. If the st_nlink field is zero, the file has been removed from the file system (possibly by being moved to a different file system, but there's no real way for you to determine that). There's a decent chance you have the only remaining reference to that file - though there might be other processes also holding it open. The disk space won't be released until the last process with an open file descriptor for the file finally closes the file.
If the st_nlink field is still positive, then your file still has a name somewhere out in the file system. You then need to use stat() to determine whether the st_dev and st_ino fields for the given file name match the same fields from the file descriptor. If the name still exists and has the same device and inode number, then it is 'the same' file (though the contents may have changed). If there's a difference, then the open file is different from the file specified by name.
Note that if you want to be sure that the given name is not a symbolic link to a moved copy of the file, then you would have to use lstat() on the file when you open it (to ensure it isn't a symlink at that point), and again when you check the file (instead of using stat()).
You can use the stat call to do this.
struct stat st;
if(stat("/tmp",&st) == 0)
printf(" /tmp is present\n");
else
/* Write code to create the file */

Resources