A file opened for read and write can be unlinked - c

In my program (on Mac OS X), I opened the file using following code.
int fd;
fd = open(filename, O_RDWR);
Program to delete the file is as follows:
unlink(filename);
In my case, I have same file which is opened and deleted. I observed the following:
After opening the file, I can delete it using this program and even by using rm command.
After deleting the file, read and write operations are working on the file without any problem.
I would like to know the reason behind this. How to prevent rm command or unlink(2) system call from deleting the file which is being opened?

You can't stop unlink(2) from unlinking a file which it has permission to unlink (i.e. it has write access to the directory).
unlink is not called unlink because nobody could think of a better name. It's called that because that is what it does; it unlinks the file from the directory. (A directory is just a collection of links; i.e. it associates names with the location of the corresponding data.) It does not delete the file; the file is garbage collected by the filesystem when there are no longer any links to it.
Open file descriptors are not the only way to keep links to files. Another, quite common, way is to use the link(1) command without the -s option. This creates "hard" links. If a file has several hard links, then removing one of the links (with unlink(2)) does just that -- it removes one of the links.
The rm command has a possibly more confusing name, but it, too, only removes the name, not the file. The file exists as long as someone has a link to it, including a running process.

First, rm command is calling unlink(2)
Then, unlinking an opened file is a normal thing to do on Linux or others Unixes (e.g. MacOSX). It is the canonical way to get temporary files (like tmpfile(3) probably does).
You should understand what inodes are, and realize that a file is not its name or file path, but essentially an inode. A file can have zero, one, or several file paths or names (one can add more with the link(2) syscall, provided all the names sit in the same filesystem). Directory entries associate names to inodes.
So there is no (POSIX-ly portable) way to prohibit I/O on open-ed files without any names.
For some opened file, the kernel has reference counters to its inode, and keep that inode till all processes having open(2)-ed it did close(2) it or have terminated.
See also inode(7) and credentials(7).

It's a normal Situation in UNIX SYSTEM. when you rm or unlink an opened file. UNIX system just mark a flag , and won't really delete the file desception. until the file is closed. and it will be really deleted in the file system.
It's protection to help the daemon work fine.

A link is a name associated to some file (a file is basically unamed). Note that a file could have different names (try ln).
unlink() removes one of this association to a file. If you remove the last link to a file, this just makes you unable to access the file by a name. But, this doesn't mean that the file is unusable, as a file could have been opened and his currently read/written by some application.
A file is removed if and only if :
- there is no link on it
- it is not currently opened by any application

Related

How to programatically move a Linux symlink to a different file system (copy) in C?

rename() C function does not work across file systems. So I can move files via a copy by opening them, reading them and writing them to a new copy and then unlinking. But I have a hard time getting this to work with symlinks. (The idea is to move a folder with a bunch of other files/folders/symlinks etc inside of it). Basically implementing a mv command in C.
open(file, O_RDONLY)
while ((c = read(source_descriptor, buf, SIZE)) > 0){
write(d, buf, c);
}
unlink file;
Works good for normal files (and I have another function handling directories without issues). But whenever it hits a symlink I get perror spitting out No such file or directory.
I can detect if its a symlink via d_type but am not sure how to read/copy it once I have one since the normal file copy doesnt seem to work with symlinks because open() refuses to open them.
Once you have determined that you're dealing with a symlink (which can be done e.g. by using lstat()), you can read its contents with readlink() and recreate it at the target location by calling symlink().
See also man 7 symlink.
When you open a symlink without the O_NOFOLLOW flag, it will dereference the symlink (or symlink chain, if it's a symlink to a symlink). If the destination does not exist, open will fail. The O_NOFOLLOW flag makes sure, that if you attempt to open a symlink you will realiably get an error.
To "copy" a symlink, you'll have to read it with readlink and create a new symlink at the destination. However you may have to adjust the path it points to.
However if a program of yours has the need to copy directory trees on a *nix system, the correct way to implement this is not to reinvent the wheel, but to follow the Unix way and just execute the cp program with the right arguments.

How to check if an opened file has been moved or removed by another process

I have a process using C on Linux OS that writes data to a file. It uses open()/write() functions and I've been wondering if another process rm'd or mv'd the file. How can my process find out and recreate the file?
You can use fstat() to get the information about the open file. If the st_nlink field is zero, the file has been removed from the file system (possibly by being moved to a different file system, but there's no real way for you to determine that). There's a decent chance you have the only remaining reference to that file - though there might be other processes also holding it open. The disk space won't be released until the last process with an open file descriptor for the file finally closes the file.
If the st_nlink field is still positive, then your file still has a name somewhere out in the file system. You then need to use stat() to determine whether the st_dev and st_ino fields for the given file name match the same fields from the file descriptor. If the name still exists and has the same device and inode number, then it is 'the same' file (though the contents may have changed). If there's a difference, then the open file is different from the file specified by name.
Note that if you want to be sure that the given name is not a symbolic link to a moved copy of the file, then you would have to use lstat() on the file when you open it (to ensure it isn't a symlink at that point), and again when you check the file (instead of using stat()).
You can use the stat call to do this.
struct stat st;
if(stat("/tmp",&st) == 0)
printf(" /tmp is present\n");
else
/* Write code to create the file */

Detecting file deletion after fopen

im working in a code that detects changes in a file (a log file) then its process the changes with the help of fseek and ftell. but if the file get deleted and changed (with logrotate) the program stops but not dies, because it not detect more changes (even if the file is recreated). fseek dont show errors and eiter ftell.
how i can detect that file deletion? maybe a way to reopen the file with other FILE *var and comparing file descriptor. but how i can do that. ?
When a file gets deleted, it is not necessarily erased from your disk. In your case the program still has a handle to the old file. The old file handle will not get you any information about its deletion or replacement with another file.
An easy way to detect file deletion and recreation is using stat(2) and fstat(2). They give you a struct stat which contains the inode for the file. When a file is recreated (and still open) the files (old open and recreated) are different and thus the inodes are different. The inode field is st_ino. Yes, you need to poll this unless you wish to use Linux-features like inotify.
You can periodically close the file and open it again, that way you will open the newly created one. Files actually get deleted when there is no handle to the file (open file descriptor is a handle), you are still holding the old file.
On windows, you could set callbacks on the modifications of the FS. Here are details: http://msdn.microsoft.com/en-us/library/aa365261(VS.85).aspx

How to check whether two file names point to the same physical file

I have a program that accepts two file names as arguments: it reads the first file in order to create the second file. How can I ensure that the program won't overwrite the first file?
Restrictions:
The method must keep working when the file system supports (soft or hard) links
File permissions are fixed and it is only required that the first file is readable and the second file writeable
It should preferably be platform-neutral (although Linux is the primary target)
On linux, open both files, and use fstat to check if st_ino (edit:) and st_dev are the same. open will follow symbolic links. Don't use stat directly, to prevent race conditions.
The best bet is not to use filenames as identities. Instead, when you open the file for reading, lock it, using whatever mechanism your OS supports. When you then also open the file for writing, also lock it - if the lock fails, report an error.
If possible, open the first file read-only, (O_RDONLY) in LINUX. Then, if you try to open it again to write to it, you will get an error.
You can use stat to get the file status, and check if the inode numbers are the same.
Maybe you could use the system() function in order to invoke some shell commands?
In bash, you would simply call:
stat -c %i filename
This displays the inode number of a file. You can compare two files this way and if their inodes are identical, it means they are hard links. The following call:
stat -c %N filename
will display the file's name and if it's a symbolic link, it'll print the file name it links to as well. It prints out only one name, even if the file it points to has hard links, so checking the symbolic link would require comparing inode numbers for the 2nd file and the file the symbolic links links to in order to make sure.
You could redirect stat output to a text file and then parse the file in your program.
If you mean the same inode, in bash, you could do
[ FILE1 -ef FILE2 ] && echo equal || echo difference
Combined with realpath/readlink, that should handle the soft-links as well.

How can I tell if a file is open elsewhere in C on Linux?

How can I tell if a file is open in C? I think the more technical question would be how can I retrieve the number of references to a existing file and determine with that info if it is safe to open.
The idea I am implementing is a file queue. You dump some files, my code processes the files. I don't want to start processing until the producer closes the file descriptor.
Everything is being done in linux.
Thanks,
Chenz
Digging out that info is a lot of work(you'd have to search thorugh /proc/*/fd
You'd be better off with any of:
Save to temp then rename. Either write your files to a temporary filename or directory, when you're done writinh, rename it into the directory where your app reads them. Renaming is atomic, so when the file is present you know it's safe to read.
Maybe a variant of the above , when you're done writing the file foo you create an empty file named foo.finished. You look for the presence of *.finished when processing files.
Lock the files while writing, that way reading the file will just block until the writer unlocks it. See the flock/lockf functions, they're advisory locks though so both the reader and writer have to lock , and honor the locks.
I don't think there is any way to do this in pure C (it wouldn't be cross platform).
If you know what files you are using ahead of time, you can use inotify to be notified when they open.
Use the lsof command. (List Open Files).
C has facilities for handling files, but not much for getting information on them. In portable C, about the only thing you can do is try to open the file in the desired way and see if it works.
generally you can't do that for variuos reasons (e.g. you cannot say if the file is opened with another user).
If you can control the processes that open the file and you are try to avoid collisions by locking the file (there are many libraries on linux in order do that)
If you are in control of both producer and consumer, you could use lockf() of flock() to lock the file.
there is lsof command on most distros, which shows all currently open files, you can ofcourse grep its output if your files are in the same directory or have some recognizable name pattern.

Resources