auto delete file on linux - c

I am trying to do a file be deleted when a program ends. I remember that before I could put the unlink() before the first close() and I don't need reopen the file.
What I expect: The file is erased after the program ends.
What is happening: The file is erased when the call to unlink happens the file is erased.
My sample program:
int main()
{
int fd = open(argv[1], O_CREAT);
int x = 1;
write(fd, "1234\n", 5);
close(fd);
fd = open(argv[1], 0);
unlink(argv[1]);
while (x <= 3)
{
int k;
scanf(" %d", &k);
x++;
}
close(fd);
return 0;
}
Has a way that I can open() the file, interact with it and on close() delete the file from harddisk? I'm using fedora linux 18.
I need know the name of the file that I did open in this way because it will be used by another application.

Unlinking a file simply detaches the file name from the underlying inode, making it impossible to open the file using that file name afterwards.
If any process has the file still open, they can happily read and write it, as those operations operate on the inode and not the file name. Also, if there are hardlinks (other file names referring to the same inode) left, those other file names can be used to open the file just fine. See e.g. the Wikipedia article on inodes for further details.
Edited to add:
In Linux, you can leverage the /proc pseudofilesystem. If your application (with process ID PID) has file descriptor FD open, with the file name already unlinked, it can still let another application work on it by telling the other application to work on /proc/PID/fd/FD. It is a pseudo-file, meaning it looks like a (non-functioning!) symlink, but it is not -- it's just useful Linux kernel magic: as long as the other application just opens it normally (open()/fopen() etc., no lstat()/readlink() stuff), they will get access as if they were opening a normal file.
As a real-world example, open two terminals, and in one write
bash -c 'exec 3<>foobar ; echo $$ ; rm foobar ; echo "Initial contents" >&3 ; cat >&3'
The first line it outputs is the PID, and FD is 3 here. Anything you type (after pressing Enter) will be appended to a file that was briefly named foobar, but no longer exists. (You can easily verify that.)
In a second terminal, type
cat /proc/PID/fd/3
to see what that file contains.

It sounds like what you really want is tmpfile():
The tmpfile() function opens a unique temporary file in binary
read/write (w+b) mode. The file will be automatically deleted when it
is closed or the program terminates.

The File is unlinked, so it won't show up from ls... but the file still exists there is an inode and you could actually re-link it... the file won't be removed from the disk until all file descriptors pointing to it are closed...
you could still read and write to the fd while it is open after it is unlinked...

Related

Can a file descriptor be duplicated multiple times?

I've been looking for quite a while and cannot find the answer to my question.
I'm trying to reproduce a shell in C, with full redirections. In order to do this, I wanted to open the file before executing my command.
For example, in ls > file1 > file2, I use dup2(file1_fd, 1) and dup2(file2_fd, 1) and then I execute ls to fill the files, but it seems a standard output can only be open once so only file2 will be filled, because it was the last one to be duplicated.
Is there a way to redirect standard output to multiple file?
Is there something I am missing? Thanks!
Is there a way to redirect standard output to multiple files?
Many file descriptors cannot be made one file descriptor. You need to write into each file descriptor separately. This is what tee utility does for you.
What you are asking for is the exact reason why the tee command exists (you can take a look at its source code here).
You cannot duplicate a file descriptor using dup2() multiple times. As you already saw, the last one overwrites any previous duplication. Therefore you cannot redirect the output of a program to multiple files directly using dup2().
In order to do this, you really need multiple descriptors, and therefore you would have to open both files, launch the command using popen() and then read from the pipe and write to both files.
Here is a very simple example of how you could do it:
#include <stdio.h>
#include <stdlib.h>
#define N 4096
int main(int argc, const char *argv[]) {
FILE *fp1, *fp2, *pipe;
fp1 = fopen("out1.txt", "w");
if (fp1 == NULL) {
perror("fopen out1 failed");
return 1;
}
fp2 = fopen("out2.txt", "w");
if (fp2 == NULL) {
perror("fopen out2 failed");
return 1;
}
// Run `ls -l` just as an example.
pipe = popen("ls -l", "r");
if (pipe == NULL) {
perror("popen failed");
return 1;
}
size_t nread, nwrote;
char buf[N];
while ((nread = fread(buf, 1, N, pipe))) {
nwrote = 0;
while (nwrote < nread)
nwrote += fwrite(buf + nwrote, 1, nread - nwrote, fp1);
nwrote = 0;
while (nwrote < nread)
nwrote += fwrite(buf + nwrote, 1, nread - nwrote, fp2);
}
pclose(pipe);
fclose(fp2);
fclose(fp1);
return 0;
}
The above code is only to give a rough estimate on how the whole thing works, it doesn't check for some errors on fread, fwrite, etc: you should of course check for errors in your final program.
It's also easy to see how this could be extended to support an arbitrary number of output files (just using an array of FILE *).
Standard output is not different from any other open file, the only special characteristic is for it to be file descriptor 1 (so only one file descriptor with index 1 can be in your process) You can dup(2) file descriptor 1 to get, let´s say file descriptor 6. That's the mission of dup() just to get another file descriptor (with a different number) than the one you use as source, but for the same source. Dupped descriptors allow you to use any of the dupped descriptors indifferently to output, or to change open flags like close on exec flag or non block or append flag (not all are shared, I'm not sure which ones can be changed without affecting the others in a dup). They share the file pointer, so every write() you attempt to any of the file descriptors will be updated in the others.
But the idea of redirection is not that. A convention in unix says that every program will receive three descriptors already open from its parent process. So to use forking, first you need to consider how to write notation to express that a program will receive (already opened) more than one output stream (so you can redirect any of them properly, before calling the program) The same also applies for joining streams. Here, the problem is more complex, as you'll need to express how the data flows might be merged into one, and this makes the merging problem, problem dependant.
File dup()ping is not a way to make a file descriptor to write in two files... but the reverse, it is a way to make two different file descriptors to reference the same file.
The only way to do what you want is to duplicate write(2) calls on every file descriptor you are going to use.
As some answer has commented, tee(1) command allows you to fork the flow of data in a pipe, but not with file descriptors, tee(1) just opens a file, and write(2)s there all the input, in addition to write(2)`ing it to stdout also.
There's no provision to fork data flows in the shell, as there's no provision to join (in paralell) dataflows on input. I think this is some abandoned idea in the shell design by Steve Bourne, and you'll probably get to the same point.
BTW, just study the possibility of using the general dup2() operator, which is <n>&m>, but again, consider that, for the redirecting program, 2>&3 2>&4 2>&5 2>&6 mean that you have pre-opened 7 file descriptors, 0...6 in which stderr is an alias of descpritors 3 to 6 (so any data written to any of those descriptors will appear into what was stderr) or you can use 2<file_a 3<file_b 4<file_c meaning your program will be executed with file descriptor 2 (stderr) redirected from file_a, and file descriptors 3 and 4 already open from files file_b and file_c. Probably, some notation should be designed (and it doesn't come easily to my mind now, how to devise it) to allow for piping (with the pipe(2) system call) between different processes that have been launched to do some task, but you need to build a general graph to allow for generality.

How to reserve a file descriptor?

I'm writing a curses-based program. In order to make it simpler for me to find errors in this program, I would like to produce debug output. Due to the program already displaying a user interface on the terminal, I cannot put debugging output there.
Instead, I plan to write debugging output to file descriptor 3 unconditionally. You can invoke the program as program 3>/dev/ttyX with /dev/ttyX being a different teletype to see the debugging output. When file descriptor 3 is not opened, write calls fail with EBADF, which I ignore like all errors when writing debugging output.
A problem occurs when I open another file and no debugging output has been requested (i.e. file descriptor 3 has not been opened). In this case, the newly opened file might receive file descriptor 3, causing debugging output to randomly corrupt a file I just opened. This is a bad thing. How can I avoid this? Is there a portable way to mark a file descriptor as “reserved” or such?
Here are a couple of ideas I had and their problems:
I could open /dev/null or a temporary file to file descriptor 3 (e.g. by means of dup2()) before opening any other file. This works but I'm not sure if I can assume this to always succeed as opening /dev/null may not succeed.
I could test if file descriptor 3 is open and not write debugging output if it isn't. This is problematic when I'm attempting to restart the program by calling exec as a different file descriptor might have been opened (and not closed) prior to the exec call. I could intentionally close file descriptor 3 before calling exec when it has not been opened for debugging, but this feels really uggly.
Why use fd 3? Why not use fd 2 (stderr)? It already has a well-defined "I am logging of some sorts" meaning, is always (not true, but sufficiently true...) and you can redirect it before starting your binary, to get the logs where you want.
Another option would be to log messages to syslog, using the LOG_DEBUG level. This entails calling syslog() instead of a normal write function, but that's simply making the logging more explicit.
A simple way of checking if stderr has been redirected or is still pointing at the terminal is by using the isatty function (example code below):
#include <stdio.h>
#include <unistd.h>
int main(void) {
if (isatty(2)) {
printf("stderr is not redirected.\n");
} else {
printf("stderr seems to be redirected.\n");
}
}
In the very beginning of your program, open /dev/null and then assign it to file descriptor 3:
int fd = open ("/dev/null", O_WRONLY);
dup2(fd, 3);
This way, file descriptor 3 won't be taken.
Then, if needed, reuse dup2() to assign file descriptor 3 to your debugging output.
You claim you can't guarantee you can open /dev/null successfully, which is a little strange, but let's run with it. You should be able to use socketpair() to get a pair of FDs. You can then set the write end of the pair non-blocking, and dup2 it. You claim you are already ignoring errors on writes to this FD, so the data going in the bit-bucket won't bother you. You can of course close the other end of the socketpair.
Don't focus on a specific file descriptor value - you can't control it in a portable manner anyway. If you can control it at all. But you can use an environment variable to control debug output to a file:
int debugFD = getDebugFD();
...
int getDebugFD()
{
const char *debugFile = getenv( "DEBUG_FILE" );
if ( NULL == debugFile )
{
return( -1 );
}
int fd = open( debugFile, O_CREAT | O_APPEND | O_WRONLY, 0644 );
// error checking can be here
return( fd );
}
Now you can write your debug output to debugFD. I assume you know enough to make sure debugFD is visible where you need it, and also how to make sure it's initialized before trying to use it.
If you don't pass a DEBUG_FILE envval, you get an invalid file descriptor and your debug calls fail - presumably silently.

How to check if a file still exists using a file descriptor

I have a file descriptor that is set to a positive value with the result of a open() function so this fd is indicating a file. When i delete the actual file fd is still a positive integer. I want to know that if i delete a file for some reason, how can i know that this file descriptor is not valid anymore. In short, how can i know that the file that fd is indicating, still there or not. I am trying to do this in C on FreeBSD.
Unix systems let you delete open files (or rather, delete all references to the file from the filesystem). But the file descriptor is still valid. Any read and write calls will be successful, as they would with the filename still there.
In other words, you cannot fully delete a file until the file descriptor is closed. Once closed, the file will then be removed automatically.
With a valid file descriptor, you can check if the filename still exists, e.g.
printf("%d\n", buf.st_nlink); // 0 means no filenames
Where buf is a struct stat initialised with fstat.
Before writing to the file you could check if it is still there using access()
if (access("/yourfile",W_OK)!=-1) {
//Write on the file
}
You can also do fstat on the descriptor:
struct stat statbuf;
fstat(fd,&statbuf);
if (statbuf.st_nlink > 0) {
//File still exists
}
But it will slow your software down a lot, and also some program could link the file somewhere else and unlink the original name, so that the file would still be existing but under a different name/location, and this method would not detect that.
A much better alternative would be to use inotify on GNU/Linux, or kqueue on bsd, but I've never used the 2nd one.
You can use these API to watch changes in directories and get notifications from the kernel and get an event when your file is being deleted by some other process, and do something about it.
Keep in mind that this events are not in real time, so you could still use the file for a couple of milliseconds before getting the event.

Reopen a file descriptor with another access?

Assume the OS is linux. Suppose I opened a file for write and get a file descriptor fdw. Is it possible to get another file descriptor fdr, with read-only access to the file without calling open again? The reason I don't want to call open is the underlying file may have been moved or even unlinked in the file system by other processes, so re-use the same file name is not reliable against such actions. So my question is: is there anyway to open a file descriptor with different access right if given only a file descriptor? dup or dup2 doesn't change the access right, I think.
Yes! The trick is to access the deleted file via /proc/self/fd/n. It’s a linux-only trick, as far as I know.
Run this program:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main() {
FILE* out_file;
FILE* in_file;
char* dev_fd_path;
char buffer[128];
/* Write “hi!” to test.txt */
out_file = fopen("test.txt", "w");
fputs("hi!\n", out_file);
fflush(out_file);
/* Delete the file */
unlink("test.txt");
/* Verify that the file is gone */
system("ls test.txt");
/* Reopen the filehandle in read-mode from /proc */
asprintf(&dev_fd_path, "/proc/self/fd/%d", fileno(out_file));
in_file = fopen(dev_fd_path, "r");
if (!in_file) {
perror("in_file is NULL");
exit(1);
}
printf("%s", fgets(buffer, sizeof(buffer), in_file));
return 0;
}
It writes some text to a file, deletes it, but keeps the file descriptor open and and then reopens it via a different route. Files aren’t actually deleted until the last process holding the last file descriptor closes it, and until then, you can get at the file contents via /proc.
Thanks to my old boss Anatoly for teaching me this trick when I deleted some important files that were fortunately still being appended to by another process!
No, the fcntl call will not let you set the read/write bits on an open file descriptor and the only way to get a new file descriptor from an existing one is by using the duplicate functionality. The calls to dup/dup2/dup3 (and fcntl) do not allow you to change the file access mode.
NOTE: this is true for Linux, but not true for other Unixes in general. In HP-UX, for example, [see (1) and (2)] you are able to change the read/write bits with fcntl using F_SETFL on an open file descriptor. Since file descriptors created by dup share the same status flags, however, changing the access mode for one will necessarily change it for the other.

How to check if an opened file has been moved or removed by another process

I have a process using C on Linux OS that writes data to a file. It uses open()/write() functions and I've been wondering if another process rm'd or mv'd the file. How can my process find out and recreate the file?
You can use fstat() to get the information about the open file. If the st_nlink field is zero, the file has been removed from the file system (possibly by being moved to a different file system, but there's no real way for you to determine that). There's a decent chance you have the only remaining reference to that file - though there might be other processes also holding it open. The disk space won't be released until the last process with an open file descriptor for the file finally closes the file.
If the st_nlink field is still positive, then your file still has a name somewhere out in the file system. You then need to use stat() to determine whether the st_dev and st_ino fields for the given file name match the same fields from the file descriptor. If the name still exists and has the same device and inode number, then it is 'the same' file (though the contents may have changed). If there's a difference, then the open file is different from the file specified by name.
Note that if you want to be sure that the given name is not a symbolic link to a moved copy of the file, then you would have to use lstat() on the file when you open it (to ensure it isn't a symlink at that point), and again when you check the file (instead of using stat()).
You can use the stat call to do this.
struct stat st;
if(stat("/tmp",&st) == 0)
printf(" /tmp is present\n");
else
/* Write code to create the file */

Resources