Can an already opened FILE handle reflect changes to the underlying file without re-opening it? - c

Assuming a plain text file, foo.txt, and two processes:
Process A, a shell script, overwrites the file in regular intervals
$ echo "example" > foo.txt
Process B, a C program, reads from the file in regular intervals
fopen("foo.txt", "r"); getline(buf, len, fp); fclose(fp);
In the C program, keeping the FILE* fp open after the initial fopen(), doing a rewind() and reading again does not seem to reflect the changes that have happened to the file in the meantime. Is the only way to see the updated contents by doing an fclose() and fopen() cycle, or is there a way to re-use the already opened FILE handle, yet reading the most recently written data?
For context, I'm simply trying to find the most efficient way of doing this.

On Unix/Linux, when you create a file with a name which already existed, the old file is not deleted or altered in any way. A new file is created and the directory is updated to point at the new file instead of the old one.
The old file will continue to exist as long as some directory entry points at it (Unix file systems allow the same file to be pointed to by multiple directories) or some program has an open file handle to the file, which is more relevant to your question.
As long as you don't close fp, it continues to refer to the original file, even if that file is no longer referenced by the filesystem. When you close fp, the file will get garbage collected automatically, and the next time you open foo.txt, you'll get a file descriptor for whatever file happens to have that name at that point in time.
In short, with the shell script you indicate, your C program must close and reopen the file in order to see the new contents.
Theoretically, it would be possible for the shell script to overwrite the same file without deleting it, but (a) that's tricky to get right; (b) it's prone to race conditions; and (c) closing and reopening the file is not that time-consuming. But if you did that, you would see the changes. [Note 1]
In particular, it's common (and easy) to append to an existing file, and if you have a shell script which does that, you can keep the file descriptor open and see the changes. However, in that case you would normally have already read to the end of the file before the new data was appended, and the standard C library treats the feof() indicator as sticky; once it gets set, you will continue to get an EOF indication from new reads. If you suspect that some process will be writing more data to the file, you should reset the EOF indication with fseek(fp, 0, SEEK_CUR); before retrying the read.
Notes
As #amadan points out in a comment, there are race conditions with echo text > foo.txt as well, although the window is a bit shorter. But you can definitely avoid race conditions by using the idiom echo text > temporary_file; mv -f temporary_file foo.txt, because the rename operation is atomic. Of course, that would definitely require you to close and reopen the file. But it's a good idea, particularly if the contents being written are long or critical, or if new files are created frequently.

Related

Reading a string from a file with C. Fopen with w+ mode is not working

I made a C program that reads a string from a .txt file, then it encrypts the string, and finally it writes the string in the same file.
The thing is that if I use fopen("D:\\Prueba.txt","w+"), the program doesn't work, it prints garbage like this )PHI N.
I've debugged and I know the error is there in that line, because if I use fopen("D:\\Prueba.txt","r+"), the program works, and it writes what it should.
But I want to use w+ because it will rewrite what the .txt file had. Why is w+ not working?
If you're opening with w+ to first read the content, that's not going to work. From C11:
w+: truncate to zero length or create text file for update.
What's probably happening is that you read data from the now empty file but don't correctly check that it worked. That would explain the weird "content" you see of )PHI N.
One solution is to open the file as with r, open another file with w, and transfer the contents, encrypting them as part of that process. Then close both, delete the original, and rename the new one to the original name. This will allow you to process arbitrarily-sized files since you process them a bit at a time.
If you don't want to use a temporary file, and you're sure you can store the entire content in memory, you could open it r+, get the content, the reopen it with a new mode, such as with:
FILE *readFh = fopen( "myfile.txt", "r+");
// Read in content, massage as needed.
FILE *writeFh = frepoen( NULL, "w+", readFh);
// Provided that worked, you should now have an empty file to write to.
// Write back your massaged data.

Any reason to reopen as "write-append" after "read-only"?

I have a save file containing a stream of program events. The program may read the file and execute the events to restore a previous state (say between program invocations). After that any new events are appended to this file.
I could open the file once as read-write (fopen rw), not exposing the usage pattern.
But I wonder if there are any benefits of opening it as read-only at first (fopen r) and later re-opening it as append (freopen a). Would there be any appearent difference?
In your case there may not be any specific benefits, but primary use of freopen is to change the file associated with standard text stream (stdin, stdout, stderr). It may effect the readability of your code if you use if on normal files. In your case you first open in read-only mode, but if you are opening the stream as output there are few things about freopen that we need to keep in mind.
On Linux, freopen may also fail and set errno to EBUSY when the kernel structure for the old file descriptor was not initialized completely before freopen was called
freopen should not be used on output streams because it ignores errors while closing the old file descriptor.
Read about freopen and possible error conditions with fclose in GNU manual: https://www.gnu.org/software/libc/manual/html_node/Opening-Streams.html#Opening-Streams
No there are no specific benefits of opening the file as Read Only and then reopening in Append mode. If you require changes in files during program execution than better if you open it in as per mode.

Using rename to safely overwrite a shared file in Linux

Here is the setup: I have a shared file (lets call it status.csv) that is read by many processes (lets call them consumers) in a read-only fashion. I have one producer that periodically updates status.csv by creating a temp file, writing data to it and using the C function discussed here:
http://www.gnu.org/software/libc/manual/html_node/Renaming-Files.html
to rename the temp file (effectively overwrite) to status.csv so that the consumers can process the new data. It want to try and guarantee (as much as possible in the Linux world) that the consumers won't get a malformed/corrupted/half-old/half-new status.csv file (I want them to get either all of the old data or all of the new). I can't seem to guarantee this by reading the description of rename: it seems to guarantee that the rename action itself is atomic but I want to know if a consumer already has the status.csv file open, he will continue to read the same file as it was when it was opened, even if the file is renamed/overwritten by the producer in the middle of this reading operation.
I attempted to prototype this thinking that the consumers will get some type of error or a half old/half new file but it seems to always be in the state it was when it was open by the consumer even if renamed/overwritten multiple times.
BTW, these processes are running on the same machine (RHEL 6).
Thanks!
In Linux and similar systems, if a process has a file open and the file is deleted, the file itself remains undeleted until all processes close it. All that happens immediately is that the directory entry is deleted so that it cannot be opened again.
The same thing happens if rename is used to replace an open file. The old file descriptor still keeps the old file open. However, new opens will see the new file.
Therefore, for your consumers to see the new file, they must close and reopen the file.
Note: your consumers can discover if the file has been replaced by using the stat(2) call. If either the st_dev or st_ino entries (or both) have changed, then the file has been replaced and must be closed and reopened. This is how tail -F works.

C Programming fopen() while opening a file

I've been wondering about this one. Most books I've read shows that when you open a file and you found that the file is not existing, you should put an error that there's no such file then exit the system...
FILE *stream = NULL;
stream = fopen("student.txt", "rt");
if (stream==NULL) {
printf(“Cannot open input file\n”);
exit(1);
else {printf("\nReading the student list directory. Wait a moment please...");
But I thought that instead of doing that.. why not automatically create a new one when you found that the file you are opening is not existing. Even if you will not be writing on the file upon using the program (but will use it next time). I'm not sure if this is efficient or not. I'm just new here and have no programming experience whatsoever so I'm asking your opinion what are the advantages and disadvantages of creating a file upon trying to open it instead of exiting the system as usually being exampled on the books.
FILE *stream = NULL;
stream = fopen("student.txt", "rt");
if (stream == NULL) stream = fopen("student.txt", "wt");
else {
printf("\nReading the student list directory. Wait a moment please...");
Your opinion will be highly appreciated. Thank you.
Because from your example, it seems like it's an input file, if it doesn't exist, no point creating it.
For example if the program is supposed to open a file, then count how many vowels in it, then I don't see much sense of creating the file if it doesn't exist.
my $0.02 worth.
Argument mode:
``r'' Open text file for reading.
``r+'' Open for reading and writing.
``w'' Truncate file to zero length or create text file for writing.
``w+'' Open for reading and writing. The file is created if it does not
exist, otherwise it is truncated.
``a'' Open for writing. The file is created if it does not exist.
``a+'' Open for reading and writing. The file is created if it does not
exist.
Your question is a simple case. Read above description, when you call fopen(), you should decide which mode shall be used. Please consider why a file is not created for "r" and "r+", and why a file is truncated for "w" and "w+", etc. All of these are reasonable designs.
If your program expects a file to exist and it doesn't, then creating one yourself doesn't make much sense, since it's going to be empty.
If OTOH, your program is OK with a file not existing and knows how to populate one from scratch, then it's perfectly fine to do so.
Either is fine as long as it makes sense for your program. Don't worry about efficiency here -- it's negligible. Worry about correctness first.
You may not have permission to create/write to a file in the directory that the user chooses. You will have to handle that error condition.

Detecting file deletion after fopen

im working in a code that detects changes in a file (a log file) then its process the changes with the help of fseek and ftell. but if the file get deleted and changed (with logrotate) the program stops but not dies, because it not detect more changes (even if the file is recreated). fseek dont show errors and eiter ftell.
how i can detect that file deletion? maybe a way to reopen the file with other FILE *var and comparing file descriptor. but how i can do that. ?
When a file gets deleted, it is not necessarily erased from your disk. In your case the program still has a handle to the old file. The old file handle will not get you any information about its deletion or replacement with another file.
An easy way to detect file deletion and recreation is using stat(2) and fstat(2). They give you a struct stat which contains the inode for the file. When a file is recreated (and still open) the files (old open and recreated) are different and thus the inodes are different. The inode field is st_ino. Yes, you need to poll this unless you wish to use Linux-features like inotify.
You can periodically close the file and open it again, that way you will open the newly created one. Files actually get deleted when there is no handle to the file (open file descriptor is a handle), you are still holding the old file.
On windows, you could set callbacks on the modifications of the FS. Here are details: http://msdn.microsoft.com/en-us/library/aa365261(VS.85).aspx

Resources