Platform is Ubuntu Linux on ARM.
I want to write a string to a file, but I want every time to truncate the file and then write the string, i.e. no append.
I have this code:
f=fopen("/home/user1/refresh.txt","w");
fputs( "{"some string",f);
fflush(f);
ftruncate(fileno(f),(off_t)0);
flcose(f);
If I run it and then check the file, it will be of zero length and when opened, there will be nothing in it.
If I remove the fflush call, it will NOT be 0 (will be 11) and when I open it there will be "some string" in it.
Is this the normal behavior?
I do not have a problem calling fflush, but I want to do this in a loop and calling fflush may increase the execution time considerably.
You should not really mix file handle and file descriptor calls like that.
What's almost certainly happening without the fflush is that the some string is waiting in file handle buffers for delivery to the file descriptor. You then truncate the file descriptor and fclose the file handle, flushing the string, hence it shows up in the file.
With the fflush, some string is sent to the file descriptor and then you truncate it. With no further flushing, the file stays truncated.
If you want to literally "truncate the file then write", then it's sufficient to:
f=fopen("/home/user1/refresh.txt","w");
fputs("some string",f);
fclose(f);
Opening the file in the mode w will truncate it (as opposed to mode a which is for appending to the end).
Also calling fclose will flush the output buffer so no data gets lost.
POSIX requires you to take specific actions (which ensure that no ugly side effects of buffering make your program go haywire) when switching between using a FILE stream and a file descriptor to access the same open file. This is described in XSH 2.5.1 Interaction of File Descriptors and Standard I/O Streams.
In your case, I believe it should suffice to just call fflush before ftruncate, like you're doing. Omitting this step, per the rules of 2.5.1, results in undefined behavior.
Related
{
FILE* f1 = fopen("C:\\num1.bin", "wb+");//it will create a new file
int A[] = { 1,3,6,28 }; //int arr
fwrite(A, sizeof(A), 1, f1); //should insert the A array to the file
}
I do see the file but even after the fwrite, the file remains empty (0 bytes), does anyone know why?
You need to close the file with fclose
Otherwise the write buffer will not (necessarily) force the file contents to be written to disk
A couple of things:
As #Grantly correctly noted above, you are missing a call to fclose or fflush after writing to the file. Without this any cached/pending writes will not necessarily be actually written to the open file.
You do not check the return value of fopen. If fopen fails for any reason it will return a NULL pointer and not a valid file pointer. Since you're writing directly to the root of the drive C:\ on a Windows platform, that's something you definitely do want to be checking for (not that you shouldn't in other cases too, but run under a regular user account that location is often write protected).
Result of fwrite is not required to appear in the fille immediately after it returns. That is because file operations usually work in a buffered manner, i.e. they are cached and then flushed to speed things up and improve the performance.
The content of the file will be updated after you call fclose:
fclose()
(...) Any unwritten buffered data are flushed to the OS. Any unread buffered
data are discarded.
You may also explicitly flush the internal buffer without closing the file using fflush:
fflush()
For output streams (and for update streams on which the last operation
was output), writes any unwritten data from the stream's buffer to the
associated output device.
The scenario:
process(A) opens file with r+ mode.
process(B) opens the same file with r+ mode.
process(A) writes some data into it, and fflush().
process(A) notifies process(B) to read the data.
process(B) reads the data. <---- here is the problem.
Got some unexpected bytes( 0000 0000 ...) at the head part, and the left bytes are correct.
PS: The data size is about 16k, and I write/read it with one fwrite()/fread() call.
I also did a test, that is, if process(B) calls fflush() before reading the data, the result is correct.
My question is,
What is the correct way to make sure process(B) always be able to
get the updated data?
As process(A) has called fflush() already, why process(B) also needs
to fflush() before reading the data?
Each stream must be flushed to insure the stream is ready for I/O. When you open your stream in process(A), you will use something similar to:
FILE *fpA = fopen (filename, "r+");
and then in process(B) you do something similar:
FILE *fpB = fopen (filename, "r+");
Both fpA and fpB are separate data streams for filename. Flusing only process(A)'s stream has no effect on what is in process(B)'s stream and vice versa. So the correct way is to insure each stream is flushed and ready for additional I/O.
I know ANSI C defines fopen, fwrite, fread, fclose to modify a file's content. However, when it comes to truncating a file, we have to turn to OS specific function, e.g, truncate() on Linux, _chsize_s_() on Windows. But before we can call those OS specific functions, we have to obtain the file-handle from FILE pointer, by calling fileno, also an non-ANSI-C one.
My question is: Is it reliable to continue using FILE* after truncating the file? I mean, ANSI C FILE layer has its own buffer and does not know the file is truncated from beneath. In case the buffered bytes is beyond the truncated point, will the buffered content be flushed to the file when doing fclose() ?
If no guarantee, what is the best practice of using file I/O functions accompanied with truncate operation when write a Windows-Linux portable program?
Similar question: When querying file size from a file-handle returned by fileno , is it the accurate size when I later call fclose() -- without further fwrite()?
[EDIT 2012-12-11]
According to Joshua's suggestion. I conclude that current possible best practice is: Set the stream to unbuffered mode by calling setbuf(stream, NULL); , then truncate() or _chsize_s() can work peacefully with the stream.
Anyhow, no official document seems to explicitly confirm this behavior, whether Microsoft CRT or GNU glibc.
The POSIX way....
ftruncate() is what you're looking for, and it's been in POSIX base specifications since 2001, so it should be in every modern POSIX-compatible system by now.
Note that ftruncate() operates on a POSIX file descriptor (despite its potentially misleading name), not a STDIO stream FILE handle. Note also that mixing operations on the STDIO stream and on the underlying OS calls which operate on the file descriptor for the open stream can confuse the internal runtime state of the STDIO library.
So, to use ftruncate() safely with STDIO it may be necessary to first flush any STDIO buffers (with fflush()) if your program may have already written to the stream in question. This will avoid STDIO trying to flush the otherwise unwritten buffer to the file after the truncation has been done.
You can then use fileno() on the STDIO stream's FILE handle to find the underlying file descriptor for the open STDIO stream, and you would then use that file descriptor with ftruncate(). You might consider putting the call to fileno() right in the parameter list for the ftruncate() call so that you don't keep the file descriptor around and accidentally use it yet other ways which might further confuse the internal state of STDIO. Perhaps like this (say to truncate a file to the current STDIO stream offset):
/*
* NOTE: fflush() is not needed here if there have been no calls to fseek() since
* the last fwrite(), assuming it extended the length of the stream --
* ftello() will account for any unwritten buffers
*/
if (ftruncate(fileno(stdout), ftello(stdout)) == -1) {
fprintf(stderr, "%s: ftruncate(stdout) failed: %s\n", argv[0], strerror(errno));
exit(1);
}
/* fseek() is not necessary here since we truncated at the current offset */
Note also that the POSIX definition of ftruncate() says "The value of the seek pointer shall not be modified by a call to ftruncate()", so this means you may also need to use use fseek() to set the STDIO layer (and thus indirectly the file descriptor) either to the new end of the file, or perhaps back to the beginning of the file, or somewhere still within the boundaries of the file, as desired. (Note that the fseek() should not be necessary if the truncation point is found using ftello().)
You should not have to make the STDIO stream unbuffered if you follow the procedure above, though of course doing so could be an alternative to using fflush() (but not fseek()).
Without POSIX....
If you need to stick to strict ISO Standard C, say C99, then you have no portable way to truncate a file to a given length other than zero (0) length. The latest draft of C11 that I have says this in Section 7.21.3 (paragraph 2):
Binary files are not truncated, except as defined in 7.21.5.3. Whether a write on a text stream causes the associated file to be truncated beyond that point is implementation-defined.
(and 7.21.5.3 describes the flags to fopen() which allow a file to be truncated to a length of zero)
The caveat about text files is there because on silly systems that have both text and binary files (as opposed to just plain POSIX-style content agnostic files) then it is often possible to write a value to the file which will be stored in the file at the position written and which will be treated as an EOF indicator when the file is next read.
Other types of systems may have different underlying file I/O interfaces that are not compatible with POSIX while still providing a compatible ISO C STDIO library. In theory if such a system offers something similar to fileno() and ftrunctate() then a similar procedure could be used with them as well, provided that one took the same care to avoid confusing the internal runtime state of the STDIO library.
With regard to querying file size....
You also asked whether the file size found by querying the file descriptor returned by fileno() would be an accurate representation of the file size after a successful call to fclose(), even without any further calls to fwrite().
The answer is: Don't do that!
As I mentioned above, the POSIX file descriptor for a file opened as a STDIO stream must be used very carefully if you don't want to confuse the internal runtime state of the STDIO library. We can add here that it is important not to confuse yourself with it either.
The most correct way to find the current size of a file opened as a STDIO stream is to seek to the end of it and then ask where the stream pointer is by using only STDIO functions.
Isn't an unbuffered write of zero bytes supposed to truncate the file at that point?
See this question for how to set unbuffered: Unbuffered I/O in ANSI C
In C, the rewind() call starts the next write at the front of the file.
As I understand it, when I call fprintf(), it will write to the end of the string I am trying to write and no further. If the existing file has data past the end of the string I am writing, this is not overwritten.
Is there a way to change this behavior so that rewind() can be used to effectively perform a quick overwrite of the entire file?
If you want to truncate the file to zero length, reopen it in write mode. If you want to truncate it while it's open, ftruncate should do it in POSIX systems. There are, I believe, _chsize and _ftruncate in Windows, which are similar.
I'm doing a small project in C after quite a long time away from it. These happen to include some file handling. I noticed in various documentation that there are functions which return FILE * handles and others which return (small integer) descriptors. Both sets of functions offer the same basic services I need so it really does not matter I use.
But I'm curious about the collection wisdom: is it better to use fopen() and friends, or open() and friends?
Edit Since someone mentioned buffered vs unbuffered and accessing devices, I should add that one part of this small project will be writing a userspace filesystem driver under FUSE. So the file level access could as easily be on a device (e.g. a CDROM or a SCSI drive) as on a "file" (i.e. an image).
It is better to use open() if you are sticking to unix-like systems and you might like to:
Have more fine-grained control over unix permission bits on file creation.
Use the lower-level functions such as read/write/mmap as opposed to the C buffered stream I/O functions.
Use file descriptor (fd) based IO scheduling (poll, select, etc.) You can of course obtain an fd from a FILE * using fileno(), but care must be taken not to mix FILE * based stream functions with fd based functions.
Open any special device (not a regular file)
It is better to use fopen/fread/fwrite for maximum portability, as these are standard C functions, the functions I've mentioned above aren't.
The objection that "fopen" is portable and "open" isn't is bogus.
fopen is part of libc, open is a POSIX system call.
Each is as portable as the place they come from.
i/o to fopen'ed files is (you must assume it may be, and for practical purposes, it is) buffered by libc, file descriptors open()'ed are not buffered by libc (they may well be, and usually are buffered in the filesystem -- but not everything you open() is a file on a filesystem.
What's the point of fopen'ing, for example, a device node like /dev/sg0, say, or /dev/tty0... What are you going to do? You're going to do an ioctl on a FILE *? Good luck with that.
Maybe you want to open with some flags like O_DIRECT -- makes no sense with fopen().
fopen works at a higher level than open ....
fopen returns you a pointer to FILE stream which is similar to the stream abstraction that you read in C++
open returns you a file descriptor for the file opened ... It does not provide you a stream abstraction and you are responsible for handling the bits and bytes yourself ... This is at a lower level as compared to fopen
Stdio streams are buffered, while open() file descriptors are not. Depends on what you need. You can also create one from the other:
int fileno (FILE * stream) returns the file descriptor for a FILE *, FILE * fdopen(int fildes, const char * mode) creates a FILE * from a file descriptor.
Be careful when intermixing buffered and non-buffered IO, since you'll lose what's in your buffer when you don't flush it with fflush().
Yes. When you need a low-level handle.
On UNIX operating systems, you can generally exchange file handles and sockets.
Also, low-level handles make for better ABI compatibility than FILE pointers.
read() & write() use unbuffered I/O. (fd: integer file descriptor)
fread() & fwrite() use buffered I/O. (FILE* structure pointer)
Binary data written to a pipe with write() may not be able to read binary data with fread(), because of byte alignments, variable sizes, etc. Its a crap-shoot.
Most low-level device driver code uses unbuffered I/O calls.
Most application level I/O uses buffered.
Use of the FILE* and its associated functions
is OK on a machine-by-machine basis: but portability is lost
on other architectures in the reading and writing of binary data.
fwrite() is buffered I/O and can lead to unreliable results if
written for a 64 bit architecture and run on a 32bit; or (Windows/Linux).
Most OSs have compatibility macros within their own code to prevent this.
For low-level binary I/O portability read() and write() guarantee
the same binary reads and writes when compiled on differing architectures.
The basic thing is to pick one way or the other and be consistent about it,
throughout the binary suite.
<stdio.h> // mostly FILE* some fd input/output parameters for compatibility
// gives you a lot of helper functions -->
List of Functions
Function Description
───────────────────────────────────────────────────────────────────
clearerr check and reset stream status
fclose close a stream
fdopen stream open functions //( fd argument, returns FILE*) feof check and reset stream status
ferror check and reset stream status
fflush flush a stream
fgetc get next character or word from input stream
fgetpos reposition a stream
fgets get a line from a stream
fileno get file descriptor // (FILE* argument, returns fd)
fopen stream open functions
fprintf formatted output conversion
fpurge flush a stream
fputc output a character or word to a stream
fputs output a line to a stream
fread binary stream input/output
freopen stream open functions
fscanf input format conversion
fseek reposition a stream
fsetpos reposition a stream
ftell reposition a stream
fwrite binary stream input/output
getc get next character or word from input stream
getchar get next character or word from input stream
gets get a line from a stream
getw get next character or word from input stream
mktemp make temporary filename (unique)
perror system error messages
printf formatted output conversion
putc output a character or word to a stream
putchar output a character or word to a stream
puts output a line to a stream
putw output a character or word to a stream
remove remove directory entry
rewind reposition a stream
scanf input format conversion
setbuf stream buffering operations
setbuffer stream buffering operations
setlinebuf stream buffering operations
setvbuf stream buffering operations
sprintf formatted output conversion
sscanf input format conversion
strerror system error messages
sys_errlist system error messages
sys_nerr system error messages
tempnam temporary file routines
tmpfile temporary file routines
tmpnam temporary file routines
ungetc un-get character from input stream
vfprintf formatted output conversion
vfscanf input format conversion
vprintf formatted output conversion
vscanf input format conversion
vsprintf formatted output conversion
vsscanf input format conversion
So for basic use I would personally use the above without mixing idioms too much.
By contrast,
<unistd.h> write()
lseek()
close()
pipe()
<sys/types.h>
<sys/stat.h>
<fcntl.h> open()
creat()
fcntl()
all use file descriptors.
These provide fine-grained control over reading and writing bytes
(recommended for special devices and fifos (pipes) ).
So again, use what you need, but keep consistent in your idioms and interfaces.
If most of your code base uses one mode , use that too, unless there is
a real reason not to. Both sets of I/O library functions are extremely reliable
and used millions of times a day.
note-- If you are interfacing C I/O with another language,
(perl, python, java, c#, lua ...) check out what the developers of those languages
recommend before you write your C code and save yourself some trouble.
usually, you should favor using the standard library (fopen). However, there are occasions where you will need to use open directly.
One example that comes to mind is to work around a bug in an older version of solaris which made fopen fail after 256 files were open. This was because they erroniously used an unsigned char for the fd field in their struct FILE implementation instead of an int. But this was a very specific case.
fopen and its cousins are buffered. open, read, and write are not buffered. Your application may or may not care.
fprintf and scanf have a richer API that allows you to read and write formatted text files. read and write use fundamental arrays of bytes. Conversions and formatting must be hand crafted.
The difference between file descriptors and (FILE *) is really inconsequential.
Randy