I'm trying to parse some code which works with O_DIRECT files.
ssize_t written = write(fd, buf, size);
What is confusing is that size can be lower than the sector size of the disk, thus does write(fd,buf,size) write the entirety of buf to fd or only the first size bytes of buf to disk?
Without O_DIRECT this is simply the second case, but I can't find any documentation about in the case of O_DIRECT, and from what I've read it will still send buf to the disk, so the only thing I can think of is that it also tells the disk to only write size...
[...] does write(fd,buf,size) write the entirety of buf to fd or only the first size bytes of buf to disk?
If the write() call is successful it means all of the requested size data has been written but the question becomes: written to where? You have to remember that opening a file with O_DIRECT is sending more of a hint that you want to bypass OS caches rather than order. The filesystem could choose to simply write your I/O through the page cache either because that's what it always does or because you broke the rules regarding alignment and using the page cache is a way of quietly fixing up your mistake. The only way to know this would be to investigate the data path when the I/O was issued.
Related
When a FILE is open with a fopen() a buffer is associated with it to write and read from the files this is done to avoid direct access of disk because it is costly.
I found on some online tutorials saying when we load a file into main memory (RAM) four stuffs get created stdin, stdout, stderror, Buffer and this buffer is used to read/write in the file, I am curious to know how much size is allocated by OS for this buffer does it depend on OS architecture? Is there any possibility to know its size?
The default buffer size is macro constant BUFSIZ defined in stdio.h. The value is implementation dependent. You may use setvbuf() to change the buffering mode (Full/Line/No buffering) and buffer size.
Reference:
http://en.cppreference.com/w/c/io
I keep on reading that fread() and fwrite() are buffered library calls. In case of fwrite(), I understood that once we write to the file, it won't be written to the hard disk, it will fill the internal buffer and once the buffer is full, it will call write() system call to write the data actually to the file.
But I am not able to understand how this buffering works in case of fread(). Does buffered in case of fread() mean, once we call fread(), it will read more data than we originally asked and that extra data will be stored in buffer (so that when 2nd fread() occurs, it can directly give it from buffer instead of going to hard disk)?
And I have following queries also.
If fread() works as I mention above, then will first fread() call read the data that is equal to the size of the internal buffer? If that is the case, if my fread() call ask for more bytes than internal buffer size, what will happen?
If fread() works as I mention above, that means at least one read() system call to kernel will happen for sure in case of fread(). But in case of fwrite(), if we only call fwrite() once during the program execution, we can't say for sure that write() system call be called. Is my understanding correct?
Will the internal buffer be maintained by OS?
Does fclose() flush the internal buffer?
There is buffering or caching at many different levels in a modern system. This might be typical:
C standard library
OS kernel
disk controller (esp. if using hardware RAID)
disk drive
When you use fread(), it may request 8 KB or so if you asked for less. This will be stored in user-space so there is no system call and context switch on the next sequential read.
The kernel may read ahead also; there are library functions to give it hints on how to do this for your particular application. The OS cache could be gigabytes in size since it uses main memory.
The disk controller may read ahead too, and could have a cache size up to hundreds of megabytes on smallish systems. It can't do as much in terms of read-ahead, because it doesn't know where the next logical block is for the current file (indeed it doesn't even know what file it is reading).
Finally, the disk drive itself has a cache, perhaps 16 MB or so. Like the controller, it doesn't know what file it is reading. For many years one disk block was 512 bytes, but it got a little larger (a few KB) recently with multi-terabyte disks.
When you call fclose(), it will probably deallocate the user-space buffer, but not the others.
Your understanding is correct. And any buffered fwrite data will be flushed when the FILE* is closed. The buffered I/O is mostly transparent for I/O on regular files.
But for terminals and other character devices you may care. Another instance where buffered I/O may be an issue is if you read from the file that one process is writing to from another process -- a common example is if a program writes text to a log file during operation, and the user runs a command like tail -f program.log to watch the content of the log file live. If the writing process has buffering enabled and it doesn't explicitly flush the log file, it will make it difficult to monitor the log file.
Consider the following pseudo-code snippet to read a file from it's end
while (1) {
seek(fd, offset, SEEK_END)
read(fd, buf, n)
// process the buffer, break on EOF...
offset -= n
}
Now n can vary between 1 byte and let's say 1kB.
How big would be the impact on the file system for very small ns? Is this compensated by file system buffering for the most part, or should I always read larger chunks at once?
The answer depends on your operating system. Most modern OS's use a multiple of the system page size for file buffers. As such, 4KB (the most common page size on most systems) is likely to be the minimum unit the disk cache holds. The bigger problem is that your code is making a lot of redundant system calls, which are expensive. If you are concerned about performance, consider either buffering the data you think you need in big chunks and then referencing that data directly from your buffer, or calling mmap() if your system supports it and accessing the mapped file directly.
I am using fopen/fread/fwrite/fseek on linux with gcc. is it necessary to allocate a memory buffer and use fread to read data sequentially into buffer before using the data?
When you use fread or the other file I/O functions in the C standard library, memory is buffered in several places.
Your application allocates a buffer which gets passed to fread. fread copies data into your buffer, and then you can do what you want with it. You are responsible for allocation/deallocation of this buffer.
The C library will usually create a buffer for every FILE* you have open. Data is read into this buffers in large chunks. This allows fread to satisfy many small requests without having to make a large number of system calls, which are expensive. This is what people mean when they say fread is buffered.
The kernel will also buffer files that are being read in the disk cache. This reduces the time needed for the read system call, since if data is already in memory, your program won't have to wait while the kernel fetches it from the disk. The kernel will hold on to recently read files, and it may read ahead for files which are being accessed sequentially.
The C library buffer is allocated automatically when you open a file and freed when you close the file. You don't have to manage it at all.
The kernel disk cache is stored in physical memory that isn't being used for anything else. Again, you don't have to manage this. The memory will be freed as soon as it's needed for something else.
You must pass a buffer (a buffer created by your code, malloced or local) to fread to pass the read data back to you. I don't know what do you exactly mean by saying "fread is buffered". Most 'C' library calls operate in this fashion. They will not return their internal storage (buffer or otherwise) to you and if they do, they will provide you a corresponding free/release functions.
Refer http://pubs.opengroup.org/onlinepubs/000095399/functions/fread.html It has a very basic example also.
With fread, yes, you have to allocate memory in your process and the system call will copy the data into your buffer.
In some specialised cases, you can handle data without copying it into userspace. See the sendfile system call, which copies data from one file descriptor to another directly. This can be used to transfer data from a file to a network socket without excessive copying.
My application (C program) opens two file handles to the same file (one in write and one in read mode). Two separate threads in the app read from and write to the file. This works fine.
Since my app runs on embedded device with a limited ram disk size, I would like write FileHandle to wrap to beginning of file on reaching max size and the read FileHandle to follow like a circular buffer. I understand from answers to this question that this should work. However as soon as I do fseek of write FileHandle to beginning of file, fread returns error. Will the EOF get reset on doing fseek to beginning of file? If so, which function should be used to cause write file position to get set to 0 without causing EOF to be reset.
EDIT/UPDATE:
I tried couple of things:
Based on #neodelphi I used pipes this works. However my usecase requires I write to a file. I receive multiple channels of live video surveilance stream that needs to be stored to harddisk and also read back decoded and displayed on monitor.
Thanks to #Clement suggestions on doing ftell I fixed a couple of bugs in my code and wrap works for the reader however, the data read appears to be stale data since write are still buffered but reader reads stale content from hard disk. I cant avoid buffering due to performance considerations (I get 32Mbps of live data that needs to be written to harddisk). I have tried things like flushing writes only in the interval from when write wraps to when read wraps and truncating the file (ftruncate) after read wraps but this doesnt solve the stale data problem.
I am trying to use two files in ping-pong fashion to see if this solves the issue but want to know if there is a better solution
You should have something like that :
// Write
if(ftell(WriteHandle)>BUFFER_MAX) rewind (WriteHandle);
fwrite(WriteHandle,/* ... */);
// Read (assuming binary)
readSize = fread (buffer,1,READ_CHUNK_SIZE,ReadHandle);
if(readSize!=READ_CHUNK_SIZE){
rewind (ReadHandle);
if(fread (buffer+readSize,1,READ_CHUNK_SIZE-readSize,ReadHandle)!=READ_CHUNK_SIZE-readSize)
;// ERROR !
}
Not tested, but it gives an idea. The write should also handle the case BUFFER_MAX is not modulo WRITE_CHUNK_SIZE.
Also, you may read only if you are sure that the data has already been written. But I guess you already do that.
You could mmap the file into you're virtual memory and then just create a normal circular buffer with the pointer returned.
int fd = open(path, O_RDWR);
volatile void * mem = mmap(NULL, max_size, PROT_WRITE, MAP_SHARED, fd, 0);
volatile char * c_mem = (volatile char *)mem;
c_mem[index % max_size] = 'a'; // This line will now write to the offset index in the file
// Now doing
Can also probably be stricter on permissions depending on on exact case.