Why use mmap over fread? - file

Why/when is it better to use mmap(), as opposed to fread()'ing from a filestream in chunks into a byte array?
uint8_t my_buffer[MY_BUFFER_SIZE];
size_t bytes_read;
bytes_read = fread(my_buffer, 1, sizeof(my_buffer), input_file);
if (MY_BUFFER_SIZE != bytes_read) {
fprintf(stderr, "File read failed: %s\n", filepath);
exit(1);
}

There are advantages to mapping a file instead of reading it as a stream:
If you intend to perform random access to different widely-spaced areas of the file, mapping might mean that only the pages you access need to be actually read, while keeping your code simple.
If multiple applications are going to be accessing the same file, mapping it means that it will only be read into memory once, as opposed to the situation where each application loads [part of] the file into its own private buffers.
If the file doesn't fit in memory or would take a large chunk of memory, mapping it can supply the illusion that it fits and simplify your program logic, while letting the operating system decide how to manage rotating bits of the file in and out of physical memory.
If the file contents change, you MAY get to see the new contents automatically. (This can be a dubious advantage.)
There are disadvantages to mapping the file:
If you only need sequential access to the file or it is small or you only need access to a small portion of it, the overhead of setting up a memory mapping and then incurring page faults to actually cause the contents to be read can be less efficient than just reading the file.
If there is an I/O error reading the file, your application will most likely be killed on the spot instead of receiving a system call error to which your application can react gracefully. (Technically you can catch the SIGBUS in the former case but recovering properly from that kind of thing is not easy.)
If you are not using a 64-bit architecture and the file is very large, there might not be enough address space to map it.
mmap() is less portable than read() (of fread() as you suggest).
mmap() will only work on regular files (on some filesystems) and some block devices.

Related

Reading a file of arbitrary length in C

What's the most idiomatic/efficient way to read a file of arbitrary length in C?
Get the filesize of the file in bytes and issue a single fread()
Keep fread()ing a constant size buffer until getting EOF
Anything else?
Avoid using any technique which requires knowing the size of the file in advance. That leaves exactly one technique: read the file a bit at a time, in blocks of a convenient size.
Here's why you don't want to try to find the filesize in advance:
If it is not a regular file, there may not be any way to tell. For example, you might be reading directly from a console, or taking piped input from a previous data generator. If your program requires the filesize to be knowable, these useful input mechanisms will not be available to your users, who will complain or choose a different tool.
Even if you can figure out the filesize, you have no way of preventing it from changing while you are reading the file. If you are not careful about how you read the file, you might open a vulnerability which could be exploited by adversarial programs.
For example, if you allocate a buffer of the "correct" size and then read until you get an end-of-file condition, you may end up overwriting random memory. (Multiple reads may be necessary if you use an interface like read() which might read less data than requested.) Or you might find that the file has been truncated; if you don't check the amount of data read, you might end up processing uninitialised memory leading to information leakage.
In practice, you usually don't need to keep the entire file content in memory. You'll often parse the file (notably if it is textual), or at least read the file in smaller pieces, and for that you don't need it entirely in memory. For a textual file, reading it line-by-line (perhaps with some state inside your parser) is often enough (using fgets or getline).
Files exist (notably on disks or SSDs) because usually they can be much "bigger" than your computer RAM. Actually, files have been invented (more than 50 years ago) to be able to deal with data larger than memory. Distributed file systems also can be very big (and accessed remotely even from a laptop, e.g. by NFS, CIFS, etc...)
Some file systems are capable of storing petabytes of data (on supercomputers), with individual files of many terabytes (much larger than available RAM).
You'll also likely to use some databases. These routinely have terabytes of data. See also this answer (about realistic size of sqlite databases).
If you really want to read a file entirely in memory using stdio (but you should avoid doing that, because you generally want your program to be able to handle a lot of data on files; so reading the entire file in memory is generally a design error), you indeed could loop on fread (or fscanf, or even fgetc) till end-of-file. Notice that feof is useful only after some input operation.
On current laptop or desktop computers, you could prefer (for efficiency) to use buffers of a few megabytes, and you certainly can deal with big files of several hundreds of gigabytes (much larger than your RAM).
On POSIX file systems, you might do memory mapped IO with e.g. mmap(2) - but that might not be faster than read(2) with large buffers (of a few megabytes). You could use readahead(2) (Linux specific) and posix_fadvise(2) (or madvise(2) if using mmap) to tune performance by hinting your OS kernel.
If you have to code for Microsoft Windows, you could study its WinAPI and find some way to do memory mapped IO.
In practice, file data (notably if it was accessed recently) often stays in the page cache, which is of paramount importance for performance. When that is not the case, your hardware (disk, controller, ...) becomes the bottleneck and your program becomes I/O bound (in that case, no software trick could improve significantly the performance).

Is it possible to read a file without loading it into memory?

I want to read a file but it is too big to load it completely into memory.
Is there a way to read it without loading it into memory? Or there is a better solution?
I want to read a file but it is too big to load it completely into memory.
Be aware that -in practice- files are an abstraction (so somehow an illusion) provided by your operating system thru file systems. Read Operating Systems: Three Easy Pieces (freely downloadable) to learn more about OSes. Files can be quite big (even if most of them are small), e.g. many dozens of gigabytes on current laptops or desktops (and many terabytes on servers, and perhaps more).
You don't define what is memory, and the C11 standard n1570 uses that word in a different way, speaking of memory locations in §3.14, and of memory management functions in §7.22.3...
In practice, a process has its virtual address space, related to virtual memory.
On many operating systems -notably Linux and POSIX- you can change the virtual address space with mmap(2) and related system calls, and you could use memory-mapped files.
Is there a way to read it without loading it into memory?
Of course, you can read and write partial chunks of some file (e.g. using fread, fwrite, fseek, or the lower-level system calls read(2), write(2), lseek(2), ...). For performance reasons, better use large buffers (several kilobytes at least). In practice, most checksums (or cryptographic hash functions) can be computed chunkwise, on a very long stream of data.
Many libraries are built above such primitives (doing direct IO by chunks). For example the sqlite database library is able to handle database files of many terabytes (more than the available RAM). And you could use RDBMS (they are software coded in C or C++)
So of course you can deal with files larger than available RAM and read or write them by chunks (or "records"), and this has been true since at least the 1960s. I would even say that intuitively, files can (usually) be much larger than RAM, but smaller than a single disk (however, even this is not always true; some file systems are able to span several physical disks, e.g. using LVM techniques).
(on my Linux desktop with 32Gbytes of RAM, the largest file has 69Gbytes, on an ext4 filesystem with 669G available and 780G total space, and I did had in the past files above 100 Gbytes)
You might find worthwhile to use some database like sqlite (or be a client of some RDBMS like PostGreSQL, etc...), or you could be interested in libraries for indexed files like gdbm. Of course you can also do direct I/O operations (e.g. fseek then fread or fwrite, or lseek then read or write, or pread(2) or pwrite ...).
I need the content to do a checksum, so I need the complete message
Many checksum libraries support incremental updates to the checksum. For example, the GLib has g_checksum_update(). So you can read the file a block at a time with fread and update the checksum as you read.
#include <stdio.h>
#include <string.h>
#include <errno.h>
#include <stdlib.h>
#include <glib.h>
int main(void) {
char filename[] = "test.txt";
// Create a SHA256 checksum
GChecksum *sum = g_checksum_new(G_CHECKSUM_SHA256);
if( sum == NULL ) {
fprintf(stderr, "Could not create checksum.\n");
exit(1);
}
// Open the file we'll be checksuming.
FILE *fp = fopen( filename, "rb" );
if( fp == NULL ) {
fprintf(stderr, "Could not open %s: %s.\n", filename, strerror(errno));
exit(1);
}
// Read one buffer full at a time (BUFSIZ is from stdio.h)
// and update the checksum.
unsigned char buf[BUFSIZ];
size_t size_read = 0;
while( (size_read = fread(buf, 1, sizeof(buf), fp)) != 0 ) {
// Update the checksum
g_checksum_update(sum, buf, (gssize)size_read);
}
// Print the checksum.
printf("%s %s\n", g_checksum_get_string(sum), filename);
}
And we can check it works by comparing the result with sha256sum.
$ ./test
0c46af5bce717d706cc44e8c60dde57dbc13ad8106a8e056122a39175e2caef8 test.txt
$ sha256sum test.txt
0c46af5bce717d706cc44e8c60dde57dbc13ad8106a8e056122a39175e2caef8 test.txt
One way to do this, if the problem is RAM, not virtual address space, is memory mapping the file, either via mmap on POSIX systems, or CreateFileMapping/MapViewOfFile on Windows.
That can get you what looks like a raw array of the file bytes, but with the OS responsible for paging the contents in (and writing them back to disk if you alter them) as you go. When mapped read-only, it's quite similar to just malloc-ing a block of memory and fread-ing to populate it, but:
It's lazy: For a 1 GB file, you're not waiting the 5-30 seconds for the whole thing to be read in before you can work with any part of it, instead, you just pay for each page on access (and sometimes, the OS will pre-read in the background, so you don't even have to wait on the per-page load in)
It responds better under memory pressure; if you run out of memory, the OS can just drop clean pages from memory without writing them to swap, knowing it can page them back in from the golden copy in the file whenever they're needed; with malloc-ed memory, it has to write it out to swap, increasing disk traffic at a time when you're likely oversubscribed on the disk already
Performance-wise, this can be slightly slower under default settings (since, without memory pressure, reading the whole file in mostly guarantees it will be in memory when asked for, while random access to a memory mapped file is likely to trigger on-demand page faults to populate each page on first access), though you can use posix_madvise with POSIX_MADV_WILLNEED (POSIX systems) or PrefetchVirtualMemory (Windows 8 and higher) to provide a hint that the entire file will be needed, causing the system to (usually) page it in in the background, even as you're accessing it. On POSIX systems, other advise hints can be used for more granular hinting when paging the whole file in at once isn't necessary (or possible), e.g. using POSIX_MADV_SEQUENTIAL if you're reading the file data in order from beginning to end usually triggers more aggressive prefetch of subsequent pages, increasing the odds that they're in memory by the time you get to them. By doing so, you get the best of both worlds; you can begin accessing the data almost immediately, with a delay on accessing pages not paged in yet, but the OS will be pre-loading the pages for you in the background, so you eventually run as full speed (while still being more resilient to memory pressure, since the OS can just drop clean pages, rather than writing them to swap first).
The main limitation here is virtual address space. If you're on a 32 bit system, you're likely limited to (depending on how fragmented the existing address space is) 1-3 GB of contiguous address space, which means you'd have to map the file in chunks, and can't have on-demand random access to any point in the file at any time without additional system calls. Thankfully, on 64 bit systems, this limitation rarely comes up; even the most limiting 64 bit systems (Windows 7) provide 8 TB of user virtual address space per process, far larger than the vast, vast majority of files you're likely to encounter (and later versions increase the cap to 128 TB).

If mmap is faster than legacy file accessing, where we see the time saving?

I Understand the usage of the mmap. Considering simple read/write operation on the file, involves, opening the file, and allocating the buffer, read [ which requires context switch, ], and then the data available to the user in the buffer, and changes in the buffer will not reflect into the file unless it is written explictly.
Instead , if we use mmap, writting directly to the buffer is nothing but writting into the file.
The Question:
1) File is in the hard disk, mmaped into the process, Each time i write into mmaped memory, is it written directly to the file?. In this case, does not it require any context switch, because, the changes are done directly into the file itself. If mmap is faster than legacy file accessing, where we see the time saving?
Kindly explain. correct me if i m wrong also.
Updates to the file are not immediately visible in the disk, but are visible after an unmap or following an msync call. Hence, there is no system call during the updates, and the kernel is not involved. However, since the file is lazily read page by page, as needed, OS may need to read-in portions of the file as you cross page boundaries. Most obvious advantage of memory mapping is that it eliminates kernel-space to user-space data copies. There is also no need for system calls to seek to a specific position in a file.

what's the proper buffer size for 'write' function?

I am using the low-level I/O function 'write' to write some data to disk in my code (C language on Linux). First, I accumulate the data in a memory buffer, and then I use 'write' to write the data to disk when the buffer is full. So what's the best buffer size for 'write'? According to my tests it isn't the bigger the faster, so I am here to look for the answer.
There is probably some advantage in doing writes which are multiples of the filesystem block size, especially if you are updating a file in place. If you write less than a partial block to a file, the OS has to read the old block, combine in the new contents and then write it out. This doesn't necessarily happen if you rapidly write small pieces in sequence because the updates will be done on buffers in memory which are flushed later. Still, once in a while you could be triggering some inefficiency if you are not filling a block (and a properly aligned one: multiple of block size at an offset which is a multiple of the block size) with each write operation.
This issue of transfer size does not necessarily go away with mmap. If you map a file, and then memcpy some data into the map, you are making a page dirty. That page has to be flushed at some later time: it is indeterminate when. If you make another memcpy which touches the same page, that page could be clean now and you're making it dirty again. So it gets written twice. Page-aligned copies of multiples-of a page size will be the way to go.
You'll want it to be a multiple of the CPU page size, in order to use memory as efficiently as possible.
But ideally you want to use mmap instead, so that you never have to deal with buffers yourself.
You could use BUFSIZ defined in <stdio.h>
Otherwise, use a small multiple of the page size sysconf(_SC_PAGESIZE) (e.g. twice that value). Most Linux systems have 4Kbytes pages (which is often the same as or a small multiple of the filesystem block size).
As other replied, using the mmap(2) system call could help. GNU systems (e.g. Linux) have an extension: the second mode string of fopen may contain the latter m and when that happens, the GNU libc try to mmap.
If you deal with data nearly as large as your RAM (or half of it), you might want to also use madvise(2) to fine-tune performance of mmap.
See also this answer to a question quite similar to yours. (You could use 64Kbytes as a reasonable buffer size).
The "best" size depends a great deal on the underlying file system.
The stat and fstat calls fill in a data structure, struct stat, that includes the following field:
blksize_t st_blksize; /* blocksize for file system I/O */
The OS is responsible for filling this field with a "good size" for write() blocks. However, it's also important to call write() with memory that is "well aligned" (e.g., the result of malloc calls). The easiest way to get this to happen is to use the provided <stdio.h> stream interface (with FILE * objects).
Using mmap, as in other answers here, can also be very fast for many cases. Note that it's not well suited to some kinds of streams (e.g., sockets and pipes) though.
It depends on the amount of RAM, VM, etc. as well as the amount of data being written. The more general answer is to benchmark what buffer works best for the load you're dealing with, and use what works the best.

When should I use mmap for file access?

POSIX environments provide at least two ways of accessing files. There's the standard system calls open(), read(), write(), and friends, but there's also the option of using mmap() to map the file into virtual memory.
When is it preferable to use one over the other? What're their individual advantages that merit including two interfaces?
mmap is great if you have multiple processes accessing data in a read only fashion from the same file, which is common in the kind of server systems I write. mmap allows all those processes to share the same physical memory pages, saving a lot of memory.
mmap also allows the operating system to optimize paging operations. For example, consider two programs; program A which reads in a 1MB file into a buffer creating with malloc, and program B which mmaps the 1MB file into memory. If the operating system has to swap part of A's memory out, it must write the contents of the buffer to swap before it can reuse the memory. In B's case any unmodified mmap'd pages can be reused immediately because the OS knows how to restore them from the existing file they were mmap'd from. (The OS can detect which pages are unmodified by initially marking writable mmap'd pages as read only and catching seg faults, similar to Copy on Write strategy).
mmap is also useful for inter process communication. You can mmap a file as read / write in the processes that need to communicate and then use synchronization primitives in the mmap'd region (this is what the MAP_HASSEMAPHORE flag is for).
One place mmap can be awkward is if you need to work with very large files on a 32 bit machine. This is because mmap has to find a contiguous block of addresses in your process's address space that is large enough to fit the entire range of the file being mapped. This can become a problem if your address space becomes fragmented, where you might have 2 GB of address space free, but no individual range of it can fit a 1 GB file mapping. In this case you may have to map the file in smaller chunks than you would like to make it fit.
Another potential awkwardness with mmap as a replacement for read / write is that you have to start your mapping on offsets of the page size. If you just want to get some data at offset X you will need to fixup that offset so it's compatible with mmap.
And finally, read / write are the only way you can work with some types of files. mmap can't be used on things like pipes and ttys.
One area where I found mmap() to not be an advantage was when reading small files (under 16K). The overhead of page faulting to read the whole file was very high compared with just doing a single read() system call. This is because the kernel can sometimes satisify a read entirely in your time slice, meaning your code doesn't switch away. With a page fault, it seemed more likely that another program would be scheduled, making the file operation have a higher latency.
mmap has the advantage when you have random access on big files. Another advantage is that you access it with memory operations (memcpy, pointer arithmetic), without bothering with the buffering. Normal I/O can sometimes be quite difficult when using buffers when you have structures bigger than your buffer. The code to handle that is often difficult to get right, mmap is generally easier. This said, there are certain traps when working with mmap.
As people have already mentioned, mmap is quite costly to set up, so it is worth using only for a given size (varying from machine to machine).
For pure sequential accesses to the file, it is also not always the better solution, though an appropriate call to madvise can mitigate the problem.
You have to be careful with alignment restrictions of your architecture(SPARC, itanium), with read/write IO the buffers are often properly aligned and do not trap when dereferencing a casted pointer.
You also have to be careful that you do not access outside of the map. It can easily happen if you use string functions on your map, and your file does not contain a \0 at the end. It will work most of the time when your file size is not a multiple of the page size as the last page is filled with 0 (the mapped area is always in the size of a multiple of your page size).
In addition to other nice answers, a quote from Linux system programming written by Google's expert Robert Love:
Advantages of mmap( )
Manipulating files via mmap( ) has a handful of advantages over the
standard read( ) and write( ) system calls. Among them are:
Reading from and writing to a memory-mapped file avoids the
extraneous copy that occurs when using the read( ) or write( ) system
calls, where the data must be copied to and from a user-space buffer.
Aside from any potential page faults, reading from and writing to a memory-mapped file does not incur any system call or context switch
overhead. It is as simple as accessing memory.
When multiple processes map the same object into memory, the data is shared among all the processes. Read-only and shared writable
mappings are shared in their entirety; private writable mappings have
their not-yet-COW (copy-on-write) pages shared.
Seeking around the mapping involves trivial pointer manipulations. There is no need for the lseek( ) system call.
For these reasons, mmap( ) is a smart choice for many applications.
Disadvantages of mmap( )
There are a few points to keep in mind when using mmap( ):
Memory mappings are always an integer number of pages in size. Thus, the difference between the size of the backing file and an
integer number of pages is "wasted" as slack space. For small files, a
significant percentage of the mapping may be wasted. For example, with
4 KB pages, a 7 byte mapping wastes 4,089 bytes.
The memory mappings must fit into the process' address space. With a 32-bit address space, a very large number of various-sized mappings
can result in fragmentation of the address space, making it hard to
find large free contiguous regions. This problem, of course, is much
less apparent with a 64-bit address space.
There is overhead in creating and maintaining the memory mappings and associated data structures inside the kernel. This overhead is
generally obviated by the elimination of the double copy mentioned in
the previous section, particularly for larger and frequently accessed
files.
For these reasons, the benefits of mmap( ) are most greatly realized
when the mapped file is large (and thus any wasted space is a small
percentage of the total mapping), or when the total size of the mapped
file is evenly divisible by the page size (and thus there is no wasted
space).
Memory mapping has a potential for a huge speed advantage compared to traditional IO. It lets the operating system read the data from the source file as the pages in the memory mapped file are touched. This works by creating faulting pages, which the OS detects and then the OS loads the corresponding data from the file automatically.
This works the same way as the paging mechanism and is usually optimized for high speed I/O by reading data on system page boundaries and sizes (usually 4K) - a size for which most file system caches are optimized to.
An advantage that isn't listed yet is the ability of mmap() to keep a read-only mapping as clean pages. If one allocates a buffer in the process's address space, then uses read() to fill the buffer from a file, the memory pages corresponding to that buffer are now dirty since they have been written to.
Dirty pages can not be dropped from RAM by the kernel. If there is swap space, then they can be paged out to swap. But this is costly and on some systems, such as small embedded devices with only flash memory, there is no swap at all. In that case, the buffer will be stuck in RAM until the process exits, or perhaps gives it back withmadvise().
Non written to mmap() pages are clean. If the kernel needs RAM, it can simply drop them and use the RAM the pages were in. If the process that had the mapping accesses it again, it cause a page fault the kernel re-loads the pages from the file they came from originally. The same way they were populated in the first place.
This doesn't require more than one process using the mapped file to be an advantage.

Resources