java.nio memory mapped file in java for reading huge file - nio

Can anybody explain me the internal working of below code
public class MemoryMappedFileInJava {
private static int count = 10485760; //10 MB
public static void main(String[] args) throws Exception {
RandomAccessFile memoryMappedFile = new RandomAccessFile("largeFile.txt", "rw");
//Mapping a file into memory
MappedByteBuffer out = memoryMappedFile.getChannel().map(FileChannel.MapMode.READ_WRITE, 0, count);
//Writing into Memory Mapped File
for (int i = 0; i < count; i++) {
out.put((byte) 'A');
}
System.out.println("Writing to Memory Mapped File is completed");
//reading from memory file in Java
for (int i = 0; i < 10 ; i++) {
System.out.print((char) out.get(i));
}
System.out.println("Reading from Memory Mapped File is completed");
}
}
i have not understood few things file
1) what is MappedByteBuffer means how it internally works?
2)what is File Channel is it handle to the file which is need to perform the operations i.e read or write?
3)map() method what it actually maps?
4)how this approach is faster than using java.io file read and write?
5)Is this approach is only useful, only when say i have a heap size of 400MB and i need to read a file which is of 8GB . or i can use it any time?
6) in the above code reading and writing is taking byte by byte,how can it be fast please explain?

What is MappedByteBuffer means how it internally works?
It is a Java wrapper around a memory area allocated by the operating system which is mapped to the file, usually on a paging basis. So the memory area appears to contain the file's contents, and changes to it are written back to the file.
What is File Channel is it handle to the file which is need to perform the operations i.e read or write?
Yes.
map() method what it actually maps?
The file to memory, or memory to the file, whichever way you prefer to think of it.
How this approach is faster than using java.io file read and write?
It bypasses a lot of the OS file system.
Is this approach is only useful, only when say i have a heap size of 400MB and i need to read a file which is of 8GB . or i can use it any time?
You can use it any time, within limits, but it's really only of benefit on large files. The performance gain was only 20% or so last time I measured it.
In the above code reading and writing is taking byte by byte,how can it be fast please explain?
Because of paging.

Related

Is it possible to read a file without loading it into memory?

I want to read a file but it is too big to load it completely into memory.
Is there a way to read it without loading it into memory? Or there is a better solution?
I want to read a file but it is too big to load it completely into memory.
Be aware that -in practice- files are an abstraction (so somehow an illusion) provided by your operating system thru file systems. Read Operating Systems: Three Easy Pieces (freely downloadable) to learn more about OSes. Files can be quite big (even if most of them are small), e.g. many dozens of gigabytes on current laptops or desktops (and many terabytes on servers, and perhaps more).
You don't define what is memory, and the C11 standard n1570 uses that word in a different way, speaking of memory locations in §3.14, and of memory management functions in §7.22.3...
In practice, a process has its virtual address space, related to virtual memory.
On many operating systems -notably Linux and POSIX- you can change the virtual address space with mmap(2) and related system calls, and you could use memory-mapped files.
Is there a way to read it without loading it into memory?
Of course, you can read and write partial chunks of some file (e.g. using fread, fwrite, fseek, or the lower-level system calls read(2), write(2), lseek(2), ...). For performance reasons, better use large buffers (several kilobytes at least). In practice, most checksums (or cryptographic hash functions) can be computed chunkwise, on a very long stream of data.
Many libraries are built above such primitives (doing direct IO by chunks). For example the sqlite database library is able to handle database files of many terabytes (more than the available RAM). And you could use RDBMS (they are software coded in C or C++)
So of course you can deal with files larger than available RAM and read or write them by chunks (or "records"), and this has been true since at least the 1960s. I would even say that intuitively, files can (usually) be much larger than RAM, but smaller than a single disk (however, even this is not always true; some file systems are able to span several physical disks, e.g. using LVM techniques).
(on my Linux desktop with 32Gbytes of RAM, the largest file has 69Gbytes, on an ext4 filesystem with 669G available and 780G total space, and I did had in the past files above 100 Gbytes)
You might find worthwhile to use some database like sqlite (or be a client of some RDBMS like PostGreSQL, etc...), or you could be interested in libraries for indexed files like gdbm. Of course you can also do direct I/O operations (e.g. fseek then fread or fwrite, or lseek then read or write, or pread(2) or pwrite ...).
I need the content to do a checksum, so I need the complete message
Many checksum libraries support incremental updates to the checksum. For example, the GLib has g_checksum_update(). So you can read the file a block at a time with fread and update the checksum as you read.
#include <stdio.h>
#include <string.h>
#include <errno.h>
#include <stdlib.h>
#include <glib.h>
int main(void) {
char filename[] = "test.txt";
// Create a SHA256 checksum
GChecksum *sum = g_checksum_new(G_CHECKSUM_SHA256);
if( sum == NULL ) {
fprintf(stderr, "Could not create checksum.\n");
exit(1);
}
// Open the file we'll be checksuming.
FILE *fp = fopen( filename, "rb" );
if( fp == NULL ) {
fprintf(stderr, "Could not open %s: %s.\n", filename, strerror(errno));
exit(1);
}
// Read one buffer full at a time (BUFSIZ is from stdio.h)
// and update the checksum.
unsigned char buf[BUFSIZ];
size_t size_read = 0;
while( (size_read = fread(buf, 1, sizeof(buf), fp)) != 0 ) {
// Update the checksum
g_checksum_update(sum, buf, (gssize)size_read);
}
// Print the checksum.
printf("%s %s\n", g_checksum_get_string(sum), filename);
}
And we can check it works by comparing the result with sha256sum.
$ ./test
0c46af5bce717d706cc44e8c60dde57dbc13ad8106a8e056122a39175e2caef8 test.txt
$ sha256sum test.txt
0c46af5bce717d706cc44e8c60dde57dbc13ad8106a8e056122a39175e2caef8 test.txt
One way to do this, if the problem is RAM, not virtual address space, is memory mapping the file, either via mmap on POSIX systems, or CreateFileMapping/MapViewOfFile on Windows.
That can get you what looks like a raw array of the file bytes, but with the OS responsible for paging the contents in (and writing them back to disk if you alter them) as you go. When mapped read-only, it's quite similar to just malloc-ing a block of memory and fread-ing to populate it, but:
It's lazy: For a 1 GB file, you're not waiting the 5-30 seconds for the whole thing to be read in before you can work with any part of it, instead, you just pay for each page on access (and sometimes, the OS will pre-read in the background, so you don't even have to wait on the per-page load in)
It responds better under memory pressure; if you run out of memory, the OS can just drop clean pages from memory without writing them to swap, knowing it can page them back in from the golden copy in the file whenever they're needed; with malloc-ed memory, it has to write it out to swap, increasing disk traffic at a time when you're likely oversubscribed on the disk already
Performance-wise, this can be slightly slower under default settings (since, without memory pressure, reading the whole file in mostly guarantees it will be in memory when asked for, while random access to a memory mapped file is likely to trigger on-demand page faults to populate each page on first access), though you can use posix_madvise with POSIX_MADV_WILLNEED (POSIX systems) or PrefetchVirtualMemory (Windows 8 and higher) to provide a hint that the entire file will be needed, causing the system to (usually) page it in in the background, even as you're accessing it. On POSIX systems, other advise hints can be used for more granular hinting when paging the whole file in at once isn't necessary (or possible), e.g. using POSIX_MADV_SEQUENTIAL if you're reading the file data in order from beginning to end usually triggers more aggressive prefetch of subsequent pages, increasing the odds that they're in memory by the time you get to them. By doing so, you get the best of both worlds; you can begin accessing the data almost immediately, with a delay on accessing pages not paged in yet, but the OS will be pre-loading the pages for you in the background, so you eventually run as full speed (while still being more resilient to memory pressure, since the OS can just drop clean pages, rather than writing them to swap first).
The main limitation here is virtual address space. If you're on a 32 bit system, you're likely limited to (depending on how fragmented the existing address space is) 1-3 GB of contiguous address space, which means you'd have to map the file in chunks, and can't have on-demand random access to any point in the file at any time without additional system calls. Thankfully, on 64 bit systems, this limitation rarely comes up; even the most limiting 64 bit systems (Windows 7) provide 8 TB of user virtual address space per process, far larger than the vast, vast majority of files you're likely to encounter (and later versions increase the cap to 128 TB).

Why use mmap over fread?

Why/when is it better to use mmap(), as opposed to fread()'ing from a filestream in chunks into a byte array?
uint8_t my_buffer[MY_BUFFER_SIZE];
size_t bytes_read;
bytes_read = fread(my_buffer, 1, sizeof(my_buffer), input_file);
if (MY_BUFFER_SIZE != bytes_read) {
fprintf(stderr, "File read failed: %s\n", filepath);
exit(1);
}
There are advantages to mapping a file instead of reading it as a stream:
If you intend to perform random access to different widely-spaced areas of the file, mapping might mean that only the pages you access need to be actually read, while keeping your code simple.
If multiple applications are going to be accessing the same file, mapping it means that it will only be read into memory once, as opposed to the situation where each application loads [part of] the file into its own private buffers.
If the file doesn't fit in memory or would take a large chunk of memory, mapping it can supply the illusion that it fits and simplify your program logic, while letting the operating system decide how to manage rotating bits of the file in and out of physical memory.
If the file contents change, you MAY get to see the new contents automatically. (This can be a dubious advantage.)
There are disadvantages to mapping the file:
If you only need sequential access to the file or it is small or you only need access to a small portion of it, the overhead of setting up a memory mapping and then incurring page faults to actually cause the contents to be read can be less efficient than just reading the file.
If there is an I/O error reading the file, your application will most likely be killed on the spot instead of receiving a system call error to which your application can react gracefully. (Technically you can catch the SIGBUS in the former case but recovering properly from that kind of thing is not easy.)
If you are not using a 64-bit architecture and the file is very large, there might not be enough address space to map it.
mmap() is less portable than read() (of fread() as you suggest).
mmap() will only work on regular files (on some filesystems) and some block devices.

Multiple threads writing on same file

I would like to know if we can use multiple threads to write binary data on the same file.
FILE *fd = openfile("test");
int SIZE = 1000000000;
int * table = malloc(sizeof(int) * SIZE);
// .. filling the table
fwrite(table, sizeof(*table), SIZE, fd);
so I wonder if i can use threads,and each thread calls fssek to seek to a different location to write in the same file.
Any idea ?
fwrite should be thread safe, but you'll need a mutex anyway, because you need the seek and the write to be atomic. Depending on your platform, you might have a write function that takes an offset, or you might be able to open the file in each thread. A better option if you have everything in memory anyway as your code suggests, would just be for each thread to fill into a single large array and then write that out when everything is done.
While fread() and fwrite() are thread safe, the stream buffer represented by the FILE* is not. So you can have multiple threads accessing the same file, but not via the same FILE* - each thread must have its own, and the file to which they refer must be shareable - which is OS dependent.
An alternative and possibly simpler approach is to use a memory mapped file, so that each thread treats the file as shared memory, and you let the OS deal with the file I/O. This has a significant advantage over normal file I/O as it is truly random access, so you don't need to worry about fseek() and sequential read/writes etc.
fseek and fwrite are thread-safe so you can use them without additional synchronization.
Let each thread open the file, and make sure they write to different positions, finally let each thread close the file and your done.
Update:
This works on IX'ish systems, at least.

Creation of maximum size file in C: drive (OS drive) fails

I have developed a disk and file wiping software (using WIN32 api) which also contains option of wiping drive's free space. I do this by creating a file which is size of drive's free space available and then I write random bytes (Applying various standard wiping schemes) on that file.
My problem is that It works well on on every other drive except on the drive which has windows operating system installed (in my case, it is C:). It gives "Not enough disk space" error although the said drive has lots of free space available. My program runs with administrative privileges. Is it some kind of privileges issue? Do I need to give more privileges to my program even after running it with administrator? I would want to do it programatically using winapi.
I am testing mostly on NTFS file system. I am creating file using CreateFile winapi call and to make sure to create exact size of file equaling available free space, I am using fragmentation api to get free space and then using SetEndOfFile winapi method to extend the size of file.
Any help would be appreciated.
Windows doesn't handle running out of disk space well, so it wouldn't surprise me if the system reserves some disk space ordinary users, even administrators can't use. However regardless of whether that is true or not, your scheme is flawed. It's easily possible for the amount of space allocated to change in between the the time you find out how much space is free and you try to allocate it all. And even it works, things might start breaking because the disk is full, even if only for a short while.
Since you're already using the defragmentation API, I'd use it to wipe all the clusters on the disk without trying to allocate them all once. First create a file that fills up most of the disk space, but leaves plenty of room for other processes to allocate files. Then use FSCTL_GET_VOLUME_BITMAP to get the bitmap of unallocated sectors, and FSCTL_MOVE_FILE to move clusters from the file you created into the free clusters found in the bitmap. You'll need to be ready for FSCTL_MOVE_FILE to fail because something allocated one of the clusters marked as free. In that case I would keep halving the number of clusters you were moving at once until it worked. If it fails with only one cluster, than you know that it's the cluster (or one of the clusters) that's been allocated.
Something like this pseudocode:
// unalloc_start and unalloc_len describe an unallocated region in the free space bitmap
wipe_unallocated_clusters(hwipefile, unalloc_start, unalloc_len, max_chunk_len) {
unalloc_vcn = unalloc_start
unalloc_end = unalloc_start + unalloc_len
max_chunck_len = max(max_chunk_len, unalloc_len)
clen = max_chunk_len
while(unalloc_vcn < unalloc_end) {
clen = max(clen, unalloc_end - unalloc_vcn)
while (fsctl_move_file(hwipefile, 0, unalloc_vcn, clen) == FAILED) {
if (clen == 1) {
unalloc_vcn++ // skip over allocated cluster
continue
}
clen /= 2 // try again with half as many clusters
}
unalloc_vcn += clen
clen = max(clen * 2, max_chunk_len) // double the clusters if it worked
}
}
Seeing some code and specific values that fail would be nice, but I'll try to use my psychic powers. I have two theories.
1) your file size exceeds maximum file size for the filesystem on C:. FAT16 supports up to 2GB, FAT32 supports up to 4GB files.
2) you have too many files in your root directory. There are some reports that Windows FAT32 implementation supports only 1000 file entries in the root directory.

Secure and efficient way to modify multiple files on POSIX systems?

I have been following the discussion on the "bug" on EXT4 that causes files to be zeroed in crash if one uses the "create temp file, write temp file, rename temp to target file" process. POSIX says that unless fsync() is called, you cannot be sure the data has been flushed to harddisk.
Obviously doing:
0) get the file contents (read it or make it somehow)
1) open original file and truncate it
2) write new contents
3) close file
is not good even with fsync() as the computer can crash during 2) or fsync() and you end up with partially written file.
Usually it has been thought that this is pretty safe:
0) get the file contents (read it or make it somehow)
1) open temp file
2) write contents to temp file
3) close temp file
4) rename temp file to original file
Unfortunately it isn't. To make it safe on EXT4 you would need to do:
0) get the file contents (read it or make it somehow)
1) open temp file
2) write contents to temp file
3) fsync()
4) close temp file
5) rename temp file to original file
This would be safe and on crash you should either have the new file contents or old, never zeroed contents or partial contents. But if the application uses lots of files, fsync() after every write would be slow.
So my question is, how to modify multiple files efficiently on a system where fsync() is required to be sure that changes have been saved to disk? And I really mean modifying many files, as in thousands of files. Modifying two files and doing fsync() after each wouldn't be too bad, but fsync() does slow things down when modifying multiple files.
EDIT: changed the fsync() close temp file to corrent order, added emphasis on writing many many many files.
The short answer is: Solving this in the app layer is the wrong place. EXT4 must make sure that after I close the file, the data is written in a timely manner. As it is now, EXT4 "optimizes" this writing to be able to collect more write requests and burst them out in one go.
The problem is obvious: No matter what you do, you can't be sure that your data ends on the disk. Calling fdisk() manually only makes things worse: You basically get in the way of EXT4's optimization, slowing the whole system down.
OTOH, EXT4 has all the information necessary to make an educated guess when it is necessary to write data out to the disk. In this case, I rename the temp file to the name of an existing file. For EXT4, this means that it must either postpone the rename (so the data of the original file stays intact after a crash) or it must flush at once. Since it can't postpone the rename (the next process might want to see the new data), renaming implicitly means to flush and that flush must happen on the FS layer, not the app layer.
EXT4 might create a virtual copy of the filesystem which contains the changes while the disk is not modified (yet). But this doesn't affect the ultimate goal: An app can't know what optimizations the FS if going to make and therefore, the FS must make sure that it does its job.
This is a case where ruthless optimizations have gone too far and ruined the results. Golden rule: Optimization must never change the end result. If you can't maintain this, you must not optimize.
As long as Tso believes that it is more important to have a fast FS rather than one which behaves correctly, I suggest not to upgrade to EXT4 and close all bug reports about this is "works as designed by Tso".
[EDIT] Some more thoughts on this. You could use a database instead of the file. Let's ignore the resource waste for a moment. Can anyone guarantee that the files, which the database uses, won't become corrupted by a crash? Probably. The database can write the data and call fsync() every minute or so. But then, you could do the same:
while True; do sync ; sleep 60 ; done
Again, the bug in the FS prevents this from working in every case. Otherwise, people wouldn't be so bothered by this bug.
You could use a background config daemon like the Windows registry. The daemon would write all configs in one big file. It could call fsync() after writing everything out. Problem solved ... for your configs. Now you need to do the same for everything else your apps write: Text documents, images, whatever. I mean almost any Unix process creates a file. This is the freaking basis of the whole Unix idea!
Clearly, this is not a viable path. So the answer remains: There is no solution on your side. Keep bothering Tso and the other FS developers until they fix their bugs.
My own answer would be to keep to the modifications on temp files, and after finishing writing them all, do one fsync() and then do rename on them all.
You need to swap 3 & 4 in your last listing - fsync(fd) uses the file descriptor. and I don't see why that would be particularly costly - you want the data written to disk by the close() anyway. So the cost will be the same between what you want to happen and what will happen with fsync().
If the cost is too much, (and you have it) fdatasync(2) avoid syncing the meta-data, so should be lighter cost.
EDIT:
So I wrote some extremely hacky test code:
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/time.h>
#include <time.h>
#include <stdio.h>
#include <string.h>
static void testBasic()
{
int fd;
const char* text = "This is some text";
fd = open("temp.tmp", O_WRONLY | O_CREAT);
write(fd,text,strlen(text));
close(fd);
rename("temp.tmp","temp");
}
static void testFsync()
{
int fd;
const char* text = "This is some text";
fd = open("temp1", O_WRONLY | O_CREAT);
write(fd,text,strlen(text));
fsync(fd);
close(fd);
rename("temp.tmp","temp");
}
static void testFdatasync()
{
int fd;
const char* text = "This is some text";
fd = open("temp1", O_WRONLY | O_CREAT);
write(fd,text,strlen(text));
fdatasync(fd);
close(fd);
rename("temp.tmp","temp");
}
#define ITERATIONS 10000
static void testLoop(int type)
{
struct timeval before;
struct timeval after;
long seconds;
long usec;
int i;
gettimeofday(&before,NULL);
if (type == 1)
{
for (i = 0; i < ITERATIONS; i++)
{
testBasic();
}
}
if (type == 2)
{
for (i = 0; i < ITERATIONS; i++)
{
testFsync();
}
}
if (type == 3)
{
for (i = 0; i < ITERATIONS; i++)
{
testFdatasync();
}
}
gettimeofday(&after,NULL);
seconds = (long)(after.tv_sec - before.tv_sec);
usec = (long)(after.tv_usec - before.tv_usec);
if (usec < 0)
{
seconds--;
usec += 1000000;
}
printf("%ld.%06ld\n",seconds,usec);
}
int main()
{
testLoop(1);
testLoop(2);
testLoop(3);
return 0;
}
On my laptop that produces:
0.595782
6.338329
6.116894
Which suggests doing the fsync() is ~10 times more expensive. and fdatasync() is slightly cheaper.
I guess the problem I see is that every application is going to think it's data is important enough to fsync(), so the performance advantages of merging writes over a minute will be eliminated.
The issue you refer to is well researched, you should definately read this:
https://www.academia.edu/9846821/Towards_Efficient_Portable_Application-Level_Consistency
Fsync can be skipped under safe rename behavior and directory fsync can be skipped under safe new file behavior. Both are implementation specific and not guaranteed by POSIX.

Resources