How many minor faults is my process *really* taking? - c

I have the following simple program, which basically just mmaps a file and sums every byte in it:
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>
volatile uint64_t sink;
int main(int argc, char** argv) {
if (argc < 3) {
puts("Usage: mmap_test FILE populate|nopopulate");
return EXIT_FAILURE;
}
const char *filename = argv[1];
int populate = !strcmp(argv[2], "populate");
uint8_t *memblock;
int fd;
struct stat sb;
fd = open(filename, O_RDONLY);
fstat(fd, &sb);
uint64_t size = sb.st_size;
memblock = mmap(NULL, size, PROT_READ, MAP_SHARED | (populate ? MAP_POPULATE : 0), fd, 0);
if (memblock == MAP_FAILED) {
perror("mmap failed");
return EXIT_FAILURE;
}
//printf("Opened %s of size %lu bytes\n", filename, size);
uint64_t i;
uint8_t result = 0;
for (i = 0; i < size; i++) {
result += memblock[i];
}
sink = result;
puts("Press enter to exit...");
getchar();
return EXIT_SUCCESS;
}
I make it like this:
gcc -O2 -std=gnu99 mmap_test.c -o mmap_test
You pass it a file name and either populate or nopopulate1, which controls whether MAP_POPULATE is passed to mmap or not. It waits for you to type enter before exiting (giving you time to check out stuff in /proc/<pid> or whatever).
I use a 1GB test file of random data, but you can really use anything:
dd bs=1MB count=1000 if=/dev/urandom of=/dev/shm/rand1g
When MAP_POPULATE is used, I expect zero major faults and a small number of page faults for a file in the page cache. With perf stat I get the expected result:
perf stat -e major-faults,minor-faults ./mmap_test /dev/shm/rand1g populate
Press enter to exit...
Performance counter stats for './mmap_test /dev/shm/rand1g populate':
0 major-faults
45 minor-faults
1.323418217 seconds time elapsed
The 45 faults just come from the runtime and process overhead (and don't depend on the size of the file mapped).
However, /usr/bin/time reports ~15,300 minor faults:
/usr/bin/time ./mmap_test /dev/shm/rand1g populate
Press enter to exit...
0.05user 0.05system 0:00.54elapsed 20%CPU (0avgtext+0avgdata 977744maxresident)k
0inputs+0outputs (0major+15318minor)pagefaults 0swaps
The same ~15,300 minor faults is reported by top and by examining /proc/<pid>/stat.
Now if you don't use MAP_POPULATE, all the methods, including perf stat agree there are ~15,300 page faults. For what it's worth, this number comes from 1,000,000,000 / 4096 / 16 = ~15,250 - that is, 1GB divided in 4K pages, with an additional factor of 16 reduction coming from a kernel feature ("faultaround") which faults in up to 15 nearby pages that are already present in the page cache when a fault is taken.
Who is right here? Based on the documented behavior of MAP_POPULATE, the figure returned by perf stat is the correct one - the single mmap call has already populated the page tables for the entire mapping, so there should be no more minor faults when touching it.
1Actually, any string other than populate works as nopopulate.

Related

Using MAP_POPULATE for writing to file

My application writes dumps huge files to disk. For various reasons it is more convenient to use mmap and memory writes than using the fwrite interface.
The slow part of writing writing to a file in this way are page faults. Using mmap with MAP_POPULATE is supposed to help; from the man page:
MAP_POPULATE (since Linux 2.5.46)
Populate (prefault) page tables for a mapping. For a file mapping, this causes read-ahead on the file. This will help to
reduce blocking on page faults later. MAP_POPULATE is supported for private mappings only since Linux 2.6.23.
(To answer the obvious question: I've tested this on relatively recent kernels, on 4.15 and 5.1).
However, this does not seem to reduce pagefaults while writing to the mapped file.
Minimal example code: test.c:
#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <unistd.h>
static int exit_with_error(const char *msg) {
perror(msg);
exit(EXIT_FAILURE);
}
int main() {
const size_t len = 1UL << 30;
const char *fname = "/tmp/foobar-huge.txt";
int f = open(fname, O_RDWR|O_CREAT|O_EXCL, 0644);
if(f == -1) {
exit_with_error("open");
}
int ret = ftruncate(f, len);
if(ret == -1) {
exit_with_error("ftruncate");
}
void *mem = mmap(NULL, len, PROT_WRITE|PROT_READ, MAP_SHARED|MAP_POPULATE, f, 0);
if(mem == MAP_FAILED) {
exit_with_error("mmap");
}
ret = close(f);
if(ret == -1) {
exit_with_error("close");
}
memset(mem, 'f', len);
}
When running this under a profiler or using perf stat it's clearly visible that the memset at the end triggers (many) pagefaults.
In fact, this program is slower when MAP_POPULATE is passed, on my machine ~1.8s vs ~1.6s without MAP_POPULATE. The difference simply seems to be the time it takes to do the populate, the number of page faults that perf stat reports is identical.
A last observation is that this behaves as expected when I read from the file, instead of writing -- in this case the MAP_POPULATE reduces the number of pagefaults to almost zero and helps to improve performance drastically.
Is this the expected behavior for MAP_POPULATE? Am I doing something wrong?
Because although the pages have been prefaulted by MAP_POPULATE, they are not dirty pages. Therefore, when writing to these pages, the CoW page faults would be triggered to set the page entries as dirty and writable.

using mmap() to search large file (~1TB)

I'm working on a project that is trying to search for specific bytes (e.g. 0xAB) in a filesystem (e.g. ext2). I was able to find what I needed using malloc(), realloc(), and memchr(), but it seemed slow so I was looking into using mmap(). What I am trying to do is find a specific bytes, then copy them into a struct, so I have two questions: (1) is using mmap() the best strategy, and (2) why isn't the following code working (I get EINVAL error)?
UPDATE: The following program compiles and runs but I still have a couple issues:
1) it won't display correct file size on large files (displayed correct size for 1GB flash drive, but not for 32GB)*.
2) it's not searching the mapping correctly**.
*Is THIS a possible solution to getting the correct size using stat64()? If so, is it something I add in my Makefile? I haven't worked with makefiles much so I don't know how to add something like that.
**Is this even the proper way to search?
#define _LARGEFILE64_SOURCE
#include <stdio.h>
#include <fcntl.h>
#include <stdlib.h>
#include <string.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>
#include <errno.h>
#define handle_error(msg) \
do { perror(msg); exit(EXIT_FAILURE); } while (0)
int main(int argc, char **argv) {
int fd = open("/dev/sdb1", O_RDONLY);
if(fd < 0) {
printf("Error %s\n", strerror(errno));
return -1;
}
const char * map;
off64_t size;
size = lseek64(fd, 0, SEEK_END);
printf("file size: %llu\n", size);
lseek64(fd, 0, SEEK_SET);
map = mmap(0, size, PROT_READ, MAP_SHARED, fd, 0);
if (map == MAP_FAILED) { handle_error("mmap error"); }
printf("Searching for magic numbers...\n");
for (i=0; i < size; i++) {
if(map[i] == 0X53 && map[i + 1] == 0XEF) {
if ((map[i-32] == 0X00 && map[i-31] == 0X00) ||
(map[i-32] == 0X01 && map[i-31] == 0X00) ||
(map[i-32] == 0X02 && map[i-31] == 0X00)) {
if(j <= 5) {
printf("superblock %d found\n", j);
++j;
} else break;
int q;
for(q=0; q<j; q++) {
printf("SUPERBLOCK[%d]: %d\n", q+1, sb_pos[q]);
}
fclose(fd);
munmap(map, size);
return 0;
}
Thanks for your help.
mmap is a very efficient way to handle searching a large file, especially in cases where there's an internal structure you can use (e.g. using mmap on a large file with fixed-size records that are sorted would permit you to do a binary search, and only the pages corresponding to records read would be touched).
In your case you need to compile for 64 bits and enable large file support (and use open(2)).
If your /dev/sdb1 is a device and not a file, I don't think stat(2) will show an actual size. stat returns a size of 0 for these devices on my boxes. I think you'll need to get the size another way.
Regarding address space: x86-64 uses 2^48 bytes of virtual address space, which is 256 TiB. You can't use all of that, but there's easily ~127 TiB of contiguous address space in most processes.
I just noticed that I was using fopen(), should I be using open() instead?
Yes, you should use open() instead of fopen(). And that's the reason why you got EINVAL error.
fopen("/dev/sdb1", O_RDONLY);
This code is totally incorrect. O_RDONLY is flag that ought to be used with open() syscall but not with fopen() libc functgion
You should also note that mmaping of large files is available only if you are running on a platform with large virtual address space. It's obvious: you should have enough virtual memory to address your file. Speaking about Intel, it should be only x86_64, not x86_32.
I haven't tried to do this with really large files ( >4G). May be some additional flags are required to be passed into open() syscall.
I'm working on a project that is trying to search for specific bytes (e.g. 0xAB) in a filesystem (e.g. ext2)
To mmap() a large file into memory is totally wrong approach in your case. You just need to process your file step-by-step by chunks with fixed size (something about 1MB). You can use mmap() or just read() it into your intenal buffer - that doesn't matter. But putting a whole file into memory is totally overkill if you just want to process it sequentially.

Why can't my program save a large amount (>2GB) to a file?

I am having trouble trying to figure out why my program cannot save more than 2GB of data to a file. I cannot tell if this is a programming or environment (OS) problem. Here is my source code:
#define _LARGEFILE_SOURCE
#define _LARGEFILE64_SOURCE
#define _FILE_OFFSET_BITS 64
#include <math.h>
#include <time.h>
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
/*-------------------------------------*/
//for file mapping in Linux
#include<fcntl.h>
#include<unistd.h>
#include<sys/stat.h>
#include<sys/time.h>
#include<sys/mman.h>
#include<sys/types.h>
/*-------------------------------------*/
#define PERMS 0600
#define NEW(type) (type *) malloc(sizeof(type))
#define FILE_MODE (S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH)
void write_result(char *filename, char *data, long long length){
int fd, fq;
fd = open(filename, O_RDWR|O_CREAT|O_LARGEFILE, 0644);
if (fd < 0) {
perror(filename);
return -1;
}
if (ftruncate(fd, length) < 0)
{
printf("[%d]-ftruncate64 error: %s/n", errno, strerror(errno));
close(fd);
return 0;
}
fq = write (fd, data,length);
close(fd);
return;
}
main()
{
long long offset = 3000000000; // 3GB
char * ttt;
ttt = (char *)malloc(sizeof(char) *offset);
printf("length->%lld\n",strlen(ttt)); // length=0
memset (ttt,1,offset);
printf("length->%lld\n",strlen(ttt)); // length=3GB
write_result("test.big",ttt,offset);
return 1;
}
According to my test, the program can generate a file large than 2GB and can allocate such large memory as well.
The weird thing happened when I tried to write data into the file. I checked the file and it is empty, which is supposed to be filled with 1.
Can any one be kind and help me with this?
You need to read a little more about C strings and what malloc and calloc do.
In your original main ttt pointed to whatever garbage was in memory when malloc was called. This means a nul terminator (the end marker of a C String, which is binary 0) could be anywhere in the garbage returned by malloc.
Also, since malloc does not touch every byte of the allocated memory (and you're asking for a lot) you could get sparse memory which means the memory is not actually physically available until it is read or written.
calloc allocates and fills the allocated memory with 0. It is a little more prone to fail because of this (it touches every byte allocated, so if the OS left the allocation sparse it will not be sparse after calloc fills it.)
Here's your code with fixes for the above issues.
You should also always check the return value from write and react accordingly. I'll leave that to you...
main()
{
long long offset = 3000000000; // 3GB
char * ttt;
//ttt = (char *)malloc(sizeof(char) *offset);
ttt = (char *)calloc( sizeof( char ), offset ); // instead of malloc( ... )
if( !ttt )
{
puts( "calloc failed, bye bye now!" );
exit( 87 );
}
printf("length->%lld\n",strlen(ttt)); // length=0 (This now works as expected if calloc does not fail)
memset( ttt, 1, offset );
ttt[offset - 1] = 0; // Now it's nul terminated and the printf below will work
printf("length->%lld\n",strlen(ttt)); // length=3GB
write_result("test.big",ttt,offset);
return 1;
}
Note to Linux gurus... I know sparse may not be the correct term. Please correct me if I'm wrong as it's been a while since I've been buried in Linux minutiae. :)
Looks like you're hitting the internal file system's limitation for the iDevice: ios - Enterprise app with more than resource files of size 2GB
2Gb+ files are simply not possible. If you need to store such amount of data you should consider using some other tools or write the file chunk manager.
I'm going to go out on a limb here and say that your problem may lay in memset().
The best thing to do here is, I think, after memset() ing it,
for (unsigned long i = 0; i < 3000000000; i++) {
if (ttt[i] != 1) { printf("error in data at location %d", i); break; }
}
Once you've validated that the data you're trying to write is correct, then you should look into writing a smaller file such as 1GB and see if you have the same problems. Eliminate each and every possible variable and you will find the answer.

Problem in code with File Descriptors. C (Linux)

I've written code that should ideally take in data from one document, encrypt it and save it in another document.
But when I try executing the code it does not put the encrypted data in the new file. It just leaves it blank. Someone please spot what's missing in the code. I tried but I couldn't figure it out.
I think there is something wrong with the read/write function, or maybe I'm implementing the do-while loop incorrectly.
#include <stdio.h>
#include <stdlib.h>
#include <termios.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <string.h>
#include <unistd.h>
int main (int argc, char* argv[])
{
int fdin,fdout,n,i,fd;
char* buf;
struct stat fs;
if(argc<3)
printf("USAGE: %s source-file target-file.\n",argv[0]);
fdin=open(argv[1], O_RDONLY);
if(fdin==-1)
printf("ERROR: Cannot open %s.\n",argv[1]);
fdout=open(argv[2], O_WRONLY | O_CREAT | O_EXCL, 0644);
if(fdout==-1)
printf("ERROR: %s already exists.\n",argv[2]);
fstat(fd, &fs);
n= fs.st_size;
buf=malloc(n);
do
{
n=read(fd, buf, 10);
for(i=0;i<n;i++)
buf[i] ^= '#';
write(fd, buf, n);
} while(n==10);
close(fdin);
close(fdout);
}
You are using fd instead of fdin in fstat, read and write system calls. fd is an uninitialized variable.
// Here...
fstat(fd, &fs);
// And here...
n=read(fd, buf, 10);
for(i=0;i<n;i++)
buf[i] ^= '#';
write(fd, buf, n);
You're reading and writing to fd instead of fdin and fdout. Make sure you enable all warnings your compiler will emit (e.g. use gcc -Wall -Wextra -pedantic). It will warn you about the use of an uninitialized variable if you let it.
Also, if you checked the return codes of fstat(), read(), or write(), you'd likely have gotten errors from using an invalid file descriptor. They are most likely erroring out with EINVAL (invalid argument) errors.
fstat(fd, &fs);
n= fs.st_size;
buf=malloc(n);
And since we're here: allocating enough memory to hold the entire file is unnecessary. You're only reading 10 bytes at a time in your loop, so you really only need a 10-byte buffer. You could skip the fstat() entirely.
// Just allocate 10 bytes.
buf = malloc(10);
// Or heck, skip the malloc() too! Change "char *buf" to:
char buf[10];
All said it true, one more tip.
You should use a larger buffer that fits the system hard disk blocks, usually 8192.
This will increase your program speed significantly as you will have less access to the disk by a factor of 800. As you know, accessing to disk is very expensive in terms of time.
Another option is use stdio functions fread, fwrite, etc, which already takes care of buffering, still you'll have the function call overhead.
Roni

how to retrieve the large file

I am working on an application wherein i need to compare 10^8 entries(alphanumeric entries). To retrieve the entries from file( file size is 1.5 GB) and then to compare them, i need to take less than 5 minutes of time. So, what would b the effective way to do that, since, only retrieving time is exceeding 5 min. And i need to work on file only. please suggest a way out.
I m working on windows with 3GB RAM n 100Gb hard disk.
Read a part of the file, sort it, write it to a temporary file.
Merge-sort the resulting files.
Error handling and header includes are not included. You need to provide DataType and cmpfunc, samples are provided. You should be able to deduce the core workings from this snippet:
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <stdlib.h>
#include <unistd.h>
typedef char DataType; // is this alphanumeric?
int cmpfunc(char const *left, char const *right)
{
return *right - *left;
}
int main(int argc, char **argv)
{
int fd = open(argv[1], O_RDWR|O_LARGEFILE);
if (fd == -1)
return 1;
struct stat st;
if (fstat(fd, &st) != 0)
return 1;
DataType *data = mmap(NULL, st.st_size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
if (!data)
return 1;
qsort(data, st.st_size / sizeof(*data), cmpfunc);
if (0 != msync(data, st.st_size, MS_SYNC))
return 1;
if (-1 == munmap(data, st.st_size))
return 1;
if (0 != close(fd))
return 1;
return 0;
}
I can't imagine you can get much faster than this. Be sure you have enough virtual memory address space (1.5GB is pushing it but will probably just work on 32bit Linux, you'll be able to manage this on any 64bit OS). Note that this code is "limited" to working on a POSIX compliant system.
In terms of C and efficiency, this approach puts the entire operation in the hands of the OS, and the excellent qsort algorithm.
If retrieving time is exceeding 5 min it seems that you need to look at how you are reading this file. One thing that has caused bad performance for me is that a C implementation sometimes uses thread-safe I/O operations by default, and you can gain some speed by using thread-unsafe I/O.
What kind of computer will this be run on? Many computers nowadays have several gigabytes of memory, so perhaps it will work to just read it all into memory and then sort it there (with, for example, qsort)?

Resources