I need to get the memory usage of the current process in C. Can someone offer a code sample of how to do this on a Linux platform?
I'm aware of the cat /proc/<your pid>/status method of getting memory usage, but I have no idea how to capture that in C.
BTW, it's for a PHP extension I'm modifying (granted, I'm a C newbie). If there are shortcuts available within the PHP extension API, that would be even more helpful.
The getrusage library function returns a structure containing a whole lot of data about the current process, including these:
long ru_ixrss; /* integral shared memory size */
long ru_idrss; /* integral unshared data size */
long ru_isrss; /* integral unshared stack size */
However, the most up-to-date linux documentation says about these 3 fields
(unmaintained) This field is currently unused on Linux
which the manual then defines as:
Not all fields are completed; unmaintained fields are set to zero by the kernel. (The unmaintained fields are provided for compatibility with other systems, and because they may one day be supported on Linux.)
See getrusage(2)
You can always just open the 'files' in the /proc system as you would a regular file (using the 'self' symlink so you don't have to look up your own pid):
FILE* status = fopen( "/proc/self/status", "r" );
Of course, you now have to parse the file to pick out the information you need.
This is a terribly ugly and non-portable way of getting the memory usage, but since getrusage()'s memory tracking is essentially useless on Linux, reading /proc/<pid>/statm is the only way I know of to get the information on Linux.
If anyone know of cleaner, or preferably more cross-Unix ways of tracking memory usage, I would be very interested in learning how.
typedef struct {
unsigned long size,resident,share,text,lib,data,dt;
} statm_t;
void read_off_memory_status(statm_t& result)
{
unsigned long dummy;
const char* statm_path = "/proc/self/statm";
FILE *f = fopen(statm_path,"r");
if(!f){
perror(statm_path);
abort();
}
if(7 != fscanf(f,"%ld %ld %ld %ld %ld %ld %ld",
&result.size,&result.resident,&result.share,&result.text,&result.lib,&result.data,&result.dt))
{
perror(statm_path);
abort();
}
fclose(f);
}
From the proc(5) man-page:
/proc/[pid]/statm
Provides information about memory usage, measured in pages.
The columns are:
size total program size
(same as VmSize in /proc/[pid]/status)
resident resident set size
(same as VmRSS in /proc/[pid]/status)
share shared pages (from shared mappings)
text text (code)
lib library (unused in Linux 2.6)
data data + stack
dt dirty pages (unused in Linux 2.6)
I came across this post: http://appcrawler.com/wordpress/2013/05/13/simple-example-of-tracking-memory-using-getrusage/
Simplified version:
#include <sys/resource.h>
#include <stdio.h>
int main() {
struct rusage r_usage;
getrusage(RUSAGE_SELF,&r_usage);
// Print the maximum resident set size used (in kilobytes).
printf("Memory usage: %ld kilobytes\n",r_usage.ru_maxrss);
return 0;
}
(tested in Linux 3.13)
#include <sys/resource.h>
#include <errno.h>
errno = 0;
struct rusage memory;
getrusage(RUSAGE_SELF, &memory);
if(errno == EFAULT)
printf("Error: EFAULT\n");
else if(errno == EINVAL)
printf("Error: EINVAL\n");
printf("Usage: %ld\n", memory.ru_ixrss);
printf("Usage: %ld\n", memory.ru_isrss);
printf("Usage: %ld\n", memory.ru_idrss);
printf("Max: %ld\n", memory.ru_maxrss);
I used this code but for some reason I get 0 all the time for all 4 printf()
I'm late to the party, but this might be helpful for anyone else looking for the resident and virtual (and their peak values so far) memories on linux.
It's probably pretty terrible, but it gets the job done.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
/*
* Measures the current (and peak) resident and virtual memories
* usage of your linux C process, in kB
*/
void getMemory(
int* currRealMem, int* peakRealMem,
int* currVirtMem, int* peakVirtMem) {
// stores each word in status file
char buffer[1024] = "";
// linux file contains this-process info
FILE* file = fopen("/proc/self/status", "r");
// read the entire file
while (fscanf(file, " %1023s", buffer) == 1) {
if (strcmp(buffer, "VmRSS:") == 0) {
fscanf(file, " %d", currRealMem);
}
if (strcmp(buffer, "VmHWM:") == 0) {
fscanf(file, " %d", peakRealMem);
}
if (strcmp(buffer, "VmSize:") == 0) {
fscanf(file, " %d", currVirtMem);
}
if (strcmp(buffer, "VmPeak:") == 0) {
fscanf(file, " %d", peakVirtMem);
}
}
fclose(file);
}
The above struct was taken from 4.3BSD Reno. Not all fields are mean-
ingful under Linux. In linux 2.4 only the fields ru_utime, ru_stime,
ru_minflt, and ru_majflt are maintained. Since Linux 2.6, ru_nvcsw and
ru_nivcsw are also maintained.
http://www.atarininja.org/index.py/tags/code
Related
Suppose I'm running a piece of code on a 32 bit CPU and plenty of memory. And a process uses mmap to map a total of 2.8GB worth of file into it's address space. Then the process tries to allocate 500MB of memory using malloc. The allocation is bounded to fail and returns NULL due to not having enough address space; even though the system may have enough allocate-able memory.
The code looks something like this:
int main()
{
int fd = open("some_2.8GB file",...);
void* file_ptr = mmap(..., fd, ...);
void* ptr = malloc(500*1024*1024);
// malloc will fail because on 32bit Linux, a process can only have 3GB of address space
assert(ptr == NULL);
if(out_of_address_space())
printf("You ran out of address space, but system still have free memory\n");
else
printf("Out of memory\n");
}
How could I detect the failure is caused by out of address space instead of allocate-able memory? Is out_of_address_space possible to implement?
How could I detect the failure is caused by out of address space instead of allocate-able memory?
You could calculate the amount of maximum virtual memory like bash does in ulimit -v - by querying getrlimit().
You can calculate the amount of "allocated" virtual memory by summing the differences between second and first column in /proc/pid/maps file.
Then the difference will give you the amount of "free" virtual space. You can compare that with the size you want to allocate and know if there is enough free virtual space.
Example: Let's compile a small program:
$ gcc -xc - <<EOF
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
int main() {
void *p = malloc(1024 * 1024);
printf("%ld %p\\n", (long)getpid(), p);
sleep(100);
}
EOF
The program will allocate 1MB, print it's pid and address and sleep so we have time to do something. On my system if I limit virtual memory to 2.5M the allocation fails:
$ ( ulimit -v 2500; ./a.out; )
94895 (nil)
If I then sum the maps file:
$ sudo cat /proc/94895/maps | awk -F'[- ]' --non-decimal-data '{a=sprintf("%d", "0x"$1); b=sprintf("%d", "0x"$2); sum += b-a; } END{print sum/1024 " Kb"}'
2320 Kb
Knowing that the limit was set to 2500 Kb and the process is using 2320 Kb, there is only space to allocate 180 Kb, not more.
Possible C implementation for fun:
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/resource.h>
#include <stdbool.h>
size_t address_space_max(void) {
struct rlimit l;
if (getrlimit(RLIMIT_AS, &l) < 0) return -1;
return l.rlim_cur;
}
size_t address_space_used(void) {
const unsigned long long pid = getpid();
const char fmt[] = "/proc/%llu/maps";
const int flen = snprintf(NULL, 0, fmt, pid);
char * const fname = malloc(flen + 1);
if (fname == NULL) return -1;
sprintf(fname, fmt, pid);
FILE *f = fopen(fname, "r");
free(fname);
if (f == NULL) return -1;
long long a, b;
long long sum = 0;
while (fscanf(f, "%llx-%llx%*[^\n]*", &a, &b) == 2) {
sum += b - a;
}
fclose(f);
return sum;
}
size_t address_space_free(void) {
const size_t max = address_space_max();
if (max == (size_t)-1) return -1;
const size_t used = address_space_used();
if (used == (size_t)-1) return -1;
return max - used;
}
/**
* Compares if there is enough address space for size
*/
bool out_of_address_space(size_t size) {
return address_space_free() < size;
}
int main() {
printf("%zu Kb\n", address_space_free()/1024);
// ie. use:
// if (out_of_address_space(500 * 1024 * 1024))
}
And a process uses mmap to map a total of 2.8GB worth of file into it's address space. Then the process tries to allocate 500MB of memory using malloc.
Don't mmap(2) the entire file at once !
Do mmap no more than one gigabyte (and no more than 2.5 gigabytes in total on a 32 bits Linux, including malloc-related mmap or sbrk). Then use mremap(2) and/or munmap(2). See also madvise(2). Be aware of the m modifier to fopen(3) mode string. In some cases, stdio(3) functions (using fseek and fread) might be enough and you could replace your mmap with them. Be aware of memory overcommitment and of the page cache. Both could be tunable thru /sys/ or /proc/ (see sysconf(3), sysfs(5), proc(5)...) and might be monitorable thru inotify(7) or userfaultfd(2) and/or signal(7).
Notice that malloc(3), dlopen(3) and shared libraries are also mmap-ing (thru ld.so(8)...) - and some malloc implementations are sometimes using sbrk(2) to manage small memory chunks. As an optimization, free(3) does not always munmap. Check with strace(1) and pmap(1) (or programmatically thru /proc/self/maps or /proc/self/status or /proc/self/statm, see proc(5)).
Some 32 bits Linux kernels could be specially configured (at their compile time) to accept slightly more than 3GBytes of virtual address space. I forgot the details. Ask on https://kernelnewbies.org/
Study the source code of your C standard library (e.g. GNU glibc). Most of them are open-source so you can improve them, e.g. musl-libc. You could use others (e.g. dietlibc), and you usually can redefine malloc. Budget a few months of efforts.
Read also Advanced Linux Programming (also here), Modern C, syscalls(2), the documentation of your C standard library, and a good operating system textbook.
The program works correctly in Linux, but I get extra characters after the end of file when running in Windows or through Wine. Not garbage but repeated text that was already written. The issue persists whether I write to stdout or a file, but doesn't occur with small files, a few hundred KB is needed.
I nailed down the issue to this function:
static unsigned long read_file(const char *filename, const char **output)
{
struct stat file_stats;
int fdescriptor;
unsigned long file_sz;
static char *file;
fdescriptor = open(filename, O_RDONLY);
if (fdescriptor < 0 || (fstat(fdescriptor ,&file_stats) < 0))
{ printf("Error opening file: %s \n", filename);
return (0);
}
if (file_stats.st_size < 0)
{ printf("file %s reports an Incorrect size", filename);
return (0);
}
file_sz = (unsigned long)file_stats.st_size;
file = malloc((file_sz) * sizeof(*file));
if (!file)
{ printf("Error allocating memory for file %s of size %lu\n", filename, file_sz);
return (0);
}
read(fdescriptor, file, file_sz);
*output = file;
write(STDOUT_FILENO, file, file_sz), exit(1); //this statement added for debugging.
return (file_sz);
}
I can't debug through Wine, much less in windows, but by using printf statements I can tell the file size is correct. The issue is either in the reading or the writing and without a debugger I can't look at the contents of the buffer in memory.
The program was compiled with x86_64-w64-mingw32-gcc, version 8.3. which is the same version of gcc in my system.
At this point I'm just perplexed; I would love to hear any ideas you may have.
Thank you.
Edit: The issue was that fewer bytes were being read than the reported file size and I was writing more than necessary. Thanks to Matt for telling me where to look.
Read can return a size different than that reported by fstat. I was writing the reported file size instead of the actual number of bytes read, which led to the issue. If writing, one should use the number of bytes directly reported by read to avoid this.
It is always best to both check the return value of read/write for failure and to make sure all bytes have been read as read can return less bytes than the total when reading from a pipe or interrupted by a signal, in which case multiple calls are necessary.
Thanks to Mat and Felix for the answer.
I'm working on a project that is trying to search for specific bytes (e.g. 0xAB) in a filesystem (e.g. ext2). I was able to find what I needed using malloc(), realloc(), and memchr(), but it seemed slow so I was looking into using mmap(). What I am trying to do is find a specific bytes, then copy them into a struct, so I have two questions: (1) is using mmap() the best strategy, and (2) why isn't the following code working (I get EINVAL error)?
UPDATE: The following program compiles and runs but I still have a couple issues:
1) it won't display correct file size on large files (displayed correct size for 1GB flash drive, but not for 32GB)*.
2) it's not searching the mapping correctly**.
*Is THIS a possible solution to getting the correct size using stat64()? If so, is it something I add in my Makefile? I haven't worked with makefiles much so I don't know how to add something like that.
**Is this even the proper way to search?
#define _LARGEFILE64_SOURCE
#include <stdio.h>
#include <fcntl.h>
#include <stdlib.h>
#include <string.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>
#include <errno.h>
#define handle_error(msg) \
do { perror(msg); exit(EXIT_FAILURE); } while (0)
int main(int argc, char **argv) {
int fd = open("/dev/sdb1", O_RDONLY);
if(fd < 0) {
printf("Error %s\n", strerror(errno));
return -1;
}
const char * map;
off64_t size;
size = lseek64(fd, 0, SEEK_END);
printf("file size: %llu\n", size);
lseek64(fd, 0, SEEK_SET);
map = mmap(0, size, PROT_READ, MAP_SHARED, fd, 0);
if (map == MAP_FAILED) { handle_error("mmap error"); }
printf("Searching for magic numbers...\n");
for (i=0; i < size; i++) {
if(map[i] == 0X53 && map[i + 1] == 0XEF) {
if ((map[i-32] == 0X00 && map[i-31] == 0X00) ||
(map[i-32] == 0X01 && map[i-31] == 0X00) ||
(map[i-32] == 0X02 && map[i-31] == 0X00)) {
if(j <= 5) {
printf("superblock %d found\n", j);
++j;
} else break;
int q;
for(q=0; q<j; q++) {
printf("SUPERBLOCK[%d]: %d\n", q+1, sb_pos[q]);
}
fclose(fd);
munmap(map, size);
return 0;
}
Thanks for your help.
mmap is a very efficient way to handle searching a large file, especially in cases where there's an internal structure you can use (e.g. using mmap on a large file with fixed-size records that are sorted would permit you to do a binary search, and only the pages corresponding to records read would be touched).
In your case you need to compile for 64 bits and enable large file support (and use open(2)).
If your /dev/sdb1 is a device and not a file, I don't think stat(2) will show an actual size. stat returns a size of 0 for these devices on my boxes. I think you'll need to get the size another way.
Regarding address space: x86-64 uses 2^48 bytes of virtual address space, which is 256 TiB. You can't use all of that, but there's easily ~127 TiB of contiguous address space in most processes.
I just noticed that I was using fopen(), should I be using open() instead?
Yes, you should use open() instead of fopen(). And that's the reason why you got EINVAL error.
fopen("/dev/sdb1", O_RDONLY);
This code is totally incorrect. O_RDONLY is flag that ought to be used with open() syscall but not with fopen() libc functgion
You should also note that mmaping of large files is available only if you are running on a platform with large virtual address space. It's obvious: you should have enough virtual memory to address your file. Speaking about Intel, it should be only x86_64, not x86_32.
I haven't tried to do this with really large files ( >4G). May be some additional flags are required to be passed into open() syscall.
I'm working on a project that is trying to search for specific bytes (e.g. 0xAB) in a filesystem (e.g. ext2)
To mmap() a large file into memory is totally wrong approach in your case. You just need to process your file step-by-step by chunks with fixed size (something about 1MB). You can use mmap() or just read() it into your intenal buffer - that doesn't matter. But putting a whole file into memory is totally overkill if you just want to process it sequentially.
OK I know questions like this have been asked in various forms before and I have read them all and tried everything that has been suggested but I still cannot create a file that is more than 2GB on a 64bit system using malloc, open, lseek, blah blah every trick under the sun.
Clearly I'm writing c here. I'm running Fedora 20, I'm actually trying to mmap the file but that is not where it fails, my original method was to use open(), then lseek to the position where the file should end which in this case is at 3GB, edit: and then write a byte at the file end position to actually create the file of that size, and then mmap the file. I cannot lseek to past 2GB. I cannot malloc more than 2GB either. ulimit -a etc all show unlimited, /etc/security/limits.conf shows nothing, ....
when I try to lseek past 2GB I get EINVAL for errno and the ret val of lseek is -1.edit: The size parameter to lseek is of type off_t which is defined as a long int (64bit signed), not size_t as I said previously.
edit:
I've already tried defining _LARGEFILE64_SOURCE & _FILE_OFFSET_BITS 64 and it made no difference.
I'm also compiling specifically for 64bit i.e. -m64
I'm lost. I have no idea why I cant do this.
Any help would be greatly appreciated.
Thanks.
edit: I've removed a lot of completely incorrect babbling on my part and some other unimportant ramblings that have been dealt with later on.
My 2GB problem was in the horribly sloppy interchanging of multiple different types. Mixing of signed and unsigned being the problem. Essentially the 3GB position I was passing to lseek was being interpreted/turned into a position of -1GB and clearly lseek didnt like that. So my bad. Totally stupid.
I am going to change to using posix_fallocate() as p_l suggested. While it does remove one function call i.e. only need posix_fallocate instead of an lseek and then a write, for me that isn't significant, it is the fact that posix_fallocate is doing exactly what I want directly which the lseek method doesn't. So thanks in particular to p_l for suggesting that, and a special thanks to NominalAnimal whose persistence that he knew better indirectly lead me to the realisation that I cant count which in turn led me to accept that posix_fallocate would work and so change to using it.
Regardless of the end method I used. The problem of 2GB was entirely my own crap coding and thanks again to EOF, chux, p_l and Jonathon Leffler who all contributed information and suggestions that lead me to the problem I had created for myself.
I've included a shorter version of this in an answer.
My 2GB problem was in the horribly sloppy interchanging of multiple different types. Mixing of signed and unsigned being the problem. Essentially the 3GB position I was passing to lseek was being interpreted/turned into a position of -1GB and clearly lseek didnt like that. So my bad. Totally stupid crap coding.
Thanks again to EOF, chux, p_l and Jonathon Leffler who all contributed information and suggestions that lead me to the problem I'd created and its solution.
Thanks again to p_l for suggesting posix_fallocate(), and a special thanks to NominalAnimal whose persistence that he knew better indirectly lead me to the realisation that I cant count which in turn led me to accept that posix_fallocate would work and so change to using it.
#p_l although the solution to my actual problem wasn't in your answer, I'd still up vote your answer that suggested using posix_fallocate but I dont have enough points to do that.
First of all, try:
//Before any includes:
#define _LARGEFILE64_SOURCE
#define _FILE_OFFSET_BITS 64
If that doesn't work, change lseek to lseek64 like this
lseek64(fd, 3221225472, SEEK_SET);
A better option than lseek might be posix_fallocate():
posix_fallocate(fd, 0, 3221225472);
before the call to mmap();
I recommend keeping the defines, though :)
This is a test program I created (a2b.c):
#include <assert.h>
#include <errno.h>
#include <fcntl.h>
#include <inttypes.h>
#include <stdarg.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/stat.h>
#include <unistd.h>
static void err_exit(const char *fmt, ...);
int main(void)
{
char const filename[] = "big.file";
int fd = open(filename, O_RDONLY);
if (fd < 0)
err_exit("Failed to open file %s for reading", filename);
struct stat sb;
fstat(fd, &sb);
uint64_t size = sb.st_size;
printf("File: %s; size %" PRIu64 "\n", filename, size);
assert(size > UINT64_C(3) * 1024 * 1024 * 1024);
off_t offset = UINT64_C(3) * 1024 * 1024 * 1024;
if (lseek(fd, offset, SEEK_SET) < 0)
err_exit("lseek failed");
close(fd);
_Static_assert(sizeof(size_t) > 4, "sizeof(size_t) is too small");
size = UINT64_C(3) * 1024 * 1024 * 1024;
void *space = malloc(size);
if (space == 0)
err_exit("failed to malloc %zu bytes", size);
*((char *)space + size - 1) = '\xFF';
printf("All OK\n");
return 0;
}
static void err_exit(const char *fmt, ...)
{
int errnum = errno;
va_list args;
va_start(args, fmt);
vfprintf(stderr, fmt, args);
va_end(args);
if (errnum != 0)
fprintf(stderr, ": (%d) %s", errnum, strerror(errnum));
putc('\n', stderr);
exit(1);
}
When compiled and run on a Mac (Mac OS X 10.9.2 Mavericks, GCC 4.8.2, 16 GiB physical RAM), with command line:
gcc -O3 -g -std=c11 -Wall -Wextra -Wmissing-prototypes -Wstrict-prototypes \
-Wold-style-definition -Werror a2b.c -o a2b
and having created big.file with:
dd if=/dev/zero of=big.file bs=1048576 count=5000
I got the reassuring output:
File: big.file; size 5242880000
All OK
I had to use _Static_assert rather than static_assert because the Mac <assert.h> header doesn't define static_assert. When I compiled with -m32, the static assert triggered.
When I ran it on an Ubuntu 13.10 64-bit VM with 1 GiB virtual physical memory (or is that tautological?), I not very surprisingly got the output:
File: big.file; size 5242880000
failed to malloc 3221225472 bytes: (12) Cannot allocate memory
I used exactly the same command line to compile the code; it compiled OK on Linux with static_assert in place of _Static_assert. The output of ulimit -a indicated that the maximum memory size was unlimited, but that means 'no limit smaller than that imposed by the amount of virtual memory on the machine' rather than anything bigger.
Note that my compilations did not explicitly include -m64 but they were automatically 64-bit compilations.
What do you get? Can dd create the big file? Does the code compile? (If you don't have C11 support in your compiler, then you'll need to replace the static assert with a normal 'dynamic' assert, removing the error message.) Does the code run? What result do you get.
Here is an example program, example.c:
/* Not required on 64-bit architectures; recommended anyway. */
#define _FILE_OFFSET_BITS 64
/* Tell the compiler we do need POSIX.1-2001 features. */
#define _POSIX_C_SOURCE 200112L
/* Needed to get MAP_NORESERVE. */
#define _GNU_SOURCE
#include <unistd.h>
#include <sys/mman.h>
#include <fcntl.h>
#include <errno.h>
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#ifndef FILE_NAME
#define FILE_NAME "data.map"
#endif
#ifndef FILE_SIZE
#define FILE_SIZE 3221225472UL
#endif
int main(void)
{
const size_t size = FILE_SIZE;
const char *const file = FILE_NAME;
size_t page;
unsigned char *data;
int descriptor;
int result;
/* First, obtain the normal page size. */
page = (size_t)sysconf(_SC_PAGESIZE);
if (page < 1) {
fprintf(stderr, "BUG: sysconf(_SC_PAGESIZE) returned an invalid value!\n");
return EXIT_FAILURE;
}
/* Verify the map size is a multiple of page size. */
if (size % page) {
fprintf(stderr, "Map size (%lu) is not a multiple of page size (%lu)!\n",
(unsigned long)size, (unsigned long)page);
return EXIT_FAILURE;
}
/* Create backing file. */
do {
descriptor = open(file, O_RDWR | O_CREAT | O_EXCL, 0600);
} while (descriptor == -1 && errno == EINTR);
if (descriptor == -1) {
fprintf(stderr, "Cannot create backing file '%s': %s.\n", file, strerror(errno));
return EXIT_FAILURE;
}
#ifdef FILE_ALLOCATE
/* Allocate disk space for backing file. */
do {
result = posix_fallocate(descriptor, (off_t)0, (off_t)size);
} while (result == -1 && errno == EINTR);
if (result == -1) {
fprintf(stderr, "Cannot resize and allocate %lu bytes for backing file '%s': %s.\n",
(unsigned long)size, file, strerror(errno));
unlink(file);
return EXIT_FAILURE;
}
#else
/* Backing file is sparse; disk space is not allocated. */
do {
result = ftruncate(descriptor, (off_t)size);
} while (result == -1 && errno == EINTR);
if (result == -1) {
fprintf(stderr, "Cannot resize backing file '%s' to %lu bytes: %s.\n",
file, (unsigned long)size, strerror(errno));
unlink(file);
return EXIT_FAILURE;
}
#endif
/* Map the file.
* If MAP_NORESERVE is not used, then the mapping size is limited
* to the amount of available RAM and swap combined in Linux.
* MAP_NORESERVE means that no swap is allocated for the mapping;
* the file itself acts as the backing store. That's why MAP_SHARED
* is also used. */
do {
data = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_NORESERVE,
descriptor, (off_t)0);
} while ((void *)data == MAP_FAILED && errno == EINTR);
if ((void *)data == MAP_FAILED) {
fprintf(stderr, "Cannot map file '%s': %s.\n", file, strerror(errno));
unlink(file);
return EXIT_FAILURE;
}
/* Notify of success. */
fprintf(stdout, "Mapped %lu bytes of file '%s'.\n", (unsigned long)size, file);
fflush(stdout);
#if defined(FILE_FILL)
memset(data, ~0UL, size);
#elif defined(FILE_ZERO)
memset(data, 0, size);
#elif defined(FILE_MIDDLE)
data[size/2] = 1; /* One byte in the middle set to one. */
#else
/*
* Do something with the mapping, data[0] .. data[size-1]
*/
#endif
/* Unmap. */
do {
result = munmap(data, size);
} while (result == -1 && errno == EINTR);
if (result == -1)
fprintf(stderr, "munmap(): %s.\n", strerror(errno));
/* Close the backing file. */
result = close(descriptor);
if (result)
fprintf(stderr, "close(): %s.\n", strerror(errno));
#ifndef FILE_KEEP
/* Remove the backing file. */
result = unlink(file);
if (result)
fprintf(stderr, "unlink(): %s.\n", strerror(errno));
#endif
/* We keep the file. */
fprintf(stdout, "Done.\n");
fflush(stdout);
return EXIT_SUCCESS;
}
To compile and run, use e.g.
gcc -W -Wall -O3 -DFILE_KEEP -DFILE_MIDDLE example.c -o example
./example
The above will create a three-gigabyte (10243) sparse file data.map, and set the middle byte in it to 1 (\x01). All other bytes in the file remain zeroes. You can then run
du -h data.map
to see how much such a sparse file actually takes on-disk, and
hexdump -C data.map
if you wish to verify the file contents are what I claim they are.
There are a few compile-time flags (macros) you can use to change how the example program behaves:
'-DFILE_NAME="filename"'
Use file name filename instead of data.map. Note that the entire value is defined inside single quotes, so that the shell does not parse the double quotes. (The double quotes are part of the macro value.)
'-DFILE_SIZE=(1024*1024*1024)'
Use 10243 = 1073741824 byte mapping instead of the default 3221225472. If the expression contains special characters the shell would try to evaluate, it is best to enclose it all in single or double quotes.
-DFILE_ALLOCATE
Allocate actual disk space for the entire mapping. By default, a sparse file is used instead.
-DFILE_FILL
Fill the entire mapping with (unsigned char)(~0UL), typically 255.
-DFILE_ZERO
Clear the entire mapping to zero.
-DFILE_MIDDLE
Set the middle byte in the mapping to 1. All other bytes are unchanged.
-DFILE_KEEP
Do not delete the data file. This is useful to explore how much data the mapping actually requires on disk; use e.g. du -h data.map.
There are three key limitations to consider when using memory-mapped files in Linux:
File size limits
Older file systems like FAT (MS-DOS) do not support large files, or sparse files. Sparse files are useful if the dataset is sparse (contains large holes); in that case the unset parts are not stored on disk, and simply read as zeroes.
Because many filesystems have problems with reads and writes larger than 231-1 bytes (2147483647 bytes), current Linux kernels internally limit each single operation to 231-1 bytes. The read or write call does not fail, it just returns a short count. I am not aware of any filesystem similarly limiting the llseek() syscall, but since the C library is responsible for mapping the lseek()/lseek64() functions to the proper syscalls, it is quite possible the C library (and not the kernel) limits the functionality. (In the case of the GNU C library and Embedded GNU C library, such syscall mapping is dependent on the compile-time flags. For example, see man 7 feature_test_macros, man 2 lseek and man 3 lseek64.
Finally, file position handling is not atomic in most Linux kernels. (Patches are upstream, but I'm not sure which releases contain them.) This means that if more than one thread uses the same descriptor in a way that modifies the file position, it is possible the file position gets completely garbled.
Memory limits
By default, file-backed memory maps are still subject to available memory and swap limits. That is, default mmap() behaviour is to assume that at memory pressure, dirty pages are swapped, not flushed to disk. You'll need to use the Linux-specific MAP_NORESERVE flag to avoid those limits.
Address space limits
On 32-bit Linux systems, the address space available to an userspace process is typically less than 4 GiB; it is a kernel compile-time option.
On 64-bit Linux systems, large mappings consume significant amounts of RAM, even if the mapping contents themselves are not yet faulted in. Typically, each single page requires 8 bytes of metadata ("page table entry") in memory, or more, depending on architecture. Using 4096-byte pages, this means a minimum overhead of 0.1953125%, and setting up e.g. a terabyte map requires two gigabytes of RAM just in page table structures!
Many 64-bit systems in Linux support huge pages to avoid that overhead. In most cases, huge pages are of limited use due to the configuration and tweaking and limitations. Kernels also may have limitations on what a process can do with a huge page mapping; a robust application would need thorough fallbacks to normal page mappings.
The kernel may impose stricter limits than resource availability to user-space processes. Run bash -c 'ulimit -a' to see the currently-imposed limits. (Details are available in the ulimit section in man bash-builtins.)
This looks like a simple question, but I didn't find anything similar here.
Since there is no file copy function in C, we have to implement file copying ourselves, but I don't like reinventing the wheel even for trivial stuff like that, so I'd like to ask the cloud:
What code would you recommend for file copying using fopen()/fread()/fwrite()?
What code would you recommend for file copying using open()/read()/write()?
This code should be portable (windows/mac/linux/bsd/qnx/younameit), stable, time tested, fast, memory efficient and etc. Getting into specific system's internals to squeeze some more performance is welcomed (like getting filesystem cluster size).
This seems like a trivial question but, for example, source code for CP command isn't 10 lines of C code.
This is the function I use when I need to copy from one file to another - with test harness:
/*
#(#)File: $RCSfile: fcopy.c,v $
#(#)Version: $Revision: 1.11 $
#(#)Last changed: $Date: 2008/02/11 07:28:06 $
#(#)Purpose: Copy the rest of file1 to file2
#(#)Author: J Leffler
#(#)Modified: 1991,1997,2000,2003,2005,2008
*/
/*TABSTOP=4*/
#include "jlss.h"
#include "stderr.h"
#ifndef lint
/* Prevent over-aggressive optimizers from eliminating ID string */
const char jlss_id_fcopy_c[] = "#(#)$Id: fcopy.c,v 1.11 2008/02/11 07:28:06 jleffler Exp $";
#endif /* lint */
void fcopy(FILE *f1, FILE *f2)
{
char buffer[BUFSIZ];
size_t n;
while ((n = fread(buffer, sizeof(char), sizeof(buffer), f1)) > 0)
{
if (fwrite(buffer, sizeof(char), n, f2) != n)
err_syserr("write failed\n");
}
}
#ifdef TEST
int main(int argc, char **argv)
{
FILE *fp1;
FILE *fp2;
err_setarg0(argv[0]);
if (argc != 3)
err_usage("from to");
if ((fp1 = fopen(argv[1], "rb")) == 0)
err_syserr("cannot open file %s for reading\n", argv[1]);
if ((fp2 = fopen(argv[2], "wb")) == 0)
err_syserr("cannot open file %s for writing\n", argv[2]);
fcopy(fp1, fp2);
return(0);
}
#endif /* TEST */
Clearly, this version uses file pointers from standard I/O and not file descriptors, but it is reasonably efficient and about as portable as it can be.
Well, except the error function - that's peculiar to me. As long as you handle errors cleanly, you should be OK. The "jlss.h" header declares fcopy(); the "stderr.h" header declares err_syserr() amongst many other similar error reporting functions. A simple version of the function follows - the real one adds the program name and does some other stuff.
#include "stderr.h"
#include <stdarg.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
void err_syserr(const char *fmt, ...)
{
int errnum = errno;
va_list args;
va_start(args, fmt);
vfprintf(stderr, fmt, args);
va_end(args);
if (errnum != 0)
fprintf(stderr, "(%d: %s)\n", errnum, strerror(errnum));
exit(1);
}
The code above may be treated as having a modern BSD license or GPL v3 at your choice.
As far as the actual I/O goes, the code I've written a million times in various guises for copying data from one stream to another goes something like this. It returns 0 on success, or -1 with errno set on error (in which case any number of bytes might have been copied).
Note that for copying regular files, you can skip the EAGAIN stuff, since regular files are always blocking I/O. But inevitably if you write this code, someone will use it on other types of file descriptors, so consider it a freebie.
There's a file-specific optimisation that GNU cp does, which I haven't bothered with here, that for long blocks of 0 bytes instead of writing you just extend the output file by seeking off the end.
void block(int fd, int event) {
pollfd topoll;
topoll.fd = fd;
topoll.events = event;
poll(&topoll, 1, -1);
// no need to check errors - if the stream is bust then the
// next read/write will tell us
}
int copy_data_buffer(int fdin, int fdout, void *buf, size_t bufsize) {
for(;;) {
void *pos;
// read data to buffer
ssize_t bytestowrite = read(fdin, buf, bufsize);
if (bytestowrite == 0) break; // end of input
if (bytestowrite == -1) {
if (errno == EINTR) continue; // signal handled
if (errno == EAGAIN) {
block(fdin, POLLIN);
continue;
}
return -1; // error
}
// write data from buffer
pos = buf;
while (bytestowrite > 0) {
ssize_t bytes_written = write(fdout, pos, bytestowrite);
if (bytes_written == -1) {
if (errno == EINTR) continue; // signal handled
if (errno == EAGAIN) {
block(fdout, POLLOUT);
continue;
}
return -1; // error
}
bytestowrite -= bytes_written;
pos += bytes_written;
}
}
return 0; // success
}
// Default value. I think it will get close to maximum speed on most
// systems, short of using mmap etc. But porters / integrators
// might want to set it smaller, if the system is very memory
// constrained and they don't want this routine to starve
// concurrent ops of memory. And they might want to set it larger
// if I'm completely wrong and larger buffers improve performance.
// It's worth trying several MB at least once, although with huge
// allocations you have to watch for the linux
// "crash on access instead of returning 0" behaviour for failed malloc.
#ifndef FILECOPY_BUFFER_SIZE
#define FILECOPY_BUFFER_SIZE (64*1024)
#endif
int copy_data(int fdin, int fdout) {
// optional exercise for reader: take the file size as a parameter,
// and don't use a buffer any bigger than that. This prevents
// memory-hogging if FILECOPY_BUFFER_SIZE is very large and the file
// is small.
for (size_t bufsize = FILECOPY_BUFFER_SIZE; bufsize >= 256; bufsize /= 2) {
void *buffer = malloc(bufsize);
if (buffer != NULL) {
int result = copy_data_buffer(fdin, fdout, buffer, bufsize);
free(buffer);
return result;
}
}
// could use a stack buffer here instead of failing, if desired.
// 128 bytes ought to fit on any stack worth having, but again
// this could be made configurable.
return -1; // errno is ENOMEM
}
To open the input file:
int fdin = open(infile, O_RDONLY|O_BINARY, 0);
if (fdin == -1) return -1;
Opening the output file is tricksy. As a basis, you want:
int fdout = open(outfile, O_WRONLY|O_BINARY|O_CREAT|O_TRUNC, 0x1ff);
if (fdout == -1) {
close(fdin);
return -1;
}
But there are confounding factors:
you need to special-case when the files are the same, and I can't remember how to do that portably.
if the output filename is a directory, you might want to copy the file into the directory.
if the output file already exists (open with O_EXCL to determine this and check for EEXIST on error), you might want to do something different, as cp -i does.
you might want the permissions of the output file to reflect those of the input file.
you might want other platform-specific meta-data to be copied.
you may or may not wish to unlink the output file on error.
Obviously the answers to all these questions could be "do the same as cp". In which case the answer to the original question is "ignore everything I or anyone else has said, and use the source of cp".
Btw, getting the filesystem's cluster size is next to useless. You'll almost always see speed increasing with buffer size long after you've passed the size of a disk block.
the size of each read need to be a multiple of 512 ( sector size ) 4096 is a good one
Here is a very easy and clear example: Copy a file. Since it is written in ANSI-C without any particular function calls I think this one would be pretty much portable.
Depending on what you mean by copying a file, it is certainly far from trivial. If you mean copying the content only, then there is almost nothing to do. But generally, you need to copy the metadata of the file, and that's surely platform dependent. I don't know of any C library which does what you want in a portable manner. Just handling the filename by itself is no trivial matter if you care about portability.
In C++, there is the file library in boost
One thing I found when implementing my own file copy, and it seems obvious but it's not: I/O's are slow. You can pretty much time your copy's speed by how many of them you do. So clearly you need to do as few of them as possible.
The best results I found were when I got myself a ginourmous buffer, read the entire source file into it in one I/O, then wrote the entire buffer back out of it in one I/O. If I even had to do it in 10 batches, it got way slow. Trying to read and write out each byte, like a naieve coder might try first, was just painful.
The accepted answer written by Steve Jessop does not answer to the first part of the quession, Jonathan Leffler do it, but do it wrong: code should be written as
while ((n = fread(buffer, 1, sizeof(buffer), f1)) > 0)
if (fwrite(buffer, n, 1, f2) != 1)
/* we got write error here */
/* test ferror(f1) for a read errors */
Explanation:
sizeof(char) = 1 by definition, always: it does not matter how many bits in it, 8 (in most cases), 9, 11 or 32 (on some DSP, for example) — size of char is one. Note, it is not an error here, but an extra code.
The fwrite function writes upto nmemb (second argument) elements of specified size (third argument), it does not required to write exactly nmemb elements. To fix this you must write the rest of the data readed or just write one element of size n — let fwrite do all his work. (This item is in question, should fwrite write all data or not, but in my version short writes impossible until error occurs.)
You should test for a read errors too: just test ferror(f1) at the end of loop.
Note, you probably need to disable buffering on both input and output files to prevent triple buffering: first on read to f1 buffer, second in our code, third on write to f2 buffer:
setvbuf(f1, NULL, _IONBF, 0);
setvbuf(f2, NULL, _IONBF, 0);
(Internal buffers should, probably, be of size BUFSIZ.)