I can get the address of the end of the heap with sbrk(0), but is there any way to programmatically get the address of the start of the heap, other than by parsing the contents of /proc/self/maps?
I think parsing /proc/self/maps is the only reliable way on the Linux to find the heap segment. And do not forget that some allocators (including one in my SLES) do use for large blocks mmap() thus the memory isn't part of the heap anymore and can be at any random location.
Otherwise, normally ld adds a symbol which marks the end of all segments in elf and the symbol is called _end. E.g.:
extern void *_end;
printf( "%p\n", &_end );
It matches the end of the .bss, traditionally the last segment of elf. After the address, with some alignment, normally follows the heap. Stack(s) and mmap()s (including the shared libraries) are at the higher addresses of the address space.
I'm not sure how portable it is, but apparently it works same way on the Solaris 10. On HP-UX 11 the map looks different and heap appears to be merged with data segment, but allocations do happen after the _end. On AIX, procmap doesn't show heap/data segment at all, but allocations too get the addresses past the _end symbol. So it seems to be at the moment quite portable.
Though, all considered, I'm not sure how useful that is.
P.S. The test program:
#include <stdio.h>
#include <stdlib.h>
char *ppp1 = "hello world";
char ppp0[] = "hello world";
extern void *_end; /* any type would do, only its address is important */
int main()
{
void *p = calloc(10000,1);
printf( "end:%p heap:%p rodata:%p data:%p\n", &_end, p, ppp1, ppp0 );
sleep(10000); /* sleep to give chance to look at the process memory map */
return 0;
}
You may call sbrk(0) to get the start of the heap, but you have to make sure no memory has been allocated yet.
The best way to do this is to assign the return value at the very beginning of main(). Note that many functions do allocate memory under the hood, so a call to sbrk(0) after a printf, a memory utility like mtrace or even a call to putenv will already return an offset value.
Although much of what we can find say that the heap is right next to .bss, I am not sure what is in the difference between end and the first break. Reading there seems to results in a segmentation fault.
The difference between the first break and the first address returned by malloc is, among (probably) other thing:
the head of the memory double-linked-list, including the next free block
a structure prefixed to the malloced block incuding:
the length of this block
the address of the previous free block
the address of the next free block
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <fcntl.h>
#include <errno.h>
void print_heap_line();
int main(int argc, char const *argv[])
{
char* startbreak = sbrk(0);
printf("pid: %d\n", getpid()); // printf is allocating memory
char* lastbreak = sbrk(0);
printf("heap: [%p - %p]\n", startbreak, lastbreak);
long pagesize = sysconf(_SC_PAGESIZE);
long diff = lastbreak - startbreak;
printf("diff: %ld (%ld pages of %ld bytes)\n", diff, diff/pagesize, pagesize);
print_heap_line();
printf("\n\npress a key to finish...");
getchar(); // gives you a chance to inspect /proc/pid/maps yourself
return 0;
}
void print_heap_line() {
int mapsfd = open("/proc/self/maps", O_RDONLY);
if(mapsfd == -1) {
fprintf(stderr, "open() failed: %s.\n", strerror(errno));
exit(1);
}
char maps[BUFSIZ] = "";
if(read(mapsfd, maps, BUFSIZ) == -1){
fprintf(stderr, "read() failed: %s.\n", strerror(errno));
exit(1);
}
if(close(mapsfd) == -1){
fprintf(stderr, "close() failed: %s.\n", strerror(errno));
exit(1);
}
char* line = strtok(maps, "\n");
while((line = strtok(NULL, "\n")) != NULL) {
if(strstr(line, "heap") != NULL) {
printf("\n\nfrom /proc/self/maps:\n%s\n", line);
return;
}
}
}
pid: 29825
heap: [0x55fe05739000 - 0x55fe0575a000]
diff: 135168 (33 pages of 4096 bytes)
from /proc/self/maps:
55fe05739000-55fe0575a000 rw-p 00000000 00:00 0 [heap]
press a key to finish...
Related
I am learning about file descriptors by using the open, write and close functions. What I expect is a printf statement outputting the file descriptor after the open function, and another printf statement outputting a confirmation of the text being written. However, I get the following result:
.\File.exe "this is a test"
[DEBUG] buffer # 0x00b815c8: 'this is a test'
[DEBUG] datafile # 0x00b81638: 'C:\Users\____\Documents\Notes'
With a blank space where the further debugging output should be. The code block for the section is:
strcpy(buffer, argv[1]); //copy first vector into the buffer
printf("[DEBUG] buffer \t # 0x%08x: \'%s\'\n", buffer, buffer); //debug buffer
printf("[DEBUG] datafile # 0x%08x: \'%s\'\n", datafile, datafile); //debug datafile
strncat(buffer, "\n", 1); //adds a newline
fd = open(datafile, O_WRONLY|O_CREAT|O_APPEND, S_IRUSR|S_IWUSR); //opens file
if(fd == -1)
{
fatal("in main() while opening file");
}
printf("[DEBUG] file descriptor is %d\n", fd);
if(write(fd, buffer, strlen(buffer)) == -1) //wrting data
{
fatal("in main() while writing buffer to file");
}
if(close(fd) == -1) //closing file
{
fatal("in main() while closing file");
}
printf("Note has been saved.");
I basically copied the code word for word from the book I'm studying, so how could it not work?
The problem is that the printf function does not display anything, and the file descriptor is not returned.
Here is the full code:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <fcntl.h>
#include <sys/stat.h>
void usage(char *pnt, char *opnt) //usage function
{
printf("Usage: %s <data to add to \"%s\">", pnt, opnt);
exit(0);
}
void fatal(char*); //fatal function for errors
void *ec_malloc(unsigned int); //wrapper for malloc error checking
int main(int argc, char *argv[]) //initiates argumemt vector/count variables
{
int fd; //file descriptor
char *buffer, *datafile;
buffer = (char*) ec_malloc(100); //buffer given 100 bytes of ec memory
datafile = (char*) ec_malloc(20); //datafile given 20 bytes of ec memory
strcpy(datafile, "C:\\Users\\____\\Documents\\Notes");
if(argc < 2) //if argument count is less than 2 i.e. no arguments provided
{
usage(argv[0], datafile); //print usage message from usage function
}
strcpy(buffer, argv[1]); //copy first vector into the buffer
printf("[DEBUG] buffer \t # %p: \'%s\'\n", buffer, buffer); //debug buffer
printf("[DEBUG] datafile # %p: \'%s\'\n", datafile, datafile); //debug datafile
strncat(buffer, "\n", 1); //adds a newline
fd = open(datafile, O_WRONLY|O_CREAT|O_APPEND, S_IRUSR|S_IWUSR); //opens file
if(fd == -1)
{
fatal("in main() while opening file");
}
printf("[DEBUG] file descriptor is %d\n", fd);
if(write(fd, buffer, strlen(buffer)) == -1) //wrting data
{
fatal("in main() while writing buffer to file");
}
if(close(fd) == -1) //closing file
{
fatal("in main() while closing file");
}
printf("Note has been saved.");
free(buffer);
free(datafile);
}
void fatal(char *message)
{
char error_message[100];
strcpy(error_message, "[!!] Fatal Error ");
strncat(error_message, message, 83);
perror(error_message);
exit(-1);
}
void *ec_malloc(unsigned int size)
{
void *ptr;
ptr = malloc(size);
if(ptr == NULL)
{
fatal("in ec_malloc() on memory allocation");
return ptr;
}
}
EDIT: the issue has been fixed. The reason for this bug was that the memory allocated within the ec_malloc function was not sufficient, which meant that the text could not be saved. I changed the byte value to 100 and the code now works.
I am not sure which compiler you are using, but the one I tried the code with (GCC) says:
main.c:34:5: warning: ‘strncat’ specified bound 1 equals source length [-Wstringop-overflow=]
34 | strncat(buffer, "\n", 1); //adds a newline
| ^~~~~~~~~~~~~~~~~~~~~~~~
In other words, the call to strncat in your code is highly suspicious. You are trying to append a single line-break character, which has a length of 1, which you pass as the third argument. But strncat expects the third parameter to be the remaining space in buffer, not the length of the string to append.
A correct call would look a bit like this:
size_t bufferLength = 100;
char* buffer = malloc(bufferLength);
strncat(buffer, "\n", (bufferLength - strlen(buffer) - strlen("\n") - 1));
In this case, however, you are saved, because strncat guarantees that the resulting buffer is NUL-terminated, meaning that it always writes one additional byte beyond the specified size.
All of this is complicated, and a common source of bugs. It's easier to simply use snprintf to build up the entire string at one go:
size_t bufferLength = 100;
char* buffer = malloc(bufferLength);
snprintf(buffer, bufferLength, "%s\n", argv[1]);
Another bug in your code is the ec_malloc function:
void *ec_malloc(unsigned int size)
{
void *ptr;
ptr = malloc(size);
if(ptr == NULL)
{
fatal("in ec_malloc() on memory allocation");
return ptr;
}
}
See if you can spot it: what happens if ptr is not NULL? Well, nothing! The function doesn't return a value in this case; execution just falls off the end.
If you're using GCC (and possibly other compilers) on x86, this code will appear to work fine, because the result of the malloc function will remain in the proper CPU register to serve as the result of the ec_malloc function. But the fact that it just happens to work by the magic of circumstance does not make it correct code. It is subject to stop working at any time, and it should be fixed. The function deserves a return value!
Unfortunately, the GCC compiler is unable to detect this mistake, but Clang does:
<source>:64:1: warning: non-void function does not return a value in all control paths [-Wreturn-type]
}
^
The major bug in your code is a buffer overrun. At the top, you allocate only 20 bytes for the datafile buffer:
datafile = (char*) ec_malloc(20); //datafile given 20 bytes of ec memory
which means it can only store 20 characters. However, you proceed to write in more than 20 characters:
strcpy(datafile, "C:\\Users\\____\\Documents\\Notes");
That string literal is 33 characters, not including the terminating NUL! You need a buffer with at least 50 characters of space to hold all of this. With a buffer that is too small, the strcpy function call creates a classic "buffer overrun" error, which is undefined behavior that manifests itself as corrupting your program's memory area and thus premature termination.
Again, when I tried compiling and running the code, GCC reported:
malloc(): corrupted top size
because it detected that you had overrun the dynamically-allocated memory (returned by malloc). It was able to do this because, under the hood, malloc stores sentinel information after the allocated memory block, and your overwriting of the allocated space had written over its sentinel information.
The whole code is a bit suspect; it was not written by someone who knows C very well, nor was it debugged or reviewed by anyone else.
There is no real need to use dynamic memory allocation here in order to allocate fixed-size buffers. If you're going to use dynamic memory allocation, then allocate the actual amount of space that you need. Otherwise, if you're allocating fixed-size buffers, then just allocate on the stack.
Don't bother with complex string-manipulation functions when you can get away with simply using snprintf.
And as a bonus tip: when debugging problems, try to reduce the code down as small as you can get it. None of the file I/O stuff was related to this problem, so when I was analyzing this code, I replaced that whole section with:
printf("[DEBUG] file descriptor is %d\n", 42);
Once the rest of the code is working, I can go back and add the real code back to that section, and then test it. (Which I didn't do, because I don't have a file system handy to test this.)
I and dynamic memories are not friends. I always get a problem with them. And the task is simple to understand.
Task: Write a function readText that reads an arbitrary text (finalized by return) from the user and returns it as a string. In this first version of the function assume that the text can't be longer than a certain length (e.g.1000 characters). After the text has been read, the memory should be shortened to the minimal needed length.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXC 5
char *readText(int*lengh);
int main()
{
char *str= malloc(MAXC*sizeof(char));
if(str == NULL) {
printf("Kein virtueller RAM mehr verfügbar ...\n");
return EXIT_FAILURE;
}
int length=0;
str=readText(&length);
printf("Text: %s %d %c\n",str,length,*str);
str= realloc(str,length+1);
if(str == NULL) {
printf("Kein virtueller RAM mehr verfügbar ...\n");
return EXIT_FAILURE;
}
printf("Text: %s\n",str);
free(str);
printf("free\n");
return 0;
}
char *readText(int*lengh){
char *result1;
char result[MAXC];
printf("Read Text: ");
scanf("%s",&result);
result1=result;
*lengh=strlen(result);
return result1;
}
Results (the string thing just happened in a moment ago, and before I only had a problem with the realloc):
Read Text: hoi
Text: h╠ ` 3 h
Kein virtueller RAM mehr verf³gbar ... (No virtual RAM available)
Process returned 1 (0x1)
My worry is that my program is ok, but my RAM is not. So if this is the case or in general, please tell me how to fix RAM problems too. Would be amazing
Thanks for looking at this and help me to improve.
The function readText returns the address of a local variable. You cannot realloc memory that was not obtained by malloc (or calloc, or strdup, etc.), and the local variable from the function readText was certainly not obtained from malloc. So realloc fails.
Suppose I'm running a piece of code on a 32 bit CPU and plenty of memory. And a process uses mmap to map a total of 2.8GB worth of file into it's address space. Then the process tries to allocate 500MB of memory using malloc. The allocation is bounded to fail and returns NULL due to not having enough address space; even though the system may have enough allocate-able memory.
The code looks something like this:
int main()
{
int fd = open("some_2.8GB file",...);
void* file_ptr = mmap(..., fd, ...);
void* ptr = malloc(500*1024*1024);
// malloc will fail because on 32bit Linux, a process can only have 3GB of address space
assert(ptr == NULL);
if(out_of_address_space())
printf("You ran out of address space, but system still have free memory\n");
else
printf("Out of memory\n");
}
How could I detect the failure is caused by out of address space instead of allocate-able memory? Is out_of_address_space possible to implement?
How could I detect the failure is caused by out of address space instead of allocate-able memory?
You could calculate the amount of maximum virtual memory like bash does in ulimit -v - by querying getrlimit().
You can calculate the amount of "allocated" virtual memory by summing the differences between second and first column in /proc/pid/maps file.
Then the difference will give you the amount of "free" virtual space. You can compare that with the size you want to allocate and know if there is enough free virtual space.
Example: Let's compile a small program:
$ gcc -xc - <<EOF
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
int main() {
void *p = malloc(1024 * 1024);
printf("%ld %p\\n", (long)getpid(), p);
sleep(100);
}
EOF
The program will allocate 1MB, print it's pid and address and sleep so we have time to do something. On my system if I limit virtual memory to 2.5M the allocation fails:
$ ( ulimit -v 2500; ./a.out; )
94895 (nil)
If I then sum the maps file:
$ sudo cat /proc/94895/maps | awk -F'[- ]' --non-decimal-data '{a=sprintf("%d", "0x"$1); b=sprintf("%d", "0x"$2); sum += b-a; } END{print sum/1024 " Kb"}'
2320 Kb
Knowing that the limit was set to 2500 Kb and the process is using 2320 Kb, there is only space to allocate 180 Kb, not more.
Possible C implementation for fun:
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/resource.h>
#include <stdbool.h>
size_t address_space_max(void) {
struct rlimit l;
if (getrlimit(RLIMIT_AS, &l) < 0) return -1;
return l.rlim_cur;
}
size_t address_space_used(void) {
const unsigned long long pid = getpid();
const char fmt[] = "/proc/%llu/maps";
const int flen = snprintf(NULL, 0, fmt, pid);
char * const fname = malloc(flen + 1);
if (fname == NULL) return -1;
sprintf(fname, fmt, pid);
FILE *f = fopen(fname, "r");
free(fname);
if (f == NULL) return -1;
long long a, b;
long long sum = 0;
while (fscanf(f, "%llx-%llx%*[^\n]*", &a, &b) == 2) {
sum += b - a;
}
fclose(f);
return sum;
}
size_t address_space_free(void) {
const size_t max = address_space_max();
if (max == (size_t)-1) return -1;
const size_t used = address_space_used();
if (used == (size_t)-1) return -1;
return max - used;
}
/**
* Compares if there is enough address space for size
*/
bool out_of_address_space(size_t size) {
return address_space_free() < size;
}
int main() {
printf("%zu Kb\n", address_space_free()/1024);
// ie. use:
// if (out_of_address_space(500 * 1024 * 1024))
}
And a process uses mmap to map a total of 2.8GB worth of file into it's address space. Then the process tries to allocate 500MB of memory using malloc.
Don't mmap(2) the entire file at once !
Do mmap no more than one gigabyte (and no more than 2.5 gigabytes in total on a 32 bits Linux, including malloc-related mmap or sbrk). Then use mremap(2) and/or munmap(2). See also madvise(2). Be aware of the m modifier to fopen(3) mode string. In some cases, stdio(3) functions (using fseek and fread) might be enough and you could replace your mmap with them. Be aware of memory overcommitment and of the page cache. Both could be tunable thru /sys/ or /proc/ (see sysconf(3), sysfs(5), proc(5)...) and might be monitorable thru inotify(7) or userfaultfd(2) and/or signal(7).
Notice that malloc(3), dlopen(3) and shared libraries are also mmap-ing (thru ld.so(8)...) - and some malloc implementations are sometimes using sbrk(2) to manage small memory chunks. As an optimization, free(3) does not always munmap. Check with strace(1) and pmap(1) (or programmatically thru /proc/self/maps or /proc/self/status or /proc/self/statm, see proc(5)).
Some 32 bits Linux kernels could be specially configured (at their compile time) to accept slightly more than 3GBytes of virtual address space. I forgot the details. Ask on https://kernelnewbies.org/
Study the source code of your C standard library (e.g. GNU glibc). Most of them are open-source so you can improve them, e.g. musl-libc. You could use others (e.g. dietlibc), and you usually can redefine malloc. Budget a few months of efforts.
Read also Advanced Linux Programming (also here), Modern C, syscalls(2), the documentation of your C standard library, and a good operating system textbook.
I am having trouble trying to figure out why my program cannot save more than 2GB of data to a file. I cannot tell if this is a programming or environment (OS) problem. Here is my source code:
#define _LARGEFILE_SOURCE
#define _LARGEFILE64_SOURCE
#define _FILE_OFFSET_BITS 64
#include <math.h>
#include <time.h>
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
/*-------------------------------------*/
//for file mapping in Linux
#include<fcntl.h>
#include<unistd.h>
#include<sys/stat.h>
#include<sys/time.h>
#include<sys/mman.h>
#include<sys/types.h>
/*-------------------------------------*/
#define PERMS 0600
#define NEW(type) (type *) malloc(sizeof(type))
#define FILE_MODE (S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH)
void write_result(char *filename, char *data, long long length){
int fd, fq;
fd = open(filename, O_RDWR|O_CREAT|O_LARGEFILE, 0644);
if (fd < 0) {
perror(filename);
return -1;
}
if (ftruncate(fd, length) < 0)
{
printf("[%d]-ftruncate64 error: %s/n", errno, strerror(errno));
close(fd);
return 0;
}
fq = write (fd, data,length);
close(fd);
return;
}
main()
{
long long offset = 3000000000; // 3GB
char * ttt;
ttt = (char *)malloc(sizeof(char) *offset);
printf("length->%lld\n",strlen(ttt)); // length=0
memset (ttt,1,offset);
printf("length->%lld\n",strlen(ttt)); // length=3GB
write_result("test.big",ttt,offset);
return 1;
}
According to my test, the program can generate a file large than 2GB and can allocate such large memory as well.
The weird thing happened when I tried to write data into the file. I checked the file and it is empty, which is supposed to be filled with 1.
Can any one be kind and help me with this?
You need to read a little more about C strings and what malloc and calloc do.
In your original main ttt pointed to whatever garbage was in memory when malloc was called. This means a nul terminator (the end marker of a C String, which is binary 0) could be anywhere in the garbage returned by malloc.
Also, since malloc does not touch every byte of the allocated memory (and you're asking for a lot) you could get sparse memory which means the memory is not actually physically available until it is read or written.
calloc allocates and fills the allocated memory with 0. It is a little more prone to fail because of this (it touches every byte allocated, so if the OS left the allocation sparse it will not be sparse after calloc fills it.)
Here's your code with fixes for the above issues.
You should also always check the return value from write and react accordingly. I'll leave that to you...
main()
{
long long offset = 3000000000; // 3GB
char * ttt;
//ttt = (char *)malloc(sizeof(char) *offset);
ttt = (char *)calloc( sizeof( char ), offset ); // instead of malloc( ... )
if( !ttt )
{
puts( "calloc failed, bye bye now!" );
exit( 87 );
}
printf("length->%lld\n",strlen(ttt)); // length=0 (This now works as expected if calloc does not fail)
memset( ttt, 1, offset );
ttt[offset - 1] = 0; // Now it's nul terminated and the printf below will work
printf("length->%lld\n",strlen(ttt)); // length=3GB
write_result("test.big",ttt,offset);
return 1;
}
Note to Linux gurus... I know sparse may not be the correct term. Please correct me if I'm wrong as it's been a while since I've been buried in Linux minutiae. :)
Looks like you're hitting the internal file system's limitation for the iDevice: ios - Enterprise app with more than resource files of size 2GB
2Gb+ files are simply not possible. If you need to store such amount of data you should consider using some other tools or write the file chunk manager.
I'm going to go out on a limb here and say that your problem may lay in memset().
The best thing to do here is, I think, after memset() ing it,
for (unsigned long i = 0; i < 3000000000; i++) {
if (ttt[i] != 1) { printf("error in data at location %d", i); break; }
}
Once you've validated that the data you're trying to write is correct, then you should look into writing a smaller file such as 1GB and see if you have the same problems. Eliminate each and every possible variable and you will find the answer.
The following program is killed by the kernel when the memory is ran out. I would like to know when the global variable should be assigned to "ENOMEM".
#define MEGABYTE 1024*1024
#define TRUE 1
int main(int argc, char *argv[]){
void *myblock = NULL;
int count = 0;
while(TRUE)
{
myblock = (void *) malloc(MEGABYTE);
if (!myblock) break;
memset(myblock,1, MEGABYTE);
printf("Currently allocating %d MB\n",++count);
}
exit(0);
}
First, fix your kernel not to overcommit:
echo "2" > /proc/sys/vm/overcommit_memory
Now malloc should behave properly.
As "R" hinted, the problem is the default behaviour of Linux memory management, which is "overcommiting". This means that the kernel claims to allocate you memory successfuly, but doesn't actually allocate the memory until later when you try to access it. If the kernel finds out that it's allocated too much memory, it kills a process with "the OOM (Out Of Memory) killer" to free up some memory. The way it picks the process to kill is complicated, but if you have just allocated most of the memory in the system, it's probably going to be your process that gets the bullet.
If you think this sounds crazy, some people would agree with you.
To get it to behave as you expect, as R said:
echo "2" > /proc/sys/vm/overcommit_memory
It happens when you try to allocate too much memory at once.
#include <stdlib.h>
#include <stdio.h>
#include <errno.h>
int main(int argc, char *argv[])
{
void *p;
p = malloc(1024L * 1024 * 1024 * 1024);
if(p == NULL)
{
printf("%d\n", errno);
perror("malloc");
}
}
In your case the OOM killer is getting to the process first.
I think errno will be set to ENOMEM:
Macro defined in stdio.h. Here is the documentation.
#define ENOMEM 12 /* Out of Memory */
After you call malloc in this statement:
myblock = (void *) malloc(MEGABYTE);
And the function returns NULL -because system is out of memory -.
I found this SO question very interesting.
Hope it helps!