malloc() not allocate memory for large blocks of memory

malloc() not allocate memory for large blocks of memory - c

I am trying to make a program that allocates memory and then fill the allocated memory with random data, in this case A and does this a number of times and then print the time it takes to fill 100 mb in milli seconds.
The program is supposed to allocate a multiple of 100 mb each time. The program is supposed to do this up to 9000 mb. The reason I do this is to demonstrate how the OS behaves when it runs out of free memory and use swap space. However I have issues doing this.
When I run the program it behaves like it's supposed to do until I reach 2100 mb then it stop allocating memory and after a while the error handler kicks in and quit the program because malloc() returns NULL.
This is the code:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <sys/time.h>
#include <assert.h>
#include <unistd.h>
#define KILO 1024
#define MEGA (KILO*KILO)
#define INCREMENT 100
int diffTime(struct timeval * startTime, struct timeval * endTime) {
return ((endTime->tv_sec - startTime->tv_sec)*1000 + (endTime->tv_usec- startTime->tv_usec)/1000);
}
int createBigDatablock(long int storlek) {
char *arr = (char*) malloc(storlek*MEGA);
if (arr==NULL){
printf("Error: not allocating memory");
exit (-1);
}
for(int i = 0;i<storlek*MEGA;i++){
arr[i] = 'A';
}
fflush(stdin);
getchar();
free(arr);
return 0;
}
int main(void) {
long int i;
struct timeval startT, endT;
// struct timezone tzp;
for(i=INCREMENT;i<=9000;i=i+INCREMENT){
gettimeofday(&startT, NULL); //&tzp); /* hämta starttid */
createBigDatablock(i);
gettimeofday(&endT, NULL);//&tzp); /* hämta sluttid */
printf("Datablock %ld MB took %d msec\n",i, diffTime(&startT,&endT));
}
return 0;
}
I have tried to run this on a virtual machine with debian 8 with 4 gb memory and 570 mb swap. The memory is never completely filled and the swap space is never touched. If someone can help me figure this out I will be greatful

You probably have a 32bit system where you cannot allocate more than 2 gigabytes of memory per process.
If malloc returns NULL, that means that it couldn't allocate memory.
So the behaviour of your program is normal.
Another possibility is that it simply cannot find a large enough free contiguous memory block.
The reason why the swap space is never touched may be because there is always enough free memory available because your virtual machine has 4gb of memory.
If you want to use the swap space, you could try to allocate lots of smaller chunks of memory, for example 40 times 100Mb; then the overall quantity of memorty you are able to allocate may also be higher than 2100Mb.

I stumbled on this code and compiled it on a Fedora 34 X86_64 system. It would segfault at Datablock 2100 on the loop that writes an A to a memory location. I changed the i to long int i and my system ground to almost a halt but a watched free in another terminal showed it was filling swap. I have not fully tested it but it seems to be working. There are two i variables in this and perhaps they're considered separate.
for(int i = 0;i<storlek*MEGA;i++){
arr[i] = 'A';
for(long int i = 0;i<storlek*MEGA;i++){
arr[i] = 'A';

Related

calloc function generating zeros instead of memory addresses

I'm trying to create a graph with 264346 positions. Would you know why calloc when it reaches 26,000 positions it stops generating memory addresses (ex: 89413216) and starts generating zeros (0) and then all the processes on my computer crash?
The calloc function should generate zeros but not at this position on my code.
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <time.h>
#include <string.h>
#include <limits.h>
int maxV;
struct grafo {
int NumTotalVertices;
int NumVertices;
int NumArestas;
int **p;
};
typedef struct grafo MGrafo;
MGrafo* Aloca_grafo(int NVertices);
int main(){
MGrafo *MatrizGrafo;
MatrizGrafo = Aloca_grafo(264346);
return 0;
}
MGrafo* Aloca_grafo(int NVertices) {
int i, k;
MGrafo *Grafo ;
Grafo = (MGrafo*) malloc(sizeof(MGrafo));
Grafo->p = (int **) malloc(NVertices*sizeof(int*));
for(i=0; i<NVertices+1; i++){
Grafo->p[i] = (int*) calloc(NVertices,sizeof(int));// error at this point
//printf("%d - (%d)\n", i, Grafo->p[i]); // see impression
}
printf("%d - (%d)\n", i, Grafo->p[i]);
Grafo->NumTotalVertices = NVertices;
Grafo->NumArestas = 0;
Grafo->NumVertices = 0;
return Grafo;
}

You surely dont mean what you have in your code
Grafo = (MGrafo*)malloc(sizeof(MGrafo));
Grafo->p = (int**)malloc(NVertices * sizeof(int*)); <<<<=== 264000 int pointers
for (i = 0; i < NVertices + 1; i++) { <<<<< for each of those 264000 int pointers
Grafo->p[i] = (int*)calloc(NVertices, sizeof(int)); <<<<<=== allocate 264000 ints
I ran this on my machine
its fans turned on, meaning it was trying very very hard
after the inner loop got to only 32000 it had already allocated 33 gb of memory
I think you only need to allocate one set of integers, since I cant tell what you are trying to do it hard to know which to remove, but this is creating a 2d array 264000 by 264000 which is huge (~70billion = ~280gb of memory), surely you dont mean that
OK taking a comment from below, maybe you do mean it
If this is what you really want then you are going to need a very chunky computer and a lot of time.
Plus you are definitely going to have to test the return from those calloc and malloc calls to make sure that every alloc works.
A lot of the time you will see answers on SO saying 'check the return from malloc' but in fact most modern OS with modern hardware will rarely fail memory allocations. But here you are pushing the edge, test every one.
'Generating zeros' is how calloc tells you it failed.
https://linux.die.net/man/3/calloc
Return Value
The malloc() and calloc() functions return a pointer to the allocated memory that is suitably aligned for any kind of variable. On error, these functions return NULL. NULL may also be returned by a successful call to malloc() with a size of zero, or by a successful call to calloc() with nmemb or size equal to zero.

Why does malloc() cause minor page fault?

I'm trying to learn about memory and page fault, so I wrote the code below to check my understanding. I don't understand why calling malloc caused MINFL to increase since malloc() shouldn't affect physical memory (from what I understand).
This is my code:
#include <stdio.h>
#include <stdlib.h>
void main() {
printf("Before malloc\n");
getchar();
malloc(1 << 20);
printf("After malloc\n");
getchar();
}
These are the terminal results of ps command.
Before malloc:
After malloc:
There are 2 things I don't understand:
why does MINFL increase?
why does VSZ increase by 1028 and not 1024?
Please help and Thank you.

The answer to both of them is the same and very simple indeed.
As you might know, Glibc malloc will use mmap to directly allocate a block larger than 128 KiB. However, it will need to write bookkeeping information below the pointer - because how else would free know what it should be done when just given a pointer. If you print the pointer that malloc returned, you'll see that it is not page aligned.
Here's a program that demonstrates all this:
#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>
#include <sys/resource.h>
#define ALLOCATION_SIZE (1 << 20)
int main(void) {
struct rusage usage = {0};
getrusage(RUSAGE_SELF, &usage);
printf("1st before malloc: %lu\n", usage.ru_minflt);
getrusage(RUSAGE_SELF, &usage);
printf("2nd before malloc: %lu\n", usage.ru_minflt);
char *p = malloc(ALLOCATION_SIZE);
printf("pointer returned from malloc: %p\n", p);
getrusage(RUSAGE_SELF, &usage);
printf("after malloc: %lu\n", usage.ru_minflt);
p[0] = 42;
getrusage(RUSAGE_SELF, &usage);
printf("after writing to the beginning of the allocation: %lu\n", usage.ru_minflt);
for (size_t i = 0; i < ALLOCATION_SIZE; i++) {
p[i] = 42;
}
getrusage(RUSAGE_SELF, &usage);
printf("after writing to every byte of the allocation: %lu\n", usage.ru_minflt);
}
outputs something like
1st before malloc: 108
2nd before malloc: 118
pointer returned from malloc: 0x7fbcb32aa010
after malloc: 119
after writing to the beginning of the allocation: 119
after writing to every byte of the allocation: 375
i.e. getrusage and printf cause page faults the first time around, so we call it twice - now the fault count is 118 before the malloc call, and after malloc it is 119. If you look at the pointer, 0x010 is not 0x000 i.e. the allocation is not page-aligned - those first 16 bytes contain bookkeeping information for free so that it knows that it needs to use munmap to release the memory block, and the size of the allocated block!
Now naturally this explains why the size increase was 1028 Ki instead of 1024 Ki - one extra page had to be reserved so that there would be enough space for those 16 bytes! It also explains the source of the page fault - because malloc had to write the bookkeeping information to the copy-on-write zeroed page. This can be proved by writing to the first byte of the allocation - it doesn't cause a page fault any longer.
Finally the for loop will modify the pages and touch the remaining 256 pages out of those 257 mapped in.
And if you change ALLOCATION_SIZE to ((1 << 20) - 16) i.e. allocate just 16 bytes less, you'd see that the both virtual size and the number of page faults would match the values you were expecting.

Why is function pointer 12 bytes long?

I've been inspecting the heap memory when executing the following code:
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <stdio.h>
#include <sys/types.h>
struct data {
char name[64];
};
struct fp {
int (*fp)();
};
void winner()
{
printf("level passed\n");
}
void nowinner()
{
printf("level has not been passed\n");
}
int main(int argc, char **argv)
{
struct data *d;
struct fp *f;
d = malloc(sizeof(struct data));
f = malloc(sizeof(struct fp));
f->fp = nowinner;
printf("data is at %p, fp is at %p\n", d, f);
strcpy(d->name, argv[1]);
f->fp();
}
The code is compiled like this:
gcc winner.c -w -g -fno-stack-protector -z norelro -z execstack -o winner
(More code will be added later on so tags like fno-stack-protector are there)
I executed the project with argument "HELLO" and I set a break point on f->fp() and inspect the heap memory:
Everything after the first malloc() makes sense but I'm kinda puzzled about what happened after the second malloc(). The second chunk should only request 4 bytes of memory to store the function pointer but instead it took 12 bytes, which reflects on what is stored on the address 0x804a04c (4 bytes of metadata + 12 bytes of requested memory + status bit 1 = 17 => 0x00000011).
And as you can see, the function pointer did only take up four bytes on 0x804a050 with the address of nowinner (0x080484a1).
I read up on this SO post and this article but it seems it still can't explain why.

Your initial question can be answered very easily by printing sizeof of your pointer. You will not see a 12 here.
The answer to your question "Why is function pointer 12 bytes long?" is simply: "It's not!"
But your question describes a different underlying question:
"Why does allocating 4 bytes take 12 bytes on the heap?"
You are under the wrong impression that memory allocation only takes exactly what is needed to store the user data.
This is wrong.
Memory management also needs to store some management data for each allocation.
When you call free the runtime library needs to know the size of the allocated block.
Therefore you can take it as granted that every allocation consumes more memory than the requested amount.
Depending on the implementation of the heap this can be within the heap itself or in a separate area.
You can also not rely on taking the same amount of overhead for each allocation. There are weird implementation out there.
Some implementations take the requested amount and add fixed length of management data.
Some implementations use a buddy system and follow a sequence of fibonacci numbers to determine smallest suitable block size.

Why doesn't this memory eater really eat memory?

I want to create a program that will simulate an out-of-memory (OOM) situation on a Unix server. I created this super-simple memory eater:
#include <stdio.h>
#include <stdlib.h>
unsigned long long memory_to_eat = 1024 * 50000;
size_t eaten_memory = 0;
void *memory = NULL;
int eat_kilobyte()
{
memory = realloc(memory, (eaten_memory * 1024) + 1024);
if (memory == NULL)
{
// realloc failed here - we probably can't allocate more memory for whatever reason
return 1;
}
else
{
eaten_memory++;
return 0;
}
}
int main(int argc, char **argv)
{
printf("I will try to eat %i kb of ram\n", memory_to_eat);
int megabyte = 0;
while (memory_to_eat > 0)
{
memory_to_eat--;
if (eat_kilobyte())
{
printf("Failed to allocate more memory! Stucked at %i kb :(\n", eaten_memory);
return 200;
}
if (megabyte++ >= 1024)
{
printf("Eaten 1 MB of ram\n");
megabyte = 0;
}
}
printf("Successfully eaten requested memory!\n");
free(memory);
return 0;
}
It eats as much memory as defined in memory_to_eat which now is exactly 50 GB of RAM. It allocates memory by 1 MB and prints exactly the point where it fails to allocate more, so that I know which maximum value it managed to eat.
The problem is that it works. Even on a system with 1 GB of physical memory.
When I check top I see that the process eats 50 GB of virtual memory and only less than 1 MB of resident memory. Is there a way to create a memory eater that really does consume it?
System specifications: Linux kernel 3.16 (Debian) most likely with overcommit enabled (not sure how to check it out) with no swap and virtualized.

When your malloc() implementation requests memory from the system kernel (via an sbrk() or mmap() system call), the kernel only makes a note that you have requested the memory and where it is to be placed within your address space. It does not actually map those pages yet.
When the process subsequently accesses memory within the new region, the hardware recognizes a segmentation fault and alerts the kernel to the condition. The kernel then looks up the page in its own data structures, and finds that you should have a zero page there, so it maps in a zero page (possibly first evicting a page from page-cache) and returns from the interrupt. Your process does not realize that any of this happened, the kernels operation is perfectly transparent (except for the short delay while the kernel does its work).
This optimization allows the system call to return very quickly, and, most importantly, it avoids any resources to be committed to your process when the mapping is made. This allows processes to reserve rather large buffers that they never need under normal circumstances, without fear of gobbling up too much memory.
So, if you want to program a memory eater, you absolutely have to actually do something with the memory you allocate. For this, you only need to add a single line to your code:
int eat_kilobyte()
{
if (memory == NULL)
memory = malloc(1024);
else
memory = realloc(memory, (eaten_memory * 1024) + 1024);
if (memory == NULL)
{
return 1;
}
else
{
//Force the kernel to map the containing memory page.
((char*)memory)[1024*eaten_memory] = 42;
eaten_memory++;
return 0;
}
}
Note that it is perfectly sufficient to write to a single byte within each page (which contains 4096 bytes on X86). That's because all memory allocation from the kernel to a process is done at memory page granularity, which is, in turn, because of the hardware that does not allow paging at smaller granularities.

All the virtual pages start out copy-on-write mapped to the same zeroed physical page. To use up physical pages, you can dirty them by writing something to each virtual page.
If running as root, you can use mlock(2) or mlockall(2) to have the kernel wire up the pages when they're allocated, without having to dirty them. (normal non-root users have a ulimit -l of only 64kiB.)
As many others suggested, it seems that the Linux kernel doesn't really allocate the memory unless you write to it
An improved version of the code, which does what the OP was wanting:
This also fixes the printf format string mismatches with the types of memory_to_eat and eaten_memory, using %zi to print size_t integers. The memory size to eat, in kiB, can optionally be specified as a command line arg.
The messy design using global variables, and growing by 1k instead of 4k pages, is unchanged.
#include <stdio.h>
#include <stdlib.h>
size_t memory_to_eat = 1024 * 50000;
size_t eaten_memory = 0;
char *memory = NULL;
void write_kilobyte(char *pointer, size_t offset)
{
int size = 0;
while (size < 1024)
{ // writing one byte per page is enough, this is overkill
pointer[offset + (size_t) size++] = 1;
}
}
int eat_kilobyte()
{
if (memory == NULL)
{
memory = malloc(1024);
} else
{
memory = realloc(memory, (eaten_memory * 1024) + 1024);
}
if (memory == NULL)
{
return 1;
}
else
{
write_kilobyte(memory, eaten_memory * 1024);
eaten_memory++;
return 0;
}
}
int main(int argc, char **argv)
{
if (argc >= 2)
memory_to_eat = atoll(argv[1]);
printf("I will try to eat %zi kb of ram\n", memory_to_eat);
int megabyte = 0;
int megabytes = 0;
while (memory_to_eat-- > 0)
{
if (eat_kilobyte())
{
printf("Failed to allocate more memory at %zi kb :(\n", eaten_memory);
return 200;
}
if (megabyte++ >= 1024)
{
megabytes++;
printf("Eaten %i MB of ram\n", megabytes);
megabyte = 0;
}
}
printf("Successfully eaten requested memory!\n");
free(memory);
return 0;
}

A sensible optimisation is being made here. The runtime does not actually acquire the memory until you use it.
A simple memcpy will be sufficient to circumvent this optimisation. (You might find that calloc still optimises out the memory allocation until the point of use.)

Not sure about this one but the only explanation that I can things of is that linux is a copy-on-write operating system. When one calls fork the both processes point to the same physically memory. The memory is only copied once one process actually WRITES to the memory.
I think here, the actual physical memory is only allocated when one tries to write something to it. Calling sbrk or mmap may well only update the kernel's memory book-keep. The actual RAM may only be allocated when we actually try to access the memory.

Basic Answer
As mentioned by others, the allocation of memory, until used, does not always commit the necessary RAM. This happens if you allocate a buffer larger than one page (usually 4Kb on Linux).
One simple answer would be for your "eat memory" function to always allocate 1Kb instead of increasingly larger blocks. This is because each allocated blocks start with a header (a size for allocated blocks). So allocating a buffer of a size equal to or less than one page will always commit all of those pages.
Following Your Idea
To optimize your code as much as possible, you want to allocate blocks of memory aligned to 1 page size.
From what I can see in your code, you use 1024. I would suggest that you use:
int size;
size = getpagesize();
block_size = size - sizeof(void *) * 2;
What voodoo magic is this sizeof(void *) * 2?! When using the default memory allocation library (i.e. not SAN, fence, valgrin, ...), there is a small header just before the pointer returned by malloc() which includes a pointer to the next block and a size.
struct mem_header { void * next_block; intptr_t size; };
Now, using block_size, all your malloc() should be aligned to the page size we found earlier.
If you want to properly align everything, the first allocation needs to use an aligned allocation:
char *p = NULL;
int posix_memalign(&p, size, block_size);
Further allocations (assuming your tool only does that) can use malloc(). They will be aligned.
p = malloc(block_size);
Note: please verify that it is indeed aligned on your system... it works on mine.
As a result you can simplify your loop with:
for(;;)
{
p = malloc(block_size);
*p = 1;
}
Until you create a thread, the malloc() does not use mutexes. But it still has to look for a free memory block. In your case, though, it will be one after the other and there will be no holes in the allocated memory so it will be pretty fast.
Can it be faster?
Further note about how memory is generally allocated in a Unix system:
the malloc() function and related functions will allocate a block in your heap; which at the start is pretty small (maybe 2Mb)
when the existing heap is full it gets grown using the sbrk() function; as far as your process is concerned, the memory address always increases, that's what sbrk() does (contrary to MS-Windows which allocates blocks all over the place)
using sbrk() once and then hitting the memory every "page size" bytes would be faster than using malloc()
char * p = malloc(size); // get current "highest address"
p += size;
p = (char*)((intptr_t)p & -size); // clear bits (alignment)
int total_mem(50 * 1024 * 1024 * 1024); // 50Gb
void * start(sbrk(total_mem));
char * end((char *)start + total_mem);
for(; p < end; p += size)
{
*p = 1;
}
note that the malloc() above may give you the "wrong" start address. But your process really doesn't do much, so I think you'll always be safe. That for() loop, however, is going to be as fast as possible. As mentioned by others, you'll get the total_mem of virtual memory allocated "instantly" and then the RSS memory allocated each time you write to *p.
WARNING: Code not tested, use at your own risk.

char pointer is struct array memory leak

I'm having memory leaks in a larger program and I believe this is the cause of it.
#include <stdlib.h>
#include <Windows.h>
typedef struct _struct{
char* name;
} str;
int main() {
system("PAUSE");
str* Character = (str*)malloc(sizeof(str) * 20000);
for(int i = 0; i < 20000; i++){
Character[i].name = (char*)malloc(20000); // Assign memory.
}
for(int i = 0; i < 20000; i++){
free(Character[i].name); // Free memory.
}
free(Character);
system("PAUSE");
}
Memory at first pause: ~500K.
Memory at second pause: ~1.7M.
Using VS2012 for testing. Any ideas?

How are you measuring the amount of memory occupied by the program? One thing off the top of my head is that you're looking at the size of the working set the OS is keeping track of. Since you've allocated and freed a lot of memory, the size of that set has increased. Some OSs will adjust the size of the working set after a while, some won't. What OS are we looking at here?

When you call malloc, memory is allocated on the heap. If there is insufficient space left on the heap, the program will ask the OS for more memory and another chunk is acquired. Memory acquired from the OS is usually not returned until the program finishes (although this is up to the OS).
Program size alone can not normally be used to check for memory leaks! Use Valgrind or a similar tool to check for memory that never gets freed.

str* Character = (str*)malloc(sizeof(str) * 20000);
In the above line you are allocating the memory by finding the size of the struct. Here the size of the structure you will get will be the size of the pointer width and not the size of the char.
suppose for example if the pointer width is 32 bit the it will allocate (4 * 20000) = 80000 bytes.
If you want to allocate for 20000 struct's,
str* Character = (str*)malloc(sizeof(char) * 20000);