why does bigger malloc cause exception? [duplicate] - c

First of all I noticed when I malloc memory vs. calloc the memory footprint is different. I am working with datasets of several GB. It is ok for this data to be random.
I expected that I could just malloc a large amount of memory and read whatever random data was in it cast to a float. However, looking at the memory footprint in the process viewer the memory is obviously not being claimed (vs. calloc where I see a large foot print). I ran a loop to write data into the memory and then I saw the memory footprint climb. Am I correct in saying that the memory isn't actually claimed until I initialize it?
Finally after I passed 1024*1024*128 bytes (1024 MB in the process viewer) I started getting segfaults. Calloc however seems to initialize the full amount up to 1 GB. Why do I get segfaults when initializing memory in a for loop with malloc at this number 128MB and why does the memory footprint show 1024MB?
If malloc a large amount from memory and then read from it what am I getting (since the process viewer shows almost no footprint until I initialize it)?
Finally is there any way for me to alloc more than 4GB? I am testing memory hierarchy performance.
Code for #2:
long long int i;
long long int *test=(long long int*)malloc(1024*1024*1024);
for (i=0;i<1024*1024*128;i++)
test[i]=i;
sleep(15);

Some notes:
As the comments note, Linux doesn't actually allocate your memory until you use it.
When you use calloc instead of malloc, it zeroes out all the memory you requested. This is equivalent to using it.

1- If you are working on a 32-bit machine you can't have a variable with more than 2GBs allocated to it.
2- If you are working on a 64-bit machine you can allocate as much as RAM+Swap memory in total, however, allocating all for one variable requires a big consequent chunk of memory which might not be available. Try it with a linked list, where each element has only 1 MB assigned and you can achieve a higher memory allocated in total.
3- As noted by you and Sharth, unless you use your memory, linux won't allocate it.

Your #2 is failing with a segfault either because sizeof(long long int) > 8 or because your malloc returned NULL. That is very possible if you are requesting 1 GB of RAM.
More info on #2. From your 128 MB comment I get the idea that you may not realize what's happening. Because you declare the array pointer as long long int the size of each array element is 8 bytes. 1024/8 == 128 so that is why your loop works. It did when I tried it, anyway.

Your for loop in your example code is actually touching 1GB of memory, since it is indexing 128*1024*1024 long longs, and each long long is 8 bytes.

Related

Why does malloc(1) work for storing 4 byte integers?

From what I understand, malloc(x) returns a block of memory x bytes long.
So to store a 4 byte integer, I would do:
int *p = (int *)malloc(4);
*p = 100;
Because sizeof(int) returns 4 for me.
However, if I do:
int *p = (int *)malloc(1);
*p = 100;
It seems to work exactly the same, with no issues storing the value.
Why does the amount of memory requested with malloc() not seem to matter? Shouldn't a 4 byte integer require malloc(4)?
If this works in your case it just works by chance and is not guaranteed to work. It is undefined behavior (compare this SO question) and everything can happen.
What did you expect to happen? Your program crash?
That might still happen if you call malloc and free a bit more often. malloc often takes some bytes more than requested and uses the extra space for Managing (linked list of all memory blocks, sizes of memory blocks). If you write some bytes before or after your allocated block then chances are high that you mess with the internal management structures and that subsequent malloc of free will crash.
If malloc internally always allocates a minimum of n bytes then your program might only crash if you access byte n+1. Also the operating system normally only be protects memory based on pages. If a page has a size of 512 bytes and your malloc-ed byte is in the middle of a page then your process might be able to read-write the rest of the page and will only crash when accessing the next memory page. But remember: even if this works it is undefined behavior.
malloc as all memory block allocation functions from C runtime or OS Kernel are optimized for memory access and object alignment.
Moreover, malloc specifically, allocate an hidden control block in front of the allocated space to keep track of the allocation (space required, space allocated, etc).
malloc must also to guarantee that the allocated memory address is suitably aligned for any storage object, this means that the block will start on an 8, 16, 32 or even 64 or 128 bytes boundary depending on the processor and generically from hardware (i.e. some special MMU). The boundary is also dependent on access speed, some processor have different behavior with different memory accesses (1, 2, 4, 8, ... bytes) and address boundaries. This constraints drive malloc code spec and allocator logical memory blocks partitions.
On practical side lets consider an allocator for X86 processor, it generally give back a block aligned on an 8 bytes boundary (32 bits code), that is useful for int, float and even doubles. To do this malloc divides the available memory arena in 'blocks' that are the minimal allocation space. When you allocate even 1 byte the function allocates at least one block. Eventually this block can host an integer, or even a double, but it is implementation dependent, and you can't consider it deterministic, because in future versions of the same function the behavior can change.
Now that, I hope, is clear because your code seems to work, keep well in mind that this is Undefined-Behavior and you must keep it for that. IT can work now, not with the next revision, it can crash on some hardware and not on another processor or machine.
For this we should know how malloc function works internally. To allocate memory dynamically, each operating system make use of system calls. We can dynamically allocate memory using these system calls. These system calls are different from one OS to the other.
So the system calls of one OS might not work for the other OS. And moreover if we are using system calls to allocate memory dynamically then our program will be platform dependent. So to avoid this dependency we use malloc function. Now it is the responsibility of malloc function to make the appropriate system calls based on the OS to allocate memory dynamically.
So malloc itself invokes system calls and it will be a very slow process because each time we are asking for dynamic memory it has to make use of system calls. To avoid this whenever we request dynamic memory it usually allocate extra memory so that next time the system call can be avoided and the remaining chunk of previously allocated memory can be used. And that's why your program is working as malloc is allocating extra memory.
The C programming language gives you the ability to shoot yourself in the foot.
It intentionally burdens the programmer with the charge that they ought to know what they are doing. Broadly speaking the reasons are to achieve performance, readability, and portability.
The behaviour of your code is undefined. If you ask for 1 byte then expect to get only one usable byte back. The fact that the operating system and C runtime library seems to be giving you back a little more than that is no more than a curious peculiarity.
On other occasions the compiler might simply eat your cat.
Finally, use sizeof in the call to malloc rather than hardcoding the size of an int type: on many systems sizeof(int) is 2, 4 is commonplace, and all values greater than 1 are allowed by the standard. In your case, using either sizeof(int) or sizeof(*p) is possible. Some folk prefer the latter since then you're not hardcoding the type of the variable in the sizeof call, so guarding you against possible variable type changes. (Note that sizeof(*p) is compile-time evaluable and uses static type information; hence it can be used before p itself "exists", if you get my meaning.)
It seems to work exactly the same, with no issues storing the value.
You invoke undefined behavior with your code, so you cannot tell that it works. In order to allocate memory for an integer, you should do:
int *p;
p = malloc(sizeof (*p) ); //you can use sizeof(*p) as p is already declared and here you use the size of its content, which is actually the size of an int
if (p != NULL)
*p = 100;
Simply, when you are allocating 1 byte for the int, there are 3 bytes following it that are not actually allocated for it, but you can still use them. You are getting lucky that these aren't being changed by something else during your tests, and that it isn't overwriting anything important (or maybe it is). So essentially this will cause an error once those 3 bytes are needed by something else -- always malloc the right size.
Usually malloc is implemented such a way that it allocates memory with the size not less than the size of the paragraph that is equal to 16 bytes.
So when you require for example 4 bytes of memory malloc actually allocates 16 bytes. However this behavior does not described in the C Standard and you may not rely on it. As result it means that the program you showed has undefined behavior.
I think this is because of padding, and even though you are calling malloc(1) padding bytes are coming with the memory.
please check this link http://www.delorie.com/gnu/docs/glibc/libc_31.html

Heap memory exploration with malloc

I've written a program that takes 3 number in input:
The size of memory to allocate in the heap with malloc()
Two int value
If q is an unsigned char pointer it gives q[i]=b from q[min] to q[max].
I thought that the heap was divided in pages and that the first call to malloc() would have given a pointer to the first byte of the page of my process. So why if try to get q[-1] my process is not killed?
Then I've tried with another pointer p and I noticed that between the two pointers there is a distance of 32byte, why they are not adjacent?
The last thing I notice is that both in p[-8]=q[-40(-32-8)] and q[-8] there is the number 33 00100001 (all the other bytes are setted to 0), it means anything?
Thank you!
I thought that the heap was divided in pages and that the first call to malloc would have given a pointer to the first byte of the page of my process. So why if try to get q[-1] my process is not killed?
Most likely because your malloc implementation stores something there. Possibly the size of the block.
Then I've tried with another pointer p and I noticed that between the two pointers there is a distance of 32byte, why they are not adjacents?
Same reason. Your implementation probably stores the size of the block in the block just before the address it returns.
The last thing I notice is that both in p[-8]=q[-40(-32-8)] and q[-8] there is the number 33 (00100001), it means anything?
It probably means something to your malloc implementation. But you can't really tell what without looking at the implementation.
The standard library uses the heap before calling main so anything you do won't be on a clean heap.
The heap implementation usually uses about 2 pointer at the starting of an allocation, and the total size is usually aligned to 2 pointers.
The heap implementation usually uses a lot of bytes at the start of each system allocation, it can sometimes be close to page size.
The heap is allocated in chunks much bigger than a page, on Windows it is at least 16 pages.
The heap can be adjacent to other allocations, on Linux it appeares right after the main executable so underflowing it won't crash.

Large array declaration with gcc and its problems

I was writing a code which requires a large 'int' array to be allocated (size of 10^9).
While doing so i faced several issues and after reading stuff on Google i came to following conclusions of my own. Can someone see this and point out if i am missing some thing and also suggest a better way to do this.
(Machine config: VM machine Ubuntu 10.4,gcc 4.4.3 , 32bit, 2GB ram(though my host machine as 6gigs)
1.I declared the array as 'unsigned long int' with size 1*10^9. It didn't worked as on compiling the code i got the error 'array size too long'.
So i searched for this and finally realized that i cant allocate that much memory on stack as my physical memory was 2 GB.( i had already tried allocating the array as global variable which would allocate them in global area instead of stack but the same error)
So i tried allocating the same amount of memory using 'malloc' but again got the error with 'malloc' this time 'Cannot alllocate memory'.
So after doing all this my understanding/problems are as follows:
3- I can't allocate that much memory be it stack or heap as my physical mem is only 2Gb ( so this is the actual problem or some other factors also govern this mem allocation ??)
4- Is there any possible workaround where i can allocate a memory of size 10^9 on a 2gig machine( I know allocating a array or mem area this much big is neither good algo design nor efficient but i just want know the limits.)
5- any better solution for allocating this much memory ( i mean should i use 2 small arrays/heap mem instead of one big chunk)
(NOTE:Point 4 and 5 are two different approaches i would appreciate suggestion for both the approaches)
Many thanks
P.S forgive me if i am being novice ..
You are compiling a 32 bit process and there is simply not enough physical address space for your huge data block. A 32 bit pointer can hold 2^32 distinct values, i.e. 4GB. You can't allocate more than that because you would have no way to refer to the memory. Each byte of memory that is mapped into your process must have a unique address.
So, nothing is going to fit your data into a 4GB address space. Even if your array was less than 4GB you may have problems allocating a single contiguous block of memory.
You could use a 64 bit process but you'd need to make sure you had enough physical memory to avoid disk thrashing when your array was swapped. Or you could find a different algorithm that did not require such a huge block of memory.

Under what circumstances can malloc return NULL?

It has never happened to me, and I've programming for years now.
Can someone give me an example of a non-trivial program in which malloc will actually not work?
I'm not talking about memory exhaustion: I'm looking for the simple case when you are allocating just one memory block in a bound size given by the user, lets say an integer, causes malloc to fail.
You need to do some work in embedded systems, you'll frequently get NULL returned there :-)
It's much harder to run out of memory in modern massive-address-space-and-backing-store systems but still quite possible in applcations where you process large amounts of data, such as GIS or in-memory databases, or in places where your buggy code results in a memory leak.
But it really doesn't matter whether you've never experienced it before - the standard says it can happen so you should cater for it. I haven't been hit by a car in the last few decades either but that doesn't mean I wander across roads without looking first.
And re your edit:
I'm not talking about memory exhaustion, ...
the very definition of memory exhaustion is malloc not giving you the desired space. It's irrelevant whether that's caused by allocating all available memory, or heap fragmentation meaning you cannot get a contiguous block even though the aggregate of all free blocks in the memory arena is higher, or artificially limiting your address space usage such using the standards-compliant function:
void *malloc (size_t sz) { return NULL; }
The C standard doesn't distinguish between modes of failure, only that it succeeds or fails.
Yes.
Just try to malloc more memory than your system can provide (either by exhausting your address space, or virtual memory - whichever is smaller).
malloc(SIZE_MAX)
will probably do it. If not, repeat a few times until you run out.
Any program at all written in c that needs to dynamically allocate more memory than the OS currently allows.
For fun, if you are using ubuntu type in
ulimit -v 5000
Any program you run will most likely crash (due to a malloc failure) as you've limited the amount of available memory to any one process to a pithy amount.
Unless your memory is already completely reserved (or heavily fragmented), the only way to have malloc() return a NULL-pointer is to request space of size zero:
char *foo = malloc(0);
Citing from the C99 standard, §7.20.3, subsection 1:
If the size of the space requested is zero, the behavior is implementationdefined: either a null pointer is returned, or the behavior is as if the size were some
nonzero value, except that the returned pointer shall not be used to access an object.
In other words, malloc(0) may return a NULL-pointer or a valid pointer to zero allocated bytes.
Pick any platform, though embedded is probably easier. malloc (or new) a ton of RAM (or leak RAM over time or even fragment it by using naive algorithms). Boom. malloc does return NULL for me on occasion when "bad" things are happening.
In response to your edit. Yes again. Memory fragmentation over time can make it so that even a single allocation of an int can fail. Also keep in mind that malloc doesn't just allocate 4 bytes for an int, but can grab as much space as it wants. It has its own book-keeping stuff and quite often will grab 32-64 bytes minimum.
On a more-or-less standard system, using a standard one-parameter malloc, there are three possible failure modes (that I can think of):
The size of allocation requested is not allowed. Eg, some systems may not allow an allocation > 16M, even if more storage is available.
A contiguous free area of the size requested, with default boundary, cannot be located in the heap. There may still be plenty of heap, but just not enough in one piece.
The total allocated heap has exceeded some "artificial" limit. Eg, the user may be prohibited from allocation more than 100M, even if there's 200M free and available to the "system" in a single combined heap.
(Of course, you can get combinations of 2 and 3, since some systems allocate non-contiguous blocks of address space to the heap as it grows, placing the "heap size limit" on the total of the blocks.)
Note that some environments support additional malloc parameters such as alignment and pool ID which can add their own twists.
Just check the manual page of malloc.
On success, a pointer to the memory block allocated by the function.
The type of this pointer is always void*, which can be cast to the desired type of data pointer in order to be dereferenceable.
If the function failed to allocate the requested block of memory, a null pointer is returned.
Yes. Malloc will return NULL when the kernel/system lib are certain that no memory can be allocated.
The reason you typically don't see this on modern machines is that Malloc doesn't really allocate memory, but rather it requests some “virtual address space” be reserved for your program so you might write in it. Kernels such as modern Linux actually over commit, that is they let you allocate more memory than your system can actually provide (swap + RAM) as long as it all fits in the address space of the system (typically 48bits on 64bit platforms, IIRC). Thus on these systems you will probably trigger an OOM killer before you will trigger a return of a NULL pointer. A good example is a 512MB RAM in a 32bit machine: it's trivial to write a C program that will be eaten by the OOM killer because of it trying to malloc all available RAM + swap.
(Overcomitting can be disabled at compile time on Linux, so it depends on the build options whether or not a given Linux kernel will overcommit. However, stock desktop distro kernels do it.)
Since you asked for an example, here's a program that will (eventually) see malloc return NULL:
perror();void*malloc();main(){for(;;)if(!malloc(999)){perror(0);return 0;}}
What? You don't like deliberately obfuscated code? ;) (If it runs for a few minutes and doesn't crash on your machine, kill it, change 999 to a bigger number and try again.)
EDIT: If it doesn't work no matter how big the number is, then what's happening is that your system is saying "Here's some memory!" but so long as you don't try to use it, it doesn't get allocated. In which case:
perror();char*p;void*malloc();main(){for(;;){p=malloc(999);if(p)*p=0;else{perror(0);return 0;}}
Should do the trick. If we can use GCC extentions, I think we can get it even smaller by changing char*p;void*malloc(); to void*p,*malloc(); but if you really wanted to golf you'd be on the Code Golf SE.
when the malloc param is negative or 0 or you have no memory left on heap.
I had to correct somebody's code which looked like this.
const int8_t bufferSize = 128;
void *buffer = malloc(bufferSize);
Here buffer is NULL because bufferSize is actually -128

Why should I use malloc() when "char bigchar[ 1u << 31 - 1 ];" works just fine?

What's the advantage of using malloc (besides the NULL return on failure) over static arrays? The following program will eat up all my ram and start filling swap only if the loops are uncommented. It does not crash.
...
#include <stdio.h>
unsigned int bigint[ 1u << 29 - 1 ];
unsigned char bigchar[ 1u << 31 - 1 ];
int main (int argc, char **argv) {
int i;
/* for (i = 0; i < 1u << 29 - 1; i++) bigint[i] = i; */
/* for (i = 0; i < 1u << 31 - 1; i++) bigchar[i] = i & 0xFF; */
getchar();
return 0;
}
...
After some trial and error I found the above is the largest static array allowed on my 32-bit Intel machine with GCC 4.3. Is this a standard limit, a compiler limit, or a machine limit? Apparently I can have as many of of them as I want. It will segfault, but only if I ask for (and try to use) more than malloc would give me anyway.
Is there a way to determine if a static array was actually allocated and safe to use?
EDIT: I'm interested in why malloc is used to manage the heap instead of letting the virtual memory system handle it. Apparently I can size an array to many times the size I think I'll need and the virtual memory system will only keep in ram what is needed. If I never write to e.g. the end (or beginning) of these huge arrays then the program doesn't use the physical memory. Furthermore, if I can write to every location then what does malloc do besides increment a pointer in the heap or search around previous allocations in the same process?
Editor's note: 1 << 31 causes undefined behaviour if int is 32-bit, so I have modified the question to read 1u. The intent of the question is to ask about allocating large static buffers.
Well, for two reasons really:
Because of portability, since some systems won't do the virtual memory management for you.
You'll inevitably need to divide this array into smaller chunks for it to be useful, then to keep track of all the chunks, then eventually as you start "freeing" some of the chunks of the array you no longer require you'll hit the problem of memory fragmentation.
All in all you'll end up implementing a lot of memory management functionality (actually pretty much reimplementing the malloc) without the benefit of portability.
Hence the reasons:
Code portability via memory management encapsulation and standardisation.
Personal productivity enhancement by the way of code re-use.
Please see:
malloc() and the C/C++ heap
Should a list of objects be stored on the heap or stack?
C++ Which is faster: Stack allocation or Heap allocation
Proper stack and heap usage in C++?
About C/C++ stack allocation
Stack,Static and Heap in C++
Of Memory Management, Heap Corruption, and C++
new on stack instead of heap (like alloca vs malloc)
with malloc you can grow and shrink your array: it becomes dynamic, so you can allocate exactly for what you need.
This is called custom memory management, I guess.
You can do that, but you'll have to manage that chunk of memory yourself.
You'd end up writing your own malloc() woring over this chunk.
Regarding:
After some trial and error I found the
above is the largest static array
allowed on my 32-bit Intel machine
with GCC 4.3. Is this a standard
limit, a compiler limit, or a machine
limit?
One upper bound will depend on how the 4GB (32-bit) virtual address space is partitioned between user-space and kernel-space. For Linux, I believe the most common partitioning scheme has a 3 GB range of addresses for user-space and a 1 GB range of addresses for kernel-space. The partitioning is configurable at kernel build-time, 2GB/2GB and 1GB/3GB splits are also in use. When the executable is loaded, virtual address space must be allocated for every object regardless of whether real memory is allocated to back it up.
You may be able to allocate that gigantic array in one context, but not others. For example, if your array is a member of a struct and you wish to pass the struct around. Some environments have a 32K limit on struct size.
As previously mentioned, you can also resize your memory to use exactly what you need. It's important in performance-critical contexts to not be paging out to virtual memory if it can be avoided.
There is no way to free stack allocation other than going out of scope. So when you actually use global allocation and VM has to alloc you real hard memory, it is allocated and will stay there until your program runs out. This means that any process will only grow in it's virtual memory use (functions have local stack allocations and those will be "freed").
You cannot "keep" the stack memory once it goes out of scope of function, it is always freed. So you must know how much memory you will use at compile time.
Which then boils down to how many int foo[1<<29]'s you can have. Since first one takes up whole memory (on 32bit) and will be (lets lie: 0x000000) the second will resolve to 0xffffffff or thereaobout. Then the third one would resolve to what? Something that 32bit pointers cannot express. (remember that stack reservations are resolved partially at compiletime, partially runtime, via offsets, how far the stack offset is pushed when you alloc this or that variable).
So the answer is pretty much that once you have int foo [1<<29] you cant have any reasonable depth of functions with other local stack variables anymore.
You really should avoid doing this unless you know what you're doing. Try to only request as much memory as you need. Even if it's not being used or getting in the way of other programs it can mess up the process its self. There are two reasons for this. First, on certain systems, particularly 32bit ones it can cause address space to be exhausted prematurely in rare circumstances. Additionally many kernels have some kind of per process limit on reserved/virtual/not in use memory. If your program asks for memory at points in run time the kernel can kill the process if it asks for memory to be reserved that exceeds this limit. I've seen programs that have either crashed or exited due to a failed malloc because they are reserving GBs of memory while only using a few MB.

Resources