Stack and Heap Space for Modern Computers - c

When writing in C, how can I tell how much stack space is available in memory when I launch a program? How about heap space?
How can I tell how much memory is being used during the execution of my program?

This is all Win32-specific (not really C-specific, all just OS API):
When a thread is created, it gets 1MB stack space by default, by that can be modified in whatever CreateThread API you use.
You can peek into the thread information block to find the actual stack info, but even though this is documented, this technique isn't officially supported, see http://en.wikipedia.org/wiki/Win32_Thread_Information_Block .
Also, for a 32-bit application, you can only address up to 2GB, so for an app that by design uses lots of memory, then the thing to watch out for is the total size of the process' virtual address space (committed + reserved), which includes all heap allocations. You can programmatically access the process' virtual memory with the GlobalMemoryStatusEx API, look at the ullTotalVirtual param for virtual address space. Once your process gets close to 1.8 or 1.9GB of VAS, then heap allocations and VirtualAlloc calls begin to fail. For "normal" apps, you don't have to worry about running out of VAS, but it's always good to check for fail allocs. Also, you shouldn't get a stack overflow, unless you have a bug, or a bad design.

There is a philosophy that when you need to ask these kinds of questions, for practical and not educational or informational reasons, then you are doing something seriously wrong.
If you are asking this for error-checking or to make sure your program has enough memory, ect... then don't wrorry about it, seriously. As for your programs memory, you can use the task manager (on windows) if this is just for debugging. If you need to know this in your program, I wouldn't count on any non-hacky solution.

Abstractions for a reason
Really, your program shouldn't have this as a concern. It is an OS concern, your problem should just be efficient with what it needs and let the OS do its job.
If you insist, you could look into /proc/meminfo, brk(), getrlimit() and setrlimit() (here are some docs) with the RLIMIT_STACK and RLIMIT_DATA values for approximations and rough-ishes.
#include <sys/resource.h>
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
int main (int argc, char *argv[])
{
struct rlimit limit;
/* Get the stack limit. */
if (getrlimit(RLIMIT_STACK, &limit) != 0) {
printf("getrlimit() failed with errno=%d\n", errno);
exit(1);
}
printf("The stack soft limit is %llu\n", limit.rlim_cur);
printf("The stack hard limit is %llu\n", limit.rlim_max);
exit(0);
}
Modified from here also see man getrlimit on your system
If you state what and why you want to do this, someone may have a better method or way of doing what you want.

Related

C program slowly takes up more memory [duplicate]

I wrote a C program in Linux that mallocs memory, ran it in a loop, and TOP didn't show any memory consumption.
then I've done something with that memory, and TOP did show memory consumption.
When I malloc, do I really "get memory", or is there a "lazy" memory management, that only gives me the memory if/when I use it?
(There is also an option that TOP only know about memory consumption when I use it, so I'm not sure about this..)
Thanks
On Linux, malloc requests memory with sbrk() or mmap() - either way, your address space is expanded immediately, but Linux does not assign actual pages of physical memory until the first write to the page in question. You can see the address space expansion in the VIRT column, while the actual, physical memory usage in RES.
This starts a little off subject ( and then I'll tie it in to your question ), but what's happening is similar to what happens when you fork a process in Linux. When forking there is a mechanism called copy on write which only copies the memory space for the new process when the memory is written too. This way if the forked process exec's a new program right away then you've saved the overhead of copying the original programs memory.
Getting back to your question, the idea is similar. As others have pointed out, requesting the memory gets you the virtual memory space immediately, but the actual pages are only allocated when write to them.
What's the purpose of this? It basically makes mallocing memory a more or less constant time operation Big O(1) instead of a Big O(n) operation ( similar to the way the linux scheduler spreads it's work out instead of doing it in one big chunk ).
To demonstrate what I mean I did the following experiment:
rbarnes#rbarnes-desktop:~/test_code$ time ./bigmalloc
real 0m0.005s
user 0m0.000s
sys 0m0.004s
rbarnes#rbarnes-desktop:~/test_code$ time ./deadbeef
real 0m0.558s
user 0m0.000s
sys 0m0.492s
rbarnes#rbarnes-desktop:~/test_code$ time ./justwrites
real 0m0.006s
user 0m0.000s
sys 0m0.008s
The bigmalloc program allocates 20 million ints, but doesn't do anything with them. deadbeef writes one int to each page resulting in 19531 writes and justwrites allocates 19531 ints and zeros them out. As you can see deadbeef takes about 100 times longer to execute than bigmalloc and about 50 times longer than justwrites.
#include <stdlib.h>
int main(int argc, char **argv) {
int *big = malloc(sizeof(int)*20000000); // allocate 80 million bytes
return 0;
}
.
#include <stdlib.h>
int main(int argc, char **argv) {
int *big = malloc(sizeof(int)*20000000); // allocate 80 million bytes
// immediately write to each page to simulate all at once allocation
// assuming 4k page size on 32bit machine
for ( int* end = big + 20000000; big < end; big+=1024 ) *big = 0xDEADBEEF ;
return 0;
}
.
#include <stdlib.h>
int main(int argc, char **argv) {
int *big = calloc(sizeof(int),19531); // number of writes
return 0;
}
Yes, the memory isn't mapped into your memoryspace unless you touch it. mallocing memory will only setup the paging tables so they know when you get a pagefault in the allocated memory, the memory should be mapped in.
Are you using compiler optimizations? Maybe the optimizer has removed the allocation since you're not using the allocated resources?
The feature is called overcommit - kernel "promises" you memory by increasing the data segment size, but does not allocate physical memory to it. When you touch an address in that new space the process page-faults into the kernel, which then tries to map physical pages to it.
Yes, note the VirtualAlloc flags,
MEM_RESERVE
MEM_COMMIT
.
Heh, but for Linux, or any POSIX/BSD/SVR# system, vfork(), has been around for ages and provides simular functionality.
The vfork() function differs from
fork() only in that the child process
can share code and data with the
calling process (parent process). This
speeds cloning activity significantly
at a risk to the integrity of the
parent process if vfork() is misused.
The use of vfork() for any purpose
except as a prelude to an immediate
call to a function from the exec
family, or to _exit(), is not advised.
The vfork() function can be used to
create new processes without fully
copying the address space of the old
process. If a forked process is simply
going to call exec, the data space
copied from the parent to the child by
fork() is not used. This is
particularly inefficient in a paged
environment, making vfork()
particularly useful. Depending upon
the size of the parent's data space,
vfork() can give a significant
performance improvement over fork().

Exhaust memory usage with malloc and sleep in C [duplicate]

This code snippet will allocate 2Gb every time it reads the letter 'u' from stdin, and will initialize all the allocated chars once it reads 'a'.
#include <iostream>
#include <stdlib.h>
#include <stdio.h>
#include <vector>
#define bytes 2147483648
using namespace std;
int main()
{
char input [1];
vector<char *> activate;
while(input[0] != 'q')
{
gets (input);
if(input[0] == 'u')
{
char *m = (char*)malloc(bytes);
if(m == NULL) cout << "cant allocate mem" << endl;
else cout << "ok" << endl;
activate.push_back(m);
}
else if(input[0] == 'a')
{
for(int x = 0; x < activate.size(); x++)
{
char *m;
m = activate[x];
for(unsigned x = 0; x < bytes; x++)
{
m[x] = 'a';
}
}
}
}
return 0;
}
I am running this code on a linux virtual machine that has 3Gb of ram. While monitoring the system resource usage using the htop tool, I have realized that the malloc operation is not reflected on the resources.
For example when I input 'u' only once(i.e. allocate 2GB of heap memory), I don't see the memory usage increasing by 2GB in htop. It is only when I input 'a'(i.e. initialize), I see the memory usage increasing.
As a consequence, I am able to "malloc" more heap memory than there exists. For example, I can malloc 6GB(which is more than my ram and swap memory) and malloc would allow it(i.e. NULL is not returned by malloc). But when I try to initialize the allocated memory, I can see the memory and swap memory filling up till the process is killed.
-My questions:
1.Is this a kernel bug?
2.Can someone explain to me why this behavior is allowed?
It is called memory overcommit. You can disable it by running as root:
echo 2 > /proc/sys/vm/overcommit_memory
and it is not a kernel feature that I like (so I always disable it). See malloc(3) and mmap(2) and proc(5)
NB: echo 0 instead of echo 2 often -but not always- works also. Read the docs (in particular proc man page that I just linked to).
from man malloc (online here):
By default, Linux follows an optimistic memory allocation strategy.
This means that when malloc() returns non-NULL there is no guarantee
that the memory really is available.
So when you just want to allocate too much, it "lies" to you, when you want to use the allocated memory, it will try to find enough memory for you and it might crash if it can't find enough memory.
No, this is not a kernel bug. You have discovered something known as late paging (or overcommit).
Until you write a byte to the address allocated with malloc (...) the kernel does little more than "reserve" the address range. This really depends on the implementation of your memory allocator and operating system of course, but most good ones do not incur the majority of kernel overhead until the memory is first used.
The hoard allocator is one big offender that comes to mind immediately, through extensive testing I have found it almost never takes advantage of a kernel that supports late paging. You can always mitigate the effects of late paging in any allocator if you zero-fill the entire memory range immediately after allocation.
Real-time operating systems like VxWorks will never allow this behavior because late paging introduces serious latency. Technically, all it does is put the latency off until a later indeterminate time.
For a more detailed discussion, you may be interested to see how IBM's AIX operating system handles page allocation and overcommitment.
This is a result of what Basile mentioned, over commit memory. However, the explanation kind of interesting.
Basically when you attempt to map additional memory in Linux (POSIX?), the kernel will just reserve it, and will only actually end up using it if your application accesses one of the reserved pages. This allows multiple applications to reserve more than the actual total amount of ram / swap.
This is desirable behavior on most Linux environments unless you've got a real-time OS or something where you know exactly who will need what resources, when and why.
Otherwise somebody could come along, malloc up all the ram (without actually doing anything with it) and OOM your apps.
Another example of this lazy allocation is mmap(), where you have a virtual map that the file you're mapping can fit inside - but you only have a small amount of real memory dedicated to the effort. This allows you to mmap() huge files (larger than your available RAM), and use them like normal file handles which is nifty)
-n
Initializing / working with the memory should work:
memset(m, 0, bytes);
Also you could use calloc that not only allocates memory but also fills it with zeros for you:
char* m = (char*) calloc(1, bytes);
1.Is this a kernel bug?
No.
2.Can someone explain to me why this behavior is allowed?
There are a few reasons:
Mitigate need to know eventual memory requirement - it's often convenient to have an application be able to an amount of memory that it considers an upper limit on the need it might actually have. For example, if it's preparing some kind of report either of an initial pass just to calculate the eventual size of the report or a realloc() of successively larger areas (with the risk of having to copy) may significantly complicate the code and hurt performance, where-as multiplying some maximum length of each entry by the number of entries could be very quick and easy. If you know virtual memory is relatively plentiful as far as your application's needs are concerned, then making a larger allocation of virtual address space is very cheap.
Sparse data - if you have the virtual address space spare, being able to have a sparse array and use direct indexing, or allocate a hash table with generous capacity() to size() ratio, can lead to a very high performance system. Both work best (in the sense of having low overheads/waste and efficient use of memory caches) when the data element size is a multiple of the memory paging size, or failing that much larger or a small integral fraction thereof.
Resource sharing - consider an ISP offering a "1 giga-bit per second" connection to 1000 consumers in a building - they know that if all the consumers use it simultaneously they'll get about 1 mega-bit, but rely on their real-world experience that, though people ask for 1 giga-bit and want a good fraction of it at specific times, there's inevitably some lower maximum and much lower average for concurrent usage. The same insight applied to memory allows operating systems to support more applications than they otherwise would, with reasonable average success at satisfying expectations. Much as the shared Internet connection degrades in speed as more users make simultaneous demands, paging from swap memory on disk may kick in and reduce performance. But unlike an internet connection, there's a limit to the swap memory, and if all the apps really do try to use the memory concurrently such that that limit's exceeded, some will start getting signals/interrupts/traps reporting memory exhaustion. Summarily, with this memory overcommit behaviour enabled, simply checking malloc()/new returned a non-NULL pointer is not sufficient to guarantee the physical memory is actually available, and the program may still receive a signal later as it attempts to use the memory.

C When trying to calloc maximum free memory, no NULL return

I'm trying to initialize as much memory as possible (all of free memory), than sleep for 10 seconds and free it up.
Calloc initializes it and it goes to a bit over 7800 MB, out of 8GB that I have, so I think it does the job, but the problem begins when (as long as I was reading forums and stuff) the OOM killer comes and kills it. So the process get's killed instead of calloc returning NULL.
Is there any fix to that? How to stop it right before killing or how to return NULL when it runs out of memory?
int main() {
int *pointer;
int megabajti=1048576; //MB
int velikost_strani=4096; //page size
long long int i=0;
while(1)
{
pointer=calloc(velikost_strani,sizeof(int));
printf("Trenutno alociram %lld MB\n",i*velikost_strani*sizeof(int)/megabajti);
if(pointer==NULL)
{
printf("Max velikost je %lld MB\n",i*velikost_strani*sizeof(int)/megabajti);
free(pointer);
sleep(10);
}
++i;
}
return 0;
}
Generally, malloc() and friends will not return NULL only because you are out of physical RAM. They normally don't even know how much physical RAM you have, and will just try to get more from the OS, usually using mmap() (or brk(), but that's just a wrapper for mmap()).
mmap(), too, will not return failure just because you're out of physical RAM, but will try to use virtual memory. This is common across UNIX systems, and it's generally not possible to work directly with physical memory instead of virtual memory. The OOM-killer is just Linux' specific implementation of what happens when virtual memory cannot handle the demand for backing store. One method to make the OOM-killer go away is to allocate more swap space (for this and similar reasons, I find that it often is a good idea to keep a lot of swap space around).
The most common case where mmap() and malloc() will return failure is when they cannot handle the allocation for internal reasons, such as being out of virtual address space (which is pretty rare on 64-bit systems).
That being said, there exist mechanisms for dealing more directly with physical RAM if you want to avoid the potential complexities of virtual memory. One POSIX-defined mechanism is mlock(), which will pin a certain amount of allocated RAM to physical pages, which is usually done to avoid the performance and/or security implications of swapping. It is restricted to the superuser and will usually only allow to lock a small amount of memory in total, however. See its manpage for details.
On Linux, you can also tweak the overcommit behavior. I have to admit I've never tried it myself, but behavior #2 (as documented in the link) seems to promise some kind of behavior similar to what you seem to be looking for.
None of these mechanisms are "ordinary", however, so if you're looking for a way to limit your allocations to physical memory in a way that is portable and reproducible on systems that you don't personally manage, you're probably out of luck, quite simply.

C malloc and free

I was taught that if you do malloc(), but you don't free(), the memory will stay taken until a restart happens. Well, I of course tested it. A very simple code:
#include <stdlib.h>
int main(void)
{
while (1) malloc(1000);
}
And I watched over it in Task Manager (Windows 8.1).
Well, the program took up 2037.4 MB really quickly and just stayed like that. I understand it's probably Windows limiting the program.
But here is the weird part: When I closed the console, the memory use percentage went down, even though I was taught that it isn't supposed to!
Is it redundant to call free, since the operating system frees it up anyway?
(The question over here is related, but doesn't quite answer whether I should free or not.)
On Windows, a 32 bit process can only allocate 2048 megabytes because that's how many addresses are there. Some of this memory is probably reserved by Windows, so the total figure is lower. malloc returns a null pointer when it fails, which is likely what happens at that point. You could modify your program like this to see that:
#include <stdlib.h>
#include <stdio.h>
int main(void)
{
int counter = 0;
while (1) {
counter++;
if (malloc(1000) == NULL) {
printf("malloc failed after %d calls\n", counter);
return 0;
}
}
}
Now you should get output like this:
$ ./mem
malloc failed after 3921373 calls
When a process terminates or when it is terminated from the outside (as you do by killing it through the task manager), all memory allocated by the process is released. The operating system manages what memory belongs to what process and can therefore free the memory of a process when it terminates. The operating system does not however know how each process uses the memory it requested from the operating system and depends on the process telling it when it doesn't need a chunk of memory anymore.
Why do you need free() then? Well, this only happens on program termination and does not discriminate between memory you still need and memory you don't need any more. When your process is doing complicated things, it is often constantly allocating and releasing memory for its own computations. It's important to release memory explicitly with free() because otherwise your process might at some point no longer be able to allocate new memory and crashes. It's also good programming practice to release memory when you can so your process does not unnecessarily eat up tons of memory. Instead, other processes can use that memory.
It is advisable to call free after you are done with the memory you had allocated, as you may need this memory space later in your program and it will be a problem if there was no memory space for new allocations.
You should always seek portability for your code.If windows frees this space, may be other operating systems don't.
Every process in the Operating System have a limited amount of addressable memory called the Process Address Space. If you allocate a huge amount of memory and you end up allocating all of the memory available for this process, malloc will fail and return NULL. And you will not be able to allocate memory for this process anymore.
With all non-trivial OS, process resources are reclaimed by the OS upon process termination.
Unless there is specifc and overriding reason to explicitly free memory upon termination, you don't need to do it and you should not try for at least these reasons:
1) You would need to write code to do it, and test it, and debug it. Why do this, if the OS can do it? It's not just redundant, it reduces quality because your explict resource-releasing will never get as much testing as the OS has already had before it got released.
2) With a complex app, with a GUI and many subsystems and threads, cleanly freeing memory on shutdown is nigh-on impossible anyway, which leads to:
3) Many library developers have already given up on the 'you must explicitly release blah... ' mantra because the complexity would result in the libs never being released. Many report unreleased, (but not lost), memory to valgrid and, with opaque libs, you can do nothing at all about it.
4) You must not free any memory that is in use by a running thread. To safely release all such memory in multithreaded apps, you must ensure that all process threads are stopped and cannot run again. User code does not have the tools to do this, the OS does. It is therefore not possible to explicitly free memory from user code in any kind of safe manner in such apps.
5) The OS can free off the process memory in big chunks - much more quickly than messing around with dozens of sub-allcations in the C manager.
6) If the process is being terminated because it has failed due to memory management issues, calling free() many more times is not going to help at all.
7) Many teachers and profs say that you must explicity free the memory, so it's obviously a bad plan.

C - wanted to know max memory allocable size in a program

I am a newbee in C
I wanted to know the max memory allowed by an application.
So I wrote a little program like the following.
I have a machine with 16GB total memory and 2GB is used and 14GB is free.
I expected this program to stop around 14GB, but it runs forever.
Want am I doing wrong here?
#include <stdlib.h>
#include <stdio.h>
int main(){
long total = 0;
void* v = malloc(1024768);
while(1) {
total += 1024768;
printf ( "Total Memory allocated : %5.1f GB\n", (float)total/(1024*1024768) );
v = realloc(v, total);
if (v == NULL) break;
}
}
Edit: running this program on CentOS 5.4 64 bit.
On most modern operating systems, memory is allocated for each page which is used, not for each page which is "reserved". Your code doesn't use any pages, so no memory is really allocated.
Try clearing the memory you allocate with memset; eventually the program will crash because it can no longer allocate a page.
I tried to find a citation for this, but I was unsuccessful. Help with this is appreciated!
Want am I doing wrong here?
Well you say that the machine you are running the application on has 16GB of RAM, so I'm going to assume it's 64-bit. This means that your application will run for ages before it exhausts 1/ the physical memory and 2/ the virtual memory.
On 32-bit Windows your application would stop at 4GB. On 64-bit Windows your application will stop at 16TB (assuming you have a page file that can grow automatically, and that much hard disk space).
http://support.microsoft.com/kb/294418
YMMV with other operating systems.
Edit: ruslik points out that in practice, your process will not be able to allocate memory up to 2GB or 3GB (depending on how your binary is compiled) on 32-bit Windows. From the KB article I link above, the maximum memory that your process will occupy is 3GB or 4GB, with 1GB belonging to the OS that you can't use.
If you're on one specific platform/OS, you should use report functions, spectific to that OS.
If you're eritiong cross-platform program, you shouldn't rely on any free memory checking algorithm. Reasons are:
OS may refuse to give all available memory due to own reasons: fragmentation, alloc limits or so.
OS may not really give a memory, just allocate the space if it have VMM.
Algorithm may change internal state of MM, so memory available before and after call to check may be different.
As OS runs several processes in parallel, available memory may be changed sponteneously due to other process activity.

Resources