I'm trying to understand the CPU's role in tracking a programs stack/heap allocation.
Reading some material, I've come across this:
The stack area traditionally adjoined the heap area and grew the opposite direction; when the stack pointer met the heap pointer, free memory was exhausted.
Are the stack and heap pointers stored in program specific registers?
If the stack pointer is pointing to the top of the stack, and (I'm assuming) the heap pointer is pointing to the end of the heap, how would these pointers ever meet without overwriting memory (overflow)?
How does this work in modern systems?
Are the stack and heap pointers stored in program specific registers?
CPUs of stack-based architectures (which represent the overwhelming majority of the CPUs in use today) have a special register for the stack pointer. This is possible because stack, by its very nature, does not get fragmented. Hence, a single pointer is sufficient.
There is no such thing as "heap pointer" because heap is potentially a fragmented data structure. Heap allocators keep a special table of memory fragments available for allocation, and adjust it when the program allocates and releases memory. Memory manager also keeps a pointer to the highest address that has been allocated from the heap.
If the stack pointer is pointing to the top of the stack, and (I'm assuming) the heap pointer is pointing to the end of the heap, how would these pointers ever meet without overwriting memory (overflow)?
Since stack pointer cannot cross without causing an error, many systems limit the size of the stack to a certain number, and make sure that the memory allocator would not let the high point of the heap to cross the upper limit of the stack.
Note: On systems that support concurrency there may be more than one stack active at a time. In this case the stacks are set up next to each other, with the upper limit monitored to detect stack overflows. Here is an article that describes techniques for detecting stack overflows.
Related
Ever since I was introduced to C, I was told that in C dynamic memory allocation is done using the functions in the malloc family. I also learned that memory dynamically allocated using malloc is allocated on the heap section of the process.
Various OS textbooks say that malloc involves system call (though not always but at times) to allocate structures on heap to the process. Now supposing that malloc returns pointer to chunk of bytes allocated on the heap, why should it need a system call. The activation records of a function are placed in the stack section of the process and since the "stack section" is already a part of the virtual address space of the process, pushing and popping of activation records, manipulation of stack pointers, just start from the highest possible address of the virtual address space. It does not even require a system call.
Now on the same grounds since the "heap section" is also a part of the virtual address space of the process, why should a system call be necessary for allocating a chunk of bytes in this section. The routine like malloc could self handle the "free" list and "allocated" list on its own. All it needs to know is the end of the "data section". Certain texts say that system calls are necessary to "attach memory to the process for dynamic memory allocation", but if malloc allocates memory on "heap section" why is it at all required to attach memory to the process during malloc? Could be simply taken from portion already part of the process.
While going through the text "The C Programming Language" [2e] by Kernighan and Ritchie, I came across their implementation of the malloc function [section 8.7 pages 185-189]. The authors say :
malloc calls upon the operating system to obtain more memory as necessary.
Which is what the OS texts say, but counter intuitive to my thought above (if malloc allocates space on heap).
Since asking the system for memory is a comparatively expensive operation, the authors do not do that on every call to malloc, so they create a function morecore which requests at least NALLOC units; this larger block is chopped up as needed. And the basic free list management is done by free.
But the thing is that the authors use sbrk() to ask the operating system for memory in morecore. Now Wikipedia says:
brk and sbrk are basic memory management system calls used in Unix and Unix-like operating systems to control the amount of memory allocated to the data segment of the process.
Where
a data segment (often denoted .data) is a portion of an object file or the corresponding address space of a program that contains initialized static variables, that is, global variables and static local variables.
Which I guess is not the "heap section". [Data section is the second section from bottom in the picture above, while heap is the third section from bottom.]
I am totally confused. I want to know what really happens and how both the concepts are correct? Please help me understand the concept by joining the scattered pieces together...
In your diagram, the section labeled "data" is more precisely called "static data"; the compiler pre-allocates this memory for all the global variables when the process starts.
The heap that malloc() uses is the rest of the process's data segment. This initially has very little memory assigned to it in the process. If malloc() needs more memory, it can use sbrk() to extend the size of the data segment, or it can use mmap() to create additional memory segments elsewhere in the address space.
Why does malloc() need to do this? Why not simply make the entire address space available for it to use? There are historical and practical reasons for this.
The historical reason is that early computers didn't have virtual memory. All the memory assigned to a process was swapped in bulk to disk when switching between processes. So it was important to only assign memory pages that were actually needed.
The practical reason is that this is useful for detecting various kinds of errors. If you've ever gotten a segmentation violation error because you dereferenced an uninitialized pointer, you've benefited from this. Much of the process's virtual address space is not allocated to the process, which makes it likely that unitialized pointers point to unavailable memory, and you get an error trying to use it.
There's also an unallocated gap between the heap (growing upwards) and the stack (growing downward). This is used to detect stack overflow -- when the stack tries to use memory in that gap, it gets a fault that's translated to the stack overflow signal.
This is the Standard C Library specification for malloc(), in its entirety:
7.22.3.4 The malloc function
Synopsis
#include <stdlib.h>
void *malloc(size_t size);
Description
The malloc function allocates space for an object whose size is
specified by size and whose value is indeterminate. Note that this need
not be the same as the representation of floating-point zero or a null
pointer constant.
Returns
The malloc function returns either a null pointer or a pointer to the
allocated space.
That's it. There's no mention of the Heap, the Stack or any other memory location, which means that the underlying mechanisms for obtaining the requested memory are implementation details.
In other words, you don't care where the memory comes from, from a C perspective. A conforming implementation is free to implement malloc() in any way it sees fit, so long as it conforms to the above specification.
I was told that in C dynamic memory allocation is done using the functions in the malloc family. I also learned that memory dynamically allocated using malloc is allocated on the heap section of the process.
Correct on both points.
Now supposing that malloc returns pointer to chunk of bytes allocated on the heap, why should it need a system call.
It needs to request an adjustment to the size of the heap, to make it bigger.
...the "stack section" is already a part of the virtual address space of the process, pushing and popping of activation records, manipulation of stack pointers, [...] does not even require a system call.
The stack segment is grown implicitly, yes, but that's a special feature of the stack segment. There's typically no such implicit growing of the data segment. (Note, too, that the implicit growing of the stack segment isn't perfect, as witness the number of people who post questions to SO asking why their programs crash when they allocate huge arrays as local variables.)
Now on the same grounds since the "heap section" is also a part of the virtual address space of the process, why should a system call be necessary for allocating a chunk of bytes in this section.
Answer 1: because it's always been that way.
Answer 2: because you want accidental stray pointer references to crash, not to implicitly allocate memory.
malloc calls upon the operating system to obtain more memory as necessary.
Which is what the OS texts say, but counter intuitive to my thought above (if malloc allocates space on heap).
Again, malloc does request space on the heap, but it must use an explicit system call to do so.
But the thing is that the authors use sbrk() to ask the operating system for memory in morecore. Now Wikipedia says:
brk and sbrk are basic memory management system calls used in Unix and Unix-like operating systems to control the amount of memory allocated to the data segment of the process.
Different people use different nomenclatures for the different segments. There's not much of a distinction between the "data" and "heap" segments. You can think of the heap as a separate segment, or you can think of those system calls -- the ones that "allocate space on the heap" -- as simply making the data segment bigger. That's the nomenclature the Wikipedia article is using.
Some updates:
I said that "There's not much of a distinction between the 'data' and 'heap' segments." I suggested that you could think of them as subparts of a single, more generic data segment. And actually there are three subparts: initialized data, uninitialized data or "bss", and the heap. Initialized data has initial values that are explicitly copied out of the program file. Uninitialized data starts out as all bits zero, and so does not need to be stored in the program file; all the program file says is how many bytes of uninitialized data it needs. And then there's the heap, which can be thought of as a dynamic extension of the data segment, which starts out with a size of 0 but may be dynamically adjusted at runtime via calls to brk and sbrk.
I said, "you want accidental stray pointer references to crash, not to implicitly allocate memory", and you asked about this. This was in response to your supposition that explicit calls to brk or sbrk ought not to be required to adjust the size of the heap, and your suggestion that the heap could grow automatically, implicitly, just like the stack does. But how would that work, really?
The way automatic stack allocation works is that as the stack pointer grows (typically "downward"), it eventually reaches a point that it points to unallocated memory -- that blue section in the middle of the picture you posted. At that point, your program literally gets the equivalent of a "segmentation violation". But the operating system notices that the violation involves an address just below the existing stack, so instead of killing your program on an actual segmentation violation, it quick-quick makes the stack segment a little bigger, and lets your program proceed as if nothing had happened.
So I think your question was, why not have the upward-growing heap segment work the same way? And I suppose an operating system could be written that worked that way, but most people would say it was a bad idea.
I said that in the stack-growing case, the operating system notices that the violation involves an address "just below" the existing stack, and decides to grow the stack at that point. There's a definition of "just below", and I'm not sure what it is, but these days I think it's typically a few tens or hundreds of kilobytes. You can find out by writing a program that allocates a local variable
char big_stack_array[100000];
and seeing if your program crashes.
Now, sometimes a stray pointer reference -- that would otherwise cause a segmentation violation style crash -- is just the result of the stack normally growing. But sometimes it's a result of a program doing something stupid, like the common error of writing
char *retbuf;
printf("type something:\n");
fgets(retbuf, 100, stdin);
And the conventional wisdom is that you do not want to (that is, the operating system does not want to) coddle a broken program like this by automatically allocating memory for it (at whatever random spot in the address space the uninitialized retbuf pointer seems to point) to make it seem to work.
If the heap were set up to grow automatically, the OS would presumably define an analogous threshold of "close enough" to the existing heap segment. Apparently stray pointer references within that region would cause the heap to automatically grow, while references beyond that (farther into the blue region) would crash as before. That threshold would probably have to be bigger than the threshold governing automatic stack growth. malloc would have to be written to make sure not to try to grow the heap by more than that amount. And true, stray pointer references -- that is, program bugs -- that happened to reference unallocated memory in that zone would not be caught. (Which is, it's true, what can happen for buggy, stray pointer references just off the end of the stack today.)
But, really, it's not hard for malloc to keep track of things, and explicitly call sbrk when it needs to. The cost of requiring explicit allocation is small, and the cost of allowing automatic allocation -- that is, the cost of the stray pointer bugs not caught -- would be larger. This is a different set of tradeoffs than for the stack growth case, where an explicit test to see if the stack needed growing -- a test which would have to occur on every function call -- would be significantly expensive.
Finally, one more complication. The picture of the virtual memory layout that you posted -- with its nice little stack, heap, data, and text segments -- is a simple and perhaps outdated one. These days I believe things can be a lot more complicated. As #chux wrote in a comment, "your malloc() understanding is only one of many ways allocation is handled. A clear understanding of one model may hinder (or help) understanding of the many possibilities." Among those complicating possibilities are:
A program may have multiple stack segments maintaining multiple stacks, if it supports coroutines or multithreading.
The mmap and shm_open system calls may cause additional memory segments to be allocated, scattered anywhere within that blue region between the heap and the stack.
For large allocations, malloc may use mmap rather than sbrk to get memory from the OS, since it turns out this can be advantageous.
See also Why does malloc() call mmap() and brk() interchangeably?
As the bard said, "There are more things in heaven and earth, Horatio, than are dreamt of in your philosophy." :-)
Not all virtual addresses are available at the beginning of a process.
OS does maintain a virtual-to-physics map, but (at any given time) only some of the virtual addresses are in the map. Reading or Writing to an virtual address that isn't in the map cause a instruction level exception. sbrk puts more addresses in the map.
Stack just like data section but has a fixed size, and there is no sbrk-like system call to extend it. We can say there is no heap section, but only a fixed-size stack section and a data section which can be grown upward by sbrk.
The heap section you say is actually a managed (by malloc and free) part of the data section. It's clear that the code relating to heap management is not in OS kernel but in C library executing in CPU user mode.
In accordance to many sources, stack and heap are memory regions devided by empty space. As far as I got this, the stack and the heap grow towards each other as the program is running.
Does that mean that the stack and the heap are both of variable size? If they are, can one of them occupy the same addresses the other occupied during the previous run (suppose the two runs took place at the same memory addresses)?
If they are of variable size, I guess stack overflow occurs when the stack tries to take what belongs to the heap. If not, a stack bound is just what it is and placed somewhere in the middle of the initially empty space between the two regions of memory.
A bit of clarification: by the variability of size here I mean changing of the reserved space. That is, here I imply that if stack size is variable, the reserved for it space changes (like when you change size of an array using realloc, but that is just for comparison); stack overflow is then occurs by the stack hitting some bound, be it another region like heap or something else. If it is not variable, the reserved space stays the same, and stack overflow is merely caused by running out of the reserved space. And I also wonder if heap is variably sized too.
The stack must be contiguous in address space. The heap has no such constraint. Anything beyond that depends how a given platform lays out the memory.
If your address space cannot have holes, a simple layout for an embedded device can be: from address n and below is stack (stacks tends to grow down) and from n and above can be used for heap. Static is then some high address and code is in ROM.
Linux and Windows run on platforms with an MMU so have a lot more options with regards memory arrangement.
I have some questions related to memory structure.
Are stack and heap determined by OS? or physically are they separated?
If they are determined by OS, which OS has the stack and heap as component of memory structure except Windows?
As I know default stack size is 1MB and I can expand the size manually, but why default size is so small?
And if stack size is 1MB, I can not hold data that exceed 1MB in a local variable?
And my last question is, is there any reason that programmer needs to be more aware of memory structure if he writes unmanaged code (e.g. native C++) rather than managed code (e.g. C#)?
The default stack size depends on the operating system. Some are small, some are large. The stack and the heap are "physically separated" in the sense that their addresses are different.
The reasons the differences are significant to programmers are many:
allocations on the stack are automatically deallocated when the function/method returns
large allocations on the stack may run out of memory even though the machine has free memory (due to stack size limitations)
in some environments (eg C#, Java, Python, JavaScript) while all local variables are allocated on the stack (as in C, C++) the variables allocated on the stack are all references (similar to pointers) and so always use a constant small amount of stack memory
recursion or deep function/method calling can run out of stack space
allocation and deallocation on the stack is faster than on the heap (the compiler computes the size and includes it in a single pointer move operation when the function is entered, no free lists to scan or anything -- this applies to C and C++ but not to C#, Java, etc.)
never return a pointer to data allocated on the stack; if you do then sometime later you will at best crash your program and at worst not know what data corruption has occurred
allocations on the stack can never cause memory leaks; allocations on the heap can be leaked
data on the stack is local to the function (unless it passes a pointer to it to a function it calls) and thus is thread-safe; data on the heap might be accessible from more than one thread and may need proper synchronization controls
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
What and where are the stack and heap?
With regard to the basic concepts of memory layout in a c program, I understand the that:
The language uses two primary data structures stack and heap.
Stack is created to store the local variables and book keeping data of subroutines
Heap is created to store the dynamically allocated variables of the program
Heap is of variable length in nature.(Not very sure on stack)
Normally it is the responsibility of the compiler/Language to request the OS to create these data structures before the execution.
Questions
What is the initial size with which a stack/heap is created? and who decides it?
Wherein physical memory are they are created? I see a general description as "Stack is created in the top-level-address and the heap at the low-level-address" Please elobarate this
"Stack is created in the top-level-address and the heap at the
low-level-address" Please elobarate this
This is a myth. It may have a basis in historical truth. It might sometimes resonate with things you see in real life. But it is not literally true.
It's easy enough to explore, though:
#include <stdlib.h>
#include <stdio.h>
void check(int depth) {
char c;
char *ptr = malloc(1);
printf("stack at %p, heap at %p\n", &c, ptr);
if (depth <= 0) return;
check(depth-1);
}
int main() {
check(10);
return 0;
}
On my machine I see:
stack at 0x22ac3b, heap at 0x20010240
stack at 0x22ac0b, heap at 0x200485b0
stack at 0x22abdb, heap at 0x200485c0
stack at 0x22abab, heap at 0x200485d0
stack at 0x22ab7b, heap at 0x200485e0
stack at 0x22ab4b, heap at 0x200485f0
stack at 0x22ab1b, heap at 0x20048600
stack at 0x22aaeb, heap at 0x20048610
stack at 0x22aabb, heap at 0x20048620
stack at 0x22aa8b, heap at 0x20048630
stack at 0x22aa5b, heap at 0x20048640
So, the stack is going downwards and the heap is going upwards (as you might expect based on the myth), but the stack has the smaller address, and they are not growing toward each other (myth busted).
Btw, my check function is tail-recursive, and on some implementations with some compiler options you might see the stack not moving at all. Which tells you something about why the standard doesn't mandate how all this works -- if it did it might inadvertently forbid useful optimizations.
As mentioned already, sizes are OS specific. For e.g. on windows using Visual Studio, default stack size is 1MB
msdn
On Linux the following command can show show your current one.
ulimit -s or -a
On my Linux mint 64 bit it shows 8192 KB.
Every program when loaded in memory has several segments. In assembly one can indicate each of those using .data, .code etc prefix (intelx86).
It is data segment which has several sub sections. Both stack and heap are part of it in addition to several others.
Stack can also grow implicitly i.e. when you make another function call, an activation record is pushed on to stack, there by utilizing more memory of stack. That is why infinite recursion results in a crash when a program runs out of allocated stack.
When a function call returns, that activation record is popped and stack shrinks.
In contrast heap grows from the opposite direction and contains all dynamically allocated memory.
The reason these two segments grow in opposite direction is to maximize the utilization of their combined memory. Note that as mentioned in comments this is not a c standard, but most common OS's have this implemented.
------ stack starts ----------- stack grows downward
-------- Unless they cross each other a program is okay to run.
------- heap starts ------------heap grows upwards
If your program uses no heap, your stack can utilize maximum memory including that of heap too. If program makes few recursive calls and uses minimum local variables (i.e. uses less memory for stack), it can utilize heap to the most.
Other parts of data segment are BSS etc. which might contain fields such as uninitialized static variables
What is the initial size with which a stack/heap is created? and who decides it?
This is compiler- and OS-specific.
Wherein physical memory are they are created? I see a general description as "Heap is created in the top-level-address and stack at the low-level-address".
This is compiler- and OS-specific.
Really. The language standard does not mandate the minimum stack size nor specifies the location of either the stack or the heap in memory. And the reason for that is to make C programs less dependent on these details and therefore more portable to different platforms (read: different OSes, different CPUs, different compilers).
First of all, the C standard doesn't impose any requirements on how the stack/heap is implemented by
the platform.
What is the initial size with which a stack/heap is created? and who decides it?
Typically a fixed size of stack is allocated for every process by the OS which is platform-specific.
There's no limit on heap size, program usually have all of the available virtual address space.
Wherein physical memory are they are created?
This is platform specific. Typically stack grows downwards and heap grows upwards.
We have functions to allocate memory on stack in both in windows and Linux systems but their use is discouraged also they are not a part of the C standard? This means that they provide some non-standard behavior. As I'm not that experienced I cannot understand what could be the problem when allocating memory from stack rather then using heap?
Thanks.
EDIT: My view: As Delan has explained that the amount of stack allocated to a program is decided during compile time so we cannot ask for more stack from the OS if we run out of it.The only way out would be a crash.So it's better to leave the stack for storage of primary things like variables,functions,function calls,arrays,structures etc. and use heap as much as the capacity of the OS/machine.
Stack memory has the benefit of frequently being faster to allocate than heap memory.
However, the problem with this, at least in the specific case of alloca(3), is that in many implementations, it just decreases the stack pointer, without giving regard or notification as to whether or not there actually is any stack space left.
The stack memory is fixed at compile- or runtime, and does not dynamically expand when more memory is needed. If you run out of stack space, and call alloca, you have a chance of getting a pointer to non-stack memory. You have no way of knowing if you have caused a stack overflow.
Addendum: this does not mean that we shouldn't use dynamically allocate stack memory; if you are
in a heavily controlled and monitored environment, such as an embedded application, where the stack limits are known or able to be set
keeping track of all memory allocations carefully to avoid a stack overflow
ensuring that you don't recurse enough to cause a stack overflow
then stack allocations are fine, and can even be beneficial to save time (motion of stack pointer is all that happens) and memory (you're using the pre-allocated stack, and not eating into heap).
Memory on stack (automatic in broader sense) is fast, safe and foolproof compared to heap.
Fast: Because it's allocated at compile time, so no overhead involved
safe: It's exception safe. The stack gets automatically wound up, when exception is thrown.
full proof: You don't have to worry about virtual destructors kind of scenarios. The destructors are called in proper order.
Still there are sometimes, you have to allocate memory runtime, at that time you can first resort on standard containers like vector, map, list etc. Allocating memory to row pointers should be always a judicious decision.