Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 1 year ago.
The community reviewed whether to reopen this question 1 year ago and left it closed:
Original close reason(s) were not resolved
Improve this question
Can anyone please tell me the difference between internal and external fragmentation while allocation of disk space for files?
External fragmentation
Total memory space is enough to satisfy a request or to reside a process in it, but it is not contiguous so it can not be used.
Internal fragmentation
Memory block assigned to process is bigger. Some portion of memory is left unused as it can not be used by another process.
First of all the term fragmentation cues there's an entity divided into parts — fragments.
Internal fragmentation: Typical paper book is a collection of pages (text divided into pages). When a chapter's end isn't located at the end of page and new chapter starts from new page, there's a gap between those chapters and it's a waste of space — a chunk (page for a book) has unused space inside (internally) — "white space"
External fragmentation: Say you have a paper diary and you didn't write your thoughts sequentially page after page, but, rather randomly. You might end up with a situation when you'd want to write 3 pages in row, but you can't since there're no 3 clean pages one-by-one, you might have 15 clean pages in the diary totally, but they're not contiguous
I am an operating system that only allocates you memory in 10mb partitions.
Internal Fragmentation
You ask for 17mb of memory
I give you 20mb of memory
Fulfilling this request has just led to 3mb of internal fragmentation.
External Fragmentation
You ask for 20mb of memory
I give you 20mb of memory
The 20mb of memory that I give you is not immediately contiguous next to another existing piece of allocated memory. In so handing you this memory, I have "split" a single unallocated space into two spaces.
Fulfilling this request has just led to external fragmentation
Presumably from this site:
Internal Fragmentation Internal fragmentation occurs when the memory
allocator leaves extra space empty inside of a block of memory that
has been allocated for a client. This usually happens because the
processor’s design stipulates that memory must be cut into blocks of
certain sizes -- for example, blocks may be required to be evenly be
divided by four, eight or 16 bytes. When this occurs, a client that
needs 57 bytes of memory, for example, may be allocated a block that
contains 60 bytes, or even 64. The extra bytes that the client doesn’t
need go to waste, and over time these tiny chunks of unused memory can
build up and create large quantities of memory that can’t be put to
use by the allocator. Because all of these useless bytes are inside
larger memory blocks, the fragmentation is considered internal.
External Fragmentation External fragmentation happens when the
memory allocator leaves sections of unused memory blocks between
portions of allocated memory. For example, if several memory blocks
are allocated in a continuous line but one of the middle blocks in the
line is freed (perhaps because the process that was using that block
of memory stopped running), the free block is fragmented. The block is
still available for use by the allocator later if there’s a need for
memory that fits in that block, but the block is now unusable for
larger memory needs. It cannot be lumped back in with the total free
memory available to the system, as total memory must be contiguous for
it to be useable for larger tasks. In this way, entire sections of
free memory can end up isolated from the whole that are often too
small for significant use, which creates an overall reduction of free
memory that over time can lead to a lack of available memory for key
tasks.
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
First of all, this may be more like a math problem.
I am writing a module that requires memory piece by piece and never release it until its instance is dead, so I write a simple memory manager to reduce malloc. The memory manager requires a block of memory during initialization, and the size of memory block is controllable by user, then the manager pass the memory pieces to user when required. If the manager is running out of memory, it doubles its memory block size by realloc. At the end, we can figure out that relation between required memory size and the total wasted memory size is:
f(x) = 2^k - x, 2^(k-1) < x <= 2^k
Now I have several memory users, I can either create a memory manager for each of them (the overhead of manager is not worth to consider), or create only one memory manager and share it among all users. The number of users and the size of each user's usage of memory may vary in a great range. So, which strategy have greater possible to waste less memory?
The memory manager does hide the actual memory block position and provides offset to user, to avoid realloc issues. The interface is quite simple:
void *memo_ref(Memo memo, MemoOffset offset)
{
panic(offset < memo->used, "invalid offset is passed to memo");
return &memo->memory[offset];
}
So I think the compiler will inline it and there's no difficult about optimization.
And also, there's no need to worry about data race, since all users of the manager come from the same thread. They just require in a staggered way.
In my opinion, one big manager leads to faster program, since there are less realloc which is a big cost. So my focus is on memory usage. Thanks for your help.
This won't work anyway: realloc is not guaranteed to succeed in resizing in place - it's free to allocate a larger block and copy all the data into the larger block. I presume that the users expect the data to stay at fixed addresses.
The simplest way to address this issue is not to use the C library, but use the platform-specific virtual memory APIs to reserve a large chunk of address space, then commit memory to it on demand. E.g. on Windows, you'd use VirtualAlloc(NULL, size, MEM_RESERVE, 0) to reserve the needed contiguous address space, and then VirtualAlloc(addr, size, MEM_COMMIT, PAGE_READWRITE) to commit the pages as your used memory area grows. This means that you're at most having just one extra page per memory pool. If you stick to small (4k) pages, this means that there's never more than 4092 bytes wasted per pool (one word short of a page).
Furthermore, on 64-bit systems, there's no need to pass the addresses using a base+offset: the reallocations won't ever realistically run out of address space, so there's no need to move the mapped view of memory within the virtual address space. You can use plain pointers/references!
There are benefits to having a separate memory area for each user: it improves the locality of reference - the user's data is close together, and thus improves cache performance at all levels, including the page swapper should paging out occur.
In a 64-bit application, reserving large address spaces for each user is not an issue and has minimal overhead. E.g. you can reserve 1Gbyte for each user. It is worth reserving say twice the largest area a user may need.
In a 32-bit application, if the address spaces needed are large, the users may need to cope with having their data moved within the address space. This can be achieved by remapping the pages and thus doesn't need to copy anything. Taking a sensible assumption of there being a 64-bit OS backing the application, you may benefit from mapping the memory area to a file. This lets you completely unmap it from the address space without losing the contents, and those contents don't have to hit the disk either - the OS will cache them if it can. Thus you can grow the address space for a user without having to copy anything and without wasting the smaller address space during a grow operation: first unmap the smaller view of the file, then map the larger view of the file. The user will need to cope with being given a new starting address for the memory area. The user may refer to the memory by adding offsets to a base address: this performs well enough and allows the flexibility of a movable address space.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I'm a computer science student, and this has been puzzling me. As I understand it, all memory used by a process is tracked by either the stack or the heap. I understand the stack very well, right down to the assembly level; I feel that I know how it works and why it works the way it does. The heap, however, I know much less about.
I understand how allocations and deallocations happen on the stack, so I understand their cost. I do not understand how allocations and deallocations happen on the heap. I know heap allocations are slower than stack allocations because I have been told so, and because it makes sense that the heap must be more complicated, but I do not know exactly why. Why is the heap called the heap? Is it a heap of addresses of free blocks of memory?
Take C. In C, you interact with the heap mainly through malloc (and calloc, etc) and free. I understand the effect of calling these methods, but I have no idea what they actually do internally; the heap is a black box to me, so I don't intuitively understand the cost of my actions.
I understand the possibility that the implementation of malloc (for example) might vary depending on any number of things, but I find it difficult to believe it could vary too wildly. The fundamentals of the heap have to be constant across most implementations, otherwise it wouldn't have such a specific name. Any pointers (ha)?
Edit: This is not a duplicate of an earlier thread. I've read the thread in question and it does not answer my questions.
"The size of the heap is set on application startup, but can grow as space is needed (the allocator requests more memory from the operating system)."
I have said that I know this in my question; I understand how to get memory from the heap, I want to know what the heap itself is doing to get that memory (beyond "it requests it". Does this mean that the heap is tracked entirely by the OS? If so, what's a basic outline of how the OS tracks it? I understand it varies, but the fundamentals have to be the same).
"The stack is faster because the access pattern makes it trivial to allocate and deallocate memory from it (a pointer/integer is simply incremented or decremented), while the heap has much more complex bookkeeping involved in an allocation or deallocation."
My entire question is about what that complex bookkeeping actually is.
To start off with, stack and heap are just read/write memory. There is no distinction between the two. A stack is simply a stack because it is managed as a stack. A heap is simply a heap because it is managed as a heap. The memory is allocated from the operating system in the same way.
One difference, however, is that a stack must be contiguous in memory. A heap does not [necessarily][ need to be continuous.
Because of the ordering of allocations and deallocations and the contiguous memory in a stack, allocations are single instruction.
SUB #NUMBEROFBYTESYOUWANT, SP
Why is the heap called the heap?
There's probably a story on that but that's like asking why a cat is called a cat.
Is it a heap of addresses of free blocks of memory?
No. Memory becomes a heap when it is controlled by a heap manager. The problem with your question is that heaps can be managed in different ways. You find scores of malloc/free implementations on the internet that allocate memory in different ways.
I understand the effect of calling these methods, but I have no idea what they actually do internally; the heap is a black box to me, so I don't intuitively understand the cost of my actions.
The simplest would be a malloc that always returns a block of a fixed size. Let's say 1K bytes. The heap manager just maintains a list of free blocks and malloc just picks one off the list. If you do malloc with something greater than 1024, it fails. If there are no free blocks in the list, the heap manager calls an operating system service to map more memory to the process. The heap manager then puts that block of memory under management. When the application calls free, the block is put back on the list of those availble.
That's a simple example. If you request 5 bytes, you get a block that is 1028 bytes underneath.
A malloc implementation could:
Manage blocks that are powers of 2 bytes in size and maintain a separate free block list for each size of block.
Manage the entire heap as a pool of memory that can be chopped up on demand.
(And there are hundreds of other ways it could and has been done).
"The size of the heap is set on application startup, but can grow as space is needed (the allocator requests more memory from the operating system)."
The operating system allocates memory in pages (usually around 512 to 4K bytes, depending upon the system). The purpose of malloc (or its equivalent in other languages) is to allocate memory in other sizes.
In regard to other points you made:
That may be true on some systems but the first half is not always correct. The heap can be initialized upon the first call to malloc.
Does this mean that the heap is tracked entirely by the OS?
The operating system knows nothing about the heap. It just knows that it has allocated memory to the process. The process can do whatever it want to with the heap.
My entire question is about what that complex bookkeeping actually is.
Again that depends entirely upon the heap manager that you link to the application. If there is a way to do such "bookkeeping" it has already been done.
The heap is just another abstract data type in RAM that malloc splits into chunks and delivers pointers to. It is not physically different from the stack.
http://compgroups.net/comp.lang.asm.x86/heap/59347
In x86 and most other systems, both the stack and the heap are just memory in RAM and hopefully primarily from cache.
I've been told not to use stack allocated arrays because the stack is a precious resource.
Other people have suggested to me that in fact it is perfectly fine to use stack allocated arrays so long as the array is relatively small.
I would like to have a general rule of thumb: when should I use a stack allocated array?
And when should I use a heap allocated array?
While all of your memory is limited, even today with enormous amounts of RAM and virtual memory, there is still a limit. However, it's rather large, especially compared with the stack which can be anything from a couple of kb on small embedded systems to a couple of megabytes on a PC.
Besides that, there is also the question about how you are using it, and for what. For example, if you want to return an "array" from a function, it should never be on the stack.
In general, I would say that try to keep arrays on the stack small if you can. If you are creating an array with thousands of entries on the stack you should stop and think about what you want it for.
It depends on your platform.
Nowadays, if working on the popular x64 platform, you don't really have to worry about it.
Depending on the Operating System you use, you can check how much stack space and how much heap space a userland process is allowed to use.
For example, UNIX-like systems have soft and hard limits. Some you can crank up, some you can not.
Bottom line is that you don't usually need to worry about such things. And when you need to know, you are usually tied so closely to the platform you'll be developing for that you know all these details.
Hope I answered your question. If you want specific values please specify your exact hardware, operating system and user privileges.
The answer to this question is context dependent. When you write for an operating system kernel, for example, the stack might be quite limited, and allocating more than a thousand bytes in a stack frame could cause a problem.
In modern consumer systems, the space available for the stack is typically quite large. One problem systems used to have was that address space was limited and, once the stack was assigned an address, it could not grow any further than the next object in the address space in the direction of stack growth, regardless of the availability of physical memory or of virtual memory elsewhere in the address space. This is less of a problem with today’s address spaces.
Commonly, megabytes of space can be allocated in a stack frame, and doing so is cheap and easy. However, if many routines that allocate large amounts of space are called, or one or a few routines that allocate large amounts of space are called recursively, then problems can occur because too much space is used, running into some limit (such as address space or physical memory).
Of course, running into a physical memory limit will not be alleviated by allocating space from the heap. So only the issue of consuming the address space available for the stack is relevant to the question of whether to use stack or heap.
A simple test for whether this is a problem is to insert use of a great deal of stack space in your main routine. If you use additional stack space and your application still functions under a load that uses large amounts of stack space normally, then, when you remove this artificial reservation in main, you will have plenty of margin.
A better way would be to calculate the maximum your program could use and compare that to the stack space available from the system. But that is rarely easy with today’s software.
If you are running into stack space limits, your linker or your operating system may have options to make more available.
Scope of Global and static variables will be through out the life of a process. Memory for these variable will be allocated when a process is started and it will be freed only process exits.
But local variable(stack variable) has scope only to a function on which it is defined. Memory will be allocated when a function is invoked and it will be freed once control exits from the function.
Main intention of dynamic memory is to create a variable of user defined scope. If you want to control a scope of variable means, you can allocate memory for a variable x at one function and then pass the reference(address) to as many function you want and then finally you can free it.
So with the help of dynamic allocated memory, we can create a variable which has scope higher than a local variable and lesser than global or static variable.
Apart from this if the size is very very high its better to go for dynamic memroy, if the architecture contains memory constraint.
The good reason to use heap allocated memory is passing its ownership to some other function/struct. From the other hand, stack gives you memory management for free, you can not forget to deallocate memory from stack, while there is risk of leak if you use heap.
If you create an array just for local usage, the criteria of size of the one to use, however it is hard to give exact size, above which memory should be allocated on heap. One could say that a few hundreds bytes is enough to move to heap, for some others it will be less or more than that.
Suppose we do a malloc request for memory block of size n where 2 ^k !=n for k>0.
Malloc returns us space for that requestted memory block but how is the remainig buffer handled from the page. I read Pages are generally blocks of memory which are powers of two.
Wiki states the following:
Like any method of memory allocation, the heap will become fragmented; that is,
there will be sections of used and unused memory in the allocated
space on the heap. A good allocator will attempt to find an unused area
of already allocated memory to use before resorting to expanding the heap.
So my question is how is this tracked?
EDIT: How is the unused memory tracked when using malloc ?
This really depends on the specific implementation, as Morten Siebuhr pointed out already. In very simple cases, there might be a list of free, fixed-size blocks of memory (possibly all having the same size), so the unused memory is simply wasted. Note that real implementations will never use such simplistic algorithms.
This is an overview over some simple possibilities: http://www.osdcom.info/content/view/31/39/
This Wikipedia entry has several interesting links, including the one above: http://en.wikipedia.org/wiki/Dynamic_memory_allocation#Implementations
As a final remark, googling "malloc implementation" turns up a heap (pun intended) of valuable links.
A standard BSD-style memory allocator basically works like this:
It keeps a linked list of pre-allocated memory blocks for sizes 2^k for k<=12 (for example).
In reality, each list for a given k is composed of memory-blocks from different areas, see below.
A malloc request for n bytes is serviced by calculating n', the closest 2^k >= n, then looking up the first area in the list for k, and then returning the first free block in the free-list for the given area.
When there is no pre-allocated memory block for size 2^k, an area is allocated, an area being some larger piece of continuous memory, say a 4kB piece of memory. This piece of memory is then chopped up into pieces that are 2^k bytes. At the beginning of the continuous memory area there is book-keeping information such as where to find the linked list of free blocks within the area. A bitmap can also be used, but a linked list typically has better cache behavior (you want the next allocated block to return memory that is already in the cache).
The reason for using areas is that free(ptr) can be implemented efficiently. ptr & 0xfffff000 in this example points to the beginning of the area which contains the book-keeping structures and makes it possible to link the memory block back into the area.
The BSD allocator will waste space by always returning a memory block 2^k in size, but it can reuse the memory of the block to keep the free-list, which is a nice property. Also allocation is blazingly fast.
Modifications to the above general idea include:
Using anonymous mmap for large allocations. This shifts the work over to the kernel for handling large mallocs and avoids wasting a lot of memory in these cases.
The GNU version of malloc have special cases for non-power-of-two buckets. There is nothing inherent in the BSD allocator that requires returning 2^k memory blocks, only that there are pre-defined bucket sizes. The GNU allocator has more buckets and thus waste less space.
Sharing memory between threads is a tricky subject. Lock-contention during allocation is an important consideration, so in the GNU allocator for example will eagerly create extra areas for different threads for a given bucket size if it ever encounters lock-contention during allocation.
This varies a lot from implementation to implementation. Some waste the space, some sub-divide pages until they get the requested size (or close to it) &c.
If you are asking out of curiosity, I suggest you read the source code for the implementation in question,
If it's because of performance worries, try to benchmark it and see what happens.
I'm experiencing what appears to be a stack/heap collision in an embedded environment (see this question for some background).
I'd like to try rewriting the code so that it doesn't allocate memory on the heap.
Can I write an application without using the heap in C? For example, how would I use the stack only if I have a need for dynamic memory allocation?
I did it once in an embedded environment where we were writing "super safe" code for biomedical machines.
Malloc()s were explicitly forbidden, partly for the resources limits and for the unexpected behavior you can get from dynamic memory (look for malloc(), VxWorks/Tornado and fragmentation and you'll have a good example).
Anyway, the solution was to plan in advance the needed resources and statically allocate the "dynamic" ones in a vector contained in a separate module, having some kind of special purpose allocator give and take back pointers. This approach avoided fragmentation issues altogether and helped getting finer grained error info, if a resource was exhausted.
This may sound silly on big iron, but on embedded systems, and particularly on safety critical ones, it's better to have a very good understanding of which -time and space- resources are needed beforehand, if only for the purpose of sizing the hardware.
Funnily enough, I once saw a database application which completly relied on static allocated memory. This application had a strong restriction on field and record lengths. Even the embedded text editor (I still shiver calling it that) was unable to create texts with more than 250 lines of text. That solved some question I had at this time: why are only 40 records allowed per client?
In serious applications you can not calculate in advance the memory requirements of your running system. Therefore it is a good idea to allocate memory dynamically as you need it. Nevertheless it is common case in embedded systems to preallocate memory you really need to prevent unexpected failures due to memory shortage.
You might allocate dynamic memory on the stack using the alloca() library calls. But this memory is tight to the execution context of the application and it is a bad idea to return memory of this type the caller, because it will be overwritten by later subroutine calls.
So I might answer your question with a crisp and clear "it depends"...
You can use alloca() function that allocates memory on the stack - this memory will be freed automatically when you exit the function. alloca() is GNU-specific, you use GCC so it must be available.
See man alloca.
Another option is to use variable-length arrays, but you need to use C99 mode.
It's possible to allocate a large amount of memory from the stack in main() and have your code sub-allocate it later on. It's a silly thing to do since it means your program is taking up memory that it doesn't actually need.
I can think of no reason (save some kind of silly programming challenge or learning exercise) for wanting to avoid the heap. If you've "heard" that heap allocation is slow and stack allocation is fast, it's simply because the heap involves dynamic allocation. If you were to dynamically allocate memory from a reserved block within the stack, it would be just as slow.
Stack allocation is easy and fast because you may only deallocate the "youngest" item on the stack. It works for local variables. It doesn't work for dynamic data structures.
Edit: Having seen the motivation for the question...
Firstly, the heap and the stack have to compete for the same amount of available space. Generally, they grow towards each other. This means that if you move all your heap usage into the stack somehow, then rather than stack colliding with heap, the stack size will just exceed the amount of RAM you have available.
I think you just need to watch your heap and stack usage (you can grab pointers to local variables to get an idea of where the stack is at the moment) and if it's too high, reduce it. If you have lots of small dynamically-allocated objects, remember that each allocation has some memory overhead, so sub-allocating them from a pool can help cut down on memory requirements. If you use recursion anywhere think about replacing it with an array-based solution.
You can't do dynamic memory allocation in C without using heap memory. It would be pretty hard to write a real world application without using Heap. At least, I can't think of a way to do this.
BTW, Why do you want to avoid heap? What's so wrong with it?
1: Yes you can - if you don't need dynamic memory allocation, but it could have a horrible performance, depending on your app. (i.e. not using the heap won't give you better apps)
2: No I don't think you can allocate memory dynamically on the stack, since that part is managed by the compiler.
Yes, it's doable. Shift your dynamic needs out of memory and onto disk (or whatever mass storage you have available) -- and suffer the consequent performance penalty.
E.g., You need to build and reference a binary tree of unknown size. Specify a record layout describing a node of the tree, where pointers to other nodes are actually record numbers in your tree file. Write routines that let you add to the tree by writing an additional record to file, and walk the tree by reading a record, finding its child as another record number, reading that record, etc.
This technique allocates space dynamically, but it's disk space, not RAM space. All the routines involved can be written using statically allocated space -- on the stack.
Embedded applications need to be careful with memory allocations but I don't think using the stack or your own pre-allocated heap is the answer. If possible, allocate all required memory (usually buffers and large data structures) at initialization time from a heap. This requires a different style of program than most of us are used to now but it's the best way to get close to deterministic behavior.
A large heap that is sub-allocated later would still be subject to running out of memory and the only thing to do then is have a watchdog kick in (or similar action). Using the stack sounds appealing but if you're going to allocate large buffers/data structures on the stack you have to be sure that the stack is large enough to handle all possible code paths that your program could execute. This is not easy and in the end is similar to a sub-allocated heap.
My foremost concern is, does abolishing the heap really helps?
Since your wish of not using heap stems from stack/heap collision, assuming the start of stack and start of heap are set properly (e.g. in the same setting, small sample programs have no such collision problem), then the collision means the hardware has not enough memory for your program.
Not using heap, one may indeed save some waste space from heap fragmentation; but if your program does not use the heap for a bunch of irregular large size allocation, the waste there are probably not much. I will see your collision problem more of an out of memory problem, something not fixable by merely avoiding heap.
My advices on tackling this case:
Calculate the total potential memory usage of your program. If it is too close to but not yet exceeding the amount of memory you prepared for the hardware, then you may
Try using less memory (improve the algorithms) or using the memory more efficiently (e.g. smaller and more-regular-sized malloc() to reduce heap fragmentation); or
Simply buy more memory for the hardware
Of course you may try pushing everything into pre-defined static memory space, but it is very probable that it will be stack overwriting into static memory this time. So improve the algorithm to be less memory-consuming first and buy more memory the second.
I'd attack this problem in a different way - if you think the the stack and heap are colliding, then test this by guarding against it.
For example (assuming a *ix system) try mprotect()ing the last stack page (assuming a fixed size stack) so it is not accessible. Or - if your stack grows - then mmap a page in the middle of the stack and heap. If you get a segv on your guard page you know you've run off the end of the stack or heap; and by looking at the address of the seg fault you can see which of the stack & heap collided.
It is often possible to write your embedded application without using dynamic memory allocation. In many embedded applications the use of dynamic allocation is deprecated because of the problems that can arise due to heap fragmentation. Over time it becomes highly likely that there will not be a suitably sized region of free heap space to allow the memory to be allocated and unless there is a scheme in place to handle this error the application will crash. There are various schemes to get around this, one being to always allocate fixed size objects on the heap so that a new allocation will always fit into a freed memory area. Another to detect the allocation failure and to perform a defragmentation process on all of the objects on the heap (left as an exercise for the reader!)
You do not say what processor or toolset you are using but in many the static, heap and stack are allocated to separate defined segments in the linker. If this is the case then it must be that your stack is growing outside the memory space that you have defined for it. The solution that you require is to reduce the heap and/or static variable size (assuming that these two are contiguous) so that there is more available for the stack. It may be possible to reduce the heap unilaterally although this can increase the probability of fragmentation problems. Ensuring that there are no unnecessary static variables will free some space at the cost of possibly increasing the stack usage if the variable is made auto.