I am new in CS world.While reading some books i checked that dynamic allocation of memory, allocates memory dynamically while program runs and that memory is called heap.So that means whenever i want to create a new node in Linkedlist it get stored on heap? or else it get stored on memory and accessed runtime?
And i also checked that whenever any programs ran, the OS creates PCB for the same including minimum 4 following parts:
Heap Segment
Stack segment
Data segment
Code segment
And heap segment and stack segment grows dynamically depends on code(upwards or downwards depends on system).
So my basic question is
The maximum element we can add in Linkedlist until system memory exhausts or heap memory exhausts?
I read that its until system memory exhausts.But i wondered how?
Best explanation I've read yet:
Because the heap grows up and the stack grows down, they basically limit each other. Also, because both type of segments are writeable, it wasn't always a violation for one of them to cross the boundary, so you could have buffer or stack overflow. Now there are mechanism to stop them from happening.
There is a set limit for heap (stack) for each process to start with. This limit can be changed at runtime (using brk()/sbrk()). Basically what happens is when the process needs more heap space and it has run out of allocated space, the standard library will issue the call to the OS. The OS will allocate a page, which usually will be manage by user library for the program to use. I.e. if the program wants 1 KiB, the OS will give additional 4 KiB and the library will give 1 KiB to the program and have 3 KiB left for use when the program ask for more next time.
Excerpt from here.
If you dynamically allocate memory you get it from the heap.
From the statement above you can directly conclude that if you allocate a list's nodes dynamically, the maximum numbers of nodes is limited by the heap's size, that is by the amount of heap-memory.
Related
Here's my question: Does calling free or delete ever release memory back to the "system". By system I mean, does it ever reduce the data segment of the process?
Let's consider the memory allocator on Linux, i.e ptmalloc.
From what I know (please correct me if I am wrong), ptmalloc maintains a free list of memory blocks and when a request for memory allocation comes, it tries to allocate a memory block from this free list (I know, the allocator is much more complex than that but I am just putting it in simple words). If, however, it fails, it gets the memory from the system using say sbrk or brk system calls. When a memory is free'd, that block is placed in the free list.
Now consider this scenario, on peak load, a lot of objects have been allocated on heap. Now when the load decreases, the objects are free'd. So my question is: Once the object is free'd will the allocator do some calculations to find whether it should just keep this object in the free list or depending upon the current size of the free list it may decide to give that memory back to the system i.e decrease the data segment of the process using sbrk or brk?
Documentation of glibc tells me that if the allocation request is much larger than page size, it will be allocated using mmap and will be directly released back to the system once free'd. Cool. But let's say I never ask for allocation of size greater than say 50 bytes and I ask a lot of such 50 byte objects on peak load on the system. Then what?
From what I know (correct me please), a memory allocated with malloc will never be released back to the system ever until the process ends i.e. the allocator will simply keep it in the free list if I free it. But the question that is troubling me is then, if I use a tool to see the memory usage of my process (I am using pmap on Linux, what do you guys use?), it should always show the memory used at peak load (as the memory is never given back to the system, except when allocated using mmap)? That is memory used by the process should never ever decrease(except the stack memory)? Is it?
I know I am missing something, so please shed some light on all this.
Experts, please clear my concepts regarding this. I will be grateful. I hope I was able to explain my question.
There isn't much overhead for malloc, so you are unlikely to achieve any run-time savings. There is, however, a good reason to implement an allocator on top of malloc, and that is to be able to trace memory leaks. For example, you can free all memory allocated by the program when it exits, and then check to see if your memory allocator calls balance (i.e. same number of calls to allocate/deallocate).
For your specific implementation, there is no reason to free() since the malloc won't release to system memory and so it will only release memory back to your own allocator.
Another reason for using a custom allocator is that you may be allocating many objects of the same size (i.e you have some data structure that you are allocating a lot). You may want to maintain a separate free list for this type of object, and free/allocate only from this special list. The advantage of this is that it will avoid memory fragmentation.
No.
It's actually a bad strategy for a number of reasons, so it doesn't happen --except-- as you note, there can be an exception for large allocations that can be directly made in pages.
It increases internal fragmentation and therefore can actually waste memory. (You can only return aligned pages to the OS, so pulling aligned pages out of a block will usually create two guaranteed-to-be-small blocks --smaller than a page, anyway-- to either side of the block. If this happens a lot you end up with the same total amount of usefully-allocated memory plus lots of useless small blocks.)
A kernel call is required, and kernel calls are slow, so it would slow down the program. It's much faster to just throw the block back into the heap.
Almost every program will either converge on a steady-state memory footprint or it will have an increasing footprint until exit. (Or, until near-exit.) Therefore, all the extra processing needed by a page-return mechanism would be completely wasted.
It is entirely implementation dependent. On Windows VC++ programs can return memory back to the system if the corresponding memory pages contain only free'd blocks.
I think that you have all the information you need to answer your own question. pmap shows the memory that is currenly being used by the process. So, if you call pmap before the process achieves peak memory, then no it will not show peak memory. if you call pmap just before the process exits, then it will show peak memory for a process that does not use mmap. If the process uses mmap, then if you call pmap at the point where maximum memory is being used, it will show peak memory usage, but this point may not be at the end of the process (it could occur anywhere).
This applies only to your current system (i.e. based on the documentation you have provided for free and mmap and malloc) but as the previous poster has stated, behavior of these is implmentation dependent.
This varies a bit from implementation to implementation.
Think of your memory as a massive long block, when you allocate to it you take a bit out of your memory (labeled '1' below):
111
If I allocate more more memory with malloc it gets some from the system:
1112222
If I now free '1':
___2222
It won't be returned to the system, because two is in front of it (and memory is given as a continous block). However if the end of the memory is freed, then that memory is returned to the system. If I freed '2' instead of '1'. I would get:
111
the bit where '2' was would be returned to the system.
The main benefit of freeing memory is that that bit can then be reallocated, as opposed to getting more memory from the system. e.g:
33_2222
I believe that the memory allocator in glibc can return memory back to the system, but whether it will or not depends on your memory allocation patterns.
Let's say you do something like this:
void *pointers[10000];
for(i = 0; i < 10000; i++)
pointers[i] = malloc(1024);
for(i = 0; i < 9999; i++)
free(pointers[i]);
The only part of the heap that can be safely returned to the system is the "wilderness chunk", which is at the end of the heap. This can be returned to the system using another sbrk system call, and the glibc memory allocator will do that when the size of this last chunk exceeds some threshold.
The above program would make 10000 small allocations, but only free the first 9999 of them. The last one should (assuming nothing else has called malloc, which is unlikely) be sitting right at the end of the heap. This would prevent the allocator from returning any memory to the system at all.
If you were to free the remaining allocation, glibc's malloc implementation should be able to return most of the pages allocated back to the system.
If you're allocating and freeing small chunks of memory, a few of which are long-lived, you could end up in a situation where you have a large chunk of memory allocated from the system, but you're only using a tiny fraction of it.
Here are some "advantages" to never releasing memory back to the system:
Having already used a lot of memory makes it very likely you will do so again, and
when you release memory the OS has to do quite a bit of paperwork
when you need it again, your memory allocator has to re-initialise all its data structures in the region it just received
Freed memory that isn't needed gets paged out to disk where it doesn't actually make that much difference
Often, even if you free 90% of your memory, fragmentation means that very few pages can actually be released, so the effort required to look for empty pages isn't terribly well spent
Many memory managers can perform TRIM operations where they return entirely unused blocks of memory to the OS. However, as several posts here have mentioned, it's entirely implementation dependent.
But lets say I never ask for allocation of size greater than say 50 bytes and I ask a lot of such 50 byte objects on peak load on the system. Then what ?
This depends on your allocation pattern. Do you free ALL of the small allocations? If so and if the memory manager has handling for a small block allocations, then this may be possible. However, if you allocate many small items and then only free all but a few scattered items, you may fragment memory and make it impossible to TRIM blocks since each block will have only a few straggling allocations. In this case, you may want to use a different allocation scheme for the temporary allocations and the persistant ones so you can return the temporary allocations back to the OS.
I am currently taking an operating systems course and I am at a point where we are discussing memory portion of a process. When a program is loaded into memory and thus becomes a process, two forms of memory could be used: either a stack data structure or a heap(I am not quite sure if it's called heap because it actually uses the heap data structure). I have already looked at this link Do the Stack and Heap both exist within your systems RAM? which helped me to a certain extent but did not actually answer my question.
Now my question is:
Stack is the memory where local variables are stored. When a function is called, it is pushed on the stack along with all its variables and when it returns it is popped of the stack. It is kind of a temporary memory. For the heap, there is more control on the memory since once allocated, the memory allocated stays there until explicitly freed. However, are we talking about RAM when we refer to stack and heap? Is the stack and heap of a process A residing in the address space of process A in the RAM?
I know that due to paging that is not always the case because the OS could do efficient memory management and put some memory on the disk while the process is not being used. Let us assume no paging is used just to isolate the two ideas and understand clearer what is happening in memory for a particular process.
Yes they are both in RAM. Perhaps this link from ARM about the microprocessor memory model will help with your understanding?
Why after executing next C-code Xcode shows 20KB more than it was?
void *p = malloc(sizeof(int)*1000);
free(p);
Do I have to free the memory another way? Or it's just an Xcode mistake?
When you say "Xcode shows 20KB more than it was", I presume you mean that the little bar graph goes up by 20kB.
When you malloc an object, the C library first checks the process's address space to see if there is enough free space to satisfy the request. If there isn't enough memory, it goes to the operating system to ask for more virtual memory to be allocated to the process. The graph in Xcode measures the amount of virtual memory the process has.
When you free an object, the memory is never returned to the operating system, rather, it is "just" placed on the list of free blocks for malloc to reuse. I put the word "just" in scare quotes because the actual algorithm can be quite complex, in order to minimise fragmentation of the heap and the time taken to malloc and free blocks. The reason memory is never returned to the operating system is that it is very expensive to do system calls to the OS to get and free memory.
Thus, you will never see the memory usage of the process go down. If you malloc a Gigabyte of memory and then free it, the process will still appear to be using a Gigabyte of virtual memory.
If you want to see if your program really leaks, you need to use the leaks profile tool. This intercepts malloc and free calls so it knows which blocks are still nominally in use and which have been freed.
I am not clear with memory management when a process is in execution
during run time
Here is a diagram
I am not clear with the following in the image:
1) What is the stack which this image is referring to?
2) What is memory mapping segment which is referring to file mappings?
3) What does the heap have to do with a process. Is the heap only handled in a process or is the heap something maintained by the operating system kernel and then memory space is allocated by malloc (using the heap) when ever a user space application invokes this?
The article mentions
http://duartes.org/gustavo/blog/post/anatomy-of-a-program-in-memory/
virtual address space, which in 32-bit mode is always a 4GB block of
memory addresses. These virtual addresses are mapped to physical
memory by page tables,
4) Does this mean that at a time only one program runs in memory occupying entire 4 GB of RAM?
The same article also mentions
Linux randomizes the stack, memory mapping segment, and heap by adding
offsets to their starting addresses. Unfortunately the 32-bit address
space is pretty tight, leaving little room for randomization and
hampering its effectiveness.
5) Is it referring to randomizing the stack within a process or is it referring to something which is left after counting the space of all the processes?
1) What is the stack which this image is referring to?
The stack is for allocating local variables and function call frames (which include things like function parameters, where to return after the function has called, etc.).
2) What is memory mapping segment which is referring to file mappings?
Memory mapping segment holds linked libraries. It also is where mmap calls are allocated. In general, a memory mapped file is simply a region of memory backed by a file.
3) What does the heap have to do with a process. Is the heap only handled in a process or is the heap something maintained by the operating system kernel and then memory space is allocated by malloc (using the heap) when ever a user space application invokes this?
The heap is process specific, and is managed by the process itself, however it must request memory from the OS to begin with (and as needed). You are correct, this is typically where malloc calls are allocated. However, most malloc implementations make use of mmap to request chunks of memory, so there is really less of a distinction between heap and the memory mapping segment. Really, the heap could be considered part of the memory mapped segment.
4) Does this mean that at a time only one program runs in memory occupying entire 4 GB of RAM?
No, that means the amount of addressable memory available to the program is limited to 4 GB of RAM, what is actually contained in memory at any given time is dependent on how the OS allocated physical memory, and is beyond the scope of this question.
5) Is it referring to randomizing the stack within a process or is it referring to something which is left after counting the space of all the processes?
I've never seen anything that suggests 4gb of space "hampers" the effectiveness of memory allocation strategies used by the OS. Additionally, as #Jason notes, the locations of the stack, memory mapped segment, and heap are randomized "to prevent predictable security exploits, or at least make them a lot harder than if every process the OS managed had each portion of the executable in the exact same virtual memory location." To be specific, the OS is randomizing the virtual addresses for the stack, memory mapped region, and heap. On that note, everything the process sees is a virtual address, which is then mapped to a physical address in memory, depending on where the specific page is located. More information about the mapping between virtual and physical addresses can be found here.
This wikipedia article on paging is a good starting point for learning how the OS manages memory between processes, and is a good resource to read up on for answering questions 4 and 5. In short, memory is allocated in pages to processes, and these pages either exist in main memory, or have been "paged out" to the disk. When a memory address is requested by a process, it will move the page from the disk to main memory, replacing another page if needed. There are various page replacement strategies that are used and I refer you to the article to learn more about the advantages and disadvantages of each.
Part 1. The Stack ...
A function can call a function, which might call another function. Any variables allocated end up on the stack through each iteration. And de-allocated as each function exits, hence "stack". You might consider Wikipedia for this stuff ... http://en.wikipedia.org/wiki/Stack_%28abstract_data_type%29
Linux randomizes the stack, memory mapping segment, and heap by adding offsets to their starting addresses. Unfortunately the 32-bit address space is pretty tight, leaving little room for randomization and hampering its effectiveness.
I believe this is more of a generalization being made in the article when comparing the ability to randomize in 32 vs. 64-bits. 3GB of addressable memory in 32-bits is still quite a bit of space to "move around" ... it's just not as much room as can be afforded in a 64-bit OS, and there are certain applications, such as image-editors, etc. that are very memory intensive, and can easily use up the entire 3GB of addressable memory available to them. Keep in mind I'm saying "addressable" memory ... this is dependent on the platform and not the amount of physical memory available in the system.
I created a linked list with 1,000,000 items which took 16M memory. Then I removed and freed half of them. I thought that the memory usage would decrease, but it didn't.
Why was that?
I'm checking the Memory usage via Activity Monitor On Mac OS X 10.8.2.
In case you want to check my code, here it is.
Generally speaking, free doesn't release memory back to the OS. It's still allocated to the process, so the OS reports it as allocated. From the POV of your program, it's available to satisfy an new allocations you make.
Be aware that since you freed every other node, your memory is almost certainly now very fragmented. This free memory is in small chunks with allocated memory in between them, so can only be used to satisfy small allocations. If you make a larger allocation, the process will go to the OS for more memory.
Since the process gets memory from the OS one page at a time, even if it wanted to it can't release such fragmented memory back to the OS. You're using part of every page.