how to free memory assigned using malloc? - c

struct element {
unsigned long int ip;
int type;
int rtt;
struct element * next;
struct element * edge;
};
I have a linked list. I create new nodes using malloc.
I tried to free up memory using free (ptr to node)
but when I run the traverse function again, I can traverse the linked list and the rtt value is correct as well as the next and edge pointers as I can follow the linked list. ONly the ip value is corrupted. why is this?

The behaviour of malloc() and free() depends heavily on the operating system and C library that you are using. In most implementations there are actually two memory allocators in play:
The OS memory allocator, which uses the virtual memory facilities of the processor to provide a process with its own address space and maps physical memory pages into that address space for use.
The C library memory allocator, which is in fact part of the application code and uses the pages provided by the OS to provide fine-grained memory management facilities, as provided by malloc() and free().
In general, calling free() does one or more of the following:
It marks the pointed-to memory area as free in the C memory allocator. This allows that memory to be reused. free() does not zero-out freed memory.
It may return memory to the OS, depending on the settings of the C memory allocator and whether it is actually possible to free that part of the heap. If memory is not returned to the OS, it can be reused by future malloc() calls by the same application.
If you try to access memory that has been freed, usually one of three things will happen:
The memory has been returned to the OS, and your program, typically, crashes. If you ask me, that is probably the best scenario - you have a problem, sure, but you know it.
The memory has not been reused, therefore your old data is still there. Your program goes on as if nothing was wrong. This is in my opinion the worst case scenario. Your code appears to work correctly and, if Murphy has a say in this, it will continue to do so until it reaches your end users - then it will fail spectacularly.
The memory has been reused by your program, and your code will start messing around with its own data. If you are careful (and lucky?), you will probably notice that the results are off. If not, well...
If you are on Linux/Unix Valgrind is a good tool to catch memory management problems like this. There are also replacement libraries for the C memory allocator, such as DUMA that will also allow you to detect such issues.

Memory is not wiped when you free it - that would be a waste of processor time. It is just allocated to the "free list". That is why your data is still there
Whenever you free a block, you should set the corresponding pointer to NULL, so you don't accidentally reference it - it could be reused at any time.

Actually free doesn't delete anything, it just tells the OS it can use that memory again, for example next time you call malloc() it could overwrite some of your nodes.

Freeing the memory releases it for reuse. It doesn't necessarily destroy the data that was at that location. It is up to you not to access a memory region that has been released because the behavior is undefined (i.e. usually very bad).
If you want the data destroyed for some strange reason, then overwrite the memory area prior to freeing it (e.g. memset(buf, 0, len)).

Related

Where does malloc() allocate memory? Is it the data section or the heap section of the virtual address space of the process?

Ever since I was introduced to C, I was told that in C dynamic memory allocation is done using the functions in the malloc family. I also learned that memory dynamically allocated using malloc is allocated on the heap section of the process.
Various OS textbooks say that malloc involves system call (though not always but at times) to allocate structures on heap to the process. Now supposing that malloc returns pointer to chunk of bytes allocated on the heap, why should it need a system call. The activation records of a function are placed in the stack section of the process and since the "stack section" is already a part of the virtual address space of the process, pushing and popping of activation records, manipulation of stack pointers, just start from the highest possible address of the virtual address space. It does not even require a system call.
Now on the same grounds since the "heap section" is also a part of the virtual address space of the process, why should a system call be necessary for allocating a chunk of bytes in this section. The routine like malloc could self handle the "free" list and "allocated" list on its own. All it needs to know is the end of the "data section". Certain texts say that system calls are necessary to "attach memory to the process for dynamic memory allocation", but if malloc allocates memory on "heap section" why is it at all required to attach memory to the process during malloc? Could be simply taken from portion already part of the process.
While going through the text "The C Programming Language" [2e] by Kernighan and Ritchie, I came across their implementation of the malloc function [section 8.7 pages 185-189]. The authors say :
malloc calls upon the operating system to obtain more memory as necessary.
Which is what the OS texts say, but counter intuitive to my thought above (if malloc allocates space on heap).
Since asking the system for memory is a comparatively expensive operation, the authors do not do that on every call to malloc, so they create a function morecore which requests at least NALLOC units; this larger block is chopped up as needed. And the basic free list management is done by free.
But the thing is that the authors use sbrk() to ask the operating system for memory in morecore. Now Wikipedia says:
brk and sbrk are basic memory management system calls used in Unix and Unix-like operating systems to control the amount of memory allocated to the data segment of the process.
Where
a data segment (often denoted .data) is a portion of an object file or the corresponding address space of a program that contains initialized static variables, that is, global variables and static local variables.
Which I guess is not the "heap section". [Data section is the second section from bottom in the picture above, while heap is the third section from bottom.]
I am totally confused. I want to know what really happens and how both the concepts are correct? Please help me understand the concept by joining the scattered pieces together...
In your diagram, the section labeled "data" is more precisely called "static data"; the compiler pre-allocates this memory for all the global variables when the process starts.
The heap that malloc() uses is the rest of the process's data segment. This initially has very little memory assigned to it in the process. If malloc() needs more memory, it can use sbrk() to extend the size of the data segment, or it can use mmap() to create additional memory segments elsewhere in the address space.
Why does malloc() need to do this? Why not simply make the entire address space available for it to use? There are historical and practical reasons for this.
The historical reason is that early computers didn't have virtual memory. All the memory assigned to a process was swapped in bulk to disk when switching between processes. So it was important to only assign memory pages that were actually needed.
The practical reason is that this is useful for detecting various kinds of errors. If you've ever gotten a segmentation violation error because you dereferenced an uninitialized pointer, you've benefited from this. Much of the process's virtual address space is not allocated to the process, which makes it likely that unitialized pointers point to unavailable memory, and you get an error trying to use it.
There's also an unallocated gap between the heap (growing upwards) and the stack (growing downward). This is used to detect stack overflow -- when the stack tries to use memory in that gap, it gets a fault that's translated to the stack overflow signal.
This is the Standard C Library specification for malloc(), in its entirety:
7.22.3.4 The malloc function
Synopsis
#include <stdlib.h>
void *malloc(size_t size);
Description
The malloc function allocates space for an object whose size is
specified by size and whose value is indeterminate. Note that this need
not be the same as the representation of floating-point zero or a null
pointer constant.
Returns
The malloc function returns either a null pointer or a pointer to the
allocated space.
That's it. There's no mention of the Heap, the Stack or any other memory location, which means that the underlying mechanisms for obtaining the requested memory are implementation details.
In other words, you don't care where the memory comes from, from a C perspective. A conforming implementation is free to implement malloc() in any way it sees fit, so long as it conforms to the above specification.
I was told that in C dynamic memory allocation is done using the functions in the malloc family. I also learned that memory dynamically allocated using malloc is allocated on the heap section of the process.
Correct on both points.
Now supposing that malloc returns pointer to chunk of bytes allocated on the heap, why should it need a system call.
It needs to request an adjustment to the size of the heap, to make it bigger.
...the "stack section" is already a part of the virtual address space of the process, pushing and popping of activation records, manipulation of stack pointers, [...] does not even require a system call.
The stack segment is grown implicitly, yes, but that's a special feature of the stack segment. There's typically no such implicit growing of the data segment. (Note, too, that the implicit growing of the stack segment isn't perfect, as witness the number of people who post questions to SO asking why their programs crash when they allocate huge arrays as local variables.)
Now on the same grounds since the "heap section" is also a part of the virtual address space of the process, why should a system call be necessary for allocating a chunk of bytes in this section.
Answer 1: because it's always been that way.
Answer 2: because you want accidental stray pointer references to crash, not to implicitly allocate memory.
malloc calls upon the operating system to obtain more memory as necessary.
Which is what the OS texts say, but counter intuitive to my thought above (if malloc allocates space on heap).
Again, malloc does request space on the heap, but it must use an explicit system call to do so.
But the thing is that the authors use sbrk() to ask the operating system for memory in morecore. Now Wikipedia says:
brk and sbrk are basic memory management system calls used in Unix and Unix-like operating systems to control the amount of memory allocated to the data segment of the process.
Different people use different nomenclatures for the different segments. There's not much of a distinction between the "data" and "heap" segments. You can think of the heap as a separate segment, or you can think of those system calls -- the ones that "allocate space on the heap" -- as simply making the data segment bigger. That's the nomenclature the Wikipedia article is using.
Some updates:
I said that "There's not much of a distinction between the 'data' and 'heap' segments." I suggested that you could think of them as subparts of a single, more generic data segment. And actually there are three subparts: initialized data, uninitialized data or "bss", and the heap. Initialized data has initial values that are explicitly copied out of the program file. Uninitialized data starts out as all bits zero, and so does not need to be stored in the program file; all the program file says is how many bytes of uninitialized data it needs. And then there's the heap, which can be thought of as a dynamic extension of the data segment, which starts out with a size of 0 but may be dynamically adjusted at runtime via calls to brk and sbrk.
I said, "you want accidental stray pointer references to crash, not to implicitly allocate memory", and you asked about this. This was in response to your supposition that explicit calls to brk or sbrk ought not to be required to adjust the size of the heap, and your suggestion that the heap could grow automatically, implicitly, just like the stack does. But how would that work, really?
The way automatic stack allocation works is that as the stack pointer grows (typically "downward"), it eventually reaches a point that it points to unallocated memory -- that blue section in the middle of the picture you posted. At that point, your program literally gets the equivalent of a "segmentation violation". But the operating system notices that the violation involves an address just below the existing stack, so instead of killing your program on an actual segmentation violation, it quick-quick makes the stack segment a little bigger, and lets your program proceed as if nothing had happened.
So I think your question was, why not have the upward-growing heap segment work the same way? And I suppose an operating system could be written that worked that way, but most people would say it was a bad idea.
I said that in the stack-growing case, the operating system notices that the violation involves an address "just below" the existing stack, and decides to grow the stack at that point. There's a definition of "just below", and I'm not sure what it is, but these days I think it's typically a few tens or hundreds of kilobytes. You can find out by writing a program that allocates a local variable
char big_stack_array[100000];
and seeing if your program crashes.
Now, sometimes a stray pointer reference -- that would otherwise cause a segmentation violation style crash -- is just the result of the stack normally growing. But sometimes it's a result of a program doing something stupid, like the common error of writing
char *retbuf;
printf("type something:\n");
fgets(retbuf, 100, stdin);
And the conventional wisdom is that you do not want to (that is, the operating system does not want to) coddle a broken program like this by automatically allocating memory for it (at whatever random spot in the address space the uninitialized retbuf pointer seems to point) to make it seem to work.
If the heap were set up to grow automatically, the OS would presumably define an analogous threshold of "close enough" to the existing heap segment. Apparently stray pointer references within that region would cause the heap to automatically grow, while references beyond that (farther into the blue region) would crash as before. That threshold would probably have to be bigger than the threshold governing automatic stack growth. malloc would have to be written to make sure not to try to grow the heap by more than that amount. And true, stray pointer references -- that is, program bugs -- that happened to reference unallocated memory in that zone would not be caught. (Which is, it's true, what can happen for buggy, stray pointer references just off the end of the stack today.)
But, really, it's not hard for malloc to keep track of things, and explicitly call sbrk when it needs to. The cost of requiring explicit allocation is small, and the cost of allowing automatic allocation -- that is, the cost of the stray pointer bugs not caught -- would be larger. This is a different set of tradeoffs than for the stack growth case, where an explicit test to see if the stack needed growing -- a test which would have to occur on every function call -- would be significantly expensive.
Finally, one more complication. The picture of the virtual memory layout that you posted -- with its nice little stack, heap, data, and text segments -- is a simple and perhaps outdated one. These days I believe things can be a lot more complicated. As #chux wrote in a comment, "your malloc() understanding is only one of many ways allocation is handled. A clear understanding of one model may hinder (or help) understanding of the many possibilities." Among those complicating possibilities are:
A program may have multiple stack segments maintaining multiple stacks, if it supports coroutines or multithreading.
The mmap and shm_open system calls may cause additional memory segments to be allocated, scattered anywhere within that blue region between the heap and the stack.
For large allocations, malloc may use mmap rather than sbrk to get memory from the OS, since it turns out this can be advantageous.
See also Why does malloc() call mmap() and brk() interchangeably?
As the bard said, "There are more things in heaven and earth, Horatio, than are dreamt of in your philosophy." :-)
Not all virtual addresses are available at the beginning of a process.
OS does maintain a virtual-to-physics map, but (at any given time) only some of the virtual addresses are in the map. Reading or Writing to an virtual address that isn't in the map cause a instruction level exception. sbrk puts more addresses in the map.
Stack just like data section but has a fixed size, and there is no sbrk-like system call to extend it. We can say there is no heap section, but only a fixed-size stack section and a data section which can be grown upward by sbrk.
The heap section you say is actually a managed (by malloc and free) part of the data section. It's clear that the code relating to heap management is not in OS kernel but in C library executing in CPU user mode.

Understanding C Memory Allocation and Deallocation

I have been recently trying to learn how to program in the C programming language.
I am currently having trouble understanding how memory is deallocated by free() in C.
What does it mean to free or release the memory?
For instance, if I have the following pointer:
int *p = malloc(sizeof(int));
When I deallocate it using free(p), what does it do? Does it somehow flag it as "deallocated", so the application may use it for new allocations?
Does it deallocates only the pointer address, or the address being pointed is also deallocated too?
I would do some experiments myself to better understand this, but I am so newbie in the subject that I don't know even how to debug a C program yet (I'm not using any IDE).
Also, what if int *p is actually a pointer to an array of int?
If I call free(p), does it deallocate the whole array or only the element it is pointing to?
I'm so eager to finally understand this, I would very much appreciate any help!
What does it mean to free or release the memory?
It means that you're done with the memory and are ready to give it back to the memory allocator.
When I deallocate it using free(p), what does it do?
The specifics are implementation dependent, but for a typical allocator it puts the block back on the free list. The allocator maintains a list of blocks that are available for use. When you ask for a chunk of memory (by calling malloc() or similar) the allocator finds an appropriate block in the list of free blocks, removes it (so it's no longer available), and gives you a pointer to the block. When you call free(), the process is reversed -- the block is put back on the free list and thereby becomes available to be allocated again.
Importantly, once you call free() on a pointer, you must not dereference that pointer again. A common source of memory-related errors is using a pointer after it has been freed. For that reason, some consider it a helpful practice to set a pointer to nil immediately after freeing it. Similarly, you should avoid calling free() on a pointer that you didn't originally get from the allocator (e.g. don't free a pointer to a local variable), and it's never a good idea to call free() twice on the same pointer.
Does it deallocates only the pointer address, or the address being pointed is also deallocated too?
When you request a block of memory from the allocator, you specify the size of the block you want. The allocator keeps track of the size of the block so that when you free the block, it knows both the starting address and the block size. When you call free(p), the block that p points to is deallocated; nothing happens to the pointer p itself.
Also, what if int *p is actually a pointer to an array of int?
An array in C is a contiguous block of memory, so a pointer to the first element of the array is also a pointer to the entire block. Freeing that block will properly deallocate the entire array.
I'm so eager to finally understand this, I would very much appreciate any help!
There are a number of good pages about memory allocation in C that you should read for a much more detailed understanding. One place you could start is with the GNU C Library manual section on memory allocation.
As alluded to above and in the other answers, the actual behavior of the allocator depends on the implementation. Your code shouldn't have any particular expectations about how memory allocation works beyond what's documented in the standard library, i.e. call malloc(), calloc(), etc. to get a block of memory, and call free() to give it back when you're done so that it can be reused.
malloc and free do whatever they want. Their expected behaviour is that malloc allocates a block of desired size in dynamic memory and returns a pointer to it. free must be able to receive one such pointer and correctly deallocate the block. How they keep track of the block size is irrelevant.
Is int *p a pointer to an array of ints ? Maybe. If you allocated sufficient space for several ints, yes.
There is a fixed and limited amount of memory in your computer, and everybody wants some. The Operating system is charged with the task of assigning ownership to pieces of memory and keeping track of it all to assure that no one messes with anyone else's.
When you ask for memory with malloc(), you're asking the system (the C runtime and the OS) to give you the address of a block of memory that is now yours. You are free to write to it and read from it at will, and the system promises that no one else will mess with it while you own it. When you de-allocate it with free(), nothing happens to the memory itself, it's just no longer yours. What happens to it is none of your business. The system may keep it around for future allocations, it may give it to some other process.
The details of how this happens vary from one system to another, but they really don't concern the programmer (unless you're the one writing the code for malloc/free). Just use the memory while it's yours, and keep your hands off while it's not.

Control over memory while working with linked lists in C

While dealing with linked lists (LL), let us say we are writing a function insert(parameters) to insert a new node into d LL. For that we write in the function something similar to:
temp = (node *)malloc(sizeof(node));
That means we are allocating some space to 'temp'. After returning from the function this temp variable loses its scope and also its lifetime is over. So its dead now. But now my doubt is:
"Is the memory we have allocated now completely in our control even after returning from the function ?"
I am asking about OUR control on the newly allocated memory. We get results when we print or do any operations; but is that memory still dedicated to us? If the environment (OS) wants to use that memory is it restricted or it has permissions to use that memory?
If you didn't call free(), the memory is still allocated for your use. If temp goes out of scope, you leaked it though - how are you going to find out where it is to use it?
The memory stays allocated until it is free()d, even if you lose track of it by letting all of the pointers to it go out of scope. Someone is always "in charge" of freeing allocated memory; the documentation for functions should tell you if they either take control of a pointer from you or give it to you (such as malloc() itself does by passing back a pointer). You are probably leaking memory here.
To be clear :
The memory allocated is allocated by you program. Exiting your program will clean it. until then it is only for your personnal use.
the only limit is if you OS refuses to allocathe this memory. (see manual of malloc)
Also loosing track of this allocated memory does not clean it, it is just inaccessible (this is called a leak, you loosed your memory. But don't worry the OS will be able to retrieve it at exit)

Heap Memory in C Programming

What exactly is heap memory?
Whenever a call to malloc is made, memory is assigned from something called as heap. Where exactly is heap. I know that a program in main memory is divided into instruction segment where program statements are presents, Data segment where global data resides and stack segment where local variables and corresponding function parameters are stored. Now, what about heap?
The heap is part of your process's address space. The heap can be grown or shrunk; you manipulate it by calling brk(2) or sbrk(2). This is in fact what malloc(3) does.
Allocating from the heap is more convenient than allocating memory on the stack because it persists after the calling routine returns; thus, you can call a routine, say funcA(), to allocate a bunch of memory and fill it with something; that memory will still be valid after funcA() returns. If funcA() allocates a local array (on the stack) then when funcA() returns, the on-stack array is gone.
A drawback of using the heap is that if you forget to release heap-allocated memory, you may exhaust it. The failure to release heap-allocated memory (e.g., failing to free() memory gotten from malloc()) is sometimes called a memory leak.
Another nice feature of the heap, vs. just allocating a local array/struct/whatever on the stack, is that you get a return value saying whether your allocation succeeded; if you try to allocate a local array on the stack and you run out, you don't get an error code; typically your thread will simply be aborted.
The heap is the diametrical opposite of the stack. The heap is a large pool of memory that can be used dynamically – it is also known as the “free store”. This is memory that is not automatically managed – you have to explicitly allocate (using functions such as malloc), and deallocate (e.g. free) the memory. Failure to free the memory when you are finished with it will result in what is known as a memory leak – memory that is still “being used”, and not available to other processes. Unlike the stack, there are generally no restrictions on the size of the heap (or the variables it creates), other than the physical size of memory in the machine. Variables created on the heap are accessible anywhere in the program.
Oh, and heap memory requires you to use pointers.
A summary of the heap:
the heap is managed by the programmer, the ability to modify it is
somewhat boundless
in C, variables are allocated and freed using functions like malloc() and free()
the heap is large, and is usually limited by the physical memory available
the heap requires pointers to access it
credit to craftofcoding
Basically, after memory is consumed by the needs of programs, what is left is the heap. In C that will be the memory available for the computer, for virtual machines it will be less than that.
But, this is the memory that can be used at run-time as your program needs memory dynamically.
You may want to look at this for more info:
http://computer.howstuffworks.com/c28.htm
Reading through this, this is actually beyond the realms of C. C doesn't specify that there's a heap behind malloc; it could just as easily be called a linked list; you're just calling it a heap by convention.
What the standard guarantees is that malloc will either return a pointer to an object that has dynamic storage duration, and your heap is just one type of data structure which facilitates the provision of such a storage duration. It's the common choice. Nonetheless, the very developers who wrote your heap have recognised that it might not be a heap, and so you'll see no reference of the term heap in the POSIX malloc manual for example.
Other things that are beyond the realms of standard C include such details of the machine code binary which is no longer C source code following compilation. The layout details, though typical, are all implementation-specific as opposed to C-specific.
The heap, or whichever book-keeping data structure is used to account for allocations, is generated during runtime; as malloc is called, new entries are (presumably) added to it and as free is called, new entries are (again, presumably) removed from it.
As a result, there's generally no need to have a section in the machine code binary for objects allocated using malloc, however there are cases where applications are shipped standalone baked into microprocessors, and in some of these cases you might find that flash or otherwise non-volatile memory might be reserved for that use.

How MMU detects double free of a pointer?

How the memory management unit(MMU) detects the double free of a pointer?
I know that its a good practice to make the pointer NULL just after freeing it, but suppose programmer does not do it. Is there any MMU mechanism to detect it?
The MMU has nothing to do with it. If you free a pointer allocated with malloc twice you will probably corrupt the C runtime heap. The heap (not the MMU) can in principle protect itself against such things, but most don't. Please note that this has nothing to do with the operating system - neither malloc() nor free() are system calls.
How the memory management unit(MMU) detects the double free of a pointer?
The MMU just does virtual address space -> physical memory mapping, it doesn't know anything about how the heap is organized/how the allocation works/..., that is operating system/allocator work.
How does OS detects the double free then? Whats the mechanism??
It walks the list/bitmap/... of allocated blocks, sees that there's no allocated block with the address you passed to it, so it detects that it's a double free.
However if that block has already been re-allocated, it finds it and correctly free it => but now the code that used the re-allocated block will go nuts, since the memory it has correctly acquired and that it didn't release has become unallocated.
If the allocator protects the unallocated memory marking it as no-read and no-write/removing it from the committed pages of the virtual address space the program will die as soon as that memory is accessed again (but the code that apparently caused the crash will be actually innocent, since it didn't do anything wrong).
Otherwise, the application may still work for some time, until that memory block will be given to some other piece of code that requested some memory. At that point, two pieces of the same application will try to work on the same block of memory, with all the mess that can originate from this.
(Thanks to Pascal Cuoq for pointing out my error.)
No, there is no MMU mechanism to detect it. It is common that calling free on an already free'd address causes the program to crash as the implentation of free does something unexpected and causes a segmentation fault.
Running valgrind is a good way of checking for memory management problems, such as double freeing a pointer.
Setting it to NULL isn't actually a good practice, it hides bugs. Particularly double free()s. Check the OS memory map, something like 0xfeeefeee or 0xdeadbeef is usually good.
You can diagnose double free()s with a debug allocator. Most any decent CRT has one.

Resources