I know people keep telling me to read the documentation, and I haven't got the hang of that yet. I don't specifically know what to look for inside the docs. If you have a tip on how to get used to reading the docs, give me your tips as a bonus, but for now here is my question. Say in a simple program if we allocate some chunk of memory I can free it once I am done, right? here is one such a program that does nothing, but allocates and deallocates memory in the heap.
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
char *s = malloc(10);
free(s);
return (0);
}
After we compile this, if we run valgrind we can see that everything is freed. Now, here is just a little bit of a twist to the previous program.
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
char *s = malloc(10);
s++;
free(s);
return (0);
}
Here, before I free it, I increment the address by one, and all hell breaks loose. Free doesn't seem to now that this chunk of memory was allocated (even though it is a subset of allocated memory). In case you want to see it, here is my error message.
*** Error in `./a.out': free(): invalid pointer: 0x0000000001e00011 ***
Aborted (core dumped).
So this got me thinking.
Does c keep track of memory's allocated on the heap
If not, how does it know what to free and what to not?
And if it has such a knowledge, why doesn't c automatically deallocate such memories just before exiting. why memory leaks?
The C standard description of free says the following:
The free function causes the space pointed to by ptr to be deallocated, that is, made available for further allocation. If ptr is a null pointer, no action occurs. Otherwise, if the argument does not match a pointer earlier returned by a memory management function, or if the space has been deallocated by a call to free or realloc, the behavior is undefined.
Since you changed the pointer, it does not match a pointer returned by a memory management function (i.e. m/re/calloc), the behaviour is undefined, and anything can happen. Including the runtime library noticing that you've tried to free an invalid pointer, but the runtime is not required to do that either.
As for
Does C keep track of memory's allocated on the heap
It might... but it does not necessarily have to...
If not, how does it know what to free and what to not?
Well, if it does free the memory pointed to by a pointer, then it obviously need to have some kind of bookkeeping about the sizes of the allocations... but it does not need to be able to figure out if any pointers are still pointing to that memory area.
And if it has such a knowledge, why doesn't c automatically deallocate such memories just before exiting. why memory leaks?
Usually memory is freed after the process exits by the operating system. That's not the problem. The real problem are the leaks that happen while the program is still running.
My knowledge is limited. But I think I can solve some of your question
Does c keep track of memory's allocated on the heap
C does not track anything. The OS, however, tracks and knows which memory region are used and which is not.
If not, how does it know what to free and what to not?
See how-does-free-know-how-much-to-free
Put it simplly. When calling malloc, you give it a size. malloc uses an extra 8 bytes right in front of the pointer it returns to "remember" this size information. When you free said pointer, free will know both the address and read 8 bytes before the pointer to get the size info, then happily release the memory to the operating system.
And if it has such a knowledge, why doesn't c automatically deallocate such memories just before exiting. why memory leaks?
The OS knows the information. Thus when a C program exits, the OS will take in charge and free memory you didn't free explicitly.
*** Error in ./a.out': free(): invalid pointer: 0x0000000001e00011 ***
Aborted (core dumped).`
As for this. I give of snippet of glibc free function
glibc malloc will allocate memory region in an aligned pattern, say 32 byte aligned. So when you did s++, it is not 32 byte aligned any more.
Now you may think what if i do s += 32; and set a pretending size information before s. I tried this. Sadly I can't naively trick glibc' free function. It has other information to prevent this. I stopped to dig in right now...
Related
I have been taught in lectures, that calling free() on a pointer twice is really, really bad. I know that it is good practice, to set a pointer to NULL, right after having freed it.
However, I still have never heard any explanation as to why that is. From what I understand, the way malloc() works, it should technically keep track of the pointers it has allocated and given you to use. So why does it not know, whether a pointer it receives through free() has been freed yet or not?
I would love to understand, what happens internally, when you call free() on a location that has previously already been freed.
When you use malloc you are telling the PC that you want to reserve some memory location on the heap just for you. The computer gives back a pointer to the first byte of the addressed space.
When you use free you are actually telling the computer that you don't need that space anymore, so it marks that space as available for other data.
The pointer still points to that memory address. At this point that same space in the heap can be returned by another malloc call. When you invoke free a second time, you are not freeing the previous data, but the new data, and this may not be good for your program ;)
To answer your first question,
So why does it not know, whether a pointer it receives through free() has been freed yet or not?
because, the specification for malloc() in C standard does not mandate this. When you call malloc() or family of functions, what it does is to return you a pointer and internally it stores the size of the memory location allocated in that pointer. That is the reason free() does not need a size to clean up the memory.
Also, once free()-d, what happens with the actually allocated memory is still implelentation dependent. Calling free() is just a marker to point out that the allocated memory is no longer in use by the process and can be reclaimed and e re-allocated, if needed. So, keeping track of the allocated pointer is very needless at that point. It will be an unnecessary burden on the OS to keep all the backtracks.
For debugging purpose, however, some library implementations can do this job for you, like DUMA or dmalloc and last but not the least, memcheck tool from Valgrind.
Now, technically, the C standard does not specify any behaviour if you call free() on an already free-ed pointer. It is undefined behavior.
C11, chapter §7.22.3.3, free() function
[...] if
the argument does not match a pointer earlier returned by a memory management
function, or if the space has been deallocated by a call to free() or realloc(), the
behavior is undefined.
C standard only says that calling free twice on a pointer returned by malloc and its family function invoke undefined behavior. There is no further explanation why it is so.
But, why it is bad is explained here:
Freeing The Same Chunk Twice
To understand what this kind of error might cause, we should remember how the memory manager normally works. Often, it stores the size of the allocated chunk right before the chunk itself in memory. If we freed the memory, this memory chunk might have been allocated again by another malloc() request, and thus this double-free will actually free the wrong memory chunk - causing us to have a dangling pointer somewhere else in our application. Such bugs tend to show themselves much later than the place in the code where they occured. Sometimes we don't see them at all, but they still lurk around, waiting for an opportunity to rear their ugly heads.
Another problem that might occure, is that this double-free will be done after the freed chunk was merged together with neighbouring free chunks to form a larger free chunk, and then the larger chunk was re-allocated. In such a case, when we try to free() our chunk for the 2nd time, we'll actually free only part of the memory chunk that the application is currently using. This will cause even more unexpected problems.
When you are calling malloc you are getting a pointer. The runtime library needs to keep track of the malloced memory. Typically malloc does not store the memory management structures separated from the malloc ed memory but in one place. So a malloc for x bytes in fact takes x+n bytes, where one possible layout is that the first n bytes are containing a linked list struct with pointers to the next (and maybe previous) allocated memory block.
When you free a pointer then the function free could walk through it's internal memory management structures and check if the pointer you pass in is a valid pointer that was malloced. Only then it could access the hidden parts of the memory block. But doing this check would be very time consuming, especially if you allocate a lot. So free simply assumes that you pass in a valid pointer. That means it directly access the hidden parts of the memory block and assumes that the linked list pointers there are valid.
If you free a block twice then you might have the problem that someone did a new malloc, got the memory you just freed, overwrites it and the second free reads invalid pointers from it.
Setting a freed pointer to NULL is good practice because it helps debugging. If you access freed memory your program might crash, but it might also just read suspicious values and maybe crash later. Finding the root cause then might be hard. If you set freed pointers to NULL your program will immediately crash when you try to access the memory. That helps massively during debugging.
I have been recently trying to learn how to program in the C programming language.
I am currently having trouble understanding how memory is deallocated by free() in C.
What does it mean to free or release the memory?
For instance, if I have the following pointer:
int *p = malloc(sizeof(int));
When I deallocate it using free(p), what does it do? Does it somehow flag it as "deallocated", so the application may use it for new allocations?
Does it deallocates only the pointer address, or the address being pointed is also deallocated too?
I would do some experiments myself to better understand this, but I am so newbie in the subject that I don't know even how to debug a C program yet (I'm not using any IDE).
Also, what if int *p is actually a pointer to an array of int?
If I call free(p), does it deallocate the whole array or only the element it is pointing to?
I'm so eager to finally understand this, I would very much appreciate any help!
What does it mean to free or release the memory?
It means that you're done with the memory and are ready to give it back to the memory allocator.
When I deallocate it using free(p), what does it do?
The specifics are implementation dependent, but for a typical allocator it puts the block back on the free list. The allocator maintains a list of blocks that are available for use. When you ask for a chunk of memory (by calling malloc() or similar) the allocator finds an appropriate block in the list of free blocks, removes it (so it's no longer available), and gives you a pointer to the block. When you call free(), the process is reversed -- the block is put back on the free list and thereby becomes available to be allocated again.
Importantly, once you call free() on a pointer, you must not dereference that pointer again. A common source of memory-related errors is using a pointer after it has been freed. For that reason, some consider it a helpful practice to set a pointer to nil immediately after freeing it. Similarly, you should avoid calling free() on a pointer that you didn't originally get from the allocator (e.g. don't free a pointer to a local variable), and it's never a good idea to call free() twice on the same pointer.
Does it deallocates only the pointer address, or the address being pointed is also deallocated too?
When you request a block of memory from the allocator, you specify the size of the block you want. The allocator keeps track of the size of the block so that when you free the block, it knows both the starting address and the block size. When you call free(p), the block that p points to is deallocated; nothing happens to the pointer p itself.
Also, what if int *p is actually a pointer to an array of int?
An array in C is a contiguous block of memory, so a pointer to the first element of the array is also a pointer to the entire block. Freeing that block will properly deallocate the entire array.
I'm so eager to finally understand this, I would very much appreciate any help!
There are a number of good pages about memory allocation in C that you should read for a much more detailed understanding. One place you could start is with the GNU C Library manual section on memory allocation.
As alluded to above and in the other answers, the actual behavior of the allocator depends on the implementation. Your code shouldn't have any particular expectations about how memory allocation works beyond what's documented in the standard library, i.e. call malloc(), calloc(), etc. to get a block of memory, and call free() to give it back when you're done so that it can be reused.
malloc and free do whatever they want. Their expected behaviour is that malloc allocates a block of desired size in dynamic memory and returns a pointer to it. free must be able to receive one such pointer and correctly deallocate the block. How they keep track of the block size is irrelevant.
Is int *p a pointer to an array of ints ? Maybe. If you allocated sufficient space for several ints, yes.
There is a fixed and limited amount of memory in your computer, and everybody wants some. The Operating system is charged with the task of assigning ownership to pieces of memory and keeping track of it all to assure that no one messes with anyone else's.
When you ask for memory with malloc(), you're asking the system (the C runtime and the OS) to give you the address of a block of memory that is now yours. You are free to write to it and read from it at will, and the system promises that no one else will mess with it while you own it. When you de-allocate it with free(), nothing happens to the memory itself, it's just no longer yours. What happens to it is none of your business. The system may keep it around for future allocations, it may give it to some other process.
The details of how this happens vary from one system to another, but they really don't concern the programmer (unless you're the one writing the code for malloc/free). Just use the memory while it's yours, and keep your hands off while it's not.
I wrote a simple program to test the contents of a dynamically allocated memory after free() as below. (I know we should not access the memory after free. I wrote this to check what will be there in the memory after free)
#include <stdio.h>
#include <stdlib.h>
main()
{
int *p = (int *)malloc(sizeof(int));
*p = 3;
printf("%d\n", *p);
free(p);
printf("%d\n", *p);
}
output:
3
0
I thought it will print either junk values or crash by 2nd print statement. But it is always printing 0.
1) Does this behaviour depend on the compiler?
2) if I try to deallocate the memory twice using free(), core dump is getting generated. In the man pages, it is mentioned that program behaviour is abnormal. But I am always getting core dump. Does this behaviour also depend on the compiler?
Does free() remove the data stored in the dynamically allocated memory?
No. free just free the allocated space pointed by its argument (pointer). This function accepts a char pointer to a previously allocated memory chunk, and frees it - that is, adds it to the list of free memory chunks, that may be re-allocated.
The freed memory is not cleared/erased in any manner.
You should not dereference the freed (dangling) pointer. Standard says that:
7.22.3.3 The free function:
[...] Otherwise, if the argument does not match a pointer earlier returned by a memory management
function, or if the space has been deallocated by a call to free or realloc, the behavior is undefined.
The above quote also states that freeing a pointer twice will invoke undefined behavior. Once UB is in action, you may get either expected, unexpected results. There may be program crash or core dump.
As described in gnu website
Freeing a block alters the contents of the block. Do not expect to find any data (such as a pointer to the next block in a chain of blocks) in the block after freeing it.
So, accessing a memory location after freeing it results in undefined behaviour, although free doesnt change the data in the memory location. U may be getting 0 in this example, u might as well get garbage in some other example.
And, if you try to deallocate the memory twice, on the second attempt you would be trying to free a memory which is not allocated, thats why you are gettin the core dump.
In addition to all the above explanations for use-after-free semantics, you really may want to investigate the life-saver for every C programmer: valgrind. It will automatically detect such bugs in your code and generally save your behind in the real world.
Coverity and all the other static code checkers are also great, but valgrind is awesome.
As far as standard C is concerned, it’s just not specified, because it is not observable. As soon as you free memory, all pointers pointing there are invalid, so there is no way to inspect that memory.*)
Even if you happen to have some standard C library documenting a certain behaviour, your compiler may still assume pointers aren’t reused after being passed to free, so you still cannot expect any particular behaviour.
*) I think, even reading these pointers is UB, not only dereferencing, but this doesn’t matter here anyway.
As per my understanding,
free() is used to deallocate the memory that we allocated using malloc before.
In my following snippet, I have freed the memory i have allocated. But i was able to access the pointer even after freeing? How it is possible?
How free works internally?
#include<iostream>
using namespace std;
int main()
{
int *p=(int *)malloc(sizeof(int));
*p=17;
free(p);
*p=*p+1;
printf("\n After freeing memory :: %d ",*p );
return 0;
}
You can certainly continue to use p after calling free(p) and nothing will stop you. However the results will be completely undefined and unpredictable. It works by luck only. This is a common programming error called "use after free" which works in many programs for literally years without "problems" -- until it causes a problem.
There are tools which are quite good at finding such errors, such as Valgrind.
Accessing a dangling pointer will result in undefined behavior.
A dangling pointer is the one which is already freed.
After free(p), p is a dangling pointer which points to no where. free() just releases the memory block allocated by malloc() and it doesn't change the value of pointer pointing to heap in that process address space. On some platforms you might get segfault if you try dereferencing pointer after freeing. Its good practice if you assign pointer p to NULL after freeing.
In most systems and standard libraries, malloc is optimized by allocating a larger chunk of memory than required (up to 64K on some systems) from the OS, and then internally disbursing it as per requirement from this pool. The same applied to the deallocation (free) as well, in that, the free'd memory isn't freed, but put back in the memory pool, in case another request comes in, in which case the pool is reused.
Thus, the call to free hasn't actually freed the memory from the process's address space and is thus accessible even after free.
My understanding as to why this is done is that system calls are expensive, so the initial call to malloc will make a system call for enough memory that future requests for malloc do not immediately trigger another system call for more memory.
As for further reading, please take a look at this page on Wikipedia, How do malloc() and free() work? and How is malloc() implemented internally?
I realize the code sample below is something you should never do. My question is just one of interest. If you allocate a block of memory, and then move the pointer (a no-no), when you deallocate the memory, what is the size of the block that is deallocated, and where is it in memory? Here's the contrived code snippet:
#include <stdio.h>
#include <string.h>
int main(void) {
char* s = malloc(1024);
strcpy(s, "Some string");
// Advance the pointer...
s += 5;
// Prints "string"
printf("%s\n", s);
/*
* What exactly are the beginning and end points of the memory
* block now being deallocated?
*/
free(s);
return 0;
}
Here is what I think I happens. The memory block being deallocated begins with the byte that holds the letter "s" in "string". The 5 bytes that held "Some " are now lost.
What I'm wondering is: Are the 5 bytes whose location in memory immediately follows the end of the original 1024 bytes deallocated as well, or are they just left alone?
Anyone know for sure what is it the compiler does? Is it undefined?
Thanks.
You cannot pass a pointer that was not obtained from a malloc, calloc or realloc to free (except NULL).
Question 7.19 in the C FAQ is relevant to your question.
The consequences of invoking undefined behavior are explained here.
It's undefined behavior in the standard, so you can't rely on anything.
Remember that blocks are artificially delimited areas of memory, and don't automatically
show up. Something has to keep track of the block, in order to free everything necessary and nothing more. There's no possible termination, like C strings, since there's no value or combination of values that can be guaranteed not to be inside the block.
Last I looked, there were two basic implementation practices.
One is to keep a separate record of allocated blocks, along with the address allocated. The free() function looks up the block to see what to free. In this case, it's likely to simply not find it, and may well just do nothing. Memory leak. There are, however, no guarantees.
One is to keep the block information in a part of memory just before the allocation address. In this case, free() is using part of the block as a block descriptor, and depending on what's stored there (which could be anything) it will free something. It could be an area that's too small, or an area that's too large. Heap corruption is very likely.
So, I'd expect either a memory leak (nothing gets freed), or heap corruption (too much is marked free, and then reallocated).
Yes, it is undefined behavior. You're essentially freeing a pointer you didn't malloc.
You cannot pass a pointer you did not obtain from malloc (or calloc or realloc...) to free. That includes offsets into blocks you did obtain from malloc. Breaking this rule could result in anything happening. Usually this ends up being the worst imaginable possibility at the worst possible moment.
As a further note, if you wanted to truncate the block, there's a legal way to do this:
#include <stdio.h>
#include <string.h>
int main() {
char *new_s;
char *s = malloc(1024);
strcpy(s, "Some string");
new_s = realloc(s, 5);
if (!new_s) {
printf("Out of memory! How did this happen when we were freeing memory? What a cruel world!\n");
abort();
}
s = new_s;
s[4] = 0; // put the null terminator back on
printf("%s\n", s); // prints Some
free(s);
return 0;
}
realloc works both to enlarge and shrink memory blocks, but may (or may not) move the memory to do so.
It is not the compiler that does it, it is the standard library. The behavior is undefined. The library knows that it allocated the original s to you. The s+5 is not assigned to any memory block known by the library, even though it happens to be inside a known block. So, it won't work.
What I'm wondering is: Are the 5 bytes whose location in memory immediately follows the end of the original 1024 bytes deallocated as well, or are they just left alone?
Both. The result is undefined so a compiler is free to do either of those, or anything else they'd like really. Of course (as with all cases of "undefined behavior") for a particular platform and compiler there is a specific answer, but any code that relies on such behavior is a bad idea.
Calling free() on a ptr that wasnt allocated by malloc or its brethren is undefined.
Most implementations of malloc allocate a small (typically 4byte) header region immediately before the ptr returned. Which means when you allocated 1024 bytes, malloc actually reserved 1028 bytes. When free( ptr ) is called, if ptr is not 0, it inspects the data at ptr - sizeof(header). Some allocators implement a sanity check, to make sure its a valid header, and which might detect a bad ptr, and assert or exit. If there is no sanity check, or it erroneously passes, free routine will act on whatever data happens to be in the header.
Adding to the more formal answers: I'd compare the mechanics of this to one taking a book in the library (malloc), then tearing off a few dozen pages together with the cover (advance the pointer), and then attempting to return it (free).
You might find a librarian (malloc/free library implementation) that takes such a book back, but in a lot of case I'd expect you would pay a fine for negligent handling.
In the draft of C99 (I don't have the final C99 handy in front of me), there is something to say on this topic:
The free function causes the space pointed to by ptr to be deallocated,
that is, made available for further allocation. If ptr is a null pointer, no action
occurs. Otherwise, if the argument does not match a pointer earlier returned
by the calloc, malloc, or realloc function, or if the space has been
deallocated by a call to free or realloc, the behaviour is undefined.
In my experience, a double free or the free of the "pointer" that was not returned via malloc will result in a memory corruption and/or crash, depending on your malloc implementation. The security people from both sides of the fence used this behaviour not once, in order to do various interesting things at least in early versions of the widely used Doug Lea's malloc package.
The library implementation might put some data structure before the pointer it returns to you. Then in free() it decrements the pointer to get at the data structure telling it how to place the memory back into the free pool. So the 5 bytes at the beginning of your string "Some " is interpreted as the end of the struct used by the malloc() algorithm. Perhaps the end of a 32 bit value, like the size of memory allocated, or a link in a linked list. It depends on the implementation. Whatever the details, it'll just crash your program. As Sinan points out, if you're lucky!
Let's be smart here... free() is not a black hole. At the very least, you have the CRT source code. Beyond that, you need the kernel source code.
Sure, the behavior is undefined in that it is up to the CRT/OS to decide what to do. But that doesn't prevent you from finding out what your platform actualy does.
A quick look into the Windows CRT shows that free() leads right to HeapFree() using a CRT specific heap. Beoyond that you're into RtlHeapFree() and then into system space (NTOSKRN.EXE) with the memory manager Mm*().
There are consistancey checks throughout all these code paths. But doing differnt things to the memory will cause differnt code paths. Hence the true definition of undefined.
At a quick glance, I can see that an allocated block of memory has a marker at the end. When the memory is freed, each byte is written over with a distinct byte. The runtime can do a check to see if the end of block marker was overwritten and raise an exception if so.
This is a posiblility in your case of freeing memory a few bytes into your block (or over-writing your allocated size). Of course you can trick this and write the end of block marker yourself at the correct location. This will get you past the CRT check, but as the code-path goes futher, more undefined behavoir occurs. Three things can happen: 1) absolutely no harm, 2) memory corruption within the CRT heap, or 3) a thrown exception by any of the memory management functions.
Short version: It's undefined behavior.
Long version: I checked the CWE site and found that, while it's a bad idea all around, nobody seemed to have a solid answer. Probably because it's undefined.
My guess is that most implementations, assuming they don't crash, would either free 1019 bytes (in your example), or else free 1024 and get a double free or similar on the last five. Just speaking theoretically for now, it depends on whether the malloc routine's internal storage tables contains an address and a length, or a start address and an end address.
In any case, it's clearly not a good idea. :-)