free Memory in c - c

I have a doubt regarding:
void *a = malloc(40);
free(a);
If I consider that malloc(40) allocates 40 bytes of memory and returns the address of this memory and then free(a) deallocates/frees this memory but doesn't do anything with the bit pattern residing in that memory. So, supposedly this same memory is allocated to say void *b, then on printing the value at address pointed to by b gives me the same value that was residing or it gives me a garbage value and why?

I assume that you have this situation in mind:
void * a = malloc(40);
free(a);
void * b = malloc(40);
assert(a == b);
This is of course entirely plausible, since memory is likely to be reused.
However, since a == b, you've answered your own question: The value of b is identical to the value of a!
I believe that you've asked the wrong question, and that you are actually interested in comparing the memory pointed to by b. That's a whole different kettle of fish. Anything could have happened in between the two malloc calls. Nothing is guaranteed. The memory pointed to by the return value of a malloc call is uninitialized, and you cannot make any assumptions about its content. It stands to reason that the memory will not have changed in a typical, optimized C library, but there's no guarantee. A "safe" runtime environment may well choose to overwrite freed or allocated memory with a specific test pattern to allow better detection of invalid accesses.

It can give you any value.
C/C++ standards do not mandate the value to be anything specific. In technical terms the value of any uninitialized variable/memory is Indeterminate.
In short, Your program should not rely on this value to be anything specific and if it does then it is non-portable.

Its not guaranteed anything. Printing block you got from malloc may print previous data, or may not. There is many things that can alter next malloc block(so next address will be different) or alter old memory block itself, too.

free will often modify the memory you're freeing.
It's a common trick, especially in debug mode, for free to overwrite memory with some fixed pattern to make it easier to tell if you're double freeing memory, or just manipulating freed memory.
Likewise malloc might overwrite memory with a different pattern to make it obvious that the memory is uninitialised.

malloc and free are C style memory management rather than C++. C++ alternatives are new and delete operators. As for the bit pattern remaining in the memory after free(), yes, it's the same bit pattern. If you want to manually delete the bit pattern, you can use memset() or ZeroMemory() if you write WinApi code.

It will give you some garbage value as function free() will set the specified bytes to some pattern which will make it known to the memory that the bytes are freed and uninitialized. Having said so, it is very unlikely and highly improbable that you will encounter a case defined in your question. Even if you will get allocated the same bytes again, believe me, you will by no means recognize them :-)
AFAIK, free() generally sets the memory to 0xFE 0xEE with Visual Studio which roughly means that the memory was allocated but now is freed. These value are known as Sentinel Values which means that the heap is still owned by the process but not in use. Memory which are freed from the process will show you "?? ??".

First, the code is not c++ but plain c.
The reason is, that free() / delete exist there to so the system can note that the memory region is again available for allocation.
Why should it do anything beyond ?
This is a security issue, however. I believe some security-oriented modern systems do zero the memory before giving it to an application. Then if you use malloc() for the first time you will get an equivalent to calloc. Never the less, if you happen to free the memory and then allocate it again might be able to read your own data.
The plain reason for such behaviour is simple. Zeroing memory would be time consuming and you can do it by hand. Actually it has O(n) complexity. If you write a number cruncher that reuses its memory, you do not care what you get in there after malloc() because most likely you should be overwriting it, and you definitely do not want your FLOPS to be negatively affected by unnecessary memset() upon calling free().
If you want to be sure that nothing can read the memory after you call free you need to use memset(a, 0, SIZE) before calling free().

Related

Why does malloc initialize the values to 0 in gcc?

Maybe it is different from platform to platform, but
when I compile using gcc and run the code below, I get 0 every time in my ubuntu 11.10.
#include <stdio.h>
#include <stdlib.h>
int main()
{
double *a = malloc(sizeof(double)*100)
printf("%f", *a);
}
Why do malloc behave like this even though there is calloc?
Doesn't it mean that there is an unwanted performance overhead just to initialize the values to 0 even if you don't want it to be sometimes?
EDIT: Oh, my previous example was not initiazling, but happened to use "fresh" block.
What I precisely was looking for was why it initializes it when it allocates a large block:
int main()
{
int *a = malloc(sizeof(int)*200000);
a[10] = 3;
printf("%d", *(a+10));
free(a);
a = malloc(sizeof(double)*200000);
printf("%d", *(a+10));
}
OUTPUT: 3
0 (initialized)
But thanks for pointing out that there is a SECURITY reason when mallocing! (Never thought about it). Sure it has to initialize to zero when allocating fresh block, or the large block.
Short Answer:
It doesn't, it just happens to be zero in your case.(Also your test case doesn't show that the data is zero. It only shows if one element is zero.)
Long Answer:
When you call malloc(), one of two things will happen:
It recycles memory that was previous allocated and freed from the same process.
It requests new page(s) from the operating system.
In the first case, the memory will contain data leftover from previous allocations. So it won't be zero. This is the usual case when performing small allocations.
In the second case, the memory will be from the OS. This happens when the program runs out of memory - or when you are requesting a very large allocation. (as is the case in your example)
Here's the catch: Memory coming from the OS will be zeroed for security reasons.*
When the OS gives you memory, it could have been freed from a different process. So that memory could contain sensitive information such as a password. So to prevent you reading such data, the OS will zero it before it gives it to you.
*I note that the C standard says nothing about this. This is strictly an OS behavior. So this zeroing may or may not be present on systems where security is not a concern.
To give more of a performance background to this:
As #R. mentions in the comments, this zeroing is why you should always use calloc() instead of malloc() + memset(). calloc() can take advantage of this fact to avoid a separate memset().
On the other hand, this zeroing is sometimes a performance bottleneck. In some numerical applications (such as the out-of-place FFT), you need to allocate a huge chunk of scratch memory. Use it to perform whatever algorithm, then free it.
In these cases, the zeroing is unnecessary and amounts to pure overhead.
The most extreme example I've seen is a 20-second zeroing overhead for a 70-second operation with a 48 GB scratch buffer. (Roughly 30% overhead.)
(Granted: the machine did have a lack of memory bandwidth.)
The obvious solution is to simply reuse the memory manually. But that often requires breaking through established interfaces. (especially if it's part of a library routine)
The OS will usually clear fresh memory pages it sends to your process so it can't look at an older process' data. This means that the first time you initialize a variable (or malloc something) it will often be zero but if you ever reuse that memory (by freeing it and malloc-ing again, for instance) then all bets are off.
This inconsistence is precisely why uninitialized variables are such a hard to find bug.
As for the unwanted performance overheads, avoiding unspecified behaviour is probably more important. Whatever small performance boost you could gain in this case won't compensate the hard to find bugs you will have to deal with if someone slightly modifies the codes (breaking previous assumptions) or ports it to another system (where the assumptions might have been invalid in the first place).
Why do you assume that malloc() initializes to zero? It just so happens to be that the first call to malloc() results in a call to sbrk or mmap system calls, which allocate a page of memory from the OS. The OS is obliged to provide zero-initialized memory for security reasons (otherwise, data from other processes gets visible!). So you might think there - the OS wastes time zeroing the page. But no! In Linux, there is a special system-wide singleton page called the 'zero page' and that page will get mapped as Copy-On-Write, which means that only when you actually write on that page, the OS will allocate another page and initialize it. So I hope this answers your question regarding performance. The memory paging model allows usage of memory to be sort-of lazy by supporting the capability of multiple mapping of the same page plus the ability to handle the case when the first write occurs.
If you call free(), the glibc allocator will return the region to its free lists, and when malloc() is called again, you might get that same region, but dirty with the previous data. Eventually, free() might return the memory to the OS by calling system calls again.
Notice that the glibc man page on malloc() strictly says that the memory is not cleared, so by the "contract" on the API, you cannot assume that it does get cleared. Here's the original excerpt:
malloc() allocates size bytes and returns a pointer to the allocated memory.
The memory is not cleared. If size is 0, then malloc() returns either NULL,
or a unique pointer value that can later be successfully passed to free().
If you would like, you can read more about of that documentation if you are worried about performance or other side-effects.
I modified your example to contain 2 identical allocations. Now it is easy to see malloc doesn't zero initialize memory.
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
{
double *a = malloc(sizeof(double)*100);
*a = 100;
printf("%f\n", *a);
free(a);
}
{
double *a = malloc(sizeof(double)*100);
printf("%f\n", *a);
free(a);
}
return 0;
}
Output with gcc 4.3.4
100.000000
100.000000
From gnu.org:
Very large blocks (much larger than a page) are allocated with mmap (anonymous or via /dev/zero) by this implementation.
The standard does not dictate that malloc() should initialize the values to zero. It just happens at your platform that it might be set to zero, or it might have been zero at the specific moment you read that value.
Your code doesn't demonstrate that malloc initialises its memory to 0. That could be done by the operating system, before the program starts. To see shich is the case, write a different value to the memory, free it, and call malloc again. You will probably get the same address, but you will have to check this. If so, you can look to see what it contains. Let us know!
malloc doesn't initialize memory to zero. It returns it to you as it is without touching the memory or changing its value.
So, why do we get those zeros?
Before answering this question we should understand how malloc works:
When you call malloc it checks whether the glibc allocator has a memory of the requested size or not.
If it does, it will return this memory to you. This memory usually comes due to a previous free operation so it has garbage value(maybe zero or not) in most cases.
On the other hand, if it can't find memory, it will ask the OS to allocate memory for it, by calling sbrk or mmap system calls.
The OS returns a zero-initialized page for security reasons as this memory may have been used by another process and carries valuable information such as passwords or personal data.
You can read about it yourself from this Link:
Neighboring chunks can be coalesced on a free no matter what their
size is. This makes the implementation suitable for all kinds of
allocation patterns without generally incurring high memory waste
through fragmentation.
Very large blocks (much larger than a page) are allocated with mmap
(anonymous or via /dev/zero) by this implementation
In some implementations calloc uses this property of the OS and asks the OS to allocate pages for it to make sure the memory is always zero-initialized without initializing it itself.
Do you know that it is definitely being initialised? Is it possible that the area returned by malloc() just frequently has 0 at the beginning?
Never ever count on any compiler to generate code that will initialize memory to anything. malloc simply returns a pointer to n bytes of memory someplace hell it might even be in swap.
If the contents of the memory is critical initialize it yourself.

Allocating and freeing memory

My question is quite simple. We generally allocate memory by declaring a pointer and then assigning a block of memory to that pointer. Suppose somewhere in the code I happen to use
ptr = ptr + 1
and then I use
free(ptr)
can someone tell what will happen. The entire memory block will get deallocated or something else. Can I partially deallocate the memory?
You must always pass exactly the same pointer to free that you got from malloc (or realloc.) If you don't, the "behavior is undefined", which is a term of art that means you can't rely on the program behaving in any predictable way. In this case, though, you should expect it to crash immediately. (If you get unlucky, it will instead corrupt memory, causing a crash some time later, or worse, incorrect output.)
The only way to partially deallocate memory is realloc with a smaller size, but that's only good for trimming at the end and isn't guaranteed to make the trimmed-off chunk available for some other allocation.
It's impossible to deallocate part of a memory block. The only thing you can do is reallocate the block giving it a different size. However this does not guarantee that the block will land in the same place in memory (it might be copied somewhere else).
You must pass to free() the same pointer to the same location malloc() returned. That's because the allocator keeps a sort of list of the blocks you allocated (#everyone: feel free to add/modify if I'm wrong) and if the pointer you pass to free doesn't compare it this list free() complains ("bad memory block" maybe?). To partly deallocate the memory you should use realloc(), which changes the dimensions of that block of memory, but is slow and inefficient. You should use it only when you're sure of the new size of the block or leaving more space to fill in the future.
malloc() , free() and realloc() are not part of C language.
These are functions defines in standard library. Code in standard library that deals with it is usually called "allocator". So, actual answer is "It depends on C library".
On Linux, glibc would crash the program.
On VC++, C runtime would corrupt allocator's state
Source code is available for these libraries, so you can actually put a breakpoint and step into free.

I can use more memory than how much I've allocated with malloc(), why?

char *cp = (char *) malloc(1);
strcpy(cp, "123456789");
puts(cp);
output is "123456789" on both gcc (Linux) and Visual C++ Express, does that mean when there is free memory, I can actually use more than what I've allocated with malloc()?
and why malloc(0) doesn't cause runtime error?
Thanks.
You've asked a very good question and maybe this will whet your appetite about operating systems. Already you know you've managed to achieve something with this code that you wouldn't ordinarily expect to do. So you would never do this in code you want to make portable.
To be more specific, and this depends entirely on your operating system and CPU architecture, the operating system allocates "pages" of memory to your program - typically this can be in the order of 4 kilobytes. The operating system is the guardian of pages and will immediately terminate any program that attempts to access a page it has not been assigned.
malloc, on the other hand, is not an operating system function but a C library call. It can be implemented in many ways. It is likely that your call to malloc resulted in a page request from the operating system. Then malloc would have decided to give you a pointer to a single byte inside that page. When you wrote to the memory from the location you were given you were just writing in a "page" that the operating system had granted your program, and thus the operating system will not see any wrong doing.
The real problems, of course, will begin when you continue to call malloc to assign more memory. It will eventually return pointers to the locations you just wrote over. This is called a "buffer overflow" when you write to memory locations that are legal (from an operating system perspective) but could potentially be overwriting memory another part of the program will also be using.
If you continue to learn about this subject you'll begin to understand how programs can be exploited using such "buffer overflow" techniques - even to the point where you begin to write assembly language instructions directly into areas of memory that will be executed by another part of your program.
When you get to this stage you'll have gained much wisdom. But please be ethical and do not use it to wreak havoc in the universe!
PS when I say "operating system" above I really mean "operating system in conjunction with privileged CPU access". The CPU and MMU (memory management unit) triggers particular interrupts or callbacks into the operating system if a process attempts to use a page that has not been allocated to that process. The operating system then cleanly shuts down your application and allows the system to continue functioning. In the old days, before memory management units and privileged CPU instructions, you could practically write anywhere in memory at any time - and then your system would be totally at the mercy of the consequences of that memory write!
No. You get undefined behavior. That means anything can happen, from it crashing (yay) to it "working" (boo), to it reformatting your hard drive and filling it with text files that say "UB, UB, UB..." (wat).
There's no point in wondering what happens after that, because it depends on your compiler, platform, environment, time of day, favorite soda, etc., all of which can do whatever they want as (in)consistently as they want.
More specifically, using any memory you have not allocated is undefined behavior. You get one byte from malloc(1), that's it.
When you ask malloc for 1 byte, it will probably get 1 page (typically 4KB) from the operating system. This page will be allocated to the calling process so as long as you don't go out of the page boundary, you won't have any problems.
Note, however, that it is definitely undefined behavior!
Consider the following (hypothetical) example of what might happen when using malloc:
malloc(1)
If malloc is internally out of memory, it will ask the operating system some more. It will typically receive a page. Say it's 4KB in size with addresses starting at 0x1000
Your call returns giving you the address 0x1000 to use. Since you asked for 1 byte, it is defined behavior if you only use the address 0x1000.
Since the operating system has just allocated 4KB of memory to your process starting at address 0x1000, it will not complain if you read/write something from/to addresses 0x1000-0x1fff. So you can happily do so but it is undefined behavior.
Let's say you do another malloc(1)
Now malloc still has some memory left so it doesn't need to ask the operating system for more. It will probably return the address 0x1001.
If you had written to more than 1 byte using the address given from the first malloc, you will get into troubles when you use the address from the second malloc because you will overwrite the data.
So the point is you definitely get 1 byte from malloc but it might be that malloc internally has more memory allocated to you process.
No. It means that your program behaves badly. It writes to a memory location that it does not own.
You get undefined behavior - anything can happen. Don't do it and don't speculate about whether it works. Maybe it corrupts memory and you don't see it immediately. Only access memory within the allocated block size.
You may be allowed to use until the memory reaches some program memory or other point at which your applicaiton will most likely crash for accessing protected memory
So many responses and only one that gives the right explanation. While the page size, buffer overflow and undefined behaviour stories are true (and important) they do not exactly answer the original question. In fact any sane malloc implementation will allocate at least in size of the alignment requirement of an intor a void *. Why, because if it allocated only 1 byte then the next chunk of memory wouldn't be aligned anymore. There's always some book keeping data around your allocated blocks, these data structures are nearly always aligned to some multiple of 4. While some architectures can access words on unaligned addresses (x86) they do incure some penalties for doing that, so allocator implementer avoid that. Even in slab allocators there's no point in having a 1 byte pool as small size allocs are rare in practice. So it is very likely that there's 4 or 8 bytes real room in your malloc'd byte (this doesn't mean you may use that 'feature', it's wrong).
EDIT: Besides, most malloc reserve bigger chunks than asked for to avoid to many copy operations when calling realloc. As a test you can try using realloc in a loop with growing allocation size and compare the returned pointer, you will see that it changes only after a certain threshold.
You just got lucky there. You are writing to locations which you don't own this leads to undefined behavior.
On most platforms you can not just allocate one byte. There is often also a bit of housekeeping done by malloc to remember the amount of allocated memory. This yields to the fact that you usually "allocate" memory rounded up to the next 4 or 8 bytes. But this is not a defined behaviour.
If you use a few bytes more you'll very likeley get an access violation.
To answer your second question, the standard specifically mandates that malloc(0) be legal. Returned value is implementation-dependent, and can be either NULL or a regular memory address. In either case, you can (and should) legally call free on the return value when done. Even when non-NULL, you must not access data at that address.
malloc allocates the amount of memory you ask in heap and then return a pointer to void (void *) that can be cast to whatever you want.
It is responsibility of the programmer to use only the memory that has been allocate.
Writing (and even reading in protected environment) where you are not supposed can cause all sort of random problems at execution time. If you are lucky your program crash immediately with an exception and you can quite easily find the bug and fix it. If you aren't lucky it will crash randomly or produce unexpected behaviors.
For the Murphy's Law, "Anything that can go wrong, will go wrong" and as a corollary of that, "It will go wrong at the right time, producing the most large amount of damage".
It is sadly true. The only way to prevent that, is to avoid that in the language that you can actually do something like that.
Modern languages do not allow the programmer to do write in memory where he/she is not supposed (at least doing standard programming). That is how Java got a lot of its traction. I prefer C++ to C. You can still make damages using pointers but it is less likely. That is the reason why Smart Pointers are so popular.
In order to fix these kind of problems, a debug version of the malloc library can be handy. You need to call a check function periodically to sense if the memory was corrupted.
When I used to work intensively on C/C++ at work, we used Rational Purify that in practice replace the standard malloc (new in C++) and free (delete in C++) and it is able to return quite accurate report on where the program did something it was not supposed. However you will never be sure 100% that you do not have any error in your code. If you have a condition that happen extremely rarely, when you execute the program you may not incur in that condition. It will eventually happen in production on the most busy day on the most sensitive data (according to Murphy's Law ;-)
It could be that you're in Debug mode, where a call to malloc will actually call _malloc_dbg. The debug version will allocate more space than you have requested to cope with buffer overflows. I guess that if you ran this in Release mode you might (hopefully) get a crash instead.
You should use new and delete operators in c++... And a safe pointer to control that operations doesn't reach the limit of the array allocated...
There is no "C runtime". C is glorified assembler. It will happily let you walk all over the address space and do whatever you want with it, which is why it's the language of choice for writing OS kernels. Your program is an example of a heap corruption bug, which is a common security vulnerability. If you wrote a long enough string to that address, you'd eventually overrun the end of the heap and get a segmentation fault, but not before you overwrote a lot of other important things first.
When malloc() doesn't have enough free memory in its reserve pool to satisfy an allocation, it grabs pages from the kernel in chunks of at least 4 kb, and often much larger, so you're probably writing into reserved but un-malloc()ed space when you initially exceed the bounds of your allocation, which is why your test case always works. Actually honoring allocation addresses and sizes is completely voluntary, so you can assign a random address to a pointer, without calling malloc() at all, and start working with that as a character string, and as long as that random address happens to be in a writable memory segment like the heap or the stack, everything will seem to work, at least until you try to use whatever memory you were corrupting by doing so.
strcpy() doesn't check if the memory it's writing to is allocated. It just takes the destination address and writes the source character by character until it reaches the '\0'. So, if the destination memory allocated is smaller than the source, you just wrote over memory. This is a dangerous bug because it is very hard to track down.
puts() writes the string until it reaches '\0'.
My guess is that malloc(0) only returns NULL and not cause a run-time error.
My answer is in responce to Why does printf not seg fault or produce garbage?
From
The C programming language by Denis Ritchie & Kernighan
typedef long Align; /* for alignment to long boundary */
union header { /* block header */
struct {
union header *ptr; /* next block if on free list */
unsigned size; /* size of this block */
} s;
Align x; /* force alignment of blocks */
};
typedef union header Header;
The Align field is never used;it just forces each header to be aligned on a worst-case boundary.
In malloc,the requested size in characters is rounded up to the proper number of header-sized units; the block that will be allocated contains
one more unit, for the header itself, and this is the value recorded in the
size field of the header.
The pointer returned by malloc points at the free space, not at the header itself.
The user can do anything with the space requested, but if anything is written outside of the allocated space the list is likely to be scrambled.
-----------------------------------------
| | SIZE | |
-----------------------------------------
| |
points to |-----address returned touser
next free
block
-> a block returned by malloc
In statement
char* test = malloc(1);
malloc() will try to search consecutive bytes from the heap section of RAM if requested bytes are available and it returns the address as below
--------------------------------------------------------------
| free memory | memory in size allocated for user | |
----------------------------------------------------------------
0x100(assume address returned by malloc)
test
So when malloc(1) executed it won't allocate just 1 byte, it allocated some extra bytes to maintain above structure/heap table. you can find out how much actual memory allocated when you requested only 1 byte by printing test[-1] because just to before that block contain the size.
char* test = malloc(1);
printf("memory allocated in bytes = %d\n",test[-1]);
If the size passed is zero, and ptr is not NULL then the call is equivalent to free.

Weird behavior of malloc()

Trying to understand answers to my question
what happens when tried to free memory allocated by heap manager, which allocates more than asked for?
I wrote this function and puzzled by its output
int main(int argc,char **argv){
char *p,*q;
p=malloc(1);
strcpy(p,"01234556789abcdefghijklmnopqrstuvwxyz"); //since malloc allocates atleast 1 byte
q=malloc(2);
// free(q);
printf("q=%s\n",q);
printf("p=%s\n",p);
return 0;
}
Output
q=vwxyz
p=01234556789abcdefghijklm!
Can any one explain this behavior? or is this implementation specific?
also if free(q) is uncommented, I am getting SIGABRT.
You are copying more bytes to *p than you have allocated, overwriting whatever might have been at the memory locations after the allocated space.
When you then call malloc again, it takes a part of memory it knows to be unused at the moment (which happens to be a few bytes after *p this time), writes some bookkeeping information there and returns a new pointer to that location.
The bookkeeping information malloc writes happens to start with a '!' in this run, followed by a zero byte, so your first string is truncated. The new pointer happens point to the end of the memory you overwrote before.
All this is implementation specific and might lead to different results each run or depending on the phase of the moon. The second call to malloc() would also absolutely be in its right to just crash the program in horrible ways (especially since you might be overwriting memory that malloc uses internally).
You are just being lucky this time: this is an undefined behavior and don't count on it.
Ususally, but depending on the OS, memory is allocated in "pages" (i.e. multiple bytes). Malloc() on the other hand allocates memory from those "pages" in a more "granular" way: there is "overhead" associated with each allocation being managed through malloc.
The signal you are getting from free is most probably related to the fact that you mess up the memory management by writing past what you were allocated with p i.e. writing on the overhead information used by the memory manager to keep track of memory blocks etc.
This is a classical heap overflow. p has only 1 byte, but the heap manager pads the allocation (32 bytes in your case). q is allocated right after p, so it naturally gets the next available spot. For example if the address of p is 0x1000, the adress that gets assigned to q is 0x1020. This explains why q points to part of the string.
The more interesting question is why p is only "01234556789abcdefghijklm" and not "01234556789abcdefghijklmnopqrstuvwxyz". The reason is that memory manager uses the gaps between allocation for its internal bookkeeping. From a memory manager perspective the memory layout is as following:
p D q
where D is internal data structure of memory manager (0x1010 to 0x1020 in our example). While allocating memory for q, the heap manager writes its stuff to the bookkeeping area (0x1010 to 0x1020). A byte is changed to 0 truncates the string since it is treated as NULL terminator.
THE VALUE OF "p":
you allocated enough space to fit this: ""
[[ strings are null terminated, remember? you don't see it, but it's there -- so that's one byte used up. ]]
but you are trying to store this: "01234556789abcdefghijklmnopqrstuvwxyz"
the result, therefore, is that the "stuff" starting with "123.." is being stored beyond the memory you allocated -- possibly writing over other "stuff" elsewhere. as such your results will be messy, and as "jidupont" said you're lucky that it doesn't just crash.
OUTPUT OF PRINTING [BROKEN] "p"
as said, you've written way past the end of "p"; but malloc doesn't know this. so when you asked for another block of memory for "q", maybe it gave you the memory following what it gave you for "p"; and maybe it aligned the memory (typical) so it's pointer is rounded up to some nice number; and then maybe it uses some of this memory to store bookkeeping information you're not supposed to be concerned with. but you don't know, do you? you're not supposed to know either -- you're just not supposed to write to memory that you haven't allocated yourself!
and the result? you see some of what you expected -- but it's truncated! because ... another block was perhaps allocated IN the memory you used (and used without permission, i might add), or something else owned that block and changed it, and in any case some values were changed -- resulting in: "01234556789abcdefghijklm!". again, lucky that things didn't just explode.
FREEING "q"
if you free "q", then try to access it -- as you are doing by trying to print it -- you will (usually) get a nasty error. this is well deserved. you shouldn't uncomment that "free(q)". but you also shouldn't try to print "q", because you haven't put anything there yet! for all you know, it might contain gibberish, and so print will continue until it encounters a NULL -- which may not happen until the end of the world -- or, more likely, until your program accesses yet more memory that it shouldn't, and crashes because the OS is not happy with you. :)
It shouldn't be that puzzling that intentionally misusing these functions will give nonsensical results.
Two consecutive mallocs are not guaranteed to give you two consecutive areas of memory. malloc may choose to allocate more than the amount of memory you requested, but not less if the allocation succeeds. The behavior of your program when you choose to overwrite unallocated memory is not guaranteed to be predictable.
This is just the way C is. You can easily misuse the returned memory areas from malloc and the language doesn't care. It just assumes that in a correct program you will never do so, and everything else is up for grabs.
Malloc is a function just like yours :)
There is a lot of malloc implementations so i won't go into useless details.
At the first call malloc it asks memory to the system. For the example let's say 4096 which is the standard memory page size which is good. So you call malloc asking for 1 byte. The function malloc will asks 4096 bytes to the system. Next, it will use a small part of this memory to store internal data such the positions of the available blocks. Then it will cut one part of this block and send it back to you.
An internal algorithm will trys to reuse the blocks after a call to free to avoid re-asking memory to the system.
So with this little explanation you can now understand why you code is working.
You are writing in the memory asked my malloc to the system. This comportment doesn't bother the system because you stay in the memory allocated for your processes. The problem is you can't know for sure that you are not writing on critical parts of your software memory. This kind off error are called buffer overflow and are causing most of the "mystical bugs".
The best way to avoid them is to use valgrind on linux. This soft will tell you if you are writing or reading where you are not supposed to.
It that clear enough ?
I suggest reading this introduction.
Pointers And Memory
It helped me understand the difference between stack and heap allocation, very good introduction.

Memory leak question in C after moving pointer (What exactly is deallocated?)

I realize the code sample below is something you should never do. My question is just one of interest. If you allocate a block of memory, and then move the pointer (a no-no), when you deallocate the memory, what is the size of the block that is deallocated, and where is it in memory? Here's the contrived code snippet:
#include <stdio.h>
#include <string.h>
int main(void) {
char* s = malloc(1024);
strcpy(s, "Some string");
// Advance the pointer...
s += 5;
// Prints "string"
printf("%s\n", s);
/*
* What exactly are the beginning and end points of the memory
* block now being deallocated?
*/
free(s);
return 0;
}
Here is what I think I happens. The memory block being deallocated begins with the byte that holds the letter "s" in "string". The 5 bytes that held "Some " are now lost.
What I'm wondering is: Are the 5 bytes whose location in memory immediately follows the end of the original 1024 bytes deallocated as well, or are they just left alone?
Anyone know for sure what is it the compiler does? Is it undefined?
Thanks.
You cannot pass a pointer that was not obtained from a malloc, calloc or realloc to free (except NULL).
Question 7.19 in the C FAQ is relevant to your question.
The consequences of invoking undefined behavior are explained here.
It's undefined behavior in the standard, so you can't rely on anything.
Remember that blocks are artificially delimited areas of memory, and don't automatically
show up. Something has to keep track of the block, in order to free everything necessary and nothing more. There's no possible termination, like C strings, since there's no value or combination of values that can be guaranteed not to be inside the block.
Last I looked, there were two basic implementation practices.
One is to keep a separate record of allocated blocks, along with the address allocated. The free() function looks up the block to see what to free. In this case, it's likely to simply not find it, and may well just do nothing. Memory leak. There are, however, no guarantees.
One is to keep the block information in a part of memory just before the allocation address. In this case, free() is using part of the block as a block descriptor, and depending on what's stored there (which could be anything) it will free something. It could be an area that's too small, or an area that's too large. Heap corruption is very likely.
So, I'd expect either a memory leak (nothing gets freed), or heap corruption (too much is marked free, and then reallocated).
Yes, it is undefined behavior. You're essentially freeing a pointer you didn't malloc.
You cannot pass a pointer you did not obtain from malloc (or calloc or realloc...) to free. That includes offsets into blocks you did obtain from malloc. Breaking this rule could result in anything happening. Usually this ends up being the worst imaginable possibility at the worst possible moment.
As a further note, if you wanted to truncate the block, there's a legal way to do this:
#include <stdio.h>
#include <string.h>
int main() {
char *new_s;
char *s = malloc(1024);
strcpy(s, "Some string");
new_s = realloc(s, 5);
if (!new_s) {
printf("Out of memory! How did this happen when we were freeing memory? What a cruel world!\n");
abort();
}
s = new_s;
s[4] = 0; // put the null terminator back on
printf("%s\n", s); // prints Some
free(s);
return 0;
}
realloc works both to enlarge and shrink memory blocks, but may (or may not) move the memory to do so.
It is not the compiler that does it, it is the standard library. The behavior is undefined. The library knows that it allocated the original s to you. The s+5 is not assigned to any memory block known by the library, even though it happens to be inside a known block. So, it won't work.
What I'm wondering is: Are the 5 bytes whose location in memory immediately follows the end of the original 1024 bytes deallocated as well, or are they just left alone?
Both. The result is undefined so a compiler is free to do either of those, or anything else they'd like really. Of course (as with all cases of "undefined behavior") for a particular platform and compiler there is a specific answer, but any code that relies on such behavior is a bad idea.
Calling free() on a ptr that wasnt allocated by malloc or its brethren is undefined.
Most implementations of malloc allocate a small (typically 4byte) header region immediately before the ptr returned. Which means when you allocated 1024 bytes, malloc actually reserved 1028 bytes. When free( ptr ) is called, if ptr is not 0, it inspects the data at ptr - sizeof(header). Some allocators implement a sanity check, to make sure its a valid header, and which might detect a bad ptr, and assert or exit. If there is no sanity check, or it erroneously passes, free routine will act on whatever data happens to be in the header.
Adding to the more formal answers: I'd compare the mechanics of this to one taking a book in the library (malloc), then tearing off a few dozen pages together with the cover (advance the pointer), and then attempting to return it (free).
You might find a librarian (malloc/free library implementation) that takes such a book back, but in a lot of case I'd expect you would pay a fine for negligent handling.
In the draft of C99 (I don't have the final C99 handy in front of me), there is something to say on this topic:
The free function causes the space pointed to by ptr to be deallocated,
that is, made available for further allocation. If ptr is a null pointer, no action
occurs. Otherwise, if the argument does not match a pointer earlier returned
by the calloc, malloc, or realloc function, or if the space has been
deallocated by a call to free or realloc, the behaviour is undefined.
In my experience, a double free or the free of the "pointer" that was not returned via malloc will result in a memory corruption and/or crash, depending on your malloc implementation. The security people from both sides of the fence used this behaviour not once, in order to do various interesting things at least in early versions of the widely used Doug Lea's malloc package.
The library implementation might put some data structure before the pointer it returns to you. Then in free() it decrements the pointer to get at the data structure telling it how to place the memory back into the free pool. So the 5 bytes at the beginning of your string "Some " is interpreted as the end of the struct used by the malloc() algorithm. Perhaps the end of a 32 bit value, like the size of memory allocated, or a link in a linked list. It depends on the implementation. Whatever the details, it'll just crash your program. As Sinan points out, if you're lucky!
Let's be smart here... free() is not a black hole. At the very least, you have the CRT source code. Beyond that, you need the kernel source code.
Sure, the behavior is undefined in that it is up to the CRT/OS to decide what to do. But that doesn't prevent you from finding out what your platform actualy does.
A quick look into the Windows CRT shows that free() leads right to HeapFree() using a CRT specific heap. Beoyond that you're into RtlHeapFree() and then into system space (NTOSKRN.EXE) with the memory manager Mm*().
There are consistancey checks throughout all these code paths. But doing differnt things to the memory will cause differnt code paths. Hence the true definition of undefined.
At a quick glance, I can see that an allocated block of memory has a marker at the end. When the memory is freed, each byte is written over with a distinct byte. The runtime can do a check to see if the end of block marker was overwritten and raise an exception if so.
This is a posiblility in your case of freeing memory a few bytes into your block (or over-writing your allocated size). Of course you can trick this and write the end of block marker yourself at the correct location. This will get you past the CRT check, but as the code-path goes futher, more undefined behavoir occurs. Three things can happen: 1) absolutely no harm, 2) memory corruption within the CRT heap, or 3) a thrown exception by any of the memory management functions.
Short version: It's undefined behavior.
Long version: I checked the CWE site and found that, while it's a bad idea all around, nobody seemed to have a solid answer. Probably because it's undefined.
My guess is that most implementations, assuming they don't crash, would either free 1019 bytes (in your example), or else free 1024 and get a double free or similar on the last five. Just speaking theoretically for now, it depends on whether the malloc routine's internal storage tables contains an address and a length, or a start address and an end address.
In any case, it's clearly not a good idea. :-)

Resources