Negative indexing; gives error upon freeing - c

In C, I encountered an error when I coded the following example:
int *pointer;
int i = 0;
pointer = malloc(10 * sizeof(int));
pointer[i - 1] = 4;
Clearly i is a negative index into pointer.
Even if the incorrect piece memory was altered, why was an error only triggered upon free(pointer) [later on in the code]? The error was: double free or corruption (out).
EDIT: The compiler used was gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3

Most if not all memory managers (eg: malloc) allocate extra data around the memory you've requested. This extra is to help the memory manager manage the various allocations by adding some of its own data to the allocation. When you did your negative index, you overwrote this extra data. It's not invalid memory per se, so the CPU can't protect against it, but malloc will indeed complain when it sees that its data has been borked. That will occur on free().

Fundamentally, accessing pointer[-1] the way you did yields Undefined Behavior; in theory, anything could happen. But in practice, there’s a good explanation for the particular error message you got.
Recall that malloc has to save the size of the allocated chunk of memory somewhere so that free can work correctly. On some systems — and it seems yours is among that bunch — this information is stored in memory just before the address that malloc returns. (It might be interesting to print out that value, if you want to learn about your system’s implementation.)
When you overwrote that chunk of memory, you left the internals of the malloc-free system in an inconsistent state, leading free to report an error when this was discovered. The most likely time for such a discovery, of course, is when the memory management functions are called about this area of memory. (Some malloc corruptions will be detected the next time you call malloc, but that usually takes more deliberate effort to achieve.) Calling realloc should give you a similar error message.

Related

How can a memory block created using malloc store more memory than it was initialized with? [duplicate]

when I try the code below it works fine. Am I missing something?
main()
{
int *p;
p=malloc(sizeof(int));
printf("size of p=%d\n",sizeof(p));
p[500]=999999;
printf("p[0]=%d",p[500]);
return 0;
}
I tried it with malloc(0*sizeof(int)) or anything but it works just fine. The program only crashes when I don't use malloc at all. So even if I allocate 0 memory for the array p, it still stores values properly. So why am I even bothering with malloc then?
It might appear to work fine, but it isn't very safe at all. By writing data outside the allocated block of memory you are overwriting some data you shouldn't. This is one of the greatest causes of segfaults and other memory errors, and what you're observing with it appearing to work in this short program is what makes it so difficult to hunt down the root cause.
Read this article, in particular the part on memory corruption, to begin understanding the problem.
Valgrind is an excellent tool for analysing memory errors such as the one you provide.
#David made a good comment. Compare the results of running your code to running the following code. Note the latter results in a runtime error (with pretty much no useful output!) on ideone.com (click on links), whereas the former succeeds as you experienced.
int main(void)
{
int *p;
p=malloc(sizeof(int));
printf("size of p=%d\n",sizeof(p));
p[500]=999999;
printf("p[0]=%d",p[500]);
p[500000]=42;
printf("p[0]=%d",p[500000]);
return 0;
}
If you don't allocate memory, p has garbage in it, so writing to it likely will fail. Once you made a valid malloc call, p is pointing to valid memory location and you can write to it. You are overwriting memory that you shouldn't write to, but nobody's going to hold your hand and tell you about it. If you run your program and a memory debugger such as valgrind, it will tell you.
Welcome to C.
Writing past the end of your memory is Undefined Behaviour™, which means that anything could happen- including your program operating as if what you just did was perfectly legal. The reason for your program running as if you had done malloc(501*sizeof(int)) are completely implementation-specific, and can indeed be specific to anything, including the phase of the moon.
This is because P would be assigned some address no matter what size you use with malloc(). Although, with a zero size you would be referencing invalid memory as the memory hasn't been allocated, but it may be within a location which wouldn't cause program crash, though the behavior will be undefined.
Now if you do not use malloc(), it would be pointing to a garbaging location and trying to access that is likely to cause program crash.
I tried it with malloc(0*sizeof(int))
According to C99 if the size passed to malloc is 0, a C runtime can return either a NULL pointer or the allocation behaves as if the request was for non-zero allocation, except that the returned pointer should not be dereferenced. So it is implementation defined (e.g. some implementations return a zero-length buffer) and in your case you do not get a NULL pointer back, but you are using a pointer you should not be using.If you try it in a different runtime you could get a NULL pointer back.
When you call malloc() a small chunk of memory is carved out of a larger page for you.
malloc(sizeof(int));
Does not actually allocate 4 bytes on a 32bit machine (the allocator pads it up to a minimum size) + size of heap meta data used to track the chunk through its lifetime (chunks are placed in bins based on their size and marked in-use or free by the allocator). hxxp://en.wikipedia.org/wiki/Malloc or more specifically hxxp://en.wikipedia.org/wiki/Malloc#dlmalloc_and_its_derivatives if you're testing this on Linux.
So writing beyond the bounds of your chunk doesn't necessarily mean you are going to crash. At p+5000 you are not writing outside the bounds of the page allocated for that initial chunk so you are technically writing to a valid mapped address. Welcome to memory corruption.
http://www.google.com/search?sourceid=chrome&ie=UTF-8&q=heap+overflows
Our CheckPointer tool can detect this error. It knows that the allocation of p was to a chunk of 4 bytes, and thus the assignment is made, it is outside the area for which p was allocated. It will tell you that the p[500] assignment is wrong.

Malloc -> how much memory has been allocated?

# include <stdio.h>
# include <stdbool.h>
# include <string.h>
# include <stdlib.h>
int main ()
{
char * buffer;
buffer = malloc (2);
if (buffer == NULL){
printf("big errors");
}
strcpy(buffer, "hello");
printf("buffer is %s\n", buffer);
free(buffer);
return 0;
}
I allocated 2 bytes of memory to the pointer/char buffer yet if I assign the C-style string hello to it, it still prints the entire string, without giving me any errors. Why doesn't the compiler give me an error telling me there isn't enough memory allocated? I read a couple of questions that ask how to check how much memory malloc actually allocates but I didn't find a concrete answer. Shouldn't the free function have to know exactly how much memory is allocated to buffer?
The compiler doesn't know. This is the joy and terror of C. malloc belongs to the runtime. All the compilers knows is that you have told it that it returns a void*, it has no idea how much, or how much strcpy is going to copy.
Tools like valgrind detect some of these errors. Other programming languages make it harder to shoot yourself in the foot. Not C.
No production malloc() implementation should prevent you from trying to write past what you allocated. It is assumed that if you allocate 123 bytes, you will use all or less than what you allocated. malloc(), for efficiency sake, has to assume that a programmer is going to keep track of their pointers.
Using memory that you didn't explicitly and successfully ask malloc() to give you is undefined behavior. You might have asked for n bytes but got n + x, due to the malloc() implementation optimizing for byte alignment. Or you could be writing to a black hole. You never can know, that's why it's undefined behavior.
That being said ...
There are malloc() implementations that give you built in statistics and debugging, however these need to be used in lieu of the standard malloc() facility just like you would if you were using a garbage collected variety.
I've also seen variants designed strictly for LD_PRELOAD that expose a function to allow you to define a callback with at least one void pointer as an argument. That argument expects a structure that contains the statistical data. Other tools like electric fence will simply halt your program on the exact instruction that resulted in an overrun or access to invalid blocks. As #R.. points out in comments, that is great for debugging but horribly inefficient.
In all honesty or (as they say) 'at the end of the day' - it's much easier to use a heap profiler such as Valgrind and its associated tools (massif) in this case which will give you quite a bit of information. In this particular case, Valgrind would have pointed out the obvious - you wrote past the allocated boundary. In most cases, however when this is not intentional, a good profiler / error detector is priceless.
Using a profiler isn't always possible due to:
Timing issues while running under a profiler (but those are common any time calls to malloc() are intercepted).
Profiler is not available for your platform / arch
The debug data (from a logging malloc()) must be an integral part of the program
We used a variant of the library that I linked in HelenOS (I'm not sure if they're still using it) for quite a while, as debugging at the VMM was known to cause insanity.
Still, think hard about future ramifications when considering a drop in replacement, when it comes to the malloc() facility you almost always want to use what the system ships.
How much malloc internally allocates is implementation-dependent and OS-dependent (e.g. multiples of 8 bytes or more). Your writing into the un-allocated bytes may lead to overwriting other variable's values even if your compiler and run-time dont detect the error. The free-function remembers the number of bytes allocated separate from the allocated region, for example in a free-list.
Why doesnt the compiler give me an
error telling me there isnt enough
memory allocated ?
C does not block you from using memory you should not. You can use that memory, but it is bad and result in Undefined Behaviour. You are writing in a place you should not. This program might appear as running correctly, but might later crash. This is UB. you do not know what might happen.
This is what is happening with your strcpy(). You write in place you do not own, but the language does not protect you from that. So you should make sure you always know what and where you are writing, or make sure you stop when you are about to exceed valid memory bounds.
I read a couple of questions that ask
how to check how much memory malloc
actually allocates but I didn't find a
concrete answer. Shouldn't the 'free'
function have to know how much memory
is exactly allocated to 'buffer' ?
malloc() might allocate more memory than you request cause of bit padding.
More : http://en.wikipedia.org/wiki/Data_structure_alignment
free() free-s the exact same amount you allocated with malloc(), but it is not as smart as you think. Eg:
int main()
{
char * ptr = malloc(10);
if(ptr)
{
++ptr; // Now on ptr+1
free(ptr); // Undefined Behaviour
}
}
You should always free() a pointer which points to the first block. Doing a free(0) is safe.
You've written past the end of the buffer you allocated. The result is undefined behavior. Some run time libraries with the right options have at least some ability to diagnose problems like this, but not all do, and even those that do only do so at run-time, and usually only when compiled with the correct options.
Malloc -> how much memory has been allocated?
When you allocate memory using malloc. On success it allocates memory and default allocation is 128k. first call to malloc gives you 128k.
what you requested is buffer = malloc (2); Though you requested 2 bytes. It has allocated 128k.
strcpy(buffer, "hello"); Allocated 128k chunk it started processing your request. "Hello"
string can fit into this.
This pgm will make you clear.
int main()
{
int *p= (int *) malloc(2);---> request is only 2bytes
p[0]=100;
p[1]=200;
p[2]=300;
p[3]=400;
p[4]=500;
int i=0;
for(;i<5;i++,p++)enter code here
printf("%d\t",*p);
}
On first call to malloc. It allocates 128k---> from that it process your request (2 bytes). The string "hello" can fit into it. Again when second call to malloc it process your request from 128k.
Beyond 128k it uses mmap interface. You can refer to man page of malloc.
There is no compiler/platform independent way of finding out how much memory malloc actually allocated. malloc will in general allocation slightly more than you ask it for see here:
http://41j.com/blog/2011/09/finding-out-how-much-memory-was-allocated/
On Linux you can use malloc_usable_size to find out how much memory you can use. On MacOS and other BSD platforms you can use malloc_size. The post linked above has complete examples of both these techniques.

Writing to pointer out of bounds after malloc() not causing error

when I try the code below it works fine. Am I missing something?
main()
{
int *p;
p=malloc(sizeof(int));
printf("size of p=%d\n",sizeof(p));
p[500]=999999;
printf("p[0]=%d",p[500]);
return 0;
}
I tried it with malloc(0*sizeof(int)) or anything but it works just fine. The program only crashes when I don't use malloc at all. So even if I allocate 0 memory for the array p, it still stores values properly. So why am I even bothering with malloc then?
It might appear to work fine, but it isn't very safe at all. By writing data outside the allocated block of memory you are overwriting some data you shouldn't. This is one of the greatest causes of segfaults and other memory errors, and what you're observing with it appearing to work in this short program is what makes it so difficult to hunt down the root cause.
Read this article, in particular the part on memory corruption, to begin understanding the problem.
Valgrind is an excellent tool for analysing memory errors such as the one you provide.
#David made a good comment. Compare the results of running your code to running the following code. Note the latter results in a runtime error (with pretty much no useful output!) on ideone.com (click on links), whereas the former succeeds as you experienced.
int main(void)
{
int *p;
p=malloc(sizeof(int));
printf("size of p=%d\n",sizeof(p));
p[500]=999999;
printf("p[0]=%d",p[500]);
p[500000]=42;
printf("p[0]=%d",p[500000]);
return 0;
}
If you don't allocate memory, p has garbage in it, so writing to it likely will fail. Once you made a valid malloc call, p is pointing to valid memory location and you can write to it. You are overwriting memory that you shouldn't write to, but nobody's going to hold your hand and tell you about it. If you run your program and a memory debugger such as valgrind, it will tell you.
Welcome to C.
Writing past the end of your memory is Undefined Behaviour™, which means that anything could happen- including your program operating as if what you just did was perfectly legal. The reason for your program running as if you had done malloc(501*sizeof(int)) are completely implementation-specific, and can indeed be specific to anything, including the phase of the moon.
This is because P would be assigned some address no matter what size you use with malloc(). Although, with a zero size you would be referencing invalid memory as the memory hasn't been allocated, but it may be within a location which wouldn't cause program crash, though the behavior will be undefined.
Now if you do not use malloc(), it would be pointing to a garbaging location and trying to access that is likely to cause program crash.
I tried it with malloc(0*sizeof(int))
According to C99 if the size passed to malloc is 0, a C runtime can return either a NULL pointer or the allocation behaves as if the request was for non-zero allocation, except that the returned pointer should not be dereferenced. So it is implementation defined (e.g. some implementations return a zero-length buffer) and in your case you do not get a NULL pointer back, but you are using a pointer you should not be using.If you try it in a different runtime you could get a NULL pointer back.
When you call malloc() a small chunk of memory is carved out of a larger page for you.
malloc(sizeof(int));
Does not actually allocate 4 bytes on a 32bit machine (the allocator pads it up to a minimum size) + size of heap meta data used to track the chunk through its lifetime (chunks are placed in bins based on their size and marked in-use or free by the allocator). hxxp://en.wikipedia.org/wiki/Malloc or more specifically hxxp://en.wikipedia.org/wiki/Malloc#dlmalloc_and_its_derivatives if you're testing this on Linux.
So writing beyond the bounds of your chunk doesn't necessarily mean you are going to crash. At p+5000 you are not writing outside the bounds of the page allocated for that initial chunk so you are technically writing to a valid mapped address. Welcome to memory corruption.
http://www.google.com/search?sourceid=chrome&ie=UTF-8&q=heap+overflows
Our CheckPointer tool can detect this error. It knows that the allocation of p was to a chunk of 4 bytes, and thus the assignment is made, it is outside the area for which p was allocated. It will tell you that the p[500] assignment is wrong.

I can use more memory than how much I've allocated with malloc(), why?

char *cp = (char *) malloc(1);
strcpy(cp, "123456789");
puts(cp);
output is "123456789" on both gcc (Linux) and Visual C++ Express, does that mean when there is free memory, I can actually use more than what I've allocated with malloc()?
and why malloc(0) doesn't cause runtime error?
Thanks.
You've asked a very good question and maybe this will whet your appetite about operating systems. Already you know you've managed to achieve something with this code that you wouldn't ordinarily expect to do. So you would never do this in code you want to make portable.
To be more specific, and this depends entirely on your operating system and CPU architecture, the operating system allocates "pages" of memory to your program - typically this can be in the order of 4 kilobytes. The operating system is the guardian of pages and will immediately terminate any program that attempts to access a page it has not been assigned.
malloc, on the other hand, is not an operating system function but a C library call. It can be implemented in many ways. It is likely that your call to malloc resulted in a page request from the operating system. Then malloc would have decided to give you a pointer to a single byte inside that page. When you wrote to the memory from the location you were given you were just writing in a "page" that the operating system had granted your program, and thus the operating system will not see any wrong doing.
The real problems, of course, will begin when you continue to call malloc to assign more memory. It will eventually return pointers to the locations you just wrote over. This is called a "buffer overflow" when you write to memory locations that are legal (from an operating system perspective) but could potentially be overwriting memory another part of the program will also be using.
If you continue to learn about this subject you'll begin to understand how programs can be exploited using such "buffer overflow" techniques - even to the point where you begin to write assembly language instructions directly into areas of memory that will be executed by another part of your program.
When you get to this stage you'll have gained much wisdom. But please be ethical and do not use it to wreak havoc in the universe!
PS when I say "operating system" above I really mean "operating system in conjunction with privileged CPU access". The CPU and MMU (memory management unit) triggers particular interrupts or callbacks into the operating system if a process attempts to use a page that has not been allocated to that process. The operating system then cleanly shuts down your application and allows the system to continue functioning. In the old days, before memory management units and privileged CPU instructions, you could practically write anywhere in memory at any time - and then your system would be totally at the mercy of the consequences of that memory write!
No. You get undefined behavior. That means anything can happen, from it crashing (yay) to it "working" (boo), to it reformatting your hard drive and filling it with text files that say "UB, UB, UB..." (wat).
There's no point in wondering what happens after that, because it depends on your compiler, platform, environment, time of day, favorite soda, etc., all of which can do whatever they want as (in)consistently as they want.
More specifically, using any memory you have not allocated is undefined behavior. You get one byte from malloc(1), that's it.
When you ask malloc for 1 byte, it will probably get 1 page (typically 4KB) from the operating system. This page will be allocated to the calling process so as long as you don't go out of the page boundary, you won't have any problems.
Note, however, that it is definitely undefined behavior!
Consider the following (hypothetical) example of what might happen when using malloc:
malloc(1)
If malloc is internally out of memory, it will ask the operating system some more. It will typically receive a page. Say it's 4KB in size with addresses starting at 0x1000
Your call returns giving you the address 0x1000 to use. Since you asked for 1 byte, it is defined behavior if you only use the address 0x1000.
Since the operating system has just allocated 4KB of memory to your process starting at address 0x1000, it will not complain if you read/write something from/to addresses 0x1000-0x1fff. So you can happily do so but it is undefined behavior.
Let's say you do another malloc(1)
Now malloc still has some memory left so it doesn't need to ask the operating system for more. It will probably return the address 0x1001.
If you had written to more than 1 byte using the address given from the first malloc, you will get into troubles when you use the address from the second malloc because you will overwrite the data.
So the point is you definitely get 1 byte from malloc but it might be that malloc internally has more memory allocated to you process.
No. It means that your program behaves badly. It writes to a memory location that it does not own.
You get undefined behavior - anything can happen. Don't do it and don't speculate about whether it works. Maybe it corrupts memory and you don't see it immediately. Only access memory within the allocated block size.
You may be allowed to use until the memory reaches some program memory or other point at which your applicaiton will most likely crash for accessing protected memory
So many responses and only one that gives the right explanation. While the page size, buffer overflow and undefined behaviour stories are true (and important) they do not exactly answer the original question. In fact any sane malloc implementation will allocate at least in size of the alignment requirement of an intor a void *. Why, because if it allocated only 1 byte then the next chunk of memory wouldn't be aligned anymore. There's always some book keeping data around your allocated blocks, these data structures are nearly always aligned to some multiple of 4. While some architectures can access words on unaligned addresses (x86) they do incure some penalties for doing that, so allocator implementer avoid that. Even in slab allocators there's no point in having a 1 byte pool as small size allocs are rare in practice. So it is very likely that there's 4 or 8 bytes real room in your malloc'd byte (this doesn't mean you may use that 'feature', it's wrong).
EDIT: Besides, most malloc reserve bigger chunks than asked for to avoid to many copy operations when calling realloc. As a test you can try using realloc in a loop with growing allocation size and compare the returned pointer, you will see that it changes only after a certain threshold.
You just got lucky there. You are writing to locations which you don't own this leads to undefined behavior.
On most platforms you can not just allocate one byte. There is often also a bit of housekeeping done by malloc to remember the amount of allocated memory. This yields to the fact that you usually "allocate" memory rounded up to the next 4 or 8 bytes. But this is not a defined behaviour.
If you use a few bytes more you'll very likeley get an access violation.
To answer your second question, the standard specifically mandates that malloc(0) be legal. Returned value is implementation-dependent, and can be either NULL or a regular memory address. In either case, you can (and should) legally call free on the return value when done. Even when non-NULL, you must not access data at that address.
malloc allocates the amount of memory you ask in heap and then return a pointer to void (void *) that can be cast to whatever you want.
It is responsibility of the programmer to use only the memory that has been allocate.
Writing (and even reading in protected environment) where you are not supposed can cause all sort of random problems at execution time. If you are lucky your program crash immediately with an exception and you can quite easily find the bug and fix it. If you aren't lucky it will crash randomly or produce unexpected behaviors.
For the Murphy's Law, "Anything that can go wrong, will go wrong" and as a corollary of that, "It will go wrong at the right time, producing the most large amount of damage".
It is sadly true. The only way to prevent that, is to avoid that in the language that you can actually do something like that.
Modern languages do not allow the programmer to do write in memory where he/she is not supposed (at least doing standard programming). That is how Java got a lot of its traction. I prefer C++ to C. You can still make damages using pointers but it is less likely. That is the reason why Smart Pointers are so popular.
In order to fix these kind of problems, a debug version of the malloc library can be handy. You need to call a check function periodically to sense if the memory was corrupted.
When I used to work intensively on C/C++ at work, we used Rational Purify that in practice replace the standard malloc (new in C++) and free (delete in C++) and it is able to return quite accurate report on where the program did something it was not supposed. However you will never be sure 100% that you do not have any error in your code. If you have a condition that happen extremely rarely, when you execute the program you may not incur in that condition. It will eventually happen in production on the most busy day on the most sensitive data (according to Murphy's Law ;-)
It could be that you're in Debug mode, where a call to malloc will actually call _malloc_dbg. The debug version will allocate more space than you have requested to cope with buffer overflows. I guess that if you ran this in Release mode you might (hopefully) get a crash instead.
You should use new and delete operators in c++... And a safe pointer to control that operations doesn't reach the limit of the array allocated...
There is no "C runtime". C is glorified assembler. It will happily let you walk all over the address space and do whatever you want with it, which is why it's the language of choice for writing OS kernels. Your program is an example of a heap corruption bug, which is a common security vulnerability. If you wrote a long enough string to that address, you'd eventually overrun the end of the heap and get a segmentation fault, but not before you overwrote a lot of other important things first.
When malloc() doesn't have enough free memory in its reserve pool to satisfy an allocation, it grabs pages from the kernel in chunks of at least 4 kb, and often much larger, so you're probably writing into reserved but un-malloc()ed space when you initially exceed the bounds of your allocation, which is why your test case always works. Actually honoring allocation addresses and sizes is completely voluntary, so you can assign a random address to a pointer, without calling malloc() at all, and start working with that as a character string, and as long as that random address happens to be in a writable memory segment like the heap or the stack, everything will seem to work, at least until you try to use whatever memory you were corrupting by doing so.
strcpy() doesn't check if the memory it's writing to is allocated. It just takes the destination address and writes the source character by character until it reaches the '\0'. So, if the destination memory allocated is smaller than the source, you just wrote over memory. This is a dangerous bug because it is very hard to track down.
puts() writes the string until it reaches '\0'.
My guess is that malloc(0) only returns NULL and not cause a run-time error.
My answer is in responce to Why does printf not seg fault or produce garbage?
From
The C programming language by Denis Ritchie & Kernighan
typedef long Align; /* for alignment to long boundary */
union header { /* block header */
struct {
union header *ptr; /* next block if on free list */
unsigned size; /* size of this block */
} s;
Align x; /* force alignment of blocks */
};
typedef union header Header;
The Align field is never used;it just forces each header to be aligned on a worst-case boundary.
In malloc,the requested size in characters is rounded up to the proper number of header-sized units; the block that will be allocated contains
one more unit, for the header itself, and this is the value recorded in the
size field of the header.
The pointer returned by malloc points at the free space, not at the header itself.
The user can do anything with the space requested, but if anything is written outside of the allocated space the list is likely to be scrambled.
-----------------------------------------
| | SIZE | |
-----------------------------------------
| |
points to |-----address returned touser
next free
block
-> a block returned by malloc
In statement
char* test = malloc(1);
malloc() will try to search consecutive bytes from the heap section of RAM if requested bytes are available and it returns the address as below
--------------------------------------------------------------
| free memory | memory in size allocated for user | |
----------------------------------------------------------------
0x100(assume address returned by malloc)
test
So when malloc(1) executed it won't allocate just 1 byte, it allocated some extra bytes to maintain above structure/heap table. you can find out how much actual memory allocated when you requested only 1 byte by printing test[-1] because just to before that block contain the size.
char* test = malloc(1);
printf("memory allocated in bytes = %d\n",test[-1]);
If the size passed is zero, and ptr is not NULL then the call is equivalent to free.

Memory leak question in C after moving pointer (What exactly is deallocated?)

I realize the code sample below is something you should never do. My question is just one of interest. If you allocate a block of memory, and then move the pointer (a no-no), when you deallocate the memory, what is the size of the block that is deallocated, and where is it in memory? Here's the contrived code snippet:
#include <stdio.h>
#include <string.h>
int main(void) {
char* s = malloc(1024);
strcpy(s, "Some string");
// Advance the pointer...
s += 5;
// Prints "string"
printf("%s\n", s);
/*
* What exactly are the beginning and end points of the memory
* block now being deallocated?
*/
free(s);
return 0;
}
Here is what I think I happens. The memory block being deallocated begins with the byte that holds the letter "s" in "string". The 5 bytes that held "Some " are now lost.
What I'm wondering is: Are the 5 bytes whose location in memory immediately follows the end of the original 1024 bytes deallocated as well, or are they just left alone?
Anyone know for sure what is it the compiler does? Is it undefined?
Thanks.
You cannot pass a pointer that was not obtained from a malloc, calloc or realloc to free (except NULL).
Question 7.19 in the C FAQ is relevant to your question.
The consequences of invoking undefined behavior are explained here.
It's undefined behavior in the standard, so you can't rely on anything.
Remember that blocks are artificially delimited areas of memory, and don't automatically
show up. Something has to keep track of the block, in order to free everything necessary and nothing more. There's no possible termination, like C strings, since there's no value or combination of values that can be guaranteed not to be inside the block.
Last I looked, there were two basic implementation practices.
One is to keep a separate record of allocated blocks, along with the address allocated. The free() function looks up the block to see what to free. In this case, it's likely to simply not find it, and may well just do nothing. Memory leak. There are, however, no guarantees.
One is to keep the block information in a part of memory just before the allocation address. In this case, free() is using part of the block as a block descriptor, and depending on what's stored there (which could be anything) it will free something. It could be an area that's too small, or an area that's too large. Heap corruption is very likely.
So, I'd expect either a memory leak (nothing gets freed), or heap corruption (too much is marked free, and then reallocated).
Yes, it is undefined behavior. You're essentially freeing a pointer you didn't malloc.
You cannot pass a pointer you did not obtain from malloc (or calloc or realloc...) to free. That includes offsets into blocks you did obtain from malloc. Breaking this rule could result in anything happening. Usually this ends up being the worst imaginable possibility at the worst possible moment.
As a further note, if you wanted to truncate the block, there's a legal way to do this:
#include <stdio.h>
#include <string.h>
int main() {
char *new_s;
char *s = malloc(1024);
strcpy(s, "Some string");
new_s = realloc(s, 5);
if (!new_s) {
printf("Out of memory! How did this happen when we were freeing memory? What a cruel world!\n");
abort();
}
s = new_s;
s[4] = 0; // put the null terminator back on
printf("%s\n", s); // prints Some
free(s);
return 0;
}
realloc works both to enlarge and shrink memory blocks, but may (or may not) move the memory to do so.
It is not the compiler that does it, it is the standard library. The behavior is undefined. The library knows that it allocated the original s to you. The s+5 is not assigned to any memory block known by the library, even though it happens to be inside a known block. So, it won't work.
What I'm wondering is: Are the 5 bytes whose location in memory immediately follows the end of the original 1024 bytes deallocated as well, or are they just left alone?
Both. The result is undefined so a compiler is free to do either of those, or anything else they'd like really. Of course (as with all cases of "undefined behavior") for a particular platform and compiler there is a specific answer, but any code that relies on such behavior is a bad idea.
Calling free() on a ptr that wasnt allocated by malloc or its brethren is undefined.
Most implementations of malloc allocate a small (typically 4byte) header region immediately before the ptr returned. Which means when you allocated 1024 bytes, malloc actually reserved 1028 bytes. When free( ptr ) is called, if ptr is not 0, it inspects the data at ptr - sizeof(header). Some allocators implement a sanity check, to make sure its a valid header, and which might detect a bad ptr, and assert or exit. If there is no sanity check, or it erroneously passes, free routine will act on whatever data happens to be in the header.
Adding to the more formal answers: I'd compare the mechanics of this to one taking a book in the library (malloc), then tearing off a few dozen pages together with the cover (advance the pointer), and then attempting to return it (free).
You might find a librarian (malloc/free library implementation) that takes such a book back, but in a lot of case I'd expect you would pay a fine for negligent handling.
In the draft of C99 (I don't have the final C99 handy in front of me), there is something to say on this topic:
The free function causes the space pointed to by ptr to be deallocated,
that is, made available for further allocation. If ptr is a null pointer, no action
occurs. Otherwise, if the argument does not match a pointer earlier returned
by the calloc, malloc, or realloc function, or if the space has been
deallocated by a call to free or realloc, the behaviour is undefined.
In my experience, a double free or the free of the "pointer" that was not returned via malloc will result in a memory corruption and/or crash, depending on your malloc implementation. The security people from both sides of the fence used this behaviour not once, in order to do various interesting things at least in early versions of the widely used Doug Lea's malloc package.
The library implementation might put some data structure before the pointer it returns to you. Then in free() it decrements the pointer to get at the data structure telling it how to place the memory back into the free pool. So the 5 bytes at the beginning of your string "Some " is interpreted as the end of the struct used by the malloc() algorithm. Perhaps the end of a 32 bit value, like the size of memory allocated, or a link in a linked list. It depends on the implementation. Whatever the details, it'll just crash your program. As Sinan points out, if you're lucky!
Let's be smart here... free() is not a black hole. At the very least, you have the CRT source code. Beyond that, you need the kernel source code.
Sure, the behavior is undefined in that it is up to the CRT/OS to decide what to do. But that doesn't prevent you from finding out what your platform actualy does.
A quick look into the Windows CRT shows that free() leads right to HeapFree() using a CRT specific heap. Beoyond that you're into RtlHeapFree() and then into system space (NTOSKRN.EXE) with the memory manager Mm*().
There are consistancey checks throughout all these code paths. But doing differnt things to the memory will cause differnt code paths. Hence the true definition of undefined.
At a quick glance, I can see that an allocated block of memory has a marker at the end. When the memory is freed, each byte is written over with a distinct byte. The runtime can do a check to see if the end of block marker was overwritten and raise an exception if so.
This is a posiblility in your case of freeing memory a few bytes into your block (or over-writing your allocated size). Of course you can trick this and write the end of block marker yourself at the correct location. This will get you past the CRT check, but as the code-path goes futher, more undefined behavoir occurs. Three things can happen: 1) absolutely no harm, 2) memory corruption within the CRT heap, or 3) a thrown exception by any of the memory management functions.
Short version: It's undefined behavior.
Long version: I checked the CWE site and found that, while it's a bad idea all around, nobody seemed to have a solid answer. Probably because it's undefined.
My guess is that most implementations, assuming they don't crash, would either free 1019 bytes (in your example), or else free 1024 and get a double free or similar on the last five. Just speaking theoretically for now, it depends on whether the malloc routine's internal storage tables contains an address and a length, or a start address and an end address.
In any case, it's clearly not a good idea. :-)

Resources