free() replaces string with zero? - c

I could use a little help with free().
When I run the following:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(int argc, const char *argv[])
{
char *mystring = malloc(6 * sizeof(char));
strcpy(mystring, "Hello");
printf("%p\n", mystring);
printf("%s\n", mystring);
free(mystring);
printf("%p\n", mystring);
printf("%d\n", *mystring);
return 0;
}
I get:
0x8f46008
Hello
0x8f46008
0
Did free() replace the string 'Hello' from memory with zero?
Note: This is just for academic purposes. I would never reference freed memory for real.
Thanks,
Frank

Once you have freed a block of memory, reading that memory again results in undefined behavior and is a serious security and stability hazard. You cannot count on anything holding true for memory that has been freed, so there is no guarantee whether the memory will be zeroed or not. Unless you are absolutely sure of what you're doing, don't reference memory after you've freed it.
As an amusing anecdote about this, the original SimCity game had a bug in it where it referenced memory that had been freed. In DOS, this ended up not causing a crash, but when it was ported to Windows the game started crashing repeatedly. The Windows team had to specifically build in a case into the OS such that if SimCity was run, a custom memory manager would be used to prevent this sort of crash. You can read the full story here.
In short, once it's freed, don't touch it. Otherwise you risk bugs that some poor programmer years down the line will have to fix for you. Tools like valgrind exist to detect these sorts of errors specifically because they're so nasty.
Hope this helps!

The contents of mystring (*mystring, mystring[0], and friends) are undefined after you free the memory. You can not rely on it containing "Hello". You also cannot rely on it containing an ASCII NUL (as you see here).
You also cannot rely on reading it not causing a segmentation fault. Don't do it.
If you were to run this program in a memory checker like valgrind, you would see an error here about access to freed memory.

Maybe in debug mode on your specific computer and compiled with your specific compiler, in general though you should expect that piece of code to crash (or worse).

Related

Using free(), what is happening later?

As I was reading, there is a need to use free(), BUT what happen next? I mean if I got something like that:
char word[] = "abc";
char *copy;
copy = (char*) malloc(sizeof(char) * (strlen(word) + 1));
strcpy(copy, word);
free(copy);
printf("%s", copy);
It is going to write me "abc". Why?
After using free(), your pointer copy still points to the same memory location. free() does not actually delete what is written there in memory but rather tells the memory management that you do not need that part of memory anymore.
That is why it still outputs abc. However, your OS could have reassigned that memory to another application or some new thing you allocate in your application. If you are unlucky, you will get an segmentation fault.
free() deallocates the memory previously allocated by a calloc, malloc, or realloc. You should not access memory that has been free'd, as the behaviour is not defined. It's only a coincidence, that it still holds it's previous content.
It is a good idea to use tools as valgrind, which can tell you (among other things) whether or not you are trying to access deallocated memory. In linux terminal, you can do it like this:
valgrind ./yourProgram
Here is it explayned quite well:
C Reference -- free()
deallocating memory does not mean there is no Data anymore.
The Memory is just free for new allocations.
Accessing it will result in undefined behavior.
As others have said, the behavior is undefined when the code references a freed pointer. In this case you are reading it. However, writing to it would most likely be not allowed, and you should see a segmentation fault.
I recommend that you run it with the MALLOCDEBUG (e.g. on AIX it would be MALLOCDEBUG=validate_ptrs) or a similar environment variable on your platform, so that you will catch this error. However turning on MALLOCDEBUG can have a serious performance impact on your program. An alternative is to write your own free routine that also sets the freed pointer to NULL explicitly as shown below:
#define MYFREE(x) do { free((x)); (x) = NULL; } while(0);

Odd behavior regarding malloc()

Why does this work?
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main()
{
char * abc = malloc(1) + 4; //WRONG use of malloc.
char * xyz = "abc";
strcpy(abc, xyz); //Should fail.
printf("%s\n", abc); //Prints abc
}
I would expect the strcpy to fail for not having enough memory, as I'm passing in 1 to the argument of malloc(). Instead, this compiles and runs flawlessly (in both GCC on linux and dev c++ on Windows).
Is this expected behavior, or a happy coincidence?
I assume this isn't good practice, but why does it work?
Without the +4 at the end of malloc(), I get a segmentation fault. This is mostly what I'm curious about.
This is undefined behavior. Don't do that!!.
You're trying to access memory location beyond the allocated region. So, the memory location is invalid and accessing invalid memory invokes UB.
FWIW,
there is noting in the C standard that stops you from accessing out of bound (invalid) memory and
neither does strcpy() check for the size of the destination buffer compared to the source length
so, this code (somehow) compiles. As soon as you run it and it hits UB, nothing is guaranteed anymore.
P.S - the only guaranteed thing here is undefined behavior.
This is basically another demonstration of the fact that pointers in C are low-level and (typically) not checked. You said you expected it to "fail for not having enough memory", but think about it: what did you expect to fail? The strcpy function most assuredly does not check that there's enough room for the string it's copying. It has no way to do so; all it gets is a pointer. It just starts copying characters, and in practice it either succeeds or dies on a segmentation violation. (But the point is it does not die on "out of memory".)
Do not rely on that behavior. The answerers responding "vigorously" are justified in that relying on such behavior can lurk undetected for years and then, one day, a minor adjustment to the runtime system suddenly causes catastrophic failure.
It seems to work because, since the advent of 32-bit computers, many—if not most—C runtime libraries implement malloc/free which manage the heap with 16-byte granularity. That is, calling malloc() with a parameter from 1 to 16 provides the same allocation. SO you get a little more memory than you asked for and that allows it to execute.
A tool like valgrind would certainly detect a problem.

getchar() and malloc returning good result when it shouldn't

Can anyone explain me why this code works perfectly?
int main(int argc, char const *argv[])
{
char* str = (char*)malloc(sizeof(char));
int c, i = 0;
while ((c = getchar()) != EOF)
{
str[i] = c;
i++;
}
printf("\n%s\n", str);
return 0;
}
Shouldn't this program crash when I enter for example "aaaaaassssssssssssddddddddddddddd"? here is what I get with this input :
aaaaaassssssssssssddddddddddddddd
aaaaaassssssssssssddddddddddddddd
And I really don't get why is it so.
As you've presumably identified you're overrunning the sizeof(char) (~1 byte) block of memory you've asked malloc to give you, and you are printing a string that you have not specifically null terminated.
Either of these two things could lead to badness such as crashes but don't right now. Overrunning your allocated block of memory simply means that you are running into memory that you didn't ask malloc to give you. It could be memory malloc gave you anyway, a minimum allocation greater than 64 bytes would not be particularly surprising. Additionally since this is the only place you allocate memory in the heap you are unlikely to overwrite a memory address you use somewhere else (ie if you allocated a second string you might overrun the buffer of the first string and write into the space used for the second string). Even if you had multiple allocations your program might not crash until you tried to write to a memory address the operating system hadn't allocated to the process. Typically operating systems allocate virtual memory as pages and then a memory allocator such as malloc is used within the process to distribute that memory and request more from the operating system. You probably had several MB of read/write virtual address space already allocated to the process and wouldn't crash until you exceeded that. Were you to have tried to write to the memory that contained your code you would likely have caused a crash due to the OS protecting that from writes (or if it didn't you would crash due to garbage instructions getting executed). That's probably enough on why you didn't crash due to an overflow. I'd suggest having fun experimenting by sending it more data to see how much you can get to work correctly without it crashing, though it may vary from run to run.
Now the other place you could have crashed or gotten incorrect behavior is in printing out your string because printf assumes a null byte terminated string, that it starts at the address of the pointer and prints until it reads a byte with value 0. Since you didn't initialize the memory yourself this could have been forever. However, it terminated printing in exactly the right spot. This means that byte 'just happened' to be 0. But that's a simplification. On a 'reasonable' modern OS the kernel will zero (write 0s to) the memory that it allocates to the process to prevent leaking information from prior users of the memory. Since this is the first/only allocation you've done the memory is all shiny and clean, but had you freed memory previously malloc might reuse it and then it would have non zero values from stuff your process had written.
Now useful advice to detect these problems in future even on programs that appear to work perfectly. If you are working on Linux (on OS X you'll need to install it) I suggest running 'small' programs through valgrind to see if they produce errors. As an exercise and an easy way to learn what the output looks like where you already know the errors try it on this program. Since valgrind slows things down you may get frustrated running a 'large' program through it, but 'small' will cover most single projects (ie always run valgrind for a school project and fix the errors).
With additional information about the environment your program is running in could lead to further explanations of implementation specific behavior. ie C implementation or OS memory zeroing behavior.

Malloc -> how much memory has been allocated?

# include <stdio.h>
# include <stdbool.h>
# include <string.h>
# include <stdlib.h>
int main ()
{
char * buffer;
buffer = malloc (2);
if (buffer == NULL){
printf("big errors");
}
strcpy(buffer, "hello");
printf("buffer is %s\n", buffer);
free(buffer);
return 0;
}
I allocated 2 bytes of memory to the pointer/char buffer yet if I assign the C-style string hello to it, it still prints the entire string, without giving me any errors. Why doesn't the compiler give me an error telling me there isn't enough memory allocated? I read a couple of questions that ask how to check how much memory malloc actually allocates but I didn't find a concrete answer. Shouldn't the free function have to know exactly how much memory is allocated to buffer?
The compiler doesn't know. This is the joy and terror of C. malloc belongs to the runtime. All the compilers knows is that you have told it that it returns a void*, it has no idea how much, or how much strcpy is going to copy.
Tools like valgrind detect some of these errors. Other programming languages make it harder to shoot yourself in the foot. Not C.
No production malloc() implementation should prevent you from trying to write past what you allocated. It is assumed that if you allocate 123 bytes, you will use all or less than what you allocated. malloc(), for efficiency sake, has to assume that a programmer is going to keep track of their pointers.
Using memory that you didn't explicitly and successfully ask malloc() to give you is undefined behavior. You might have asked for n bytes but got n + x, due to the malloc() implementation optimizing for byte alignment. Or you could be writing to a black hole. You never can know, that's why it's undefined behavior.
That being said ...
There are malloc() implementations that give you built in statistics and debugging, however these need to be used in lieu of the standard malloc() facility just like you would if you were using a garbage collected variety.
I've also seen variants designed strictly for LD_PRELOAD that expose a function to allow you to define a callback with at least one void pointer as an argument. That argument expects a structure that contains the statistical data. Other tools like electric fence will simply halt your program on the exact instruction that resulted in an overrun or access to invalid blocks. As #R.. points out in comments, that is great for debugging but horribly inefficient.
In all honesty or (as they say) 'at the end of the day' - it's much easier to use a heap profiler such as Valgrind and its associated tools (massif) in this case which will give you quite a bit of information. In this particular case, Valgrind would have pointed out the obvious - you wrote past the allocated boundary. In most cases, however when this is not intentional, a good profiler / error detector is priceless.
Using a profiler isn't always possible due to:
Timing issues while running under a profiler (but those are common any time calls to malloc() are intercepted).
Profiler is not available for your platform / arch
The debug data (from a logging malloc()) must be an integral part of the program
We used a variant of the library that I linked in HelenOS (I'm not sure if they're still using it) for quite a while, as debugging at the VMM was known to cause insanity.
Still, think hard about future ramifications when considering a drop in replacement, when it comes to the malloc() facility you almost always want to use what the system ships.
How much malloc internally allocates is implementation-dependent and OS-dependent (e.g. multiples of 8 bytes or more). Your writing into the un-allocated bytes may lead to overwriting other variable's values even if your compiler and run-time dont detect the error. The free-function remembers the number of bytes allocated separate from the allocated region, for example in a free-list.
Why doesnt the compiler give me an
error telling me there isnt enough
memory allocated ?
C does not block you from using memory you should not. You can use that memory, but it is bad and result in Undefined Behaviour. You are writing in a place you should not. This program might appear as running correctly, but might later crash. This is UB. you do not know what might happen.
This is what is happening with your strcpy(). You write in place you do not own, but the language does not protect you from that. So you should make sure you always know what and where you are writing, or make sure you stop when you are about to exceed valid memory bounds.
I read a couple of questions that ask
how to check how much memory malloc
actually allocates but I didn't find a
concrete answer. Shouldn't the 'free'
function have to know how much memory
is exactly allocated to 'buffer' ?
malloc() might allocate more memory than you request cause of bit padding.
More : http://en.wikipedia.org/wiki/Data_structure_alignment
free() free-s the exact same amount you allocated with malloc(), but it is not as smart as you think. Eg:
int main()
{
char * ptr = malloc(10);
if(ptr)
{
++ptr; // Now on ptr+1
free(ptr); // Undefined Behaviour
}
}
You should always free() a pointer which points to the first block. Doing a free(0) is safe.
You've written past the end of the buffer you allocated. The result is undefined behavior. Some run time libraries with the right options have at least some ability to diagnose problems like this, but not all do, and even those that do only do so at run-time, and usually only when compiled with the correct options.
Malloc -> how much memory has been allocated?
When you allocate memory using malloc. On success it allocates memory and default allocation is 128k. first call to malloc gives you 128k.
what you requested is buffer = malloc (2); Though you requested 2 bytes. It has allocated 128k.
strcpy(buffer, "hello"); Allocated 128k chunk it started processing your request. "Hello"
string can fit into this.
This pgm will make you clear.
int main()
{
int *p= (int *) malloc(2);---> request is only 2bytes
p[0]=100;
p[1]=200;
p[2]=300;
p[3]=400;
p[4]=500;
int i=0;
for(;i<5;i++,p++)enter code here
printf("%d\t",*p);
}
On first call to malloc. It allocates 128k---> from that it process your request (2 bytes). The string "hello" can fit into it. Again when second call to malloc it process your request from 128k.
Beyond 128k it uses mmap interface. You can refer to man page of malloc.
There is no compiler/platform independent way of finding out how much memory malloc actually allocated. malloc will in general allocation slightly more than you ask it for see here:
http://41j.com/blog/2011/09/finding-out-how-much-memory-was-allocated/
On Linux you can use malloc_usable_size to find out how much memory you can use. On MacOS and other BSD platforms you can use malloc_size. The post linked above has complete examples of both these techniques.

Memory leak question in C after moving pointer (What exactly is deallocated?)

I realize the code sample below is something you should never do. My question is just one of interest. If you allocate a block of memory, and then move the pointer (a no-no), when you deallocate the memory, what is the size of the block that is deallocated, and where is it in memory? Here's the contrived code snippet:
#include <stdio.h>
#include <string.h>
int main(void) {
char* s = malloc(1024);
strcpy(s, "Some string");
// Advance the pointer...
s += 5;
// Prints "string"
printf("%s\n", s);
/*
* What exactly are the beginning and end points of the memory
* block now being deallocated?
*/
free(s);
return 0;
}
Here is what I think I happens. The memory block being deallocated begins with the byte that holds the letter "s" in "string". The 5 bytes that held "Some " are now lost.
What I'm wondering is: Are the 5 bytes whose location in memory immediately follows the end of the original 1024 bytes deallocated as well, or are they just left alone?
Anyone know for sure what is it the compiler does? Is it undefined?
Thanks.
You cannot pass a pointer that was not obtained from a malloc, calloc or realloc to free (except NULL).
Question 7.19 in the C FAQ is relevant to your question.
The consequences of invoking undefined behavior are explained here.
It's undefined behavior in the standard, so you can't rely on anything.
Remember that blocks are artificially delimited areas of memory, and don't automatically
show up. Something has to keep track of the block, in order to free everything necessary and nothing more. There's no possible termination, like C strings, since there's no value or combination of values that can be guaranteed not to be inside the block.
Last I looked, there were two basic implementation practices.
One is to keep a separate record of allocated blocks, along with the address allocated. The free() function looks up the block to see what to free. In this case, it's likely to simply not find it, and may well just do nothing. Memory leak. There are, however, no guarantees.
One is to keep the block information in a part of memory just before the allocation address. In this case, free() is using part of the block as a block descriptor, and depending on what's stored there (which could be anything) it will free something. It could be an area that's too small, or an area that's too large. Heap corruption is very likely.
So, I'd expect either a memory leak (nothing gets freed), or heap corruption (too much is marked free, and then reallocated).
Yes, it is undefined behavior. You're essentially freeing a pointer you didn't malloc.
You cannot pass a pointer you did not obtain from malloc (or calloc or realloc...) to free. That includes offsets into blocks you did obtain from malloc. Breaking this rule could result in anything happening. Usually this ends up being the worst imaginable possibility at the worst possible moment.
As a further note, if you wanted to truncate the block, there's a legal way to do this:
#include <stdio.h>
#include <string.h>
int main() {
char *new_s;
char *s = malloc(1024);
strcpy(s, "Some string");
new_s = realloc(s, 5);
if (!new_s) {
printf("Out of memory! How did this happen when we were freeing memory? What a cruel world!\n");
abort();
}
s = new_s;
s[4] = 0; // put the null terminator back on
printf("%s\n", s); // prints Some
free(s);
return 0;
}
realloc works both to enlarge and shrink memory blocks, but may (or may not) move the memory to do so.
It is not the compiler that does it, it is the standard library. The behavior is undefined. The library knows that it allocated the original s to you. The s+5 is not assigned to any memory block known by the library, even though it happens to be inside a known block. So, it won't work.
What I'm wondering is: Are the 5 bytes whose location in memory immediately follows the end of the original 1024 bytes deallocated as well, or are they just left alone?
Both. The result is undefined so a compiler is free to do either of those, or anything else they'd like really. Of course (as with all cases of "undefined behavior") for a particular platform and compiler there is a specific answer, but any code that relies on such behavior is a bad idea.
Calling free() on a ptr that wasnt allocated by malloc or its brethren is undefined.
Most implementations of malloc allocate a small (typically 4byte) header region immediately before the ptr returned. Which means when you allocated 1024 bytes, malloc actually reserved 1028 bytes. When free( ptr ) is called, if ptr is not 0, it inspects the data at ptr - sizeof(header). Some allocators implement a sanity check, to make sure its a valid header, and which might detect a bad ptr, and assert or exit. If there is no sanity check, or it erroneously passes, free routine will act on whatever data happens to be in the header.
Adding to the more formal answers: I'd compare the mechanics of this to one taking a book in the library (malloc), then tearing off a few dozen pages together with the cover (advance the pointer), and then attempting to return it (free).
You might find a librarian (malloc/free library implementation) that takes such a book back, but in a lot of case I'd expect you would pay a fine for negligent handling.
In the draft of C99 (I don't have the final C99 handy in front of me), there is something to say on this topic:
The free function causes the space pointed to by ptr to be deallocated,
that is, made available for further allocation. If ptr is a null pointer, no action
occurs. Otherwise, if the argument does not match a pointer earlier returned
by the calloc, malloc, or realloc function, or if the space has been
deallocated by a call to free or realloc, the behaviour is undefined.
In my experience, a double free or the free of the "pointer" that was not returned via malloc will result in a memory corruption and/or crash, depending on your malloc implementation. The security people from both sides of the fence used this behaviour not once, in order to do various interesting things at least in early versions of the widely used Doug Lea's malloc package.
The library implementation might put some data structure before the pointer it returns to you. Then in free() it decrements the pointer to get at the data structure telling it how to place the memory back into the free pool. So the 5 bytes at the beginning of your string "Some " is interpreted as the end of the struct used by the malloc() algorithm. Perhaps the end of a 32 bit value, like the size of memory allocated, or a link in a linked list. It depends on the implementation. Whatever the details, it'll just crash your program. As Sinan points out, if you're lucky!
Let's be smart here... free() is not a black hole. At the very least, you have the CRT source code. Beyond that, you need the kernel source code.
Sure, the behavior is undefined in that it is up to the CRT/OS to decide what to do. But that doesn't prevent you from finding out what your platform actualy does.
A quick look into the Windows CRT shows that free() leads right to HeapFree() using a CRT specific heap. Beoyond that you're into RtlHeapFree() and then into system space (NTOSKRN.EXE) with the memory manager Mm*().
There are consistancey checks throughout all these code paths. But doing differnt things to the memory will cause differnt code paths. Hence the true definition of undefined.
At a quick glance, I can see that an allocated block of memory has a marker at the end. When the memory is freed, each byte is written over with a distinct byte. The runtime can do a check to see if the end of block marker was overwritten and raise an exception if so.
This is a posiblility in your case of freeing memory a few bytes into your block (or over-writing your allocated size). Of course you can trick this and write the end of block marker yourself at the correct location. This will get you past the CRT check, but as the code-path goes futher, more undefined behavoir occurs. Three things can happen: 1) absolutely no harm, 2) memory corruption within the CRT heap, or 3) a thrown exception by any of the memory management functions.
Short version: It's undefined behavior.
Long version: I checked the CWE site and found that, while it's a bad idea all around, nobody seemed to have a solid answer. Probably because it's undefined.
My guess is that most implementations, assuming they don't crash, would either free 1019 bytes (in your example), or else free 1024 and get a double free or similar on the last five. Just speaking theoretically for now, it depends on whether the malloc routine's internal storage tables contains an address and a length, or a start address and an end address.
In any case, it's clearly not a good idea. :-)

Resources