Writing to pointer out of bounds after malloc() not causing error - c

when I try the code below it works fine. Am I missing something?
main()
{
int *p;
p=malloc(sizeof(int));
printf("size of p=%d\n",sizeof(p));
p[500]=999999;
printf("p[0]=%d",p[500]);
return 0;
}
I tried it with malloc(0*sizeof(int)) or anything but it works just fine. The program only crashes when I don't use malloc at all. So even if I allocate 0 memory for the array p, it still stores values properly. So why am I even bothering with malloc then?

It might appear to work fine, but it isn't very safe at all. By writing data outside the allocated block of memory you are overwriting some data you shouldn't. This is one of the greatest causes of segfaults and other memory errors, and what you're observing with it appearing to work in this short program is what makes it so difficult to hunt down the root cause.
Read this article, in particular the part on memory corruption, to begin understanding the problem.
Valgrind is an excellent tool for analysing memory errors such as the one you provide.
#David made a good comment. Compare the results of running your code to running the following code. Note the latter results in a runtime error (with pretty much no useful output!) on ideone.com (click on links), whereas the former succeeds as you experienced.
int main(void)
{
int *p;
p=malloc(sizeof(int));
printf("size of p=%d\n",sizeof(p));
p[500]=999999;
printf("p[0]=%d",p[500]);
p[500000]=42;
printf("p[0]=%d",p[500000]);
return 0;
}

If you don't allocate memory, p has garbage in it, so writing to it likely will fail. Once you made a valid malloc call, p is pointing to valid memory location and you can write to it. You are overwriting memory that you shouldn't write to, but nobody's going to hold your hand and tell you about it. If you run your program and a memory debugger such as valgrind, it will tell you.
Welcome to C.

Writing past the end of your memory is Undefined Behaviour™, which means that anything could happen- including your program operating as if what you just did was perfectly legal. The reason for your program running as if you had done malloc(501*sizeof(int)) are completely implementation-specific, and can indeed be specific to anything, including the phase of the moon.

This is because P would be assigned some address no matter what size you use with malloc(). Although, with a zero size you would be referencing invalid memory as the memory hasn't been allocated, but it may be within a location which wouldn't cause program crash, though the behavior will be undefined.
Now if you do not use malloc(), it would be pointing to a garbaging location and trying to access that is likely to cause program crash.

I tried it with malloc(0*sizeof(int))
According to C99 if the size passed to malloc is 0, a C runtime can return either a NULL pointer or the allocation behaves as if the request was for non-zero allocation, except that the returned pointer should not be dereferenced. So it is implementation defined (e.g. some implementations return a zero-length buffer) and in your case you do not get a NULL pointer back, but you are using a pointer you should not be using.If you try it in a different runtime you could get a NULL pointer back.

When you call malloc() a small chunk of memory is carved out of a larger page for you.
malloc(sizeof(int));
Does not actually allocate 4 bytes on a 32bit machine (the allocator pads it up to a minimum size) + size of heap meta data used to track the chunk through its lifetime (chunks are placed in bins based on their size and marked in-use or free by the allocator). hxxp://en.wikipedia.org/wiki/Malloc or more specifically hxxp://en.wikipedia.org/wiki/Malloc#dlmalloc_and_its_derivatives if you're testing this on Linux.
So writing beyond the bounds of your chunk doesn't necessarily mean you are going to crash. At p+5000 you are not writing outside the bounds of the page allocated for that initial chunk so you are technically writing to a valid mapped address. Welcome to memory corruption.
http://www.google.com/search?sourceid=chrome&ie=UTF-8&q=heap+overflows

Our CheckPointer tool can detect this error. It knows that the allocation of p was to a chunk of 4 bytes, and thus the assignment is made, it is outside the area for which p was allocated. It will tell you that the p[500] assignment is wrong.

Related

How can a memory block created using malloc store more memory than it was initialized with? [duplicate]

when I try the code below it works fine. Am I missing something?
main()
{
int *p;
p=malloc(sizeof(int));
printf("size of p=%d\n",sizeof(p));
p[500]=999999;
printf("p[0]=%d",p[500]);
return 0;
}
I tried it with malloc(0*sizeof(int)) or anything but it works just fine. The program only crashes when I don't use malloc at all. So even if I allocate 0 memory for the array p, it still stores values properly. So why am I even bothering with malloc then?
It might appear to work fine, but it isn't very safe at all. By writing data outside the allocated block of memory you are overwriting some data you shouldn't. This is one of the greatest causes of segfaults and other memory errors, and what you're observing with it appearing to work in this short program is what makes it so difficult to hunt down the root cause.
Read this article, in particular the part on memory corruption, to begin understanding the problem.
Valgrind is an excellent tool for analysing memory errors such as the one you provide.
#David made a good comment. Compare the results of running your code to running the following code. Note the latter results in a runtime error (with pretty much no useful output!) on ideone.com (click on links), whereas the former succeeds as you experienced.
int main(void)
{
int *p;
p=malloc(sizeof(int));
printf("size of p=%d\n",sizeof(p));
p[500]=999999;
printf("p[0]=%d",p[500]);
p[500000]=42;
printf("p[0]=%d",p[500000]);
return 0;
}
If you don't allocate memory, p has garbage in it, so writing to it likely will fail. Once you made a valid malloc call, p is pointing to valid memory location and you can write to it. You are overwriting memory that you shouldn't write to, but nobody's going to hold your hand and tell you about it. If you run your program and a memory debugger such as valgrind, it will tell you.
Welcome to C.
Writing past the end of your memory is Undefined Behaviour™, which means that anything could happen- including your program operating as if what you just did was perfectly legal. The reason for your program running as if you had done malloc(501*sizeof(int)) are completely implementation-specific, and can indeed be specific to anything, including the phase of the moon.
This is because P would be assigned some address no matter what size you use with malloc(). Although, with a zero size you would be referencing invalid memory as the memory hasn't been allocated, but it may be within a location which wouldn't cause program crash, though the behavior will be undefined.
Now if you do not use malloc(), it would be pointing to a garbaging location and trying to access that is likely to cause program crash.
I tried it with malloc(0*sizeof(int))
According to C99 if the size passed to malloc is 0, a C runtime can return either a NULL pointer or the allocation behaves as if the request was for non-zero allocation, except that the returned pointer should not be dereferenced. So it is implementation defined (e.g. some implementations return a zero-length buffer) and in your case you do not get a NULL pointer back, but you are using a pointer you should not be using.If you try it in a different runtime you could get a NULL pointer back.
When you call malloc() a small chunk of memory is carved out of a larger page for you.
malloc(sizeof(int));
Does not actually allocate 4 bytes on a 32bit machine (the allocator pads it up to a minimum size) + size of heap meta data used to track the chunk through its lifetime (chunks are placed in bins based on their size and marked in-use or free by the allocator). hxxp://en.wikipedia.org/wiki/Malloc or more specifically hxxp://en.wikipedia.org/wiki/Malloc#dlmalloc_and_its_derivatives if you're testing this on Linux.
So writing beyond the bounds of your chunk doesn't necessarily mean you are going to crash. At p+5000 you are not writing outside the bounds of the page allocated for that initial chunk so you are technically writing to a valid mapped address. Welcome to memory corruption.
http://www.google.com/search?sourceid=chrome&ie=UTF-8&q=heap+overflows
Our CheckPointer tool can detect this error. It knows that the allocation of p was to a chunk of 4 bytes, and thus the assignment is made, it is outside the area for which p was allocated. It will tell you that the p[500] assignment is wrong.

What happens if I set a value outside of the memory allocated with calloc?

Consider the following:
int* x = calloc(3,sizeof(int));
x[3] = 100;
which is located inside of a function.
I get no error when I compile and run the program, but when I run it with valgrind I get an "Invalid write of size 4".
I understand that I am accessing a memory place outside of what I have allocated with calloc, but I'm trying to understand what actually happens.
Does some address in the stack(?) still have the value 100? Because there must certainly be more available memory than what I have allocated with calloc. Is the valgrind error more of a "Hey, you probably did not mean to do that"?
I understand that I am accessing a memory place outside of what I have allocated with calloc, but I'm trying to understand what actually happens.
"What actually happens" is not well-defined; it depends entirely on what gets overwritten. As long as you don't overwrite anything important, your code will appear to run as expected.
You could wind up corrupting other data that was allocated dynamically. You could wind up corrupting some bit of heap bookkeeping.
The language does not enforce any kind of bounds-checking on array accesses, so if you read or write past the end of the array, there are no guarantees on what will happen.
Does some address in the stack(?) still have the value 100?
First of all, calloc allocates memory on the heap not stack.
Now, regarding the error.
Sure most of the time there is plenty of memory available when your program is running. However when you allocate memory for x bytes, the memory manager looks for some free chunk of memory of that exact size(+ maybe some more if calloc requested larger memory to store some auxiliary info), there are no guaranties on what the bytes after that chunk are used for, and even no guaranties that they are not read-only or can be accessed by your program.
So anything can happen. In the case if the memory was just there waiting for it to be used by your program, nothing horrible happens, but if that memory was used by something else in your program, the values would be mess up, or worst of all the program could crash because of accessing something that wasn't supposed to be accessed.
So the valgrind error should be treated very seriously.
The C language doesn't require bounds checking on array accesses, and most C compilers don't implement it. Besides if you used some variable size instead of constant value 3, the array size could be unknown during compilation, and there would be no way to check if the access isn't out of bound.
There's no guarantees on what was allocated in the space past x[3] or what will be written there in the future. alinsoar mentioned that x[3] itself does not cause undefined behavior, but you should not attempt to fetch or store a value from there. Often you will probably be able to write and access this memory location without problems, but writing code that relies on reaching outside of your allocated arrays is setting yourself up for very hard to find errors in the future.
Does some address in the stack(?) still have the value 100?
When using calloc or malloc, the values of the array are not actually on the stack. These calls are used for dynamic memory allocation, meaning they are allocated in a seperate area of memory known as the "Heap". This allows you to access these arrays from different parts of the stack as long as you have a pointer to them. If the array were on the stack, writing past the bounds would risk overwriting other information contained in your function (like in the worst case the return location).
The act of doing that is what is called undefined behavior.
Literally anything can happen, or nothing at all.
I give you extra points for testing with Valgrind.
In practice, it is likely you will find the value 100 in the memory space after your array.
Beware of nasal demons.
You're allocating memory for 3 integer elements but accessing the 4th element (x[3]). Hence, the warning message from valgrind. Compiler will not complain about it.

C malloc function size

So I have this little tricky question I need to answer:
On which segment in memory is c+9 pointing to if the function is:
void f()
{
int *c=(int*)malloc(10);
}
I think I know how malloc works, and I looked up other questions, so this should allocate 10 bytes of memory and return the adress of the first one, plus it will cast it to int.
So because sizeof(int) is 4 bytes, I thought that there wouldn't be enough space for 9 integers, and c+9 would point to some memory outside the range of the allocated memory and it would return an error, but no, the program works just fine, as if there was 10*sizeof(int) allocated. So where am I making a mistake?
Your mistake is believing that it works just fine.
Yes, it probably runs, yes, it probably doesn't segfault, but no, it is not correct.
So because sizeof(int) is 4 bytes, I thought that there wouldn't be enough space for 9 integers, and c+9 would point to some memory outside the range of the allocated memory
This is correct
and it would return an error
But this is unfortunately not correct in every case. The OS can only give out space in full pages, this means you can only get space in multiples of 4096 bytes (one page). This means, even though malloc (which is implemented in userspace) gives you 10 bytes, your program will have at least 4096 bytes from the OS. BUT: malloc will eventually give you out more unallocated space from this one page you got and then it will probably introduce a bug.
TLDR: This is UB, even though it looks like it works, never do this.
You're making the mistake in assuming that undefined behavior implies that something "bad" is guaranteed to happen. Accessing c[9] in your case is undefined behavior as you haven't malloced enough memory - and is something that you should not do.
Undefined behavior means that the standard allow for any behavior. For this particular error you would often get an non-localized misbehavior, accessing c[9] would work apparently fine and no odd things happens when you do it, but then in an unrelated piece of code accessing an unrelated piece of data results in error. Often these kind of mistakes would also corrupt the data used by the memory allocation system which may make malloc and free to misbehave.
C programs will not return an error if you poke outside of the assigned memory range. The result is not defined, it may hang or (apparently) work fine. But it is not fine.
You are right in that malloc gives you 10 characters (usually 8-bit bytes). Allocating an area for ints that is not a multiple of int size is in itself fishy... but not illegal. The resulting address is interpreted as a pointer to an int (typically 32 bits), and you are asking for the address 9 int beyond the start of the allocated area. That in itself is fine, but trying to access that is undefined behaviour: anything might happen, including whatever you expect naïvely, nothing whatsoever, crash, or end the universe. What will usually happen is that you get assigned memory from a bigger area containing other objects and free space (and the extra data malloc uses to keep track of the whole mess). Reading there causes no harm, writing could damage other data or mess up malloc's data structures, leading to mysterious behaviour/crashes later on. If you are lucky, the new space is allocated at a boundary, and out-of-limits access gives a segmentation fault, pointing at the culprit.

Does free() remove the data stored in the dynamically allocated memory?

I wrote a simple program to test the contents of a dynamically allocated memory after free() as below. (I know we should not access the memory after free. I wrote this to check what will be there in the memory after free)
#include <stdio.h>
#include <stdlib.h>
main()
{
int *p = (int *)malloc(sizeof(int));
*p = 3;
printf("%d\n", *p);
free(p);
printf("%d\n", *p);
}
output:
3
0
I thought it will print either junk values or crash by 2nd print statement. But it is always printing 0.
1) Does this behaviour depend on the compiler?
2) if I try to deallocate the memory twice using free(), core dump is getting generated. In the man pages, it is mentioned that program behaviour is abnormal. But I am always getting core dump. Does this behaviour also depend on the compiler?
Does free() remove the data stored in the dynamically allocated memory?
No. free just free the allocated space pointed by its argument (pointer). This function accepts a char pointer to a previously allocated memory chunk, and frees it - that is, adds it to the list of free memory chunks, that may be re-allocated.
The freed memory is not cleared/erased in any manner.
You should not dereference the freed (dangling) pointer. Standard says that:
7.22.3.3 The free function:
[...] Otherwise, if the argument does not match a pointer earlier returned by a memory management
function, or if the space has been deallocated by a call to free or realloc, the behavior is undefined.
The above quote also states that freeing a pointer twice will invoke undefined behavior. Once UB is in action, you may get either expected, unexpected results. There may be program crash or core dump.
As described in gnu website
Freeing a block alters the contents of the block. Do not expect to find any data (such as a pointer to the next block in a chain of blocks) in the block after freeing it.
So, accessing a memory location after freeing it results in undefined behaviour, although free doesnt change the data in the memory location. U may be getting 0 in this example, u might as well get garbage in some other example.
And, if you try to deallocate the memory twice, on the second attempt you would be trying to free a memory which is not allocated, thats why you are gettin the core dump.
In addition to all the above explanations for use-after-free semantics, you really may want to investigate the life-saver for every C programmer: valgrind. It will automatically detect such bugs in your code and generally save your behind in the real world.
Coverity and all the other static code checkers are also great, but valgrind is awesome.
As far as standard C is concerned, it’s just not specified, because it is not observable. As soon as you free memory, all pointers pointing there are invalid, so there is no way to inspect that memory.*)
Even if you happen to have some standard C library documenting a certain behaviour, your compiler may still assume pointers aren’t reused after being passed to free, so you still cannot expect any particular behaviour.
*) I think, even reading these pointers is UB, not only dereferencing, but this doesn’t matter here anyway.

Memory leak question in C after moving pointer (What exactly is deallocated?)

I realize the code sample below is something you should never do. My question is just one of interest. If you allocate a block of memory, and then move the pointer (a no-no), when you deallocate the memory, what is the size of the block that is deallocated, and where is it in memory? Here's the contrived code snippet:
#include <stdio.h>
#include <string.h>
int main(void) {
char* s = malloc(1024);
strcpy(s, "Some string");
// Advance the pointer...
s += 5;
// Prints "string"
printf("%s\n", s);
/*
* What exactly are the beginning and end points of the memory
* block now being deallocated?
*/
free(s);
return 0;
}
Here is what I think I happens. The memory block being deallocated begins with the byte that holds the letter "s" in "string". The 5 bytes that held "Some " are now lost.
What I'm wondering is: Are the 5 bytes whose location in memory immediately follows the end of the original 1024 bytes deallocated as well, or are they just left alone?
Anyone know for sure what is it the compiler does? Is it undefined?
Thanks.
You cannot pass a pointer that was not obtained from a malloc, calloc or realloc to free (except NULL).
Question 7.19 in the C FAQ is relevant to your question.
The consequences of invoking undefined behavior are explained here.
It's undefined behavior in the standard, so you can't rely on anything.
Remember that blocks are artificially delimited areas of memory, and don't automatically
show up. Something has to keep track of the block, in order to free everything necessary and nothing more. There's no possible termination, like C strings, since there's no value or combination of values that can be guaranteed not to be inside the block.
Last I looked, there were two basic implementation practices.
One is to keep a separate record of allocated blocks, along with the address allocated. The free() function looks up the block to see what to free. In this case, it's likely to simply not find it, and may well just do nothing. Memory leak. There are, however, no guarantees.
One is to keep the block information in a part of memory just before the allocation address. In this case, free() is using part of the block as a block descriptor, and depending on what's stored there (which could be anything) it will free something. It could be an area that's too small, or an area that's too large. Heap corruption is very likely.
So, I'd expect either a memory leak (nothing gets freed), or heap corruption (too much is marked free, and then reallocated).
Yes, it is undefined behavior. You're essentially freeing a pointer you didn't malloc.
You cannot pass a pointer you did not obtain from malloc (or calloc or realloc...) to free. That includes offsets into blocks you did obtain from malloc. Breaking this rule could result in anything happening. Usually this ends up being the worst imaginable possibility at the worst possible moment.
As a further note, if you wanted to truncate the block, there's a legal way to do this:
#include <stdio.h>
#include <string.h>
int main() {
char *new_s;
char *s = malloc(1024);
strcpy(s, "Some string");
new_s = realloc(s, 5);
if (!new_s) {
printf("Out of memory! How did this happen when we were freeing memory? What a cruel world!\n");
abort();
}
s = new_s;
s[4] = 0; // put the null terminator back on
printf("%s\n", s); // prints Some
free(s);
return 0;
}
realloc works both to enlarge and shrink memory blocks, but may (or may not) move the memory to do so.
It is not the compiler that does it, it is the standard library. The behavior is undefined. The library knows that it allocated the original s to you. The s+5 is not assigned to any memory block known by the library, even though it happens to be inside a known block. So, it won't work.
What I'm wondering is: Are the 5 bytes whose location in memory immediately follows the end of the original 1024 bytes deallocated as well, or are they just left alone?
Both. The result is undefined so a compiler is free to do either of those, or anything else they'd like really. Of course (as with all cases of "undefined behavior") for a particular platform and compiler there is a specific answer, but any code that relies on such behavior is a bad idea.
Calling free() on a ptr that wasnt allocated by malloc or its brethren is undefined.
Most implementations of malloc allocate a small (typically 4byte) header region immediately before the ptr returned. Which means when you allocated 1024 bytes, malloc actually reserved 1028 bytes. When free( ptr ) is called, if ptr is not 0, it inspects the data at ptr - sizeof(header). Some allocators implement a sanity check, to make sure its a valid header, and which might detect a bad ptr, and assert or exit. If there is no sanity check, or it erroneously passes, free routine will act on whatever data happens to be in the header.
Adding to the more formal answers: I'd compare the mechanics of this to one taking a book in the library (malloc), then tearing off a few dozen pages together with the cover (advance the pointer), and then attempting to return it (free).
You might find a librarian (malloc/free library implementation) that takes such a book back, but in a lot of case I'd expect you would pay a fine for negligent handling.
In the draft of C99 (I don't have the final C99 handy in front of me), there is something to say on this topic:
The free function causes the space pointed to by ptr to be deallocated,
that is, made available for further allocation. If ptr is a null pointer, no action
occurs. Otherwise, if the argument does not match a pointer earlier returned
by the calloc, malloc, or realloc function, or if the space has been
deallocated by a call to free or realloc, the behaviour is undefined.
In my experience, a double free or the free of the "pointer" that was not returned via malloc will result in a memory corruption and/or crash, depending on your malloc implementation. The security people from both sides of the fence used this behaviour not once, in order to do various interesting things at least in early versions of the widely used Doug Lea's malloc package.
The library implementation might put some data structure before the pointer it returns to you. Then in free() it decrements the pointer to get at the data structure telling it how to place the memory back into the free pool. So the 5 bytes at the beginning of your string "Some " is interpreted as the end of the struct used by the malloc() algorithm. Perhaps the end of a 32 bit value, like the size of memory allocated, or a link in a linked list. It depends on the implementation. Whatever the details, it'll just crash your program. As Sinan points out, if you're lucky!
Let's be smart here... free() is not a black hole. At the very least, you have the CRT source code. Beyond that, you need the kernel source code.
Sure, the behavior is undefined in that it is up to the CRT/OS to decide what to do. But that doesn't prevent you from finding out what your platform actualy does.
A quick look into the Windows CRT shows that free() leads right to HeapFree() using a CRT specific heap. Beoyond that you're into RtlHeapFree() and then into system space (NTOSKRN.EXE) with the memory manager Mm*().
There are consistancey checks throughout all these code paths. But doing differnt things to the memory will cause differnt code paths. Hence the true definition of undefined.
At a quick glance, I can see that an allocated block of memory has a marker at the end. When the memory is freed, each byte is written over with a distinct byte. The runtime can do a check to see if the end of block marker was overwritten and raise an exception if so.
This is a posiblility in your case of freeing memory a few bytes into your block (or over-writing your allocated size). Of course you can trick this and write the end of block marker yourself at the correct location. This will get you past the CRT check, but as the code-path goes futher, more undefined behavoir occurs. Three things can happen: 1) absolutely no harm, 2) memory corruption within the CRT heap, or 3) a thrown exception by any of the memory management functions.
Short version: It's undefined behavior.
Long version: I checked the CWE site and found that, while it's a bad idea all around, nobody seemed to have a solid answer. Probably because it's undefined.
My guess is that most implementations, assuming they don't crash, would either free 1019 bytes (in your example), or else free 1024 and get a double free or similar on the last five. Just speaking theoretically for now, it depends on whether the malloc routine's internal storage tables contains an address and a length, or a start address and an end address.
In any case, it's clearly not a good idea. :-)

Resources