Related
That is my question. Say my array is size 10, and I get a segmentation fault when I loop through and fill the array to 13, because I corrupted some important information on the stack. If I stuck it on the heap instead, am I immune to segmentation faults? This is more of a conceptual question.
No. If you overrun the allocated space, you are using memory that does not belong to the application, or which belongs to some other part of the application.
What then happens is undefined. I would be surprised in either case if overrunning by just three bytes directly caused a segmentation fault - the page granularity is not that small. Seg-faults are a function of the processor and operating system not the C language, and occur when you access memory not allocated to the process.
In the case of a stack buffer overrun, you will most likely corrupt some adjacent data in the current or calling function, if a seg-fault occurs it will be due to acting upon the corrupted data, such as popping an invalid return address to the program-counter for example, rather than the overrun itself.
Similarly if you overrun the heap allocation, the result depends on what you are corrupting and how that is subsequently used. Heap corruption is particularly insidious, because the results of the error may remain undetected (latent), or result in failure long after the actual error in some unrelated area of the code - typically when you attempt to free or allocate some other allocation where the heap structures have been destroyed. The memory you have corrupted may be part of some other existing allocation, and the error may manifest itself only when that corrupted data is utilised.
The error you observe is entirely non-deterministic - an immediate seg-fault is perhaps unlikely in the scenario you have described, but would in fact be the best you could hope for, since all other possible manifestations of failure are particularly difficult to debug. A failure from a stack data overrun is likely to be more localised - typically you will see corrupted data within the function, or the function will fail on return, whereas a heap error is often less immediately obvious because the data you are corrupting can be associated with any code withing your application. If that code does not run, or runs infrequently, you may never observe any failure,
The "solution" to your problem is not to write code that overruns - it is always and error, and using a different type of memory allocation is not going to save you from that. To use a coding practice that simply "hides" bugs or makes them less apparent or deterministic is not a good strategy.
Thisprogram should crash due to buffer overrun. But I am getting output as "stackoverflow". How?
#include<stdio.h>
#include<string.h>
int main()
{
char *src;
char dest[10];
src = (char*)malloc(5);
strcpy(src, "stackoverflow");
printf("%s\n", src);
return 0;
}
It does crash due to a buffer overrun.
The behaviour of your code is undefined as you are overrunning your buffer. You can't expect the behaviour to be in any way predictable.
It's difficult - and not required by the c standard - to issue an appropriate diagnostic in such cases.
Buffer overflows are not guaranteed to crash you: they cause undefined behavior. While a lot of platforms make the sequence of events that may or may not culminate in a crash rather predictable, one very important thing to consider is that the possible crash almost never happens at the same time that the damage is caused.
In a stack buffer overflow, possible crashes happens when you read the value of a variable that sat on the stack and was overflowed onto, or when you return from the function and the return address has been overwritten.
However, you're not overflowing a stack buffer: you're overflowing a heap buffer that you got from malloc. Typically, possible crashes there happens when you free that buffer or try to use a buffer that happened to be contiguous to it (there is, on purpose, no way to predict this). You allocate only one buffer and never free it, so you're not going to observe any problem from a small overflow.
In addition, I don't know any mainstream malloc implementation on desktops that returns blocks of less than 32 bytes, so even though you said malloc(5), you probably have room for 32 bytes, so your short write is not overflowing on anything (although you must not rely on this).
The only case where an overflow will straight-up crash your program is if you overflow to a memory location that has not been assigned any meaning. For instance, if you do something like memset('c', dest, 100000000), that will probably happen because you'll be busting out of the memory area that is reserved to the stack and there is probably nothing next to it.
Copying to a buffer that is too small is undefined behavior; that doesn't necessarily mean it's guaranteed to crash. For all we know those other bytes occupying the "overflow\0" part of your string aren't being used anyway.
Because unless you are using some overrun-protection library/debugging tool, nothing will notice that you’re writing to memory you shouldn’t be. If you run this under valgrind it will display that you wrote to memory you shouldn’t have. But malloc(5) returns a pointer into a likely larger block of memory, so the chances of the buffer overflow resulting in trying to access an unmapped address is low. But if you had other malloc() calls, etc., you might notice the "overflow" part ending up in one of those other buffers—but it really depends on the implementation of malloc() and what code that overflow breaks won’t be deterministic.
Your buffer is allocated in the heap so your pointer src is pointing to buffer of char basicly of size 5 bytes because the size of char is 1 byte, however if the size of this allocated buffer + the added size by copying the string into this buffer doesn't exceed the size of the heap then it will work ,in the other hand if the total size try to overwrite an allocat memory by other pointer then you get the crash or the size exceed the heap size limitation you get the crash
As conclusion avoid this kind of code because you will get an unexpected behavior.
I have been taught in lectures, that calling free() on a pointer twice is really, really bad. I know that it is good practice, to set a pointer to NULL, right after having freed it.
However, I still have never heard any explanation as to why that is. From what I understand, the way malloc() works, it should technically keep track of the pointers it has allocated and given you to use. So why does it not know, whether a pointer it receives through free() has been freed yet or not?
I would love to understand, what happens internally, when you call free() on a location that has previously already been freed.
When you use malloc you are telling the PC that you want to reserve some memory location on the heap just for you. The computer gives back a pointer to the first byte of the addressed space.
When you use free you are actually telling the computer that you don't need that space anymore, so it marks that space as available for other data.
The pointer still points to that memory address. At this point that same space in the heap can be returned by another malloc call. When you invoke free a second time, you are not freeing the previous data, but the new data, and this may not be good for your program ;)
To answer your first question,
So why does it not know, whether a pointer it receives through free() has been freed yet or not?
because, the specification for malloc() in C standard does not mandate this. When you call malloc() or family of functions, what it does is to return you a pointer and internally it stores the size of the memory location allocated in that pointer. That is the reason free() does not need a size to clean up the memory.
Also, once free()-d, what happens with the actually allocated memory is still implelentation dependent. Calling free() is just a marker to point out that the allocated memory is no longer in use by the process and can be reclaimed and e re-allocated, if needed. So, keeping track of the allocated pointer is very needless at that point. It will be an unnecessary burden on the OS to keep all the backtracks.
For debugging purpose, however, some library implementations can do this job for you, like DUMA or dmalloc and last but not the least, memcheck tool from Valgrind.
Now, technically, the C standard does not specify any behaviour if you call free() on an already free-ed pointer. It is undefined behavior.
C11, chapter §7.22.3.3, free() function
[...] if
the argument does not match a pointer earlier returned by a memory management
function, or if the space has been deallocated by a call to free() or realloc(), the
behavior is undefined.
C standard only says that calling free twice on a pointer returned by malloc and its family function invoke undefined behavior. There is no further explanation why it is so.
But, why it is bad is explained here:
Freeing The Same Chunk Twice
To understand what this kind of error might cause, we should remember how the memory manager normally works. Often, it stores the size of the allocated chunk right before the chunk itself in memory. If we freed the memory, this memory chunk might have been allocated again by another malloc() request, and thus this double-free will actually free the wrong memory chunk - causing us to have a dangling pointer somewhere else in our application. Such bugs tend to show themselves much later than the place in the code where they occured. Sometimes we don't see them at all, but they still lurk around, waiting for an opportunity to rear their ugly heads.
Another problem that might occure, is that this double-free will be done after the freed chunk was merged together with neighbouring free chunks to form a larger free chunk, and then the larger chunk was re-allocated. In such a case, when we try to free() our chunk for the 2nd time, we'll actually free only part of the memory chunk that the application is currently using. This will cause even more unexpected problems.
When you are calling malloc you are getting a pointer. The runtime library needs to keep track of the malloced memory. Typically malloc does not store the memory management structures separated from the malloc ed memory but in one place. So a malloc for x bytes in fact takes x+n bytes, where one possible layout is that the first n bytes are containing a linked list struct with pointers to the next (and maybe previous) allocated memory block.
When you free a pointer then the function free could walk through it's internal memory management structures and check if the pointer you pass in is a valid pointer that was malloced. Only then it could access the hidden parts of the memory block. But doing this check would be very time consuming, especially if you allocate a lot. So free simply assumes that you pass in a valid pointer. That means it directly access the hidden parts of the memory block and assumes that the linked list pointers there are valid.
If you free a block twice then you might have the problem that someone did a new malloc, got the memory you just freed, overwrites it and the second free reads invalid pointers from it.
Setting a freed pointer to NULL is good practice because it helps debugging. If you access freed memory your program might crash, but it might also just read suspicious values and maybe crash later. Finding the root cause then might be hard. If you set freed pointers to NULL your program will immediately crash when you try to access the memory. That helps massively during debugging.
I have an array that's declared as char buff[8]. That should only be 8 bytes, but looking as the assembly and testing the code, I get a segmentation fault when I input something larger than 32 characters into that buff, whereas I would expect it to be for larger than 8 characters. Why is this?
What you're saying is not a contradiction:
You have space for 8 characters.
You get an error when you input more than 32 characters.
So what?
The point is that nobody told you that you would be guaranteed to get an error if you input more than 8 characters. That's simply undefined behaviour, and anything can (and will) happen.
You absolutely mustn't think that the absence of obvious misbehaviour is proof of the correctness of your code. Code correctness can only be verified by checking the code against the rules of the language (though some automated tools such as valgrind are an immense help).
Writing beyond the end of the array is undefined behavior. Undefined behavior means nothing (including a segmentation fault) is guaranteed.
In other words, it might do anything. More practical, it's likely the write didn't touch anything protected, so from the point of view of the OS everything is still OK until 32.
This raises an interesting point. What is "totally wrong" from the point of view of C might be OK with the OS. The OS only cares about what pages you access:
Is the address mapped for your process ?
Does your process have the rights ?
You shouldn't count on the OS slapping you if anything goes wrong. A useful tool for this (slapping) is valgrind, if you are using Unix. It will warn you if your process is doing nasty things, even if those nasty things are technically OK with the OS.
C arrays have no bound checking.
As other said, you are hitting undefined behavior; until you stay inside the bounds of the array, everything works fine. If you cheat, as far as the standard is concerned, anything can happen, including your program seeming to work right as well as the explosion of the Sun.
What happens in practice is that with stack-allocated variables you are likely to overwrite other variables on the stack, getting "impossible" bugs, or, if you hit a canary value put by the compiler, it may detect the buffer overflow on return from the function. For variables allocated in the so-called heap, the heap allocator may have given some more room than requested, so the mistake may be less easy to spot, although you may easily mess up the internal structures of the heap.
In both cases you can also hit a protected memory page, which will result in your program being terminated forcibly (for the stack this happens less often because usually you have to overwrite the entire stack to get to a protected page).
Your declaration char buff[8] sounds like a stack allocated variable, although it could be heap allocated if part of a struct. Accessing out of bounds of an array is undefined behaviour and is known as a buffer overrun. Buffer overruns on stack allocated memory may corrupt the current stack frame and possibly other stack frames in the call stack. With undefined behaviour, anything could happen, including no apparent error. You would not expect a seg fault immediately because the stack is typically when the thread starts.
For heap allocated memory, memory managers typically allocate large blocks of memory and then sub-allocate from those larger blocks. That is why you often don't get a seg fault when you access beyond the end of a block of memory.
It is undefined behaviour to access beyond the end of a memory block. And it is perfectly valid, according to the standard, for such out of bounds accesses to result in seg faults or indeed an apparently successful read or write. I say apparently successful because if you are writing then you will quite possibly produce a heap corruption by writing out of bounds.
Unless you are not telling us something you answered your owflown question.
declaring
char buff[8] ;
means that the compiler grabs 8 bytes of memory. If you try and stuff 32 char's into it you should get a seg fault, that's called a buffer overflow.
Each char is a byte ( unless you are doing unicode in which it is a word ) so you are trying to put 4x the number of chars that will fit in your buffer.
Is this your first time coding in C ?
I'm supporting some c code on Solaris, and I've seen something weird at least I think it is:
char new_login[64];
...
strcpy(new_login, (char *)login);
...
free(new_login);
My understanding is that since the variable is a local array the memory comes from the stack and does not need to be freed, and moreover since no malloc/calloc/realloc was used the behaviour is undefined.
This is a real-time system so I think it is a waste of cycles. Am I missing something obvious?
You can only free() something you got from malloc(),calloc() or realloc() function. freeing something on the stack yields undefined behaviour, you're lucky this doesn't cause your program to crash, or worse.
Consider that a serious bug, and delete that line asap.
No. This is a bug.
According to free(3)....
free() frees the memory space pointed
to by ptr, which must have been
returned by a previous call to
malloc(), calloc() or realloc().
Otherwise, or if free(ptr) has already
been called before, undefined
behaviour occurs. If ptr is NULL, no
operation is performed.
So you have undefined behavior happening in your program.
IN MOST CASES, you can only free() something allocated on the heap. See http://www.opengroup.org/onlinepubs/009695399/functions/free.html .
HOWEVER: One way to go about doing what you'd like to be doing is to scope temporary variables allocated on the stack. like so:
{
char new_login[64];
... /* No later-used variables should be allocated on the stack here */
strcpy(new_login, (char *)login);
}
...
The free() is definitely a bug.
However, it's possible there's another bug here:
strcpy(new_login, (char *)login);
If the function isn't pedantically confirming that login is 63 or fewer characters with the appropriate null termination, then this code has a classic buffer overflow bug. If a malicious party can fill login with the right bytes, they can overwrite the return pointer on the stack and execute arbitrary code. One solution is:
new_login[sizeof(new_login)-1]='\0';
strncpy(new_login, (char *)login, sizeof(new_login)-1 );
Definitely a bug. free() MUST ONLY be used for heap alloc'd memory, unless it's redefined to do something completely different, which I doubt to be the case.