Not getting a segmentation fault when expecting it - c

I'm toying with buffer overflows, but I'm confused by what I'm finding when running the following simple C program on Mac OS.
#include <stdio.h>
int main(void) {
char buf[2];
scanf("%s", buf);
printf("%s\n", buf);
}
By setting the length of buf to 2 bytes, I expected to cause a segmentation fault when entering the string "CCC", but that doesn't happen. Only when entering a string 24 characters in length do I incur a segmentation fault.
What's going on? Is it something to do with character encoding?
Thanks.

The behavior of your program is undefined as soon as you overflow the buffer. Anything can happen. You can't predict it.
There might or might not be some padding bytes after your buffer that happen to be unimportant to your code execution. You can't rely on that. A different compiler, compiling in 32bit vs 64bit, debug settings... all that could alter your code execution after that overflow.

Because buf is on the stack. When you start overwriting it, you start overwriting the stack which belongs to the program which the OS won't catch depending on what else is allocated there (e.g. spill slots for registers created by the compiler). Only once you cross the allocated stack boundary the OS will have a chance to raise a segfault.

I guess it's related to the memory layout. If what you are overwriting is accessible to your process (a page mapped writable) the OS doesn't have a chance to see you're doing something "wrong".
Indeed, when doing something like this, from the eyes of a C programmer "that's totally wrong!". But in the eyes of the OS "Okay, he's writing stuff to some page. Is the page mapped with the adequate permissions ? If it is, OKAY".

There is no guarantee that you will get segmentation fault at all. There is more data after char buf[2] overwriting it may or may not cause segmentation fault.

buf is allocated on the stack, you're just overwriting an area that's not used there is a good chance that nobody will complain about it. On some platforms your code will accept whole paragraphs.

Related

New to C, segmentation fault?

I see there error in this code. in fscanf, the address of buffer needs to be referenced (&buffer). Could some one explain the error "Segmentation fault"? I am new to compiling things with gcc, and I dont understand what it it trying to tell me.
int buffer;
char junk;
while(fscanf(fp,"%d%c",buffer, &junk) !=EOF)
{
printf("%d\n",buffer);
}
fclose(fp);
return 0;
}
I note that none of the answers have actually addressed your question:
Could some one explain the error "Segmentation fault"?
In C it is extremely easy to write a program that has "undefined behaviour". You have done so. A program that has undefined behaviour can literally do anything. It can go into an infinite loop. It can give a segmentation fault. It can work normally. It can erase your hard disk after emailing your files to North Korea. It can do anything whatsoever.
An extremely common symptom of undefined behaviour is a segmentation fault. Basically what this means is that you have written a program with undefined behaviour, and you got lucky. Your program attempts to access memory that it has no right to access. And instead of deleting your hard disk, the operating system gives you a segmentation fault. You should be thankful every time you get a segmentation fault; it could have been much, much worse. A seg fault calls attention to the error so that you can fix it easily.
Specifically what is happening here is:
buffer is not initialized to any value. Its value could be any legal integer.
fscanf expects a pointer. Pointers have the property that when dereferenced they turn into a variable. Pointers are often implemented as integers that store an address to the memory location of the variable. (Note that pointers are not required to be implemented like this, but it is a common choice.)
Instead of a pointer you are giving fscanf an integer, which it interprets as a pointer to a storage location. But the integer contains any possible integer value.
The operating system maintains a list of memory pages that are known to be in use. If buffer just happens to have a value which, when interpreted as a pointer, happens to refer to a page that is not in use, then the operating system will produce a seg fault when fscanf attempts to turn the pointer into a storage location.
Now think of what could have happened in other circumstances. buffer could have happened to contain an integer which when interpreted as a pointer yields a valid memory address, and that valid memory address might have happened to contain the return address of the current method. And the value put into that location by fscanf might happen to be the address of the "format the hard disk" library routine. You would not get a segmentation fault in that case; instead, when the current method returned, it would format your hard disk instead of terminating the program. Again, make sure you understand this: undefined behaviour can literally do anything.
Most of the time you will get lucky and get a segmentation fault. Do not rely on this safety net! Do not write undefined behaviour in the first place.
As a historical note, the term "segmentation fault" comes from the common practice of the operating system "segmenting" memory into sections for code, for data, and so on. There is of course again no requirement that an operating system do this, but most modern operating systems use some form of segmentation to help catch these sorts of errors.
You were close..
fscanf(fp,"%d%c",&buffer, &junk)
You missed & operator before buffer in fscanf.
When its a matter of addresses you have two possibilities; define a pointer: in this case int * buffer and for that case buffer will do fine.
However, if you use a regular variable you have to give its address by adding the operator '&'.
Second, I see that you use variable called "fp". I can't find the definition for that variable in your code.

Strcpy a bigger string to a smaller array of char

Why when I do this:
char teststrcpy[5];
strcpy(teststrcpy,"thisisahugestring");
I get this message in run time:
Abort trap: 6
Shouldn't it just overwrite what is in the right of the memory of teststrcpy? If not, what does Abort trap means?
I'm using the GCC compiler under MAC OSX
As a note, and in answer to some comments, I am doing this for playing around C, I'm not going to try to do this in production. Don't you worry folkz! :)
Thanks
I don't own one, but I've read that Mac OS treats overflow differently, it won't allow you to overwrite memory incertian instances. strcpy() being one of them
On Linux machine, this code successfully overwrite next stack, but prevented on mac os (Abort trap) due to a stack canary.
You might be able to get around that with the gcc option -fno-stack-protector
Ok, since you're seeing an abort from __strcpy_chk that would mean it's specifically checking strcpy (and probably friends). So in theory you could do the following*:
char teststrcpy[5];
gets(teststrcpy);
Then enter your really long string and it should behave baddly as you wish.
*I am only advising gets in this specific instance in an attempt to get around the OS's protection mechanisms that are in place. Under NO other instances would I suggest anyone use the code. gets is not safe.
Shouldn't it just overwrite what is in the right of the memory of teststrcpy?
Not necessarily, it's undefined behaviour to write outside the allocated memory. In your case, something detected the out-of-bounds write and aborted the programme.
In C there is nobody who tells you that "buffer is too small" if you insist on copying too many characters to a buffer that is too small you will go into undefined behavior terrority
If you would LIKE to overwrite what's after 5th char of teststrcpy, you are a scary man. You can copy a string of size 4 to your teststrcpy (fifth char SHOLULD be reserved for NULL).
Most likely your compiler is using a canary for buffer overflow protection and, thus, raising this exception when there is an overflow, preventing you from writing outside the buffer.
See http://en.wikipedia.org/wiki/Buffer_overflow_protection#Canaries

Array is larger than allocated?

I have an array that's declared as char buff[8]. That should only be 8 bytes, but looking as the assembly and testing the code, I get a segmentation fault when I input something larger than 32 characters into that buff, whereas I would expect it to be for larger than 8 characters. Why is this?
What you're saying is not a contradiction:
You have space for 8 characters.
You get an error when you input more than 32 characters.
So what?
The point is that nobody told you that you would be guaranteed to get an error if you input more than 8 characters. That's simply undefined behaviour, and anything can (and will) happen.
You absolutely mustn't think that the absence of obvious misbehaviour is proof of the correctness of your code. Code correctness can only be verified by checking the code against the rules of the language (though some automated tools such as valgrind are an immense help).
Writing beyond the end of the array is undefined behavior. Undefined behavior means nothing (including a segmentation fault) is guaranteed.
In other words, it might do anything. More practical, it's likely the write didn't touch anything protected, so from the point of view of the OS everything is still OK until 32.
This raises an interesting point. What is "totally wrong" from the point of view of C might be OK with the OS. The OS only cares about what pages you access:
Is the address mapped for your process ?
Does your process have the rights ?
You shouldn't count on the OS slapping you if anything goes wrong. A useful tool for this (slapping) is valgrind, if you are using Unix. It will warn you if your process is doing nasty things, even if those nasty things are technically OK with the OS.
C arrays have no bound checking.
As other said, you are hitting undefined behavior; until you stay inside the bounds of the array, everything works fine. If you cheat, as far as the standard is concerned, anything can happen, including your program seeming to work right as well as the explosion of the Sun.
What happens in practice is that with stack-allocated variables you are likely to overwrite other variables on the stack, getting "impossible" bugs, or, if you hit a canary value put by the compiler, it may detect the buffer overflow on return from the function. For variables allocated in the so-called heap, the heap allocator may have given some more room than requested, so the mistake may be less easy to spot, although you may easily mess up the internal structures of the heap.
In both cases you can also hit a protected memory page, which will result in your program being terminated forcibly (for the stack this happens less often because usually you have to overwrite the entire stack to get to a protected page).
Your declaration char buff[8] sounds like a stack allocated variable, although it could be heap allocated if part of a struct. Accessing out of bounds of an array is undefined behaviour and is known as a buffer overrun. Buffer overruns on stack allocated memory may corrupt the current stack frame and possibly other stack frames in the call stack. With undefined behaviour, anything could happen, including no apparent error. You would not expect a seg fault immediately because the stack is typically when the thread starts.
For heap allocated memory, memory managers typically allocate large blocks of memory and then sub-allocate from those larger blocks. That is why you often don't get a seg fault when you access beyond the end of a block of memory.
It is undefined behaviour to access beyond the end of a memory block. And it is perfectly valid, according to the standard, for such out of bounds accesses to result in seg faults or indeed an apparently successful read or write. I say apparently successful because if you are writing then you will quite possibly produce a heap corruption by writing out of bounds.
Unless you are not telling us something you answered your owflown question.
declaring
char buff[8] ;
means that the compiler grabs 8 bytes of memory. If you try and stuff 32 char's into it you should get a seg fault, that's called a buffer overflow.
Each char is a byte ( unless you are doing unicode in which it is a word ) so you are trying to put 4x the number of chars that will fit in your buffer.
Is this your first time coding in C ?

Exceed the buffer size?

Why is it possible to exceed the buffer size in C up to a certain limit without any error (segmentation fault)?
For example, I was playing with this code:
#include <stdio.h>
#include <string.h>
void function1(char *a) {
char buf[10];
strcpy(buf, a);
printf("End of function1\n");
}
main (int argc, char *argv[]) {
function1(argv[1]);
printf("End of main\n");
}
I was able to pass as an argument up to 23 characters instead of 10 characters without any errors, but when I use 24 characters I get a segmentation fault.
I know that with the 24th character, I hit the return value. But what about with the previous 13??!!
You did get an error. You exceed a buffer size and nothing terrible happened. Naively, something terrible should happen when you exceed a buffer. What you expected did not happen, the definition of an error.
I'm not trying to be flippant. My point is a serious one -- if you break the rules, you have no idea what will happen. You might get an error. It might appear fine. Something else might happen. In principle, it's unpredictable. It might change from compiler to compiler, operating system to operating system, or even run to run.
Likely what's happened in this case is that buf is the last thing on the stack and the space after it isn't used for anything critical. So using some of the space after it is harmless. You may eventually hit a critical structure or hit a page that's not writable, resulting in a fault.
That's the beauty of undefined behavior.
For C, writing outside the array is illegal
For your operating system, writing at an unmapped address or at an address mapped with the wrong permissions (read-only) is illegal
These ideas of what process is permitted to do don't always match perfectly.
It's perfectly possible for a C program to do something completely brain-damaged that makes the OS say "that's OK with me" because it's indistinguishable from normal operation.
Back to your question, it's likely the first 13 bytes didn't actually bother the OS (they were written in a valid page). Then the next byte probably touched read-only memory or an unmapped address and the OS had a chance to spot the error.

Writing more characters than malloced. Why does it not fail?

Why does the following work and not throw some kind of segmentation fault?
char *path = "/usr/bin/";
char *random = "012";
// path + random + \0
// so its malloc(13), but I get 16 bytes due to memory alignment (im on 32bit)
newPath = (char *) malloc(strlen(path) + strlen(random) + 1);
strcat(newPath, path);
strcat(newPath, "random");
// newPath is now: "/usr/bin/012\0" which makes 13 characters.
However, if I add
strcat(newPath, "RANDOMBUNNIES");
shouldn't this call fail, because strcat uses more memory than allocated? Consequently, shouldn't
free(newPath)
also fail because it tries to free 16 bytes but I used 26 bytes ("/usr/bin/012RANDOMBUNNIES\0")?
Thank you so much in advance!
Most often this kind of overrun problems doesn't make your program explode in a cloud of smoke and the smell of burnt sulphur. It's more subtle: the variable that is allocated after the overrun variable will be altered, causing unexplainable and seemingly random behavior of the program later on.
The whole program snippet is wrong. You are assuming that malloc() returns something that has at least the first byte set to 0. This is not generally the case, so even your "safe" strcat() is wrong.
But otherwise, as others have said, undefined behavior doesn't mean your program will crash. It only means it can do anything (including crashing, but also not crashing, if you are unlucky).
(Also, you shouldn't cast the return value of malloc().)
Writing more characters than malloced is an Undefined Behavior.
Undefined Behavior means anything can happen and the behavior cannot be explained.
Segmentation fault generally occurs because of accessing the invalid memory section. Here it won't give error(Segmentation fault) because you can still access memory. However you are overwriting other memory locations which is undefined behavior, your code runs fine.
It will fail and not fail at random, depending on the availability of the memory just after the malloc'd memory.
Also when you want to concat random you shouldn't be putting in quotes. that should be
strcat(newPath, random);
Many C library functions do not check whether they overrun. Its up to the programmer to manage the memory allocated. You may just be writing over another variable in memory, with unpredictable effects for the operation of your program. C is designed for efficiency not for pointing out errors in programming.
You have luck with this call. You don't get a segfault because your calls presumably stay in a allocated part of the address space. This is undefined behaviour. The last chars of what has been written are not guaranteed to not be overwritten. This calls may also fail.
Buffer overruns aren't guaranteed to cause a segfault. The behavior is simply undefined. You may get away with writing to memory that's not yours one time, cause a crash another time, and silently overwrite something completely unrelated a third time. Which one of these happens depends on the OS (and OS version), the hardware, the compiler (and compiler flags), and pretty much everything else that is running on your system.
This is what makes buffer overruns such nasty sources of bugs: Often, the apparent symptom shows in production, but not when run through a debugger; and the symptoms usually don't show in the part of the program where they originate. And of course, they are a welcome vulnerability to inject your own code.
Operating systems allocate at a certain granularity which is on my system a page-size of 4kb (which is typical on 32bit machines), whether a malloc() always takes a fresh page from the OS depends on your C runtime library.

Resources