does buffer overflow still exist? - c

I was watching a university lecture about buffer overflow, and the professor ended up saying
even if we were able to fill the buffer with exploit code and jumped
into that code, we still can not execute it..
the reasons - he mentioned - are:
programmers avoid the use of functions that cause overflow.
randomized stack offsets: at start of program, allocate random amount of space on stack to make it difficult to predict the beginning of inserted code.
use techniques to detect stack corruption.
non-executable code segments: only allow code to execute from "text" sections of memory.
now I wonder, does buffer overflow attack still exist nowadays? or it is out-of-date.
detailed answer will be very appreciated!

Not all of us. There's a bunch of new programmers every day. Does our collective knowledge that strcpy is bad get disseminated to them magically? I don't think so.
Difficult, yes. Impossible, no. Any vulnerability that can be turned into an arbitrary read can defeat such protections trivially.
Indeed we can detect stack corruption, under certain circumstances. Canaries, for instance, may be overwritten, their value is compiler dependent, and they might not protect against all kinds of stack corruption (e.g. GCC's -fstack-protector-strong protects against EIP overwrite, but not other kinds of overrun)
W^X memory is a reality, but how many OS's have adopted it for the stack? That'd be an interesting little research project for your weekend. :) Additionally, if you look into return-oriented-programming (ROP) techniques (return-to-libc is an application of it), you'll see it also can be bypassed.

Related

Is buffer overflow the only possible bug associated with program stack?

Is buffer overflow the only possible bug associated with C/C++ program stack? are there any other bugs which can happen at program stack in a single/multi threaded C/C++ program.
I was reading this paper (Learning from Mistakes — A Comprehensive Study on Real World Concurrency Bug Characteristics) on concurrency bugs, and started thinking that such concurrency bugs do not happen at stack as it is private to threads.
Thanks
You can attempt to use too much stack, typically resulting in a segmentation fault (a report by the hardware that the program has attempted to access memory that is not mapped or not mapped for the type of access attempted and that the operating system does not handle by changing the memory mapping to provide access).
You can use pointers or array indices incorrectly (not just buffer overflows but “underflows” going in the other direction or other incorrect address calculations), corrupting the stack, which can alter program execution in a variety of ways, causing transfer of program control to undesired locations or corrupting data and causing undesired computations.

Is using canaries for bss or data-sections to detect overflows/smashing useful?

In our GCC-based C embedded system we are using the -ffunction-sections and -fdata-sections options to allow the linker, when linking the final executable, to remove unused (unreferenced) sections. This works well since years.
In the same system most of the data-structures and buffers are allocated statically (often as static-variables at file-scope).
Of course we have bugs, sometimes nasty ones, where we would like to quickly exclude the possibility of buffer-overflows.
One idea we have is to place canaries in between each bss-section and data-section - each one presenting exactly one symbol (because of -fdata-sections). Like the compiler is doing for functions-stacks when Stack-Smashing and StackProtection is activated. Checking these canaries could be done from the host by reading the canary-addresses "from time to time".
It seems that modifying the linker-script (placing manually the section and adding a canary-word in between) seems feasible, but does it make sense?
Is there a project or an article in the wild? Using my keywords I couldn't find anything.
Canaries are mostly useful for the stack, since it expands and collapses beyond the programmer's direct control. The things you have on data/bss do not behave like that. Either they are static variables, or in case they are buffers, they should keep within their fixed size, which should be checked with defensive programming in-place with the algorithm, rather than unorthodox tricks.
Also, stack canaries are used specifically in RAM-based, PC-like systems that don't know any better way. In embedded systems, they aren't very meaningful. Some useful things you can do instead:
Memory map the stack so that it grows into a memory area where writes will yield a hardware exception. Like for example, if your MCU has the ability to separate executable memory from data memory and yield exceptions if you try to execute code in the data area, or write to the executable area.
Ensure that everything in your program dealing with buffers perform their error checks and not write out-of-bounds. Static analysis tools are usually decent at spotting out-of-bounds bugs. Even some compilers can do this.
Add lots of defensive programming with static asserts. Check sizes of structs, buffers etc at compile-time, it's free.
Run-time defensive programming. For example if(x==good) {...} else if(x == bad) {... } is missing an else. And switch(x) case A: { ... } is missing a default. "But it can't go there in theory!" No but in practice, when you get runaway code caused by bugs (very likely), data retention of flash (100% likely) or EMI influence on RAM (quite unlikely).
And so on.

Feasibility to Bypass Address randomization and Stack Smash Protection - buffer overflow attack

I just went through logic behind buffer overflow attacks and associated protection mechanisms available in kernel versions above 2.6 in UNIX to avoid buffer overflow attacks (Address Randomization and Stack Smash Protection).
In each time we go ahead disabling Address Randomization (Assigning '0' to kernel address randomization) and Stack Smash Protection (including -fno-stack-protector while compiling) to analyze buffer overflow attacks.
Just curious to get to know, Is there any bypass protection mechanism available without having to do above mentioned two activities just by disabling while it's still enforced. Would be good to hear if so any such mechanism there, can you please help on it.
The best way I know of to avoid buffer overflow is to make use of 100% fully exhaustive unit tests that check any function that deals with a buffer of any size and type. This is not always realistic, of course.
"exhaustive" means that all possible cases are taken in account, no matter whether your application would ever generate all those specific cases at time you first write your code.
Although there are tools out there that can help you in that arena. Some are quite well automated and will generate unit tests automatically. I never tried one of those so I cannot warrant any one of them, but if you are in a time crunch, that could help.
Another way, which somewhat works is to run a static analyzer against your code. Code Coverity is the one I have used in the past, but there are many others too. In most cases, static analysis will only catch problems where you declare static buffers on your stack as in:
char buf[256];
...
char a = buf[256]; // <- bug here, although not too bad
buf[256] = a; // <- bug here, could be bad, you're writing to the stack!
Now... under Unix you have two problems with buffer overflows. It will make your system crash in most cases. However, if the hacker has access to your code, they may be able to call a specific system function (a kernel function, to be clear). In that case, what is potentially problematic is if your process runs with an elevated user (i.e. worst case scenario: root). At that point the hacker may have obtained some permissions to do more stuff without your authorization. To eliminate this risk you have two main solutions:
Use a chroot environment; this can be difficult to setup if you are new to Linux, but that works on virtual all Unices
Use a virtualbox environment (or some other virtual system like qemu); getting such an environment setup is generally pretty easy, although if you want to automatically generate new environments... there is an API and it can be tedious.
There is one last way, but that can be slow. The CPU has an MMU. You can use the MMU to protect/unprotect the memory and ensure that each read and write happens to a buffer that was allocated (in case of the stack, the frame buffer is used to make sure you are within the correct window.) As you can imagine, for each write (and possibly many reads) you get an interrupt and the handler is not small. It's a good tool/idea to debug a software that has many buffer overflows, but in general it is not useable in production.
Unfortunately, none of these options are part of the g++ suite by default.

Allocating a new call stack

(I think there's a high chance of this question either being a duplicate or otherwise answered here already, but searching for the answer is hard thanks to interference from "stack allocation" and related terms.)
I have a toy compiler I've been working on for a scripting language. In order to be able to pause the execution of a script while it's in progress and return to the host program, it has its own stack: a simple block of memory with a "stack pointer" variable that gets incremented using the normal C code operations for that sort of thing and so on and so forth. Not interesting so far.
At the moment I compile to C. But I'm interested in investigating compiling to machine code as well - while keeping the secondary stack and the ability to return to the host program at predefined control points.
So... I figure it's not likely to be a problem to use the conventional stack registers within my own code, I assume what happens to registers there is my own business as long as everything is restored when it's done (do correct me if I'm wrong on this point). But... if I want the script code to call out to some other library code, is it safe to leave the program using this "virtual stack", or is it essential that it be given back the original stack for this purpose?
Answers like this one and this one indicate that the stack isn't a conventional block of memory, but that it relies on special, system specific behaviour to do with page faults and whatnot.
So:
is it safe to move the stack pointers into some other area of memory? Stack memory isn't "special"? I figure threading libraries must do something like this, as they create more stacks...
assuming any area of memory is safe to manipulate using the stack registers and instructions, I can think of no reason why it would be a problem to call any functions with a known call depth (i.e. no recursion, no function pointers) as long as that amount is available on the virtual stack. Right?
stack overflow is obviously a problem in normal code anyway, but would there be any extra-disastrous consequences to an overflow in such a system?
This is obviously not actually necessary, since simply returning the pointers to the real stack would be perfectly serviceable, or for that matter not abusing them in the first place and just putting up with fewer registers, and I probably shouldn't try to do it at all (not least due to being obviously out of my depth). But I'm still curious either way. Want to know how these sorts of things work.
EDIT: Sorry of course, should have said. I'm working on x86 (32-bit for my own machine), Windows and Ubuntu. Nothing exotic.
All of these answer are based on "common processor architectures", and since it involves generating assembler code, it has to be "target specific" - if you decide to do this on processor X, which has some weird handling of stack, below is obviously not worth the screensurface it's written on [substitute for paper]. For x86 in general, the below holds unless otherwise stated.
is it safe to move the stack pointers into some other area of memory?
Stack memory isn't "special"? I figure threading libraries
must do something like this, as they create more stacks...
The memory as such is not special. This does however assume that it's not on an x86 architecture where the stack segment is used to limit the stack usage. Whilst that is possible, it's rather rare to see in an implementation. I know that some years ago Nokia had a special operating system using segments in 32-bit mode. As far as I can think of right now, that's the only one I've got any contact with that uses the stack segment for as x86-segmentation mode describes.
Assuming any area of memory is safe to manipulate using the stack
registers and instructions, I can think of no reason why it would be a
problem to call any functions with a known call depth (i.e. no
recursion, no function pointers) as long as that amount is available
on the virtual stack. Right?
Correct. Just as long as you don't expect to be able to get back to some other function without switching back to the original stack. Limited level of recursion would also be acceptable, as long as the stack is deep enough [there are certain types of problems that are definitely hard to solve without recursion - binary tree search for example].
stack overflow is obviously a problem in normal code anyway,
but would there be any extra-disastrous consequences to an overflow in
such a system?
Indeed, it would be a tough bug to crack if you are a little unlucky.
I would suggest that you use a call to VirtualProtect() (Windows) or mprotect() (Linux etc) to mark the "end of the stack" as unreadable and unwriteable so that if your code accidentally walks off the stack, it crashes properly rather than some other more subtle undefined behaviour [because it's not guaranteed that the memory just below (lower address) is unavailable, so you could overwrite some other useful things if it does go off the stack, and that would cause some very hard to debug bugs].
Adding a bit of code that occassionally checks the stack depth (you know where your stack starts and ends, so it shouldn't be hard to check if a particular stack value is "outside the range" [if you give yourself some "extra buffer space" between the top of the stack and the "we're dead" zone that you protected - a "crumble zone" as they would call it if it was a car in a crash]. You can also fill the entire stack with a recognisable pattern, and check how much of that is "untouched".
Typically, on x86, you can use the existing stack without any problems so long as:
you don't overflow it
you don't increment the stack pointer register (with pop or add esp, positive_value / sub esp, negative_value) beyond what your code starts with (if you do, interrupts or asynchronous callbacks (signals) or any other activity using the stack will trash its contents)
you don't cause any CPU exception (if you do, the exception handling code might not be able to unwind the stack to the nearest point where the exception can be handled)
The same applies to using a different block of memory for a temporary stack and pointing esp to its end.
The problem with exception handling and stack unwinding has to do with the fact that your compiled C and C++ code contains some exception-handling-related data structures like the ranges of eip with the links to their respective exception handlers (this tells where the closest exception handler is for every piece of code) and there's also some information related to identification of the calling function (i.e. where the return address is on the stack, etc), so you can bubble up exceptions. If you just plug in raw machine code into this "framework", you won't properly extend these exception-handling data structures to cover it, and if things go wrong, they'll likely go very wrong (the entire process may crash or become damaged, despite you having exception handlers around the generated code).
So, yeah, if you're careful, you can play with stacks.
You can use any region you like for the processor's stack (modulo the memory protections).
Essentially, you simply load the ESP register ("MOV ESP, ...") with a pointer to the new area, however you managed to allocate it.
You have to have enough for your program, and whatever it might call (e.g., a Windows OS API), and whatever funny behaviours the OS has. You might be able to figure out how much space your code needs; a good compiler can easily do that. Figuring how much is needed by Windows is harder; you can always allocate "way too much" which is what Windows programs tend to do.
If you decide to manage this space tightly, you'll probably have to switch stacks to call Windows functions. That won't be enough; you'll likely get burned by various Windows surprises. I describe one of them here Windows: avoid pushing full x86 context on stack. I have mediocre solutions, but not good solutions for this.

what is Address space layout randomization [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Memory randomization as application security enhancement?
hi,
Can some explain me please what address space Layout Randomization is and how is it implemented. How does this technique affect the stack, heap and static data. Also I am interested in any papers that explain about the address space Layout Randomization.
Thanks & Regards,
Mousey.
ASLR is a technique designed to make various types of buffer overruns more difficult to exploit, by moving segments around a bit. The stack could be shifted a few bytes (or pages), the sections of your program (and even the libraries your code uses) can be loaded at different addresses, etc.
Buffer overflows usually work by tricking the CPU into running code at a certain address (often on the stack). ASLR complicates that by making the address harder to predict, since it can change each and every time the program runs. So often, instead of running arbitrary code, the program will just crash. This is obviously a bad thing, but not as bad as if some random joker were allowed to take control of your server.
A very simple, crude form of ASLR can actually be implemented without any help from the OS, by simply subtracting some small amount from the stack pointer. (It's a little tricky to do in higher-level languages, but somewhat simpler in C -- and downright trivial in ASM.) That'll only protect against overflows that use the stack, though. The OS is more helpful; it can change all sorts of stuff if it feels like. It depends on your OS as to how much it does, though.

Resources