Function call after buffer overflow - c

I've seen a video: https://www.youtube.com/watch?v=AXQefYKWjz4
I don't understand 2 things:
I can't see the function call, but it happens.
How he got a specific number, which he wrote to the file.
He is trying to write specific value(perhaps address of function to some position in the stack). Why it is possible? How I can repeat this?

First of all, What happens here is that he stores the hardcoded address of the function foo() in the 'file' that he reads into the variable 'x'. He stored it as '134513853' which when converted to hexadecimal becomes: 0x80484bd which must be the address of the function foo().
So, in order of execution,
the program reads the address of foo() from the file and copies it into x. Then it overwrites the buffer with this address such that after it overflows the buffer, it overwrites the return address.
For example:
If this is what the function stack looks like,
Buffer----------------->
EBP ----------------->
Return address --------> some 0x value <--- EIP
Post overflow it will look like this:
Buffer-----------------> 0x80484bd
EBP--------------------> 0x80484bd
Return Address---------> 0x80484bd <----EIP
Lets not bother with little-endian for now. So, when the function main() ends, the execution will resume from the address stored at the 'Return address' thereby diverting the execution to function foo() and printing the string, "Welcome to my...".
As for your second question, i think the guy who made the video has disabled ASLR and Stack Cookies.
ASLR or Address Space Layout Randomization randomizes key parts of the executable such that a function exists at different addresses on every new instance.
Stack Cookie/Canary is a random runtime generate value which is placed in between the local variables and the return address such that any overflow will have to first overwrite the cookie value. This cookie value is checked before the function ends and if there is a mismatch, the function exits thereby not letting the execution flow being diverted to the attacker controlled return address.
In order to repeat this, u will have to disable ASLR on your system, on Ubuntu this can be achieved by typing the following in your terminal such as Bash:
echo 0 | sudo tee /proc/sys/kernel/randomize_va_space
Then, you will have to compile your program without the stack cookie in the following way:
gcc -fno-stack-protector -z execstack -o test test.c
For more information:
ASLR: http://en.wikipedia.org/wiki/Address_space_layout_randomization
http://en.wikipedia.org/wiki/Buffer_overflow_protection#Canaries
Hope this helps.

A buffer overflow is a kind of 'problem' that lets the program potentially write over its own memory stack.
This does two things:
overwrites IP's (instruction pointer's) return address with a trivial value
(over)writes data in the stack, meant to be used as function parameters.
When the current function exits, the instruction pointers changes to the address in the stack. If that adress is that of malicious code, the code will be executed as if it were the program's.
This could allow one to potentially execute code with, for example root priviledges, if done inside a program that uses such priviledges.

Related

Running own code with a buffer overflow exploit

I am trying to understand the buffer overflow exploit and more specifically, how it can be used to run own code - e.g. by starting our own malicious application or anything similar.
While I do understand the idea of the buffer overflow exploit using the gets() function (overwriting the return address with a long enough string and then jumping to the said address), there are a few things I am struggling to understand in real application, those being:
Do I put my own code into the string just behind the return address? If so, how do I know the address to jump to? And if not, where do I jump and where is the actual code located?
Is the actual payload that runs the code my own software that's running and the other program just jumps into it or are all the instructions provided in the payload? Or more specifically, what does the buffer overflow exploit implementation actually look like?
What can I do when the address (or any instruction) contains 0? gets() function stops reading when it reads 0 so how is it possible to get around this problem?
As a homework, I am trying to exploit a very simple program that just asks for an input with gets() (ASLR turned off) and then prints it. While I can find the memory address of the function which calls it and the return, I just can't figure out how to actually implement the exploit.
You understand how changing the return address lets you jump to an arbitrary location.
But as you have correctly identified you don't know where you have loaded the code you want to execute. You just copied it into a local buffer(which was mostly some where on the stack).
But there is something that always points to this stack and it is the stack pointer register. (Lets assume x64 and it would be %rsp).
Assuming your custom code is on the top of the stack. (It could be at an offset but that too can be managed similarly).
Now we need an instruction that
1. Allows us to jump to the esp
2. Is located at a fixed address.
So most binaries use some kind of shared libraries. On windows you have kernel32.dll. In all the programs this library is loaded, it is always mapped at the same address. So you know the exact location of every instruction in this library.
All you have to do is disassemble one such library and find an instruction like
jmp *%rsp // or a sequence of instructions that lets you jump to an offset
Then the address of this instruction is what you will place where the return address is supposed to be.
The function will return then and then jump to the stack (ofcourse you need an executable stack for this). Then it will execute your arbitrary code.
Hope that clears some confusion on how to get the exploit running.
To answer your other questions -
Yes you can place your code in the buffer directly. Or if you can find the exact code you want to execute (again in a shared library), you can simply jump to that.
Yes, gets would stop at \n and 0. But usually you can get away by changing your instructions a bit to write code that doesn't use these bytes at all.
You try different instructions and check the assembled bytes.

GDB: Find stack memory address where return address of a function is stored?

I'm working on producing a buffer overflow on my Raspberry Pi (ASLR disabled).
I have a program, which has a main function, a vulnerable function and a function which should not be called, the evil function.
My main function calls the vulnerable function at some point, but the evil function obviously never gets called. I need to make sure it does, using a buffer overflow.
So what I have got so far is the return address of the vulnerable function in the main function, which I want to overwrite with the starting address of the evil function. I think this is correct approach.
However I wasn't able to figure out how I examine the memory in gdb in such a way so that I find at what stack address the return address is stored. There is an example available, which inputs a string of characters through gdb while the program is running, then they look up the memory around the stackpointer and somehow that is where the return address is stored. This seems rather weird to me, since how could they know that their input gets stored just a couple addresses away from the desperately wanted return address.
My question is if I can 'search' the stack for my return address using gdb.
The Raspberry Pi is running an ARM microcontroller, so you should read more about the ARM architecture and calling convention.
You should read more about ARM registers, especially the stack pointer (abbreviated SP) as well as the link register (abbreviated LR: this is where the return address of a function is stored). See for instance this question for a good explanation.
To visually inspect the values of those registers with gdb, you can use the instruction info registers, (also work if you type i r). See the doc for more details.

Examining local variables returned function

I have a coredump of a process that has crashed (hard to reproduce).
I have figured out that something goes wrong in a function that has just returned (it returned a NULL pointer rather than a non-NULL pointer).
It would be of great help for me to know the contents of the stack variables in that function. I think on most architectures, returning from a function just means changing the stack pointer. In other words, those values are still there (below the stack pointer then if we take x86 as an example).
Can anyone confirm my reasoning is correct and maybe provide an example how do this with gdb?
Does my reasoning also hold for MIPS ?
Local variables might have been stored on stack, but not necessarily. If there is only a small number of variables that fit into registers and code is optimized, then local variables were never saved on stack.
Depending on calling convention used, final values of local variables may still persist in registers.
Disassemble the function in question (you can use objdump -dS to do this, so you can easily correlate source). See how local variables were accessed . Were they stored in memory or registers? Were registers already restored to their value relevant for caller?
If original register value was not restored, you can just examine the register that was used to store local. If it was already restored, then it's probably lost.
If local values were stored to stack, then function prologue (first instructions) should tell you how stack and frame pointer were manipulated. Taking into account that call also saved to stack (PC saved) you can calculate the value of stack/frame pointer used in that function. Then use x to examine memory locations.
Depending on called function, you could also be able to examine its arguments (when called) and recalculate the value of local variables.
You may see local variable that hasn't be optimised using:
info locals
It may not work in a function that already return, though. If you can run that program again, try to put a breakpoint just before the function return.
Otherwise, you can manually investigate the stack using x/x and info register to know the stack pointer address.
You may then browse the stack using up and down.

Understanding Format String Exploits

I'm learning various exploits and I can't quite get my head around Format String exploits. I've got a fairly simple program set up in an environment that allows the exploit.
int woah(char *arg){
char buf[200];
snprintf(buf, sizeof buf, arg);
return 0;
}
I'm able to control the arg being passed into the function which will be how the attack will happen with the end result of the program running my shellcode and giving me root. Making the program crash is easy, just feed it "%s%s" and it segfaults. We want to do more than that so we feed it something like "AAAA%x%x%x%x%x%x%x". Looking at the program in gdb we look at the buffer right after the snprinf and we can see:
"AAAA849541414141353934....blah blah blah"
That's good! We can see see the A's on the stack as well as the 41s which is A in hex. But then what comes next? I get that the general idea here is to overwrite the instruction pointer with four bytes by having the address at the start of our string that we feed in.....and then somewhere along the line we have it pointing to our shellcode.
How would I find the address of the seip/return address to overwrite?
When snprintf() is called, a stack frame - memory region - is created to execute it's function statements. However, before this happens, the compiler needs to know the previous caller of the function - return point address. This address is included in the stack frame so when the stack frame unwinds, that is the function is finishing up its work, it has go back to that address so the program can continue to run. What you are trying to do is overwrite this address with your shellcode address. Research more on stack frames, ESP, EBP, EIP to get the idea.

The Notesearch Exploit anomalies (Hacking: Art of Exploitation)

This question is about the exploit for the program notesearch on pg 121 of the book Hacking: Art of Exploitation 2nd Edition.
There is something I do not understand in the exploit:
When the System executes the ./notesearch 'xyz....' the argument
'xyz...' overflows the string buffer in the child program thereby
overwriting the return address....that much is clear.
The assumption here is that the notesearch program's stack frame comes ontop of the calling exploit's Stack frame. This holds true when the compiled versions exist on the same system.
My first question is 1. Will this work even as a remote hack?
My second question is
2. Since the buffer has been used to overwrite all variables including and beyond the return address, how does the notesearch program work as intended?
Variables like "printing" etc which sit in this stackframe and decide whether messages are printed or not all seem to work fine.
Even though the calling functions sit ontop of the relevant stackframe, where the string buffer which is being flooded sits, there are certain key variables whioch would have been overwritten.
Question no. 3.
Given that String buffer is part of a new stack frame pushed in after execution of notesearch starts, the buffer overwrites all the given variables in that notesearch program. Also the buffer is the value for the search string. By the program logic since the search string does not match with message, the program should not output details of the User messages. In this case, the messages appear. I want to know why?
(For reference: the book is http://www.tinker.tv/download/hacking2_sample.pdf and the code is downloadable for free from http://www.nostarch.com/hacking2.htm.)
Keep reading the book; another example is given on page 122, and then there's plenty of explanatory text that tells all about the exploits.
Here's the relevant part of notesearch's code:
int main(int argc, char *argv[]) {
int userid, printing=1, fd; // file descriptor
char searchstring[100];
if(argc > 1) // If there is an arg
strcpy(searchstring, argv[1]); // that is the search string
else // otherwise
searchstring[0] = 0; // search string is empty
userid = getuid();
fd = open(FILENAME, O_RDONLY); // open the file for read-only access
You wrote:
The assumption here is that the notesearch program's stack frame comes ontop of the calling exploit's Stack frame.
No, that's wrong. There is only one stack frame that's relevant here: the stack frame of the main() function in notesearch. The fact that we invoke ./notesearch xyz... via a system() call inside exploit_notesearch is irrelevant; we could just as well invoke ./notesearch xyz... directly on the bash command line, or trick some other process (such as, you know, bash) into executing it on our behalf.
Will this work even as a remote hack?
Of course.
Since the buffer has been used to overwrite all variables including and beyond the return address, how does the notesearch program work as intended?
Well, it doesn't really work as intended. Look at the output again:
reader#hacking:~/booksrc $ gcc exploit_notesearch.c
reader#hacking:~/booksrc $ ./a.out
[DEBUG] found a 34 byte note for user id 999
[DEBUG] found a 41 byte note for user id 999
-------[ end of note data ]-------
sh-3.2#
Giving you a shell clearly doesn't count as "working as intended". But even before that, the program claims to find notes for userid 999 in /var/notes, which might indicate that it's a little bit confused. In our role as the malicious hacker, we don't care about this garbage output from the notesearch program; all we care about is that it eventually reaches the end of main() and returns to our shellcode, giving us access to the shell.
But, if you're wondering how we managed to overwrite the return address without overwriting local variables userid, printing, and fd, there are at least three obvious possibilities:
A. Maybe those variables are allocated below searchstring on the stack.
B. Maybe those variables are allocated in registers instead of on the stack.
C. Overwhelmingly likely, those variables are being overwritten, but their initial values simply don't matter to the program. For example, userid can get any value at all, because that garbage value will immediately be overwritten with getuid() on the next line. The only variable whose initial value matters is printing. And even printing changes the behavior of the program only if it happens to get the value 0 — and it can't get the value 0, because the data we're copying in consists entirely of non-zero bytes, by design.
I think you don't really understand what is buffer overflow. That searchstring variable is originally located on stack for 100 bytes. Now you are copying a large chunk of buffer into searchstring without checking the length of it. Therefore the buffer overflows to other parts of the stack frame of the notesearch's main function. The return address is also overwritten. That's how it works.
I think that the most important assumption here is that the notesearch stack is similar to that of exploit_notesearch. That is why he uses an exploit_notesearch local variable (unsigned int i) to calculate ret. He assumes (of course, knowing the source code of notesearch) that when both programs are loaded in memory they will have similar frame addresses (around 0xffff7..)
Of course, the 2 programs does not share memory, they are different processes.

Resources