I'm experimenting with buffer overflows and try to overwrite the return address of the stack with a certain input of fgets
This is the code:
void foo()
{
fprintf(stderr, "You did it.\n");
}
void bar()
{
char buf[20];
puts("Input:");
fgets(buf, 24, stdin);
printf("Your input:.\n", strlen(buf));
}
int main(int argc, char **argv)
{
bar();
return 0;
}
On a normal execution the program just returns your input. I want it to output foo() without modifying the code.
My idea was to overflow the buffer of buf by entering 20 'A's. This works and causes a segmentation fault.
My next idea was to find out the address of foo() which is \x4006cd and append this to the 20 'A's.
From my understanding this should overwrite the return address of the stack and make it jump to foo. But it only causes a segfault.
What am I doing wrong?
Update: Assembler dumps
main
Dump of assembler code for function main:
0x000000000040073b <+0>: push %rbp
0x000000000040073c <+1>: mov %rsp,%rbp
0x000000000040073f <+4>: sub $0x10,%rsp
0x0000000000400743 <+8>: mov %edi,-0x4(%rbp)
0x0000000000400746 <+11>: mov %rsi,-0x10(%rbp)
0x000000000040074a <+15>: mov $0x0,%eax
0x000000000040074f <+20>: callq 0x4006f1 <bar>
0x0000000000400754 <+25>: mov $0x0,%eax
0x0000000000400759 <+30>: leaveq
0x000000000040075a <+31>: retq
End of assembler dump.
foo
Dump of assembler code for function foo:
0x00000000004006cd <+0>: push %rbp
0x00000000004006ce <+1>: mov %rsp,%rbp
0x00000000004006d1 <+4>: mov 0x200990(%rip),%rax # 0x601068 <stderr##GLIBC_2.2.5>
0x00000000004006d8 <+11>: mov %rax,%rcx
0x00000000004006db <+14>: mov $0x15,%edx
0x00000000004006e0 <+19>: mov $0x1,%esi
0x00000000004006e5 <+24>: mov $0x400804,%edi
0x00000000004006ea <+29>: callq 0x4005d0 <fwrite#plt>
0x00000000004006ef <+34>: pop %rbp
0x00000000004006f0 <+35>: retq
End of assembler dump.
bar:
Dump of assembler code for function bar:
0x00000000004006f1 <+0>: push %rbp
0x00000000004006f2 <+1>: mov %rsp,%rbp
0x00000000004006f5 <+4>: sub $0x20,%rsp
0x00000000004006f9 <+8>: mov $0x40081a,%edi
0x00000000004006fe <+13>: callq 0x400570 <puts#plt>
0x0000000000400703 <+18>: mov 0x200956(%rip),%rdx # 0x601060 <stdin##GLIBC_2.2.5>
0x000000000040070a <+25>: lea -0x20(%rbp),%rax
0x000000000040070e <+29>: mov $0x18,%esi
0x0000000000400713 <+34>: mov %rax,%rdi
0x0000000000400716 <+37>: callq 0x4005b0 <fgets#plt>
0x000000000040071b <+42>: lea -0x20(%rbp),%rax
0x000000000040071f <+46>: mov %rax,%rdi
0x0000000000400722 <+49>: callq 0x400580 <strlen#plt>
0x0000000000400727 <+54>: mov %rax,%rsi
0x000000000040072a <+57>: mov $0x400821,%edi
0x000000000040072f <+62>: mov $0x0,%eax
0x0000000000400734 <+67>: callq 0x400590 <printf#plt>
0x0000000000400739 <+72>: leaveq
0x000000000040073a <+73>: retq
End of assembler dump.
You did not count with memory aligment. I changed the code a litte bit to make it easier to find the right spot.
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
int **x;
int z;
void foo()
{
fprintf(stderr, "You did it.\n");
}
void bar()
{
char buf[2];
//puts("Input:");
//fgets(buf, 70, stdin);
x = (int**) buf;
for(z=0;z<8;z++)
printf("%d X=%x\n", z, *(x+z));
*(x+3) = foo;
printf("Your input: %d %s\n", strlen(buf), buf);
}
int main(int argc, char **argv)
{
printf("Foo: %x\n", foo);
printf("Main: %x\n", main);
bar();
return 0;
}
With a smaller buffer, 2 in my example, I found the return address 24 bytes away (x+3, for 8 byte pointers; 64 bits, no debug, no optimization...) from the beginning of the buffer. This position can change depending on the buffer size, architecture, etc. In this example, I manage to change the return address of bar to foo. Anyway you will get a segmentation fault at foo return, as it was not properly set to return to main.
I added x and z as global vars to not change the stack size of bar. The code will display an array of pointer like values, starting at buf[0]. In my case, I found the address in main in the position 3. That's why the final code has *(x+3) = foo. As I said, this position can change depending on compilation options, machine etc. To find the correct position, find the address of main (printed before calling bar) on the address list.
It is important to note that I said address in main, not the address of main, bacause the return address was set to the line after the call to bar and not to the beginning of main. So, in my case, it was 0x4006af instead of 0x400668.
In your example, with 20 bytes buffer, as far as I know, it was aligned to 32 bytes (0x20).
If you want to do the same with fgets, you have to figure out how to type the address of foo, but if you are running a x86/x64 machine, remember to add it in little enddian. You can change the code to display the values byte per byte, so you can get them in the right order and type them using ALT+number. Remember that the numbers you type while holding ALT are decimal numbers. Some terminals won't be friendly handling 0x00.
My output looks like:
$ gcc test.c -o test
test.c: In function ‘bar’:
test.c:21: warning: assignment from incompatible pointer type
$ ./test
Foo: 400594
Main: 400668
0 X=9560e9f0
1 X=95821188
2 X=889350f0
3 X=4006af
4 X=889351d8
5 X=0
6 X=0
7 X=95a1ed1d
Your input: 5 ▒▒`▒9
You did it.
Segmentation fault
void bar()
{
char buf[20];
puts("Input:");
fgets(buf, 24, stdin);
printf("Your input:.\n", strlen(buf));
}
... This works and causes a segmentation fault...
The compiler is probably replacing fgets with a safer variant that includes a check on the destination buffer size. If the check fails, the the prgram unconditionally calls abort().
In this particular case, you should compile the program with -U_FORTIFY_SOURCE or -D_FORTIFY_SOURCE=0.
Related
I am following this buffer overflow tutorial: https://insecure.org/stf/smashstack.html
I want to make this program work in Windows
#include <stdio.h>
void f(int x, int y)
{
char buffer1[5];
char buffer2[10];
int *ret = buffer1 + 12;
(*ret) += 7;
}
int main()
{
int x;
x = 10;
f(1, 2);
x = 21;
printf("%d\n", x);
return 0;
}
This program attempts to modify the return address of function f so that this line x = 21; is ignored in main (ie. the program jumps directly to executing printf).
For some reason this trivial buffer overflow attack didn't work in Windows and I am not sure why.
This is how I understand the stack layout after the function f is called in x64 machines
high address
...
return address (4 bytes)
saved frame pointer (4 bytes)
1 (first argument for f; 4 bytes)
2 (second argument for f; 4 bytes)
buffer1 (1*8=8 bytes)
buffer2 (1*12=12 bytes)
...
low address
Using gdb, I get the following disassembly
Dump of assembler code for function main:
0x0000000000401590 <+0>: push %rbp
0x0000000000401591 <+1>: mov %rsp,%rbp
0x0000000000401594 <+4>: sub $0x30,%rsp
0x0000000000401598 <+8>: callq 0x401690 <__main>
0x000000000040159d <+13>: movl $0xa,-0x4(%rbp)
0x00000000004015a4 <+20>: mov $0x2,%edx
0x00000000004015a9 <+25>: mov $0x1,%ecx
0x00000000004015ae <+30>: callq 0x401560 <f>
0x00000000004015b3 <+35>: movl $0x15,-0x4(%rbp)
0x00000000004015ba <+42>: mov -0x4(%rbp),%eax
0x00000000004015bd <+45>: mov %eax,%edx
0x00000000004015bf <+47>: lea 0x2a3a(%rip),%rcx # 0x404000
0x00000000004015c6 <+54>: callq 0x402b70 <printf>
0x00000000004015cb <+59>: mov $0x0,%eax
0x00000000004015d0 <+64>: add $0x30,%rsp
0x00000000004015d4 <+68>: pop %rbp
0x00000000004015d5 <+69>: retq
End of assembler dump.
So in order to get return address in the stack, I must add 4+4+4=12 bytes. From the disassembly, in order to skip x = 21;, I must skip movl $0x15,-0x4(%rbp). So I need to add the additional 42-35=7 bytes to the return address.
Is there anything wrong with my understanding so far?
I'm having a difficult to debug a program at assembly level because GDB is jumping some parts of the code. The code is:
#include <stdio.h>
#define BUF_SIZE 8
void getInput(){
char buf[BUF_SIZE];
gets(buf);
puts(buf);
}
int main(int argc, char* argv){
printf("Digite alguma coisa, tamanho do buffer eh: %d\n", BUF_SIZE);
getInput();
return 0;
}
The program was compiled with gcc -ggdb -fno-stack-protector -mpreferred-stack-boundary=4 -o exploit1 exploit1.c
In gdb, I added break getInput and when I run disas getInput it returns me:
Dump of assembler code for function getInput:
0x00000000004005cc <+0>: push %rbp
0x00000000004005cd <+1>: mov %rsp,%rbp
0x00000000004005d0 <+4>: sub $0x10,%rsp
0x00000000004005d4 <+8>: lea -0x10(%rbp),%rax
0x00000000004005d8 <+12>: mov %rax,%rdi
0x00000000004005db <+15>: mov $0x0,%eax
0x00000000004005e0 <+20>: callq 0x4004a0 <gets#plt>
0x00000000004005e5 <+25>: lea -0x10(%rbp),%rax
0x00000000004005e9 <+29>: mov %rax,%rdi
0x00000000004005ec <+32>: callq 0x400470 <puts#plt>
0x00000000004005f1 <+37>: nop
0x00000000004005f2 <+38>: leaveq
0x00000000004005f3 <+39>: retq
If I type run I noticed that the program stops at the line 0x00000000004005d4 and not in the first line of the function 0x00000000004005cc as I expected. Why is this happening?
By the way, this is messing me up because I'm noticing that some extra data is being added to the Stack and I want to see step by step the stack growing.
If I type run I noticed that the program stops at the line 0x00000000004005d4 and not in the first line of the function 0x00000000004005cc as I expected.
Your expectation is incorrect.
Why is this happening?
Because when you set breakpoint via break getInput, GDB sets the breakpoint after function prolog. From documentation:
-function function
The value specifies the name of a function. Operations on function locations
unmodified by other options (such as -label or -line) refer to the line that
begins the body of the function. In C, for example, this is the line with the
open brace.
If you want to set breakpoint on the first instruction, use break *getInput instead.
Documentation here and here.
I am trying to modify the following C program so that the main function will skip the printf("x is 1") line and only print "x is 0".
void func(char *str) {
char buffer[24];
int *ret;
ret = buffer + 28; // Supposed to set ret to the return address of func
(*ret) += 32; // Add the offset needed so that func will skip over printf("x is 1")
strcpy(buffer, str);
}
int main(int argc, char **argv) {
int x;
x = 0;
func(argv[1]);
x = 1;
printf("x is 1");
printf("x is 0");
getchar();
}
As the comments imply, the ret pointer needs to first be set to the return address of the function. I then need to add on an offset that will push it over the line I want to skip. I am running this code on a Linux system with 2 x Intel(R) Xeon(TM) CPU 3.20GHz. I am using gcc version 4.7.2 (Debian 4.7.2-5) to compile. I'm also trying to use example3.c from this (http://insecure.org/stf/smashstack.html) link for reference. Here is a disassembly of the main function using gdb:
Dump of assembler code for function main:
0x0000000000400641 <+0>: push %rbp
0x0000000000400642 <+1>: mov %rsp,%rbp
0x0000000000400645 <+4>: sub $0x20,%rsp
0x0000000000400649 <+8>: mov %edi,-0x14(%rbp)
0x000000000040064c <+11>: mov %rsi,-0x20(%rbp)
0x0000000000400650 <+15>: movl $0x0,-0x4(%rbp)
0x0000000000400657 <+22>: mov -0x20(%rbp),%rax
0x000000000040065b <+26>: add $0x8,%rax
0x000000000040065f <+30>: mov (%rax),%rax
0x0000000000400662 <+33>: mov %rax,%rdi
0x0000000000400665 <+36>: callq 0x4005ac <func>
0x000000000040066a <+41>: movl $0x1,-0x4(%rbp)
0x0000000000400671 <+48>: mov $0x40075b,%edi
0x0000000000400676 <+53>: mov $0x0,%eax
0x000000000040067b <+58>: callq 0x400470 <printf#plt>
0x0000000000400680 <+63>: mov $0x400762,%edi
0x0000000000400685 <+68>: mov $0x0,%eax
0x000000000040068a <+73>: callq 0x400470 <printf#plt>
0x000000000040068f <+78>: callq 0x400490 <getchar#plt>
0x0000000000400694 <+83>: leaveq
0x0000000000400695 <+84>: retq
End of assembler dump.
Using what I've read from the example, my buffer is 24 bytes long and I should add an extra 4 bytes for the SFP size. This would mean I add 28 bytes to get to the return address of <+41>. It then looks like I want to jump to the last printf call at <+73>. This should be an offset of 32. However, when I execute the code, "x is 1" is still printed. I can't seem to find out why. Is there something wrong with my math or assumptions?
This looks to be an ideal time to get experience with gdb and verify that your expectations regarding the stack and the function return address locations are correct!
I will, however suggest that your modified return address should probably be at <+63>, not <+73>, as you need to run the function setup code (to pass the argument, etc).
I'm playing with one stack overflow example. This example looks like this:
void return_input (void){
char array[30];
gets (array);
printf("%s\n", array);
}
main() {
return_input();
return 0;
}
All this code is in the file called overflow.c. We have vulnerable function called return_input, particularly it's 30 byte char array. I compiled it and opened vulnerable function in gdb and got following output:
(gdb) disas return_input
0x08048464 <+0>: push %ebp
0x08048465 <+1>: mov %esp,%ebp
0x08048467 <+3>: sub $0x48,%esp
0x0804846a <+6>: mov %gs:0x14,%eax
0x08048470 <+12>: mov %eax,-0xc(%ebp)
0x08048473 <+15>: xor %eax,%eax
0x08048475 <+17>: lea -0x2a(%ebp),%eax
0x08048478 <+20>: mov %eax,(%esp)
0x0804847b <+23>: call 0x8048360 <gets#plt>
0x08048480 <+28>: lea -0x2a(%ebp),%eax
0x08048483 <+31>: mov %eax,(%esp)
0x08048486 <+34>: call 0x8048380 <puts#plt>
0x0804848b <+39>: mov -0xc(%ebp),%eax
0x0804848e <+42>: xor %gs:0x14,%eax
0x08048495 <+49>: je 0x804849c <return_input+56>
0x08048497 <+51>: call 0x8048370 <__stack_chk_fail#plt>
0x0804849c <+56>: leave
0x0804849d <+57>: ret
End of assembler dump.
As you see from the function prologue we reserved hex48(dec 72) bytes on the stack for local variables. First I was trying to find the address where our vulnerable array starts on the stack. I think it's -0x2a(%ebp), am I right? Hex2a is 42 decimal. As I understand it means that we can write safely 42 bytes before we start to overwrite EBP saved on the stack. But when I run this example it's enough to right only 37 bytes to get segmentation fault:
rustam#rustam-laptop:~/temp/ELF_reader$ ./overflow
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Segmentation fault (core dumped)
How is 37 bytes enough to overflow buffer? If our local char array is -42 bytes from saved EBP
It is hard to tell without seeing the whole disassembly of the function.
However, my guess is that the %gs:0x14 stored at -0xc(%ebp) may be your stack canary that causes a clean exit if a stack coruption is detected.
So this value is stored at -0xc(%ebp), which means that your buffer is in fact only 30 bytes large, followed by whatever comes after.
The code below is from the well-known article Smashing The Stack For Fun And Profit.
void function(int a, int b, int c) {
char buffer1[5];
char buffer2[10];
int *ret;
ret = buffer1 + 12;
(*ret)+=8;
}
void main() {
int x;
x=0;
function(1,2,3);
x=1;
printf("%d\n",x);
}
I think I must explain my target of this code.
The stack model is below. The number below the word is the number of bytes of the variable in the stack. So, if I want to rewrite RET to skip the statement I want, I calculate the offset from buffer1 to RET is 8+4=12. Since the architecture is x86 Linux.
buffer2 buffer1 BSP RET a b c
(12) (8) (4) (4) (4) (4) (4)
I want to skip the statement x=1; and let printf() output 0 on the screen.
I compile the code with:
gcc stack2.c -g
and run it in gdb:
gdb ./a.out
gdb gives me the result like this:
Program received signal SIGSEGV, Segmentation fault.
main () at stack2.c:17
17 x = 1;
I think Linux uses some mechanism to protect against stack overflow. Maybe Linux stores the RET address in another place and compares the RET address in the stack before functions return.
And what is the detail about the mechanism? How should I rewrite the code to make the program output 0?
OK,the disassemble code is below.It comes form the output of gdb since I think is more easy to read for you.And anybody can tell me how to paste a long code sequence?Copy and paste one by one makes me too tired...
Dump of assembler code for function main:
0x08048402 <+0>: push %ebp
0x08048403 <+1>: mov %esp,%ebp
0x08048405 <+3>: sub $0x10,%esp
0x08048408 <+6>: movl $0x0,-0x4(%ebp)
0x0804840f <+13>: movl $0x3,0x8(%esp)
0x08048417 <+21>: movl $0x2,0x4(%esp)
0x0804841f <+29>: movl $0x1,(%esp)
0x08048426 <+36>: call 0x80483e4 <function>
0x0804842b <+41>: movl $0x1,-0x4(%ebp)
0x08048432 <+48>: mov $0x8048520,%eax
0x08048437 <+53>: mov -0x4(%ebp),%edx
0x0804843a <+56>: mov %edx,0x4(%esp)
0x0804843e <+60>: mov %eax,(%esp)
0x08048441 <+63>: call 0x804831c <printf#plt>
0x08048446 <+68>: mov $0x0,%eax
0x0804844b <+73>: leave
0x0804844c <+74>: ret
Dump of assembler code for function function:
0x080483e4 <+0>: push %ebp
0x080483e5 <+1>: mov %esp,%ebp
0x080483e7 <+3>: sub $0x14,%esp
0x080483ea <+6>: lea -0x9(%ebp),%eax
0x080483ed <+9>: add $0x3,%eax
0x080483f0 <+12>: mov %eax,-0x4(%ebp)
0x080483f3 <+15>: mov -0x4(%ebp),%eax
0x080483f6 <+18>: mov (%eax),%eax
0x080483f8 <+20>: lea 0x8(%eax),%edx
0x080483fb <+23>: mov -0x4(%ebp),%eax
0x080483fe <+26>: mov %edx,(%eax)
0x08048400 <+28>: leave
0x08048401 <+29>: ret
I check the assemble code and find some mistake about my program,and I have rewrite (*ret)+=8 to (*ret)+=7,since 0x08048432 <+48>minus0x0804842b <+41> is 7.
Because that article is from 1996 and the assumptions are incorrect.
Refer to "Smashing The Modern Stack For Fun And Profit"
http://www.ethicalhacker.net/content/view/122/24/
From the above link:
However, the GNU C Compiler (gcc) has evolved since 1998, and as a result, many people are left wondering why they can't get the examples to work for them, or if they do get the code to work, why they had to make the changes that they did.
The function function overwrites some place of the stack outside of its own, which is this case is the stack of main. What it overwrites I don't know, but it causes the segmentation fault you see. It might be some protection employed by the operating system, but it might as well be the generated code just does something wrong when wrong value is at that position on the stack.
This is a really good example of what may happen when you write outside of your allocated memory. It might crash directly, it might crash somewhere completely different, or if might not crash at all but instead just do some calculation wrong.
Try ret = buffer1 + 3;
Explanation: ret is an integer pointer; incrementing it by 1 adds 4 bytes to the address on 32bit machines.