I've reproduced Example 3 from Smashing the Stack for Fun and Profit on Linux x86_64. However I'm having trouble understanding what is the correct number of bytes that should be incremented to the return address in order to skip past the instruction:
0x0000000000400595 <+35>: movl $0x1,-0x4(%rbp)
which is where I think the x = 1 instruction is. I've written the following:
#include <stdio.h>
void fn(int a, int b, int c) {
char buf1[5];
char buf2[10];
int *ret;
ret = buf1 + 24;
(*ret) += 7;
}
int main() {
int x;
x = 0;
fn(1, 2, 3);
x = 1;
printf("%d\n", x);
}
and disassembled it in gdb. I have disabled address randomization and compiled the program with the -fno-stack-protector option.
Question 1
I can see from the disassembler output below that I want to skip past the instruction at address 0x0000000000400595: both the return address from callq <fn> and the address of the movl instruction. Therefore, if the return address is 0x0000000000400595, and the next instruction is 0x000000000040059c, I should add 7 bytes to the return address?
0x0000000000400572 <+0>: push %rbp
0x0000000000400573 <+1>: mov %rsp,%rbp
0x0000000000400576 <+4>: sub $0x10,%rsp
0x000000000040057a <+8>: movl $0x0,-0x4(%rbp)
0x0000000000400581 <+15>: mov $0x3,%edx
0x0000000000400586 <+20>: mov $0x2,%esi
0x000000000040058b <+25>: mov $0x1,%edi
0x0000000000400590 <+30>: callq 0x40052d <fn>
0x0000000000400595 <+35>: movl $0x1,-0x4(%rbp)
0x000000000040059c <+42>: mov -0x4(%rbp),%eax
0x000000000040059f <+45>: mov %eax,%esi
0x00000000004005a1 <+47>: mov $0x40064a,%edi
0x00000000004005a6 <+52>: mov $0x0,%eax
0x00000000004005ab <+57>: callq 0x400410 <printf#plt>
0x00000000004005b0 <+62>: leaveq
0x00000000004005b1 <+63>: retq
Question 2
I notice that I can add 5 bytes to the return address in place of 7 and achieve the same result. When I do so, am I not jumping into the middle of the instruction 0x0000000000400595 <+35>: movl $0x1,-0x4(%rbp)? In which case, why does this not crash the program, like when I add 6 bytes to the return address in place of 5 bytes or 7 bytes.
Question 3
Just before buffer1[] on the stack is SFP, and before it, the return address.
That is 4 bytes pass the end of buffer1[]. But remember that buffer1[] is
really 2 word so its 8 bytes long. So the return address is 12 bytes from
the start of buffer1[].
In the example by Aleph 1, he/she calculates the offset of the return address as 12 bytes from the start of buffer1[]. Since I am on x86_64, and not x86_32, I need to recalculate the offset to the return address. When on x86_64, is it the case that buffer1[] is still 2 words, which is 16 bytes; and the SFP and return address are 8 bytes each (as we're on 64 bit) and therefore the return address is at: buf1 + (8 * 2) + 8 which is equivalent to buf1 + 24?
The first, and very important, thing to note: all numbers and offsets are very compiler-dependent. Different compilers, and even the same compiler with different settings, can produce drastically different assemblies. For example, many compilers can (and will) remove buf2 because it's not used. They can also remove x = 0 as its effect is not used and later overwritten. They can also remove x = 1 and replace all occurences of x with a constant 1, etc, etc.
That said, you absolutely need to make numbers for a specific assembly you're getting on your specific compiler and its settings.
Question 1
Since you provided the assembly for main(), I can confirm that you need to add 7 bytes to the return address, which would normally be 0x0000000000400595, to skip over x=1 and go to 0x000000000040059c which loads x into register for later use. 0x000000000040059c - 0x0000000000400595 = 7.
Question 2
Adding just 5 bytes instead of 7 will indeed jump into middle of instruction. However, this 2-byte tail of instruction happen (by pure chance) to be another valid instruction code. This is why it doesn't crash.
Question 3
This is again very compiler and settings dependent. Pretty much everything can happen there. Since you didn't provide disassembly, I can only make guesses. The guess would be the following: buf and buf2 are rounded up to the next stack unit boundary (8 bytes on x64). buf becomes 8 bytes, and buf2 becomes 16 bytes. Frame pointers are not saved to stack on x64, so no "SFP". That's 24 bytes total.
Related
Why this program needs more than 45 input to occur buffer overflow(segmentaion fault)?
#include <stdio.h>
#include <string.h>
int main(int argc, char *argv[])
{
char whatever[20];
strcpy(whatever, argv[1]);
return 0;
}
I mean it should be more than 24 char input.by the way there is no grsecurity enabled in my system.and i'm using ubuntu 7.04 32bit on virtual box.
Ok, what's interesting here is the disassembly of main:
push %ebp
mov %esp,%ebp
sub $0x38,%esp
and $0xfffffff0,%esp
mov $0x0,%eax
sub %eax,%esp
mov 0xc(%ebp),%eax
add $0x4,%eax
mov (%eax),%eax
mov %eax,0x4(%esp)
lea 0xffffffd8(%ebp),%eax
mov %eax,(%esp)
call 80482a0 <strcpy#plt>
mov $0x0,%eax
leave
ret
Before entering main, the stack pointer esp points to the return address pushed by call. Let's call that &ret.
The first opcode in the function pushes the base pointer of the previous frame, and then sets the current base pointer to the stack pointer. So ebp = &ret - 4.
When setting up the call to strcpy, the value right at esp is the first parameter. Here:
mov %eax,(%esp)
call 80482a0 <strcpy#plt>
So the value in eax is the first parameter. If we look at the previous instruction, we can see what that value is:
lea 0xffffffd8(%ebp),%eax
Ok, this notation basically means: eax = ebp + 0xffffffd8, which is equivalent to eax = ebp - 40 (see Two's Complement). Basically, you flip all the bits (and get 0x27=39), stick a minus sign (-39), and subtract 1 (-40).
And in relation to the frame's return address: eax = &ret - 44
So it would take at least 45 bytes to overrun the return address.
But you say 47. This is interesting, and it might have to do with the specific input you supplied.
You see, x86 is a little-endian little endian machine, which means that in memory, integers are stored LSB-first. So, when overwriting the stored return address, you first overwrite it's LSB.
If your input happens to be in the vicinity of the LSB, you might cause a faulty termination, but not a segmentation fault, as you will cause a branch to a legitimate address.
If you'll share your input, it might help shed some light on those two missing bytes :)
I have a simple program called demo.c which allocates space for a char array with the length of 8 on the stack
#include<stdio.h>
main()
{
char buffer[8];
return 0;
}
I thought that 8 bytes will be allocated from stack for the eight chars but if I check this in gdb there are 10 bytes subtracted from the stack.
I compile the the program with this command on my Ubuntu 32 bit machine:
$ gcc -ggdb -o demo demo.c
Then I analyze the program with:
$ gdb demo
$ disassemble main
(gdb) disassemble main
Dump of assembler code for function main:
0x08048404 <+0>: push %ebp
0x08048405 <+1>: mov %esp,%ebp
0x08048407 <+3>: and $0xfffffff0,%esp
0x0804840a <+6>: sub $0x10,%esp
0x0804840d <+9>: mov %gs:0x14,%eax
0x08048413 <+15>: mov %eax,0xc(%esp)
0x08048417 <+19>: xor %eax,%eax
0x08048419 <+21>: mov $0x0,%eax
0x0804841e <+26>: mov 0xc(%esp),%edx
0x08048422 <+30>: xor %gs:0x14,%edx
0x08048429 <+37>: je 0x8048430 <main+44>
0x0804842b <+39>: call 0x8048340 <__stack_chk_fail#plt>
0x08048430 <+44>: leave
0x08048431 <+45>: ret
End of assembler dump.
0x0804840a <+6>: sub $0x10,%esp says, that there are 10 bytes allocated from the stack right?
Why are there 10 bytes allocated and not 8?
No, 0x10 means it's hexadecimal, i.e. 1016, which is 1610 bytes in decimal.
Probably due to alignment requirements for the stack.
Please note that the constant $0x10 is in hexadecimal this is equal to 16 byte.
Take a look at the machine code:
0x08048404 <+0>: push %ebp
0x08048405 <+1>: mov %esp,%ebp
0x08048407 <+3>: and $0xfffffff0,%esp
0x0804840a <+6>: sub $0x10,%esp
...
0x08048430 <+44>: leave
0x08048431 <+45>: ret
As you can see before we subtract 16 from the esp we ensure to make esp pointing to a 16 byte aligned address first (take a look at the and $0xfffffff0,%esp instruction).
I guess the compiler try to respect the alignment so he simply reserves 16 byte as well. It does not matter anyway because 8 byte fit into 16 byte very well.
sub $0x10, %esp is saying that there are 16 bytes on the stack, not 10 since 0x is hexadecimal notation.
The amount of space for the stack is completely dependent on the compiler. In this case it's most like an alignment issue where the alignment is 16 bytes and you've requested 8, so it gets increased to 16.
If you requested 17 bytes, it would most likely have been sub $0x20, %esp or 32 bytes instead of 17.
(I skipped over some things the other answers explain in more detail).
You compiled with -O0, so gcc is operating in a super-simple way that tells you something about compiler internals, but little about how to make good code from C.
gcc is keeping the stack 16B-aligned at all times. The 32bit SysV ABI only guarantees 4B stack alignment, but GNU/Linux systems actually assume and maintain gcc's default -mpreferred-stack-boundary=4 (16B-aligned).
Your version of gcc also defaults to using -fstack-protector, so it checks for stack-smashing in functions with local char arrays with 4 or more elements:
-fstack-protector
Emit extra code to check for buffer overflows, such as stack smashing attacks. This is done by adding a guard variable to
functions with
vulnerable objects. This includes functions that call "alloca", and functions with buffers larger than 8 bytes. The guards
are
initialized when a function is entered and then checked when the function exits. If a guard check fails, an error message is
printed and
the program exits.
For some reason, this is actually kicking in with char arrays >= 4B, but not with integer arrays. (At least, not when they're unused!). char pointers can alias anything, which may have something to do with it.
See the code on godbolt, with asm output. Note how main is special: it uses andl $-16, %esp to align the stack on entry to main, but other functions assume the stack was 16B-aligned before the call instruction that called them. So they'll typically sub $24, %esp, after pushing %ebp. (%ebp and the return address are 8B total, so the stack is 8B away from being 16B-aligned). This leaves room for the stack-protector canary.
The 32bit SysV ABI only requires arrays to be aligned to the natural alignment of their elements, so this 16B alignment for the char array is just what the compiler decided to do in this case, not something you can count on.
The 64bit ABI is different:
An array uses the same alignment as its elements, except that a local
or global array variable of length at least 16 bytes or a C99
variable-length array variable always has alignment of at least 16
bytes
(links from the x86 tag wiki)
So you can count on char buf[1024] being 16B-aligned on SysV, allowing you to use SSE aligned loads/stores on it.
I am trying to make the buffer exploitation example (example3.c from http://insecure.org/stf/smashstack.html) work on Debian Lenny 2.6 version. I know the gcc version and the OS version is different than the one used by Aleph One. I have disabled any stack protection mechanisms using -fno-stack-protector and sysctl -w kernel.randomize_va_space=0 arguments. To account for the differences in my setup and Aleph One's I introduced two parameters : offset1 -> Offset from buffer1 variable to the return address and offset2 -> how many bytes to jump to skip a statement. I tried to figure out these parameters by analyzing assembly code but was not successful. So, I wrote a shell script that basically runs the buffer overflow program with simultaneous values of offset1 and offset2 from (1-60). But much to my surprise I am still not able to break this program. It would be great if someone can guide me for the same. I have attached the code and assembly output for consideration. Sorry for the really long post :)
Thanks.
// Modified example3.c from Aleph One paper - Smashing the stack
void function(int a, int b, int c, int offset1, int offset2) {
char buffer1[5];
char buffer2[10];
int *ret;
ret = (int *)buffer1 + offset1;// how far is return address from buffer ?
(*ret) += offset2; // modify the value of return address
}
int main(int argc, char* argv[]) {
int x;
x = 0;
int offset1 = atoi(argv[1]);
int offset2 = atoi(argv[2]);
function(1,2,3, offset1, offset2);
x = 1; // Goal is to skip this statement using buffer overflow
printf("X : %d\n",x);
return 0;
}
-----------------
// Execute the buffer overflow program with varying offsets
#!/bin/bash
for ((i=1; i<=60; i++))
do
for ((j=1; j<=60; j++))
do
echo "`./test $i $j`"
done
done
-- Assembler output
(gdb) disassemble main
Dump of assembler code for function main:
0x080483c2 <main+0>: lea 0x4(%esp),%ecx
0x080483c6 <main+4>: and $0xfffffff0,%esp
0x080483c9 <main+7>: pushl -0x4(%ecx)
0x080483cc <main+10>: push %ebp
0x080483cd <main+11>: mov %esp,%ebp
0x080483cf <main+13>: push %ecx
0x080483d0 <main+14>: sub $0x24,%esp
0x080483d3 <main+17>: movl $0x0,-0x8(%ebp)
0x080483da <main+24>: movl $0x3,0x8(%esp)
0x080483e2 <main+32>: movl $0x2,0x4(%esp)
0x080483ea <main+40>: movl $0x1,(%esp)
0x080483f1 <main+47>: call 0x80483a4 <function>
0x080483f6 <main+52>: movl $0x1,-0x8(%ebp)
0x080483fd <main+59>: mov -0x8(%ebp),%eax
0x08048400 <main+62>: mov %eax,0x4(%esp)
0x08048404 <main+66>: movl $0x80484e0,(%esp)
0x0804840b <main+73>: call 0x80482d8 <printf#plt>
0x08048410 <main+78>: mov $0x0,%eax
0x08048415 <main+83>: add $0x24,%esp
0x08048418 <main+86>: pop %ecx
0x08048419 <main+87>: pop %ebp
0x0804841a <main+88>: lea -0x4(%ecx),%esp
0x0804841d <main+91>: ret
End of assembler dump.
(gdb) disassemble function
Dump of assembler code for function function:
0x080483a4 <function+0>: push %ebp
0x080483a5 <function+1>: mov %esp,%ebp
0x080483a7 <function+3>: sub $0x20,%esp
0x080483aa <function+6>: lea -0x9(%ebp),%eax
0x080483ad <function+9>: add $0x30,%eax
0x080483b0 <function+12>: mov %eax,-0x4(%ebp)
0x080483b3 <function+15>: mov -0x4(%ebp),%eax
0x080483b6 <function+18>: mov (%eax),%eax
0x080483b8 <function+20>: lea 0x7(%eax),%edx
0x080483bb <function+23>: mov -0x4(%ebp),%eax
0x080483be <function+26>: mov %edx,(%eax)
0x080483c0 <function+28>: leave
0x080483c1 <function+29>: ret
End of assembler dump.
The disassembly for function you provided seems to use hardcoded values of offset1 and offset2, contrary to your C code.
The address for ret should be calculated using byte/char offsets: ret = (int *)(buffer1 + offset1), otherwise you'll get hit by pointer math (especially in this case, when your buffer1 is not at a nice aligned offset from the return address).
offset1 should be equal to 0x9 + 0x4 (the offset used in lea + 4 bytes for the push %ebp). However, this can change unpredictably each time you compile - the stack layout might be different, the compiler might create some additional stack alignment, etc.
offset2 should be equal to 7 (the length of the instruction you're trying to skip).
Note that you're getting a little lucky here - the function uses the cdecl calling convention, which means the caller is responsible for removing arguments off the stack after returning from the function, which normally looks like this:
push arg3
push arg2
push arg1
call func
add esp, 0Ch ; remove as many bytes as were used by the pushed arguments
Your compiler chose to combine this correction with the one after printf, but it could also decide to do this after your function call. In this case the add esp, <number> instruction would be present between your return address and the instruction you want to skip - you can probably imagine that this would not end well.
This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 11 years ago.
I need your help. Here is the source code of my program. I need to understand what manipulations are being done by the score1, score2, score3 and score4 functions.
1 #include <stdio.h>
2 #include <string.h>
3 #include <stdlib.h>
4 #include <sys/types.h>
5 #include <sys/stat.h>
6 #include <pwd.h>
7 #include <unistd.h>
8
9 #include "score.h"
10
(gdb)
11 int main(int argc, char *argv[])
12 {
13 int i, j, k, l, s;
14 struct passwd *pw;
15 char cmd[1024];
16
17 /* Make sure that we have exactly 5 arguments: the name of the executable, and 4 numbers */
18 if (argc != 5) {
19 printf("Usage: %s i j k l\n where i,j,k,l are integers.\n Try to get as high a score as you can.\n", argv[0]);
20 exit(8);
(gdb)
21 }
22
23 initialize();
24
25 /* Convert the inputs to ints */
26 i = atoi(argv[1]);
27 j = atoi(argv[2]);
28 k = atoi(argv[3]);
29 l = atoi(argv[4]);
30
(gdb)
31 printf("You entered the integers %d, %d, %d, and %d.\n", i, j, k, l);
32 s = score1(i) + score2(j) + score3(k) + score4(l);
33
34 printf("Your score is %d.\n", s);
35 if (s > 0) {
36 pw = getpwuid(getuid());
37
38 printf("Thank you!\n");
40 system(cmd);
I have started disassemble the code like the following:
(gdb) disas score1
Dump of assembler code for function score1:
0x080488b0 <score1+0>: push %ebp
0x080488b1 <score1+1>: mov %esp,%ebp
0x080488b3 <score1+3>: cmpl $0xe1e4,0x8(%ebp)
0x080488ba <score1+10>: setne %al
0x080488bd <score1+13>: movzbl %al,%eax
0x080488c0 <score1+16>: sub $0x1,%eax
0x080488c3 <score1+19>: and $0xa,%eax
0x080488c6 <score1+22>: pop %ebp
0x080488c7 <score1+23>: ret
(gdb) disas score2
Dump of assembler code for function score2:
0x080488c8 <score2+0>: push %ebp
0x080488c9 <score2+1>: mov %esp,%ebp
0x080488cb <score2+3>: mov 0x8049f88,%eax
0x080488d0 <score2+8>: sub $0x2,%eax
0x080488d3 <score2+11>: mov %eax,0x8049f88
0x080488d8 <score2+16>: cmp 0x8(%ebp),%eax
0x080488db <score2+19>: setne %al
0x080488de <score2+22>: movzbl %al,%eax
0x080488e1 <score2+25>: sub $0x1,%eax
0x080488e4 <score2+28>: and $0xa,%eax
0x080488e7 <score2+31>: pop %ebp
0x080488e8 <score2+32>: ret
(gdb) disas score3
Dump of assembler code for function score3:
0x080488e9 <score3+0>: push %ebp
0x080488ea <score3+1>: mov %esp,%ebp
0x080488ec <score3+3>: mov 0x8(%ebp),%eax
0x080488ef <score3+6>: and $0xf,%eax
0x080488f2 <score3+9>: mov 0x8048e00(,%eax,4),%eax
0x080488f9 <score3+16>: pop %ebp
0x080488fa <score3+17>: ret
(gdb) disas score4
Dump of assembler code for function score4:
0x080488fb <score4+0>: push %ebp
0x080488fc <score4+1>: mov %esp,%ebp
0x080488fe <score4+3>: push %ebx
0x080488ff <score4+4>: mov 0x8(%ebp),%eax
0x08048902 <score4+7>: movzwl %ax,%edx
0x08048905 <score4+10>: mov %eax,%ecx
0x08048907 <score4+12>: shr $0x10,%ecx
0x0804890a <score4+15>: lea 0x0(,%edx,8),%eax
0x08048911 <score4+22>: sub %edx,%eax
0x08048913 <score4+24>: cmp %ecx,%eax
0x08048915 <score4+26>: jne 0x8048920 <score4+37>
0x08048917 <score4+28>: mov $0x8000ffff,%ebx
0x0804891c <score4+33>: test %edx,%ecx
0x0804891e <score4+35>: jne 0x8048940 <score4+69>
0x08048920 <score4+37>: mov %ecx,%eax
0x08048922 <score4+39>: xor %edx,%eax
0x08048924 <score4+41>: cmp $0xf00f,%eax
0x08048929 <score4+46>: jne 0x804893b <score4+64>
0x0804892b <score4+48>: mov %ecx,%eax
0x0804892d <score4+50>: or %edx,%eax
0x0804892f <score4+52>: mov $0xa,%ebx
0x08048934 <score4+57>: cmp $0xf42f,%eax
---Type <return> to continue, or q <return> to quit---
0x08048939 <score4+62>: je 0x8048940 <score4+69>
0x0804893b <score4+64>: mov $0x0,%ebx
0x08048940 <score4+69>: mov %ebx,%eax
0x08048942 <score4+71>: pop %ebx
0x08048943 <score4+72>: pop %ebp
0x08048944 <score4+73>: ret
I've started examining score2.
What I have done is:
(
gdb) x 0x8049f88
0x8049f88 <secret>: "Чй"
(gdb) disas 0x8049f88
Dump of assembler code for function secret:
0x08049f88 <secret+0>: dec %dl
0x08049f8a <secret+2>: add %al,(%eax)
End of assembler dump.
And I'm lost here.
Here's what I think happens so far (See comments):
(gdb) disas score2
Dump of assembler code for function score2:
0x080488c8 <score2+0>: push %ebp
0x080488c9 <score2+1>: mov %esp,%ebp 'Copy %esp into %ebp
0x080488cb <score2+3>: mov 0x8049f88,%eax 'executing: decrement and add
0x080488d0 <score2+8>: sub $0x2,%eax ' subtract $0x2 from %eax (How can I figure out what $0x2
0x080488d3 <score2+11>: mov %eax,0x8049f88 'Have no idea what this does
0x080488d8 <score2+16>: cmp 0x8(%ebp),%eax compare of %ebp to %eax (why %ebp has 0x8 preceding it?)
0x080488db <score2+19>: setne %al 'I have no idea what this does
0x080488de <score2+22>: movzbl %al,%eax
0x080488e1 <score2+25>: sub $0x1,%eax
0x080488e4 <score2+28>: and $0xa,%eax
0x080488e7 <score2+31>: pop %ebp
0x080488e8 <score2+32>: ret
If you could help me understand what kind of transformations score2 performs to an integer and what commands can I run in gdb that could help me, I would really appreciate it and would try to figure rest of it(score1-3) by myself. I'm just lost here.
There's only really 2 things you need to know to understand a disassembly. The first thing you need to know is all the instructions and addressing modes support by the CPU and how they work. The second thing is the syntax used by the assembler/disassembler. Without being familiar with either of these things you will get nowhere.
For an example of "you will get nowhere", here's score2:
0x080488c8 <score2+0>: push %ebp ;Save EBP
0x080488c9 <score2+1>: mov %esp,%ebp ;EBP = address of stack frame
0x080488cb <score2+3>: mov 0x8049f88,%eax ;EAX = the data at address 0x8049f88
0x080488d0 <score2+8>: sub $0x2,%eax ;EAX = EAX - 2
0x080488d3 <score2+11>: mov %eax,0x8049f88 ;The value at address 0x8049f88 = eax
0x080488d8 <score2+16>: cmp 0x8(%ebp),%eax ;Compare the int at offset 8 in the stack frame with EAX
0x080488db <score2+19>: setne %al ;If the int at offset 8 in the stack frame wasn't equal to EAX, set AL to 0, otherwise set AL to 1
0x080488de <score2+22>: movzbl %al,%eax ;Zero-extend AL to EAX (so EAX = 0 or 1)
0x080488e1 <score2+25>: sub $0x1,%eax ;Decrease EAX (so EAX = -1 or 0)
0x080488e4 <score2+28>: and $0xa,%eax ;EAX = EAX AND 0x0A (so EAX = 0xA or 0)
0x080488e7 <score2+31>: pop %ebp ;Restore previous EBP
0x080488e8 <score2+32>: ret ;Return
Converting back into C, this might look something like:
int score2(int something) {
some_global_int -= 2;
if(some_global_int == something) return 0;
else return 0x0A;
}
Of course I only slapped this together in 5 minutes, and haven't double checked anything or tested anything, so it could be wrong.
After reading the above "score2" code, are you any closer to understanding the disassembly of any of the other functions?
Based on your initial attempt at commenting score2, you should either ask someone to do all the work for you (and learn nothing, and have no way of knowing if that person is right or wrong), or ask for the best place to learn 80x86 assembly (and AT&T syntax).
I'm assuming you're given some kind of compiled library with the score functions in it, and you're trying to reverse engineer it as some kind of homework project. In that case, I suggest you start familiarizing yourself with the standard C calling convention cdecl.
Basically, esp points to the stack, on which the arguments to the function are pushed before it's called, so a C function first moves esp into ebp and then it can access the arguments by subtracting values from ebp and dereferencing the resulting address. It uses ebp for this purpose so it can still modify esp in order to add more local variables on the stack without losing track of where the arguments are stored.
Anyway, here's an overview of score2 to help get you started:
(gdb) disas score2
Dump of assembler code for function score2:
0x080488c8 <score2+0>: push %ebp
0x080488c9 <score2+1>: mov %esp,%ebp ; This just saves a copy of the top of our stack to read arguments with
0x080488cb <score2+3>: mov 0x8049f88,%eax ; Load a value from a memory location (the number is a memory address, probably to a global variable)
0x080488d0 <score2+8>: sub $0x2,%eax ; Subtract 2
0x080488d3 <score2+11>: mov %eax,0x8049f88 ; Store the new value into the same memory location
0x080488d8 <score2+16>: cmp 0x8(%ebp),%eax ; Compare the first argument of the function to that value
0x080488db <score2+19>: setne %al ; Sets the lower byte of eax to 1 if they don't match
0x080488de <score2+22>: movzbl %al,%eax ; Sets al to eax, zeroing the upper bytes so eax is just 1 or 0 now
0x080488e1 <score2+25>: sub $0x1,%eax ; Subtract 1 from eax
0x080488e4 <score2+28>: and $0xa,%eax ; eax = eax & 0xa
0x080488e7 <score2+31>: pop %ebp
0x080488e8 <score2+32>: ret ; Return eax
So that means there is some kind of global variable stored at 0x8049f88
(let's call it x), and score2 literally translates to:
int score2(int n) {
x -= 2;
if (n == x)
n = 1;
else
n = 0;
n--;
n = n & 0xa;
return n;
}
EDIT: Brendan's example is the same, but probably looks more like the original code. Look over it a few times and compare it to the assembly output.
The next step is now to see what's in the variable at 0x8049f88. Try running awatch *0x8049f88 inside of gdb to make it stop on every access and also print *0x8049f88 to see what's stored there.
You should also run set disassembly-flavor intel if you're not too familiar with assembly language. The syntax will then match the examples you're more likely to find on the Internet.
I presume you don't have access to the source code of the functions, and for the puzzle or homework you are supposed to try to find numbers to get a big score. You probably should edit your question to display contents of score.h, or just the relevant portions if it's quite lengthy. Also, note that disassembly at 0x8049f88 doesn't make sense. Instead use gdb's x command to display that location, and edit accordingly.
While you can attack the problem via disassembly (as above) you can also try using a different main program, that reports the results of individual score?() calls, and that loops some of them through a series of values looking for big values.
With score2(), looping within main() won't work, because score2() subtracts 2 from a word in memory. So, if you wanted to try out a lot of inputs, you'd need to call the program with different arguments in a shell code loop. Eg, if you are using bash:
for i in {1..1000}; do testScore2 $i; done
where testScore2 is a main program that only runs score2() with its parameter and reports the result.
Of course, because score2() can produce only two different results, as explained in detail in two previous answers, it won't actually make sense to test score2() with more than two argument values. I showed the shell code above because you might want to use such a technique with some of the other score functions.
void function(int a, int b, int c) {
char buffer1[5];
char buffer2[10];
int *ret;
ret = buffer1 + 12;
(*ret) += 8;//why is it 8??
}
void main() {
int x;
x = 0;
function(1,2,3);
x = 1;
printf("%d\n",x);
}
The above demo is from here:
http://insecure.org/stf/smashstack.html
But it's not working here:
D:\test>gcc -Wall -Wextra hw.cpp && a.exe
hw.cpp: In function `void function(int, int, int)':
hw.cpp:6: warning: unused variable 'buffer2'
hw.cpp: At global scope:
hw.cpp:4: warning: unused parameter 'a'
hw.cpp:4: warning: unused parameter 'b'
hw.cpp:4: warning: unused parameter 'c'
1
And I don't understand why it's 8 though the author thinks:
A little math tells us the distance is
8 bytes.
My gdb dump as called:
Dump of assembler code for function main:
0x004012ee <main+0>: push %ebp
0x004012ef <main+1>: mov %esp,%ebp
0x004012f1 <main+3>: sub $0x18,%esp
0x004012f4 <main+6>: and $0xfffffff0,%esp
0x004012f7 <main+9>: mov $0x0,%eax
0x004012fc <main+14>: add $0xf,%eax
0x004012ff <main+17>: add $0xf,%eax
0x00401302 <main+20>: shr $0x4,%eax
0x00401305 <main+23>: shl $0x4,%eax
0x00401308 <main+26>: mov %eax,0xfffffff8(%ebp)
0x0040130b <main+29>: mov 0xfffffff8(%ebp),%eax
0x0040130e <main+32>: call 0x401b00 <_alloca>
0x00401313 <main+37>: call 0x4017b0 <__main>
0x00401318 <main+42>: movl $0x0,0xfffffffc(%ebp)
0x0040131f <main+49>: movl $0x3,0x8(%esp)
0x00401327 <main+57>: movl $0x2,0x4(%esp)
0x0040132f <main+65>: movl $0x1,(%esp)
0x00401336 <main+72>: call 0x4012d0 <function>
0x0040133b <main+77>: movl $0x1,0xfffffffc(%ebp)
0x00401342 <main+84>: mov 0xfffffffc(%ebp),%eax
0x00401345 <main+87>: mov %eax,0x4(%esp)
0x00401349 <main+91>: movl $0x403000,(%esp)
0x00401350 <main+98>: call 0x401b60 <printf>
0x00401355 <main+103>: leave
0x00401356 <main+104>: ret
0x00401357 <main+105>: nop
0x00401358 <main+106>: add %al,(%eax)
0x0040135a <main+108>: add %al,(%eax)
0x0040135c <main+110>: add %al,(%eax)
0x0040135e <main+112>: add %al,(%eax)
End of assembler dump.
Dump of assembler code for function function:
0x004012d0 <function+0>: push %ebp
0x004012d1 <function+1>: mov %esp,%ebp
0x004012d3 <function+3>: sub $0x38,%esp
0x004012d6 <function+6>: lea 0xffffffe8(%ebp),%eax
0x004012d9 <function+9>: add $0xc,%eax
0x004012dc <function+12>: mov %eax,0xffffffd4(%ebp)
0x004012df <function+15>: mov 0xffffffd4(%ebp),%edx
0x004012e2 <function+18>: mov 0xffffffd4(%ebp),%eax
0x004012e5 <function+21>: movzbl (%eax),%eax
0x004012e8 <function+24>: add $0x5,%al
0x004012ea <function+26>: mov %al,(%edx)
0x004012ec <function+28>: leave
0x004012ed <function+29>: ret
In my case the distance should be - = 5,right?But it seems not working..
Why function needs 56 bytes for local variables?( sub $0x38,%esp )
As joveha pointed out, the value of EIP saved on the stack (return address) by the call instruction needs to be incremented by 7 bytes (0x00401342 - 0x0040133b = 7) in order to skip the x = 1; instruction (movl $0x1,0xfffffffc(%ebp)).
You are correct that 56 bytes are being reserved for local variables (sub $0x38,%esp), so the missing piece is how many bytes past buffer1 on the stack is the saved EIP.
A bit of test code and inline assembly tells me that the magic value is 28 for my test. I cannot provide a definitive answer as to why it is 28, but I would assume the compiler is adding padding and/or stack canaries.
The following code was compiled using GCC 3.4.5 (MinGW) and tested on Windows XP SP3 (x86).
unsigned long get_ebp() {
__asm__("pop %ebp\n\t"
"movl %ebp,%eax\n\t"
"push %ebp\n\t");
}
void function(int a, int b, int c) {
char buffer1[5];
char buffer2[10];
int *ret;
/* distance in bytes from buffer1 to return address on the stack */
printf("test %d\n", ((get_ebp() + 4) - (unsigned long)&buffer1));
ret = (int *)(buffer1 + 28);
(*ret) += 7;
}
void main() {
int x;
x = 0;
function(1,2,3);
x = 1;
printf("%d\n",x);
}
I could have just as easily used gdb to determine this value.
(compiled w/ -g to include debug symbols)
(gdb) break function
...
(gdb) run
...
(gdb) p $ebp
$1 = (void *) 0x22ff28
(gdb) p &buffer1
$2 = (char (*)[5]) 0x22ff10
(gdb) quit
(0x22ff28 + 4) - 0x22ff10 = 28
(ebp value + size of word) - address of buffer1 = number of bytes
In addition to Smashing The Stack For Fun And Profit, I would suggest reading some of the articles I mentioned in my answer to a previous question of yours and/or other material on the subject. Having a good understanding of exactly how this type of exploit works should help you write more secure code.
It's hard to predict what buffer1 + 12 really points to. Your compiler can put buffer1 and buffer2 in any location on the stack it desires, even going as far as to not save space for buffer2 at all. The only way to really know where buffer1 goes is to look at the assembler output of your compiler, and there's a good chance it would jump around with different optimization settings or different versions of the same compiler.
I do not test the code on my own machine yet, but have you taken memory alignment into consideration?
Try to disassembly the code with gcc. I think a assembly code may give you a further understanding of the code. :-)
This code prints out 1 as well on OpenBSD and FreeBSD, and gives a segmentation fault on Linux.
This kind of exploit is heavily dependent on both the instruction set of the particular machine, and the calling conventions of the compiler and operating system. Everything about the layout of the stack is defined by the implementation, not the C language. The article assumes Linux on x86, but it looks like you're using Windows, and your system could be 64-bit, although you can switch gcc to 32-bit with -m32.
The parameters you'll have to tweak are 12, which is the offset from the tip of the stack to the return address, and 8, which is how many bytes of main you want to jump over. As the article says, you can use gdb to inspect the disassembly of the function to see (a) how far the stack gets pushed when you call function, and (b) the byte offsets of the instructions in main.
The +8 bytes part is by how much he wants the saved EIP to the incremented with. The EIP was saved so the program could return to the last assignment after the function is done - now he wants to skip over it by adding 8 bytes to the saved EIP.
So all he tries to is to "skip" the
x = 1;
In your case the saved EIP will point to 0x0040133b, the first instruction after function returns. To skip the assignment you need to make the saved EIP point to 0x00401342. That's 7 bytes.
It's really a "mess with RET EIP" rather than an buffer overflow example.
And as far as the 56 bytes for local variables goes, that could be anything your compiler comes up with like padding, stack canaries, etc.
Edit:
This shows how difficult it is to make buffer overflows examples in C. The offset of 12 from buffer1 assumes a certain padding style and compile options. GCC will happily insert stack canaries nowadays (which becomes a local variable that "protects" the saved EIP) unless you tell it not to. Also, the new address he wants to jump to (the start instruction for the printf call) really has to be resolved manually from assembly. In his case, on his machie, with his OS, with his compiler, on that day.... it was 8.
You're compiling a C program with the C++ compiler. Rename hw.cpp to hw.c and you'll find it will compile.