Printf arguments not pushed on the stack - c

I'm in the process of trying to understand the stack mechanisms.
From the theory I have seen, before a function is called, its arguments are pushed onto the stack.
However when calling printf in the code below, none of them are pushed:
#include<stdio.h>
int main(){
char *s = " test string";
printf("Print this: %s and this %s \n", s, s);
return 1;
}
I've put a break in gdb to the printf instruction, and when displaying the stack, none of the 3 arguments are pushed onto the stack.
The only thing pushed to the stack is the string address s as can be seen in the disassembled code below:
0x000000000040052c <+0>: push %rbp
0x000000000040052d <+1>: mov %rsp,%rbp
0x0000000000400530 <+4>: sub $0x10,%rsp
0x0000000000400534 <+8>: movq $0x400604,-0x8(%rbp) // variable pushed on the stack
0x000000000040053c <+16>: mov -0x8(%rbp),%rdx
0x0000000000400540 <+20>: mov -0x8(%rbp),%rax
0x0000000000400544 <+24>: mov %rax,%rsi
0x0000000000400547 <+27>: mov $0x400611,%edi
0x000000000040054c <+32>: mov $0x0,%eax
0x0000000000400551 <+37>: callq 0x400410 <printf#plt>
0x0000000000400556 <+42>: mov $0x1,%eax
0x000000000040055b <+47>: leaveq
Actually, the only argument appearing so far in the disassembled code is when "Print this: %s and this %s \n" is put in %edi...
0x0000000000400547 <+27>: mov $0x400611,%edi
SO my question is: why am i not seeing 3 push instructions for each of my three arguments ?
uname -a:
3.8.0-31-generic #46-Ubuntu SMP Tue Sep 10 20:03:44 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

On 64 bits Linux x86-64 systems, the x86-64 ABI (x86-64 Application Binary Interface) does not push arguments on stack, but uses some registers (this calling convention is slightly faster).
If you pass many arguments -e.g. a dozen- some of them gets pushed on the stack.
Perhaps read first the wikipage on x86 calling conventions before reading the x86-64 ABI specifications.
For variadic functions like printf details are a bit scary.

Depending on your compiler, you will need to allocate space on the heap for your pointer 's'.
Instead of
char *s;
use
char s[300];
to allocate 300 bytes of room
Otherwise 's' is simply pointing up the stack - which can be random
This could be partly why you are not seeing PUSH instructions.
Also, I don't see why there should be a PUSH instruction for the pointers required in printf? The assembler is simply copying (MOV) the value of the pointers

Related

Difference in x86-32 and x64 Assembly stack allocation for a fixed-size buffer with unoptimized C (GCC)

Doing some basic disassembly and have noticed that the buffer is being given additional buffer space for some reason although what i am looking at in a tutorial uses the same code but is only given the correct (500) chars in length. Why is this?
My code:
#include <stdio.h>
#include <string.h>
int main (int argc, char** argv){
char buffer[500];
strcpy(buffer, argv[1]);
return 0;
}
compiled with GCC, the dissembled code is:
0x0000000000001139 <+0>: push %rbp
0x000000000000113a <+1>: mov %rsp,%rbp
0x000000000000113d <+4>: sub $0x210,%rsp
0x0000000000001144 <+11>: mov %edi,-0x204(%rbp)
0x000000000000114a <+17>: mov %rsi,-0x210(%rbp)
0x0000000000001151 <+24>: mov -0x210(%rbp),%rax
0x0000000000001158 <+31>: add $0x8,%rax
0x000000000000115c <+35>: mov (%rax),%rdx
0x000000000000115f <+38>: lea -0x200(%rbp),%rax
0x0000000000001166 <+45>: mov %rdx,%rsi
0x0000000000001169 <+48>: mov %rax,%rdi
0x000000000000116c <+51>: call 0x1030 <strcpy#plt>
0x0000000000001171 <+56>: mov $0x0,%eax
0x0000000000001176 <+61>: leave
0x0000000000001177 <+62>: ret
However, this video https://www.youtube.com/watch?v=1S0aBV-Waeo clearly only has 500 bytes assigned
Why is this this the case as the only difference I can see here is one is 32-bit and another (mine) is on x86-64.
500 is not a multiple of 16.
The x86-64 ABI (application binary interface) requires the stack pointer to be a multiple of 16 whenever a call instruction is about to happen. (Since call pushes an 8-byte return address, this means the stack pointer is always congruent to 8, mod 16, when control reaches the first instruction of a called function.) For the code shown, it is convenient for the compiler to achieve this requirement by increasing the value it uses in the sub instruction, making it be a multiple of 16.
The x86-32 ABI did not make this requirement, so there was no reason for the compiler used in the video to increase the size of the stack frame.
Note that you appear to have compiled your code without optimization. I get this at -O2:
0x0000000000000000 <+0>: sub $0x208,%rsp
0x0000000000000007 <+7>: mov 0x8(%rsi),%rsi
0x000000000000000b <+11>: mov %rsp,%rdi
0x000000000000000e <+14>: call <strcpy#PLT>
0x0000000000000013 <+19>: xor %eax,%eax
0x0000000000000015 <+21>: add $0x208,%rsp
0x000000000000001c <+28>: ret
The stack adjustment is still somewhat larger than the size of the array, but not as big as what you had, and no longer a multiple of 16; the difference is that with optimization on, the frame pointer is eliminated, so %rbp does not need to be saved and restored, and so the stack pointer is not a multiple of 16 at the point of the sub instruction.
(Incidentally, there is no requirement anywhere for a stack frame to be as small as possible. "Quality of implementation" dictates that it should be as small as possible, but for various reasons it's quite common for the compiler to miss that target. In my optimized code dump, I don't see any reason why the immediate operand to sub and add couldn't have been 0x1f8 (504).

Understanding x86-64 assembly for simple program in C with a function call

I have simple C program that produces this x86-64 assembly for function func
#include <stdio.h>
#include <string.h>
void func(char *name)
{
char buf[90];
strcpy(buf, name);
printf("Welcome %s\n", buf);
}
int main(int argc, char *argv[])
{
func(argv[1]);
return 0;
}
So I think this
0x000000000000118d <+4>: push %rbp
pushes the base pointer like placed argument which is char *name
then 0x000000000000118e <+5>: mov %rsp,%rbp set stack pointer to what at base pointer I belive that above and this makes stack point points to char *name at this point
then
0x0000000000001191 <+8>: add $0xffffffffffffff80,%rsp
I am little unsure about this. Why is 0xffffffffffffff80 added to rsp? What is the point of this instruction. Can any one please tell.
then in next instruction 0x0000000000001195 <+12>: mov %rdi,-0x78(%rbp)
its just setting -128 decimal to rdi. But still no buffer char buf[90] can be seen, where is my buffer? in following assmebly, can anyone please tell?
also what this line 0x00000000000011a2 <+25>: mov %rax,-0x8(%rbp)
Dump of assembler code for function func:
0x0000000000001189 <+0>: endbr64
0x000000000000118d <+4>: push %rbp
0x000000000000118e <+5>: mov %rsp,%rbp
0x0000000000001191 <+8>: add $0xffffffffffffff80,%rsp
0x0000000000001195 <+12>: mov %rdi,-0x78(%rbp)
0x0000000000001199 <+16>: mov %fs:0x28,%rax
0x00000000000011a2 <+25>: mov %rax,-0x8(%rbp)
0x00000000000011a6 <+29>: xor %eax,%eax
0x00000000000011a8 <+31>: mov -0x78(%rbp),%rdx
0x00000000000011ac <+35>: lea -0x70(%rbp),%rax
0x00000000000011b0 <+39>: mov %rdx,%rsi
0x00000000000011b3 <+42>: mov %rax,%rdi
0x00000000000011b6 <+45>: call 0x1070 <strcpy#plt>
0x00000000000011bb <+50>: lea -0x70(%rbp),%rax
0x00000000000011bf <+54>: mov %rax,%rsi
0x00000000000011c2 <+57>: lea 0xe3b(%rip),%rax # 0x2004
0x00000000000011c9 <+64>: mov %rax,%rdi
0x00000000000011cc <+67>: mov $0x0,%eax
0x00000000000011d1 <+72>: call 0x1090 <printf#plt>
0x00000000000011d6 <+77>: nop
0x00000000000011d7 <+78>: mov -0x8(%rbp),%rax
0x00000000000011db <+82>: sub %fs:0x28,%rax
0x00000000000011e4 <+91>: je 0x11eb <func+98>
0x00000000000011e6 <+93>: call 0x1080 <__stack_chk_fail#plt>
0x00000000000011eb <+98>: leave
0x00000000000011ec <+99>: ret
End of assembler dump.
also what in above assembly the use of fs register what this instruction actually doing 0x0000000000001199 <+16>: mov %fs:0x28,%rax
As already mentioned in comments, your buffer is on the stack.
In the beginning of the function the rsp is decreased to allow more space (stack grows towards lower addresses, thus rsp is decreased as stack grows). This space is generally used for local variables, arguments passed to the function, and also for other purposes (will get back to it below).
In your case, you may trace back where your buffer buf is by looking at what arguments are passed to the strcpy - the first argument is passed in rdi register, the second - in rsi.
0x00000000000011b0 <+39>: mov %rdx,%rsi
0x00000000000011b3 <+42>: mov %rax,%rdi
0x00000000000011b6 <+45>: call 0x1070 <strcpy#plt>
In the snippet above you can see that the pointer to buf (first argument to strcpy) was in rax prior to being put to rdi. And rax got its value from this instruction:
0x00000000000011ac <+35>: lea -0x70(%rbp),%rax
which means "load effective address (i.e. a pointer) that resides at offset -0x70 from the address rbp is pointing to". rbp points to where the stack pointer was in the beginning of the function (function frame pointer).
So it answers where the compiler has put your buffer.
Now for other questions:
then in next instruction 0x0000000000001195 <+12>:
mov %rdi,-0x78(%rbp) its just setting -128 decimal to rdi.
As we said, rdi holds the first argument to a function. Here it holds a first argument to func(), which is a pointer to name. This instruction puts this argument onto a stack at an offset of -0x78 from rbp - 8 bytes right before the space reserved for your buffer buf.
And the last two questions are related:
also what this line 0x00000000000011a2 <+25>: mov %rax,-0x8(%rbp)
and
also what in above assembly the use of fs register what this instruction actually doing 0x0000000000001199 <+16>: mov %fs:0x28,%rax
0x0000000000001199 <+16>: mov %fs:0x28,%rax
0x00000000000011a2 <+25>: mov %rax,-0x8(%rbp)
...
...
0x00000000000011d7 <+78>: mov -0x8(%rbp),%rax
0x00000000000011db <+82>: sub %fs:0x28,%rax
0x00000000000011e4 <+91>: je 0x11eb <func+98>
0x00000000000011e6 <+93>: call 0x1080 <__stack_chk_fail#plt>
0x00000000000011eb <+98>: leave
There is some value at %fs:0x28 (which denotes an offset of 0x28 in an fs segment). And this value is being placed (via rax) to the stack. To the very first 8 bytes in the space allocated for your function. And there it stays, hopefully untouched, until the function is about to return. There, it checks whether the value on the stack was changed. If it remained unchanged, the jump (je) will take you to the leave and the function will return. If, by any chance, the value on the stack got changed - your code has caused a stack overflow (aha!) and a call to __stack_chk_fail will be triggered, which perhaps will warn you about the overflow, and perhaps dump some debug information. So the value at %fs:0x28 is a kind of a unique magic/canary value.
And one last thing - about why add $0xffffffffffffff80,%rsp was used to allocate space on the stack, and not sub - other compilers do use sub as did GCC (version 8.5.0 20210514):
sub $0x70,%rsp
It allocated less, and one of the reasons is that the compiler did not reserve space for the stack overflow check.
As to "why use an add %rsp rather than a sub %rsp instruction":
On x86_64 there are actually two versions of these add/sub immediate with rsp instructions
a 4 byte version with a 1 byte immediate
a 7 byte version with a 4 byte immediate
For both versions, the immediate will be sign-extended to 64 bits and then added to (or subtracted from) %rsp. Now because of that sign extension, a 1-byte immediate can be any value from -128 (-0x80) up to 127 (0x7f). So the instruction
add $-0x80, %rsp
can use the 4-byte encoding, while the instruction
sub $0x80, %rsp
would require the 7 byte encoding. All else being equal (as it never is), the shorter encoding is better as it occupies less memory/cache.

Overflow buffer in C on x86_64 to call function

Hello i have such code
#include <stdio.h>
#define SECRET "1234567890AZXCVBNFRT"
int checksecret(){
char buf[32];
gets(buf);
if(strcmp(SECRET,buf)==0) return 1;
else return 0;
}
void outsecret(){
printf("%s\n",SECRET);
}
int main(int argc, char** argv){
if (checksecret()){
outsecret();
};
}
disass of outsecret
(gdb) disassemble outsecret
Dump of assembler code for function outsecret:
0x00000000004005f4 <+0>: push %rbp
0x00000000004005f5 <+1>: mov %rsp,%rbp
0x00000000004005f8 <+4>: mov $0x4006b4,%edi
0x00000000004005fd <+9>: callq 0x400480 <puts#plt>
0x0000000000400602 <+14>: pop %rbp
0x0000000000400603 <+15>: retq
I have an assumption that i don't know SECRET, so i try to run my program with such string python -c 'print "A" * 32 + "\x40\x05\xf4"[::-1]'. But it fails with segmentation fault. What i am doing wrong? Thank you for any help.
PS
I want to call function outsecret by overwriting return code in checksecret
You have to remember that all strings have an extra character that terminates the string, so if you input 32 characters then gets will write 33 characters to the buffer. Writing beyond the limits of an array leads to undefined behavior which often leads to crashes.
The gets function have no bounds-checking, and is very dangerous to use. It has been deprecated since long, and in the latest C11 standard it has even been removed.
$ python -c 'print "A" * 32 + "\x40\x05\xf4"[::1]'
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA#
$ perl -le 'print length("AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA#")'
33
Your input string is too long for buffer size of 32 characters (extra one is needed for '\0' terminating null character). You are victim to buffer or array overflow (sometimes also called as array overrun).
Note that gets() is deprecated in C99 and eventually it has been dropped in C11 Standard for security reasons.
I want to call function outsecret by overwriting return code in
checksecret
Beware, you are about to leave relatively safe regions of C Standard. This means that behaviour is relative to compiler, compiler's versions, optimization settings, ABI and so on (maybe inclucing current phase of moon).
As of x86 calling conventions integer return value is stored directly in %eax register (that's assuming that you have x86 or x86-64 CPU). Stack-likely-located array buf is handled by %rbp offsets within current stack frame. Let's consult with gdb disassemble command:
$ gcc -O0 test.c
$ gdb -q a.out
(gdb) b checksecret
(gdb) r
Breakpoint 1, 0x0000000000400631 in checksecret ()
(gdb) disas
Dump of assembler code for function checksecret:
0x000000000040062d <+0>: push %rbp
0x000000000040062e <+1>: mov %rsp,%rbp
=> 0x0000000000400631 <+4>: sub $0x30,%rsp
0x0000000000400635 <+8>: mov %fs:0x28,%rax
0x000000000040063e <+17>: mov %rax,-0x8(%rbp)
0x0000000000400642 <+21>: xor %eax,%eax
0x0000000000400644 <+23>: lea -0x30(%rbp),%rax
0x0000000000400648 <+27>: mov %rax,%rdi
0x000000000040064b <+30>: callq 0x400530 <gets#plt>
0x0000000000400650 <+35>: lea -0x30(%rbp),%rax
0x0000000000400654 <+39>: mov %rax,%rsi
0x0000000000400657 <+42>: mov $0x400744,%edi
0x000000000040065c <+47>: callq 0x400510 <strcmp#plt>
0x0000000000400661 <+52>: test %eax,%eax
0x0000000000400663 <+54>: jne 0x40066c <checksecret+63>
0x0000000000400665 <+56>: mov $0x1,%eax
0x000000000040066a <+61>: jmp 0x400671 <checksecret+68>
0x000000000040066c <+63>: mov $0x0,%eax
0x0000000000400671 <+68>: mov -0x8(%rbp),%rdx
0x0000000000400675 <+72>: xor %fs:0x28,%rdx
0x000000000040067e <+81>: je 0x400685 <checksecret+88>
0x0000000000400680 <+83>: callq 0x4004f0 <__stack_chk_fail#plt>
0x0000000000400685 <+88>: leaveq
0x0000000000400686 <+89>: retq
There is no way overwrite %eax directly from C code, but what you could do is to overwrite selective fragment of code section. In your case what you want is to replace:
0x000000000040066c <+63>: mov $0x0,%eax
with
0x000000000040066c <+63>: mov $0x1,%eax
It's easy to accomplish by gdb itself:
(gdb) x/2bx 0x40066c
0x40066c <checksecret+63>: 0xb8 0x00
set {unsigned char}0x40066d = 1
Now let's confirm it:
(gdb) x/i 0x40066c
0x40066c <checksecret+63>: mov $0x1,%eax
From that point checksecret() is returning 1 even if SECRET does not match. However It wouldn't be so easy to do it by buf itself, as you need to know (guess somehow?) correct offset of particular code section instruction.
Above answers are pretty clear and corret way to exploit buffer overflow vulnerability. But there is a different way to do same thing without exploit vulnerability.
mince#rootlab tmp $ gcc test.c -o test
mince#rootlab tmp $ strings test
/lib64/ld-linux-x86-64.so.2
libc.so.6
gets
puts
__stack_chk_fail
strcmp
__libc_start_main
__gmon_start__
GLIBC_2.4
GLIBC_2.2.5
UH-X
UH-X
[]A\A]A^A_
1234567890AZXCVBNFRT
;*3$
Please look at last 2 row. You will see your secret key in there.

How does GDB determine the address to break at when you do "break function-name"?

A simple example that demonstrates my issue:
// test.c
#include <stdio.h>
int foo1(int i) {
i = i * 2;
return i;
}
void foo2(int i) {
printf("greetings from foo! i = %i", i);
}
int main() {
int i = 7;
foo1(i);
foo2(i);
return 0;
}
$ clang -o test -O0 -Wall -g test.c
Inside GDB I do the following and start the execution:
(gdb) b foo1
(gdb) b foo2
After reaching the first breakpoint, I disassemble:
(gdb) disassemble
Dump of assembler code for function foo1:
0x0000000000400530 <+0>: push %rbp
0x0000000000400531 <+1>: mov %rsp,%rbp
0x0000000000400534 <+4>: mov %edi,-0x4(%rbp)
=> 0x0000000000400537 <+7>: mov -0x4(%rbp),%edi
0x000000000040053a <+10>: shl $0x1,%edi
0x000000000040053d <+13>: mov %edi,-0x4(%rbp)
0x0000000000400540 <+16>: mov -0x4(%rbp),%eax
0x0000000000400543 <+19>: pop %rbp
0x0000000000400544 <+20>: retq
End of assembler dump.
I do the same after reaching the second breakpoint:
(gdb) disassemble
Dump of assembler code for function foo2:
0x0000000000400550 <+0>: push %rbp
0x0000000000400551 <+1>: mov %rsp,%rbp
0x0000000000400554 <+4>: sub $0x10,%rsp
0x0000000000400558 <+8>: lea 0x400644,%rax
0x0000000000400560 <+16>: mov %edi,-0x4(%rbp)
=> 0x0000000000400563 <+19>: mov -0x4(%rbp),%esi
0x0000000000400566 <+22>: mov %rax,%rdi
0x0000000000400569 <+25>: mov $0x0,%al
0x000000000040056b <+27>: callq 0x400410 <printf#plt>
0x0000000000400570 <+32>: mov %eax,-0x8(%rbp)
0x0000000000400573 <+35>: add $0x10,%rsp
0x0000000000400577 <+39>: pop %rbp
0x0000000000400578 <+40>: retq
End of assembler dump.
GDB obviously uses different offsets (+7 in foo1 and +19 in foo2), with respect to the beginning of the function, when setting the breakpoint. How can I determine this offset by myself without using GDB?
gdb uses a few methods to decide this information.
First, the very best way is if your compiler emits DWARF describing the function. Then gdb can decode the DWARF to find the end of the prologue.
However, this isn't always available. GCC emits it, but IIRC only when optimization is used.
I believe there's also a convention that if the first line number of a function is repeated in the line table, then the address of the second instance is used as the end of the prologue. That is if the lines look like:
< function f >
line 23 0xffff0000
line 23 0xffff0010
Then gdb will assume that the function f's prologue is complete at 0xfff0010.
I think this is the mode used by gcc when not optimizing.
Finally gdb has some prologue decoders that know how common prologues are written on many platforms. These are used when debuginfo isn't available, though offhand I don't recall what the purpose of that is.
As others mentioned, even without debugging symbols GDB has a function prologue decoder, i.e. heuristic magic.
To disable that, you can add an asterisk before the function name:
break *func
On Binutils 2.25 the skip algorithm on seems to be at: symtab.c:skip_prologue_sal, which breakpoints.c:break_command, the command definition, calls indirectly.
The prologue is a common "boilerplate" used at the start of function calls.
The prologues of foo2 is longer than that of foo1 by two instructions because:
sub $0x10,%rsp
foo2 calls another function, so it is not a leaf function. This prevents some optimizations, in particular it must reduce the rsp before another call to save room for the local state.
Leaf functions don't need that because of the 128 byte ABI red zone, see also: Why does the x86-64 GCC function prologue allocate less stack than the local variables?
foo1 however is a leaf function.
lea 0x400644,%rax
For some reason, clang stores the address of local string constants (stored in .rodata) in registers as part of the function prologue.
We know that rax contains "greetings from foo! i = %i" because it is then passed to %rdi, the first argument of printf.
foo1 does not have local strings constants however.
The other instructions of the prologue are common to both functions:
rbp manipulation is discussed at: What is the purpose of the EBP frame pointer register?
mov %edi,-0x4(%rbp) stores the first argument on the stack. This is not required on leaf functions, but clang does it anyways. It makes register allocation easier.
On ELF platforms like linux, debug information is stored in a separate (non-executable) section in the executable. In this separate section there is all the information that is needed by the debugger. Check the DWARF2 specification for the specifics.

My overflow code does not work

The code below is from the well-known article Smashing The Stack For Fun And Profit.
void function(int a, int b, int c) {
char buffer1[5];
char buffer2[10];
int *ret;
ret = buffer1 + 12;
(*ret)+=8;
}
void main() {
int x;
x=0;
function(1,2,3);
x=1;
printf("%d\n",x);
}
I think I must explain my target of this code.
The stack model is below. The number below the word is the number of bytes of the variable in the stack. So, if I want to rewrite RET to skip the statement I want, I calculate the offset from buffer1 to RET is 8+4=12. Since the architecture is x86 Linux.
buffer2 buffer1 BSP RET a b c
(12) (8) (4) (4) (4) (4) (4)
I want to skip the statement x=1; and let printf() output 0 on the screen.
I compile the code with:
gcc stack2.c -g
and run it in gdb:
gdb ./a.out
gdb gives me the result like this:
Program received signal SIGSEGV, Segmentation fault.
main () at stack2.c:17
17 x = 1;
I think Linux uses some mechanism to protect against stack overflow. Maybe Linux stores the RET address in another place and compares the RET address in the stack before functions return.
And what is the detail about the mechanism? How should I rewrite the code to make the program output 0?
OK,the disassemble code is below.It comes form the output of gdb since I think is more easy to read for you.And anybody can tell me how to paste a long code sequence?Copy and paste one by one makes me too tired...
Dump of assembler code for function main:
0x08048402 <+0>: push %ebp
0x08048403 <+1>: mov %esp,%ebp
0x08048405 <+3>: sub $0x10,%esp
0x08048408 <+6>: movl $0x0,-0x4(%ebp)
0x0804840f <+13>: movl $0x3,0x8(%esp)
0x08048417 <+21>: movl $0x2,0x4(%esp)
0x0804841f <+29>: movl $0x1,(%esp)
0x08048426 <+36>: call 0x80483e4 <function>
0x0804842b <+41>: movl $0x1,-0x4(%ebp)
0x08048432 <+48>: mov $0x8048520,%eax
0x08048437 <+53>: mov -0x4(%ebp),%edx
0x0804843a <+56>: mov %edx,0x4(%esp)
0x0804843e <+60>: mov %eax,(%esp)
0x08048441 <+63>: call 0x804831c <printf#plt>
0x08048446 <+68>: mov $0x0,%eax
0x0804844b <+73>: leave
0x0804844c <+74>: ret
Dump of assembler code for function function:
0x080483e4 <+0>: push %ebp
0x080483e5 <+1>: mov %esp,%ebp
0x080483e7 <+3>: sub $0x14,%esp
0x080483ea <+6>: lea -0x9(%ebp),%eax
0x080483ed <+9>: add $0x3,%eax
0x080483f0 <+12>: mov %eax,-0x4(%ebp)
0x080483f3 <+15>: mov -0x4(%ebp),%eax
0x080483f6 <+18>: mov (%eax),%eax
0x080483f8 <+20>: lea 0x8(%eax),%edx
0x080483fb <+23>: mov -0x4(%ebp),%eax
0x080483fe <+26>: mov %edx,(%eax)
0x08048400 <+28>: leave
0x08048401 <+29>: ret
I check the assemble code and find some mistake about my program,and I have rewrite (*ret)+=8 to (*ret)+=7,since 0x08048432 <+48>minus0x0804842b <+41> is 7.
Because that article is from 1996 and the assumptions are incorrect.
Refer to "Smashing The Modern Stack For Fun And Profit"
http://www.ethicalhacker.net/content/view/122/24/
From the above link:
However, the GNU C Compiler (gcc) has evolved since 1998, and as a result, many people are left wondering why they can't get the examples to work for them, or if they do get the code to work, why they had to make the changes that they did.
The function function overwrites some place of the stack outside of its own, which is this case is the stack of main. What it overwrites I don't know, but it causes the segmentation fault you see. It might be some protection employed by the operating system, but it might as well be the generated code just does something wrong when wrong value is at that position on the stack.
This is a really good example of what may happen when you write outside of your allocated memory. It might crash directly, it might crash somewhere completely different, or if might not crash at all but instead just do some calculation wrong.
Try ret = buffer1 + 3;
Explanation: ret is an integer pointer; incrementing it by 1 adds 4 bytes to the address on 32bit machines.

Resources