I've been tasked with converting IA32 code to Y86. The original program was written in C and is intended to take an array of integers in which the even positioned values call one of three functions and the odd positioned values are operated on within that function. The functions include the negation of a number, the square of a number, and the sum from 1 to the supplied number.
Most of the instructions are easily converted from IA32 to Y86, but there are a number of instructions that are giving me a really hard time.
0000001e <negation>:
1e: 55 push %ebp
1f: 89 e5 mov %esp,%ebp
21: 8b 45 08 mov 0x8(%ebp),%eax
24: f7 d8 neg %eax
26: 5d pop %ebp
27: c3 ret
The neg instruction is not a valid instruction in Y86. This is what I have in Y86:
# int Negation(int x)
Negation:
pushl %ebp
pushl %esi
rrmovl %esp,%ebp
mrmovl 0x8(%ebp),%eax
irmovl %esi,$0
subl %eax, %esi
rrmovl %esi, %eax
popl %esi
popl %ebp
ret
Is this the correct way to go about this problem?
Another instruction is the imul instruction in my square function:
00000028 <square>:
28: 55 push %ebp
29: 89 e5 mov %esp,%ebp
2b: 8b 45 08 mov 0x8(%ebp),%eax
2e: 0f af c0 imul %eax,%eax
31: 5d pop %ebp
32: c3 ret
Does anyone know how the "imul" instruction can be converted in this situation?
Thanks for the help! Any tips on IA32/Y86 Conversion would be greatly appreciated too.
For implementing imul, you might want to look at using a shift and add routine to implement a mul routine:
http://en.wikipedia.org/wiki/Multiplication_algorithm#Peasant_or_binary_multiplication
Then for imul just use the following steps:
figure out what sign the result should have
convert the operands to absolute values (using your negation routine)
call your mul routine on the positive values
convert the result to negative if necessary
1) is mrmovl 0x4(%esp),%eax allowed?
ixorl %eax, 0xffffffff
iaddl %eax, 1
should be slightly more efficient (also ebp can be used as GPR -- no need to push esi)
2) for multiplication there are indeed shift and add-options,
but also a LUT based approach, exploiting the fact that 4*a*b = (a+b)^2 - (a-b)^2.
for each 8x8 bit or NxN bit multiplication.
For a=h<<8+l, B=H<<8|L, aB = Ll + (hL+Hl)<<8 + hH<<16;
could be handled using 3 different tables:
s1[n] = n^2 >>2; s2[n]=n^2 << 6; s3[n]=n^2 << 14;
For negation, you reversed the operands for the irmovl instruction.
The following code works:
#
# Negate a number in %ebx by subtracting it from 0
#
Start:
irmovl $999, %eax // Some random value to prove non-destructiveness
irmovl Stack, %esp // Set the stack
pushl %eax // Preserve
Go:
irmovl $300, %ebx
xorl %eax, %eax
subl %ebx,%eax
rrmovl %eax, %ebx
Finish:
popl %eax // Restore
halt
.pos 0x0100
Stack:
Related
Let test_speed.c be the following C code :
#include <stdio.h>
int main(){
int i;
for(i=0; i < 1000000000; i++) {}
printf("%d", i);
}
I run in the terminal :
gcc -o test_speed test_speed.c
and then :
time ./test_speed
I get :
Now i run the following :
gcc -O3 -o test_speed test_speed.c
and then :
time ./test_speed
I get :
How can the second run be this fast ? Is it already computed during the compilation ?
that's because -O3 aggressive optimization assumes that
for(i=0; i < 1000000000; i++) {}
has no side effect (except for the value of i) and removes the loop completely (directly setting i to 1000000000).
Disassembly (x86):
00000000 <_main>:
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: 83 e4 f0 and $0xfffffff0,%esp
6: 83 ec 10 sub $0x10,%esp
9: e8 00 00 00 00 call e <_main+0xe>
e: c7 44 24 04 00 ca 9a movl $0x3b9aca00,0x4(%esp) <== 1000000000 in hex, no loop
15: 3b
16: c7 04 24 00 00 00 00 movl $0x0,(%esp)
1d: e8 00 00 00 00 call 22 <_main+0x22>
22: 31 c0 xor %eax,%eax
24: c9 leave
25: c3 ret
that optimization level is not suitable for calibrated active-CPU loops as you can see (the result is the same with -O2, but the loop remains unoptimized with just -O)
gcc "knows" that there is no body in the loop, and no dependency on any result, temporary or real -- so it removes the loop.
A good tool for analysis like this is godbolt.org which shows you the generated assembly. The difference between no optimization at all and the -O3 optmization is stark:
No optimization
With -O3
A compiler only has to keep the observable behavior of a program. Counting a variable without any I/O, interaction, or just using its value isn't observable, so as your loop doesn't do anything, the optimizer just throws it away completely and directly assigns the final value.
The compiler recognizes that the loop does nothing, and that removing it would not change the output of the program, so the loop was optimized away entirely.
Here's the assembly with -O0:
.L3:
.loc 1 4 0 is_stmt 0 discriminator 3
addl $1, -4(%rbp)
.L2:
.loc 1 4 0 discriminator 1
cmpl $999999999, -4(%rbp) # loop
jle .L3
.loc 1 5 0 is_stmt 1
movl -4(%rbp), %eax
movl %eax, %esi
movl $.LC0, %edi
movl $0, %eax
call printf
movl $0, %eax
.loc 1 6 0
leave
.cfi_def_cfa 7, 8
ret
And with -O3:
main:
.LFB23:
.file 1 "x1.c"
.loc 1 2 0
.cfi_startproc
.LVL0:
subq $8, %rsp
.cfi_def_cfa_offset 16
.LBB4:
.LBB5:
.file 2 "/usr/include/x86_64-linux-gnu/bits/stdio2.h"
.loc 2 104 0
movl $1000000000, %edx # stored value, no loop
movl $.LC0, %esi
movl $1, %edi
xorl %eax, %eax
call __printf_chk
.LVL1:
.LBE5:
.LBE4:
.loc 1 6 0
xorl %eax, %eax
addq $8, %rsp
.cfi_def_cfa_offset 8
ret
You can see that in the -O3 case the loop is removed entirely and the final value of i, 1000000000, is stored directly.
I am trying to find the meaning of assembly code generated from a c program. Here is the program in C:
int* a = &argc;
int b = 8;
a = &b;
Here is the assembly code generated with explanations. There is one part that I do not understand:
Prologue of the main:
leal 4(%esp), %ecx
andl $-16, %esp
pushl -4(%ecx)
pushl %ebp
movl %esp, %ebp
pushl %ecx
subl $36, %esp
Load the address of argc in %eax:
movl %ecx, %eax
The part I do not get:
movl 4(%eax), %edx
movl %edx, -28(%ebp)
Stack-Smashing Protector code (setup):
movl %gs:20, %ecx
movl %ecx, -12(%ebp)
xorl %ecx, %ecx
Load values in a and b (see in main.c):
movl %eax, -16(%ebp)
movl $8, -20(%ebp)
Modify the value of a (a = &b):
leal -20(%ebp), %eax
movl %eax, -16(%ebp)
Stack-Smashing Protector code (verify the stack is ok):
movl $0, %eax
movl -12(%ebp), %edx
xorl %gs:20, %edx
je .L7
call __stack_chk_fail
If the stack is Ok:
.L7:
addl $36, %esp
popl %ecx
popl %ebp
leal -4(%ecx), %esp
ret
So the part I do not uinderstand is modifying the value in -28(%ebp), an address never used. Does someone knows why is this part generated?
The good way to see what the compiler does. I assume you have a file called main.c:
int main(int argc, char **argv)
{
int* a = &argc;
int b = 8;
a = &b;
}
Compile with debug info to an object file:
$ gcc -c -g main.c
View the assembly:
$ objdump -S main.o
main.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <main>:
int main(int argc, char **argv)
{
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
4: 89 7d ec mov %edi,-0x14(%rbp)
7: 48 89 75 e0 mov %rsi,-0x20(%rbp)
int* a = &argc;
b: 48 8d 45 ec lea -0x14(%rbp),%rax
f: 48 89 45 f8 mov %rax,-0x8(%rbp)
int b = 8;
13: c7 45 f4 08 00 00 00 movl $0x8,-0xc(%rbp)
a = &b;
1a: 48 8d 45 f4 lea -0xc(%rbp),%rax
1e: 48 89 45 f8 mov %rax,-0x8(%rbp)
22: b8 00 00 00 00 mov $0x0,%eax
}
27: 5d pop %rbp
28: c3 retq
Then do the same with full optimization:
$ gcc -c -g -O3 main.c
And view the assembly again:
$ objdump -S main.o
main.o: file format elf64-x86-64
Disassembly of section .text.startup:
0000000000000000 <main>:
int main(int argc, char **argv)
{
int* a = &argc;
int b = 8;
a = &b;
}
0: 31 c0 xor %eax,%eax
2: c3 retq
So the answer is yes. The compiler can produce instructions not needed. That's why you turn on optimizations. When they are turned off, the compiler does its job in a very generic way without thinking at all. For example, it reserves space for variables that are not used.
I knew that register variables are stored in CPU registers.
And the same variables are stored in stack if the CPU registers are busy/full.
how can i know that the variable is stored in stack or CPU register?
No, you can't.
It's decided by the compiler, and might change between compilations if, for instance, the surrounding code changes the register pressure or if compiler flags are changed.
I am agree with Mr. Unwind's answer, but upto some extend this way may be helpful to you:
file name x.c:
int main(){
register int i=0;
i++;
printf("%d",i);
}
Assemble code:
~$ gcc x.c -S
output file name is x.s.
In my case ebx register is used, which may be difference at different compilation time.
~$ cat x.s
.file "x.c"
.section .rodata
.LC0:
.string "%d"
.text
.globl main
.type main, #function
main:
pushl %ebp
movl %esp, %ebp
andl $-16, %esp
pushl %ebx
subl $28, %esp
movl $0, %ebx
addl $1, %ebx // because i++
movl $.LC0, %eax
movl %ebx, 4(%esp)
movl %eax, (%esp)
call printf
addl $28, %esp
popl %ebx
movl %ebp, %esp
popl %ebp
ret
You can also disassemble your executable using objdunp:
$ gcc x.c -o x
$ objdump x -d
Partial assembly output using objdump command:
080483c4 <main>:
80483c4: 55 push %ebp
80483c5: 89 e5 mov %esp,%ebp
80483c7: 83 e4 f0 and $0xfffffff0,%esp
80483ca: 53 push %ebx
80483cb: 83 ec 1c sub $0x1c,%esp
80483ce: bb 00 00 00 00 mov $0x0,%ebx
80483d3: 83 c3 01 add $0x1,%ebx //due to i++
80483d6: b8 b0 84 04 08 mov $0x80484b0,%eax
80483db: 89 5c 24 04 mov %ebx,0x4(%esp)
80483df: 89 04 24 mov %eax,(%esp)
80483e2: e8 0d ff ff ff call 80482f4 <printf#plt>
80483e7: 83 c4 1c add $0x1c,%esp
80483ea: 5b pop %ebx
80483eb: 89 ec mov %ebp,%esp
80483ed: 5d pop %ebp
80483ee: c3 ret
80483ef: 90 nop
%ebx register reserved for register variable.
Am too agreeing with UnWind answer, on the other hand disassembling the code in GDB may give the storage of the variables. Disassembling a vague code which I have gives the locals of that frame as below,
(gdb) info locals
i = 0
ret = <value optimized out>
k = 0
ctx = (BN_CTX *) 0x632e1cc8
A1 = (BIGNUM *) 0x632e1cd0
A1_odd = (BIGNUM *) 0x632e1ce8
check = <value optimized out>
mont = (BN_MONT_CTX *) 0x632e2108
A = (const BIGNUM *) 0x632e2028
Now if try printing the address of the locals it does tell me the storage location as below,
(gdb) p &i
$16 = (int *) 0x143fba40
(gdb) p &k
$17 = (int *) 0x143fba38
(gdb) p &mont
Address requested for identifier "mont" which is in register $s7
(gdb)
Here objects i and k are on stack and mont is in register $s7.
According to the book "The Ansi C Programming Language - Second Edition" of Brian W. Kernighan & Dennis M.Ritchie (The founders of the C languages), you can not.
Chapter 4, Page 84,
"... And it is not possible to take the address of register variable,
regardless whether the variable is actually placed in a register."
Hope that helps!
Best of luck in the future,
Ron
I am a little bit confused about the difference between
leal -4(%ebp), %eax
and
movl -4(%ebp), %eax
Can someone explain this to me?
LEA (load effective address) just computes the address of the operand, it does not actually dereference it. Most of the time, it's just doing a calculation like a combined multiply-and-add for, say, array indexing.
In this case, it's doing a simple numeric subtraction: leal -4(%ebp), %eax just assigns to the %eax register the value of %ebp - 4. It's equivalent to a single sub instruction, except a sub requires the destination to be the same as one of the sources.
The movl instruction, in contrast, accesses the memory location at %ebp - 4 and stores that value into %eax.
If you wish to look at this in terms of a different programming language, then:
int var;
[ ... ]
func (var, &var);
evaluates to the following (Linux x86_64) assembly code:
[ ... ]
4: 8b 7c 24 0c mov 0xc(%rsp),%edi
8: 48 8d 74 24 0c lea 0xc(%rsp),%rsi
d: e8 xx xx xx xx callq ... <func>
[ ... ]
Since %rdi / %rsi are the 1st / 2nd arguments, you can see that lea ... retrieves the address &var of a variable, while mov ... loads/stores the value var of the same.
I.e. in assembly, the use of lea instead of mov is similar to using the address-of & operator in C/C++, not the (value of) a variable itself.
lea has far more uses than that, but you explicitly asked about the difference between the two.
For illustration: mov with a memory operand always performs a memory access (load or store), while the memory operand to lea is merely treated as pointer arithmetic - i.e. the address is calculated and resolved but no memory access happens at the instruction itself. These two:
lea 1234(%eax, %ebx, 8), %ecx
movl (%ecx), ecx
result in the same as:
movl 1234(%eax, %ebx, 8), %ecx
while the following:
leal (%eax, %eax, 4), %eax
multiplies the value in %eax with five.
Equivialent to LEA in Intel syntax, load effective address (long?).
I am trying to understand the assembly level code for a simple C program by inspecting it with gdb's disassembler.
Following is the C code:
#include <stdio.h>
void function(int a, int b, int c) {
char buffer1[5];
char buffer2[10];
}
void main() {
function(1,2,3);
}
Following is the disassembly code for both main and function
gdb) disass main
Dump of assembler code for function main:
0x08048428 <main+0>: push %ebp
0x08048429 <main+1>: mov %esp,%ebp
0x0804842b <main+3>: and $0xfffffff0,%esp
0x0804842e <main+6>: sub $0x10,%esp
0x08048431 <main+9>: movl $0x3,0x8(%esp)
0x08048439 <main+17>: movl $0x2,0x4(%esp)
0x08048441 <main+25>: movl $0x1,(%esp)
0x08048448 <main+32>: call 0x8048404 <function>
0x0804844d <main+37>: leave
0x0804844e <main+38>: ret
End of assembler dump.
(gdb) disass function
Dump of assembler code for function function:
0x08048404 <function+0>: push %ebp
0x08048405 <function+1>: mov %esp,%ebp
0x08048407 <function+3>: sub $0x28,%esp
0x0804840a <function+6>: mov %gs:0x14,%eax
0x08048410 <function+12>: mov %eax,-0xc(%ebp)
0x08048413 <function+15>: xor %eax,%eax
0x08048415 <function+17>: mov -0xc(%ebp),%eax
0x08048418 <function+20>: xor %gs:0x14,%eax
0x0804841f <function+27>: je 0x8048426 <function+34>
0x08048421 <function+29>: call 0x8048340 <__stack_chk_fail#plt>
0x08048426 <function+34>: leave
0x08048427 <function+35>: ret
End of assembler dump.
I am seeking answers for following things :
how the addressing is working , I mean (main+0) , (main+1), (main+3)
In the main, why is $0xfffffff0,%esp being used
In the function, why is %gs:0x14,%eax , %eax,-0xc(%ebp) being used.
If someone can explain , step by step happening, that will be greatly appreciated.
The reason for the "strange" addresses such as main+0, main+1, main+3, main+6 and so on, is because each instruction takes up a variable number of bytes. For example:
main+0: push %ebp
is a one-byte instruction so the next instruction is at main+1. On the other hand,
main+3: and $0xfffffff0,%esp
is a three-byte instruction so the next instruction after that is at main+6.
And, since you ask in the comments why movl seems to take a variable number of bytes, the explanation for that is as follows.
Instruction length depends not only on the opcode (such as movl) but also the addressing modes for the operands as well (the things the opcode are operating on). I haven't checked specifically for your code but I suspect the
movl $0x1,(%esp)
instruction is probably shorter because there's no offset involved - it just uses esp as the address. Whereas something like:
movl $0x2,0x4(%esp)
requires everything that movl $0x1,(%esp) does, plus an extra byte for the offset 0x4.
In fact, here's a debug session showing what I mean:
Microsoft Windows XP [Version 5.1.2600]
(C) Copyright 1985-2001 Microsoft Corp.
c:\pax> debug
-a
0B52:0100 mov word ptr [di],7
0B52:0104 mov word ptr [di+2],8
0B52:0109 mov word ptr [di+0],7
0B52:010E
-u100,10d
0B52:0100 C7050700 MOV WORD PTR [DI],0007
0B52:0104 C745020800 MOV WORD PTR [DI+02],0008
0B52:0109 C745000700 MOV WORD PTR [DI+00],0007
-q
c:\pax> _
You can see that the second instruction with an offset is actually different to the first one without it. It's one byte longer (5 bytes instead of 4, to hold the offset) and actually has a different encoding c745 instead of c705.
You can also see that you can encode the first and third instruction in two different ways but they basically do the same thing.
The and $0xfffffff0,%esp instruction is a way to force esp to be on a specific boundary. This is used to ensure proper alignment of variables. Many memory accesses on modern processors will be more efficient if they follow the alignment rules (such as a 4-byte value having to be aligned to a 4-byte boundary). Some modern processors will even raise a fault if you don't follow these rules.
After this instruction, you're guaranteed that esp is both less than or equal to its previous value and aligned to a 16 byte boundary.
The gs: prefix simply means to use the gs segment register to access memory rather than the default.
The instruction mov %eax,-0xc(%ebp) means to take the contents of the ebp register, subtract 12 (0xc) and then put the value of eax into that memory location.
Re the explanation of the code. Your function function is basically one big no-op. The assembly generated is limited to stack frame setup and teardown, along with some stack frame corruption checking which uses the afore-mentioned %gs:14 memory location.
It loads the value from that location (probably something like 0xdeadbeef) into the stack frame, does its job, then checks the stack to ensure it hasn't been corrupted.
Its job, in this case, is nothing. So all you see is the function administration stuff.
Stack set-up occurs between function+0 and function+12. Everything after that is setting up the return code in eax and tearing down the stack frame, including the corruption check.
Similarly, main consist of stack frame set-up, pushing the parameters for function, calling function, tearing down the stack frame and exiting.
Comments have been inserted into the code below:
0x08048428 <main+0>: push %ebp ; save previous value.
0x08048429 <main+1>: mov %esp,%ebp ; create new stack frame.
0x0804842b <main+3>: and $0xfffffff0,%esp ; align to boundary.
0x0804842e <main+6>: sub $0x10,%esp ; make space on stack.
0x08048431 <main+9>: movl $0x3,0x8(%esp) ; push values for function.
0x08048439 <main+17>: movl $0x2,0x4(%esp)
0x08048441 <main+25>: movl $0x1,(%esp)
0x08048448 <main+32>: call 0x8048404 <function> ; and call it.
0x0804844d <main+37>: leave ; tear down frame.
0x0804844e <main+38>: ret ; and exit.
0x08048404 <func+0>: push %ebp ; save previous value.
0x08048405 <func+1>: mov %esp,%ebp ; create new stack frame.
0x08048407 <func+3>: sub $0x28,%esp ; make space on stack.
0x0804840a <func+6>: mov %gs:0x14,%eax ; get sentinel value.
0x08048410 <func+12>: mov %eax,-0xc(%ebp) ; put on stack.
0x08048413 <func+15>: xor %eax,%eax ; set return code 0.
0x08048415 <func+17>: mov -0xc(%ebp),%eax ; get sentinel from stack.
0x08048418 <func+20>: xor %gs:0x14,%eax ; compare with actual.
0x0804841f <func+27>: je <func+34> ; jump if okay.
0x08048421 <func+29>: call <_stk_chk_fl> ; otherwise corrupted stack.
0x08048426 <func+34>: leave ; tear down frame.
0x08048427 <func+35>: ret ; and exit.
I think the reason for the %gs:0x14 may be evident from above but, just in case, I'll elaborate here.
It uses this value (a sentinel) to put in the current stack frame so that, should something in the function do something silly like write 1024 bytes to a 20-byte array created on the stack or, in your case:
char buffer1[5];
strcpy (buffer1, "Hello there, my name is Pax.");
then the sentinel will be overwritten and the check at the end of the function will detect that, calling the failure function to let you know, and then probably aborting so as to avoid any other problems.
If it placed 0xdeadbeef onto the stack and this was changed to something else, then an xor with 0xdeadbeef would produce a non-zero value which is detected in the code with the je instruction.
The relevant bit is paraphrased here:
mov %gs:0x14,%eax ; get sentinel value.
mov %eax,-0xc(%ebp) ; put on stack.
;; Weave your function
;; magic here.
mov -0xc(%ebp),%eax ; get sentinel back from stack.
xor %gs:0x14,%eax ; compare with original value.
je stack_ok ; zero/equal means no corruption.
call stack_bad ; otherwise corrupted stack.
stack_ok: leave ; tear down frame.
Pax has produced a definitive answer. However, for completeness, I thought I'd add a note on getting GCC itself to show you the assembly it generates.
The -S option to GCC tells it to stop compilation and write the assembly to a file. Normally, it either passes that file to the assembler or for some targets writes the object file directly itself.
For the sample code in the question:
#include <stdio.h>
void function(int a, int b, int c) {
char buffer1[5];
char buffer2[10];
}
void main() {
function(1,2,3);
}
the command gcc -S q3654898.c creates a file named q3654898.s:
.file "q3654898.c"
.text
.globl _function
.def _function; .scl 2; .type 32; .endef
_function:
pushl %ebp
movl %esp, %ebp
subl $40, %esp
leave
ret
.def ___main; .scl 2; .type 32; .endef
.globl _main
.def _main; .scl 2; .type 32; .endef
_main:
pushl %ebp
movl %esp, %ebp
subl $24, %esp
andl $-16, %esp
movl $0, %eax
addl $15, %eax
addl $15, %eax
shrl $4, %eax
sall $4, %eax
movl %eax, -4(%ebp)
movl -4(%ebp), %eax
call __alloca
call ___main
movl $3, 8(%esp)
movl $2, 4(%esp)
movl $1, (%esp)
call _function
leave
ret
One thing that is evident is that my GCC (gcc (GCC) 3.4.5 (mingw-vista special r3)) doesn't include the stack check code by default. I imagine that there is a command line option, or that if I ever got around to nudging my MinGW install up to a more current GCC that it could.
Edit: Nudged to do so by Pax, here's another way to get GCC to do more of the work.
C:\Documents and Settings\Ross\My Documents\testing>gcc -Wa,-al q3654898.c
q3654898.c: In function `main':
q3654898.c:8: warning: return type of 'main' is not `int'
GAS LISTING C:\DOCUME~1\Ross\LOCALS~1\Temp/ccLg8pWC.s page 1
1 .file "q3654898.c"
2 .text
3 .globl _function
4 .def _function; .scl 2; .type
32; .endef
5 _function:
6 0000 55 pushl %ebp
7 0001 89E5 movl %esp, %ebp
8 0003 83EC28 subl $40, %esp
9 0006 C9 leave
10 0007 C3 ret
11 .def ___main; .scl 2; .type
32; .endef
12 .globl _main
13 .def _main; .scl 2; .type 32;
.endef
14 _main:
15 0008 55 pushl %ebp
16 0009 89E5 movl %esp, %ebp
17 000b 83EC18 subl $24, %esp
18 000e 83E4F0 andl $-16, %esp
19 0011 B8000000 movl $0, %eax
19 00
20 0016 83C00F addl $15, %eax
21 0019 83C00F addl $15, %eax
22 001c C1E804 shrl $4, %eax
23 001f C1E004 sall $4, %eax
24 0022 8945FC movl %eax, -4(%ebp)
25 0025 8B45FC movl -4(%ebp), %eax
26 0028 E8000000 call __alloca
26 00
27 002d E8000000 call ___main
27 00
28 0032 C7442408 movl $3, 8(%esp)
28 03000000
29 003a C7442404 movl $2, 4(%esp)
29 02000000
30 0042 C7042401 movl $1, (%esp)
30 000000
31 0049 E8B2FFFF call _function
31 FF
32 004e C9 leave
33 004f C3 ret
C:\Documents and Settings\Ross\My Documents\testing>
Here we see an output listing produced by the assembler. (Its name is GAS, because it is Gnu's version of the classic *nix assembler as. There's humor there somewhere.)
Each line has most of the following fields: a line number, an address in the current section, bytes stored at that address, and the source text from the assembly source file.
The addresses are offsets into that portion of each section provided by this module. This particular module only has content in the .text section which stores executable code. You will typically find mention of sections named .data and .bss as well. Lots of other names are used and some have special purposes. Read the manual for the linker if you really want to know.
It will be better to try the -fno-stack-protector flag with gcc to disable the canary and see your results.
I'd like to add that for simple stuff, GCC's assembly output is often easier to read if you turn on a little optimization. Here's the sample code again...
void function(int a, int b, int c) {
char buffer1[5];
char buffer2[10];
}
/* corrected calling convention of main() */
int main() {
function(1,2,3);
return 0;
}
this is what I get without optimization (OSX 10.6, gcc 4.2.1+Apple patches)
.globl _function
_function:
pushl %ebp
movl %esp, %ebp
pushl %ebx
subl $36, %esp
call L4
"L00000000001$pb":
L4:
popl %ebx
leal L___stack_chk_guard$non_lazy_ptr-"L00000000001$pb"(%ebx), %eax
movl (%eax), %eax
movl (%eax), %edx
movl %edx, -12(%ebp)
xorl %edx, %edx
leal L___stack_chk_guard$non_lazy_ptr-"L00000000001$pb"(%ebx), %eax
movl (%eax), %eax
movl -12(%ebp), %edx
xorl (%eax), %edx
je L3
call ___stack_chk_fail
L3:
addl $36, %esp
popl %ebx
leave
ret
.globl _main
_main:
pushl %ebp
movl %esp, %ebp
subl $24, %esp
movl $3, 8(%esp)
movl $2, 4(%esp)
movl $1, (%esp)
call _function
movl $0, %eax
leave
ret
Whew, one heck of a mouthful! But look what happens with -O on the command line...
.text
.globl _function
_function:
pushl %ebp
movl %esp, %ebp
leave
ret
.globl _main
_main:
pushl %ebp
movl %esp, %ebp
movl $0, %eax
leave
ret
Of course, you do run the risk of your code being rendered completely unrecognizable, especially at higher optimization levels and with more complicated stuff. Even here, we see that the call to function has been discarded as pointless. But I find that not having to read through dozens of unnecessary stack spills is generally more than worth a little extra scratching my head over the control flow.