Assembly - js versus ja instruction - c

So the goal is for me to write out the C code that corresponds to this assembly :
0: 85 f6 test %esi,%esi
2: 78 13 js 17 <part3+0x17>
4: 83 fe 07 cmp $0x7,%esi
7: 77 14 ja 1d <part3+0x1d>
9: 8d 0c f5 00 00 00 00 lea 0x0(,%rsi,8),%ecx
10: 48 d3 ff sar %cl,%rdi
13: 48 89 f8 mov %rdi,%rax
16: c3 retq
17: b8 00 00 00 00 mov $0x0,%eax
1c: c3 retq
1d: b8 00 00 00 00 mov $0x0,%eax
22: c3 retq
I am a little confused because the first loop testing the %esi register ends before the second loop ends.
Is the second if statement comparing %esi to 7 inside the first loop? or is this a if , else if situation??

Let me sum up, what's already been said
0: 85 f6 test %esi,%esi
2: 78 13 js 17 <part3+0x17>
this is " if (esi < 0) goto 17; "
4: 83 fe 07 cmp $0x7,%esi
7: 77 14 ja 1d <part3+0x1d>
this is " if (esi >7) goto 1d; "
9: 8d 0c f5 00 00 00 00 lea 0x0(,%rsi,8),%ecx
"cx = 8*rsi" // not that obvious it's "just" a multiplication)
10: 48 d3 ff sar %cl,%rdi
rdi >> cl; // not cx, but cx is safe to be <= 7*8, so that's the same
13: 48 89 f8 mov %rdi,%rax
16: c3 retq
return rdi;
17: b8 00 00 00 00 mov $0x0,%eax
1c: c3 retq
17: "return 0"
1d: b8 00 00 00 00 mov $0x0,%eax
22: c3 retq
1d: another "return 0"
so the C-Code is:
{
if (esi < 0) return 0;
if (esi > 7) return 0;
return rdi >> ( 8 * rsi );
}
PS: the 2 "return 0" (17 and 1d) give a clear indication that, in the C-code, the two ifs were NOT combined into one
PSS: the C Code was obviously not compiled with optimization :P

Related

Creating a print function in C 32-bit protected mode

I've been trying to develop a small OS and managed to switch into protected mode, in order to write C code instead of assembly, but since this means I can't use interrupt 10h anymore, I have to write chars to the video memory address. So I tried creating a new print function to easily print out whole strings instead of printing each char separately. That's where the problems came in, for some reason, while printing single chars with the printchar function works, this new print function doesn't work, no matter what I try.
Here's my C Code:
void print(char* message, int offset);
void printChar(char character, int offset);
void start() {
printChar('M', 2);
print("Test String", 4);
while (1) {
}
}
void print(char* msg, int offset) {
for (int i = 0; msg[i] != '\0'; i++)
{
printChar(msg[i], (i * 2) + offset);
}
}
void printChar(char character, int offset) {
unsigned char* vidmem = (unsigned char*)0xB8000;
*(vidmem + offset + 1) = character;
*(vidmem + offset + 2) = 0x0f;
}
I then use these commands to convert my code to binary and put it onto the second sector of a floppy disk with sectedit.
gcc -c test.c
objcopy -O binary -j .text test.o test.bin
Also here's the assembly code generated, when using objdump -d test.o
0000000000000000 <start>:
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
4: 48 83 ec 20 sub $0x20,%rsp
8: ba 02 00 00 00 mov $0x2,%edx
d: b9 4d 00 00 00 mov $0x4d,%ecx
12: e8 73 00 00 00 call 8a <printChar>
17: ba 04 00 00 00 mov $0x4,%edx
1c: 48 8d 05 00 00 00 00 lea 0x0(%rip),%rax # 23 <start+0x23>
23: 48 89 c1 mov %rax,%rcx
26: e8 02 00 00 00 call 2d <print>
2b: eb fe jmp 2b <start+0x2b>
000000000000002d <print>:
2d: 55 push %rbp
2e: 48 89 e5 mov %rsp,%rbp
31: 48 83 ec 30 sub $0x30,%rsp
35: 48 89 4d 10 mov %rcx,0x10(%rbp)
39: 89 55 18 mov %edx,0x18(%rbp)
3c: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%rbp)
43: eb 29 jmp 6e <print+0x41>
45: 8b 45 fc mov -0x4(%rbp),%eax
48: 8d 14 00 lea (%rax,%rax,1),%edx
4b: 8b 45 18 mov 0x18(%rbp),%eax
4e: 01 c2 add %eax,%edx
50: 8b 45 fc mov -0x4(%rbp),%eax
53: 48 63 c8 movslq %eax,%rcx
56: 48 8b 45 10 mov 0x10(%rbp),%rax
5a: 48 01 c8 add %rcx,%rax
5d: 0f b6 00 movzbl (%rax),%eax
60: 0f be c0 movsbl %al,%eax
63: 89 c1 mov %eax,%ecx
65: e8 20 00 00 00 call 8a <printChar>
6a: 83 45 fc 01 addl $0x1,-0x4(%rbp)
6e: 8b 45 fc mov -0x4(%rbp),%eax
71: 48 63 d0 movslq %eax,%rdx
74: 48 8b 45 10 mov 0x10(%rbp),%rax
78: 48 01 d0 add %rdx,%rax
7b: 0f b6 00 movzbl (%rax),%eax
7e: 84 c0 test %al,%al
80: 75 c3 jne 45 <print+0x18>
82: 90 nop
83: 90 nop
84: 48 83 c4 30 add $0x30,%rsp
88: 5d pop %rbp
89: c3 ret
000000000000008a <printChar>:
8a: 55 push %rbp
8b: 48 89 e5 mov %rsp,%rbp
8e: 48 83 ec 10 sub $0x10,%rsp
92: 89 c8 mov %ecx,%eax
94: 89 55 18 mov %edx,0x18(%rbp)
97: 88 45 10 mov %al,0x10(%rbp)
9a: 48 c7 45 f8 00 80 0b movq $0xb8000,-0x8(%rbp)
a1: 00
a2: 8b 45 18 mov 0x18(%rbp),%eax
a5: 48 98 cltq
a7: 48 8d 50 01 lea 0x1(%rax),%rdx
ab: 48 8b 45 f8 mov -0x8(%rbp),%rax
af: 48 01 c2 add %rax,%rdx
b2: 0f b6 45 10 movzbl 0x10(%rbp),%eax
b6: 88 02 mov %al,(%rdx)
b8: 8b 45 18 mov 0x18(%rbp),%eax
bb: 48 98 cltq
bd: 48 8d 50 02 lea 0x2(%rax),%rdx
c1: 48 8b 45 f8 mov -0x8(%rbp),%rax
c5: 48 01 d0 add %rdx,%rax
c8: c6 00 0f movb $0xf,(%rax)
cb: 90 nop
cc: 48 83 c4 10 add $0x10,%rsp
d0: 5d pop %rbp
d1: c3 ret
d2: 90 nop
d3: 90 nop
d4: 90 nop
d5: 90 nop
d6: 90 nop
d7: 90 nop
d8: 90 nop
d9: 90 nop
da: 90 nop
db: 90 nop
dc: 90 nop
dd: 90 nop
de: 90 nop
df: 90 nop
edit: The problem basically lied in me not doing this on a linux distribution, with all the things I'd need to do to do it in Windows not properly set up, huge thanks to MichaelPetch who explained the problems to me, I've now switched to a linux VM and after slightly correcting the code, it works (as the comments pointed out my offset was weird, I used that offset as it worked in the broken setup I had, but normally it shouldn't).

Deciphering x86 assembly function

I am currently working on phase 2 of the binary bomb assignment. I'm having trouble deciphering exactly what a certain function does when called. I've been stuck on it for days.
The function is:
0000000000400f2a <func2a>:
400f2a: 85 ff test %edi,%edi
400f2c: 74 1d je 400f4b <func2a+0x21>
400f2e: b9 cd cc cc cc mov $0xcccccccd,%ecx
400f33: 89 f8 mov %edi,%eax
400f35: f7 e1 mul %ecx
400f37: c1 ea 03 shr $0x3,%edx
400f3a: 8d 04 92 lea (%rdx,%rdx,4),%eax
400f3d: 01 c0 add %eax,%eax
400f3f: 29 c7 sub %eax,%edi
400f41: 83 04 be 01 addl $0x1,(%rsi,%rdi,4)
400f45: 89 d7 mov %edx,%edi
400f47: 85 d2 test %edx,%edx
400f49: 75 e8 jne 400f33 <func2a+0x9>
400f4b: f3 c3 repz retq
It gets called in the larger function "phase_2":
0000000000400f4d <phase_2>:
400f4d: 53 push %rbx
400f4e: 48 83 ec 60 sub $0x60,%rsp
400f52: 48 c7 44 24 30 00 00 movq $0x0,0x30(%rsp)
400f59: 00 00
400f5b: 48 c7 44 24 38 00 00 movq $0x0,0x38(%rsp)
400f62: 00 00
400f64: 48 c7 44 24 40 00 00 movq $0x0,0x40(%rsp)
400f6b: 00 00
400f6d: 48 c7 44 24 48 00 00 movq $0x0,0x48(%rsp)
400f74: 00 00
400f76: 48 c7 44 24 50 00 00 movq $0x0,0x50(%rsp)
400f7d: 00 00
400f7f: 48 c7 04 24 00 00 00 movq $0x0,(%rsp)
400f86: 00
400f87: 48 c7 44 24 08 00 00 movq $0x0,0x8(%rsp)
400f8e: 00 00
400f90: 48 c7 44 24 10 00 00 movq $0x0,0x10(%rsp)
400f97: 00 00
400f99: 48 c7 44 24 18 00 00 movq $0x0,0x18(%rsp)
400fa0: 00 00
400fa2: 48 c7 44 24 20 00 00 movq $0x0,0x20(%rsp)
400fa9: 00 00
400fab: 48 8d 4c 24 58 lea 0x58(%rsp),%rcx
400fb0: 48 8d 54 24 5c lea 0x5c(%rsp),%rdx
400fb5: be 9e 26 40 00 mov $0x40269e,%esi
400fba: b8 00 00 00 00 mov $0x0,%eax
400fbf: e8 6c fc ff ff callq 400c30 <__isoc99_sscanf#plt>
400fc4: 83 f8 02 cmp $0x2,%eax
400fc7: 74 05 je 400fce <phase_2+0x81>
400fc9: e8 c1 06 00 00 callq 40168f <explode_bomb>
400fce: 83 7c 24 5c 64 cmpl $0x64,0x5c(%rsp)
400fd3: 76 07 jbe 400fdc <phase_2+0x8f>
400fd5: 83 7c 24 58 64 cmpl $0x64,0x58(%rsp)
400fda: 77 05 ja 400fe1 <phase_2+0x94>
400fdc: e8 ae 06 00 00 callq 40168f <explode_bomb>
400fe1: 48 8d 74 24 30 lea 0x30(%rsp),%rsi
400fe6: 8b 7c 24 5c mov 0x5c(%rsp),%edi
400fea: e8 3b ff ff ff callq 400f2a <func2a>
400fef: 48 89 e6 mov %rsp,%rsi
400ff2: 8b 7c 24 58 mov 0x58(%rsp),%edi
400ff6: e8 2f ff ff ff callq 400f2a <func2a>
400ffb: bb 00 00 00 00 mov $0x0,%ebx
401000: 8b 04 1c mov (%rsp,%rbx,1),%eax
401003: 39 44 1c 30 cmp %eax,0x30(%rsp,%rbx,1)
401007: 74 05 je 40100e <phase_2+0xc1>
401009: e8 81 06 00 00 callq 40168f <explode_bomb>
40100e: 48 83 c3 04 add $0x4,%rbx
401012: 48 83 fb 28 cmp $0x28,%rbx
401016: 75 e8 jne 401000 <phase_2+0xb3>
401018: 48 83 c4 60 add $0x60,%rsp
40101c: 5b pop %rbx
40101d: c3 retq
I completely understand what phase_2 is doing, I just don't understand what func2a is doing and how it affects the values at 0x30(%rsp) and so on. Because of this I always get to the comparison statement at 0x401003, and the bomb eventually explodes there.
My problem is I don't understand how the input (phase solution) is affecting the values at 0x30(%rsp) via func2a.
400f2a: 85 ff test %edi,%edi
400f2c: 74 1d je 400f4b <func2a+0x21>
This is just an early exit for when edi is zero (je is the same as jz).
400f2e: b9 cd cc cc cc mov $0xcccccccd,%ecx
400f33: 89 f8 mov %edi,%eax
400f35: f7 e1 mul %ecx
400f37: c1 ea 03 shr $0x3,%edx
This is a classic optimization trick; it is the integer arithmetic equivalent of dividing by multiplying by the inverse (see here for details); in practice, here it's the same as saying edx = edi / 10;
400f3a: 8d 04 92 lea (%rdx,%rdx,4),%eax
400f3d: 01 c0 add %eax,%eax
Here it is exploiting lea to perform arithmetic (and it's way clearer in Intel syntax, where it is lea eax,[rdx+rdx*4] => eax = edx*5), then sums the result with itself. It all boils down to eax = edx*10.
400f3f: 29 c7 sub %eax,%edi
Then, subtract it back to edi.
So, all in all this is a complicated (but fast) way to compute the last decimal digit of edi; what we have until now is something like:
void func2a(unsigned edi) {
if(edi==0) return;
label1:
edx=edi/10;
edi%=10;
// ...
}
(label1: is there because 400f33 is a jump target later)
Going on:
400f41: 83 04 be 01 addl $0x1,(%rsi,%rdi,4)
Again, this is way clearer to me in Intel syntax - add dword [rsi+rdi*4],byte +0x1. It is a regular increment into an array of 32-bit int (rdi is multiplied by 4); so, we can imagine that rsi points to an array of integers, indexed with the just-calculated last digit of edi.
void func2a(unsigned edi, int rsi[]) {
if(edi==0) return;
label1:
edx=edi/10;
edi%=10;
rsi[edi]++;
}
Then:
400f45: 89 d7 mov %edx,%edi
400f47: 85 d2 test %edx,%edx
400f49: 75 e8 jne 400f33 <func2a+0x9>
Move the result of the division we calculated above to edi, and loop if it's different from zero.
400f4b: f3 c3 repz retq
Return (using an unusual encoding of the instruction that is optimal for certain AMD processors).
So, by rewriting the jumps with a while loop and giving some meaningful names...
// number is edi, digits_count is rsi, as per regular
// x64 SystemV calling convention
void count_digits(unsigned number, int digits_count[]) {
while(number) {
digits_count[number%10]++;
number/=10;
}
}
I.e., this is a function that, given an integer, counts the occurrences of the single decimal digits, by incrementing the corresponding buckets in the digits_count array.
Fun fact: if we give the C code above to gcc (almost any recent version at -O1) we obtain back exactly the assembly you provided.

C: float changing to 'nan' value

I have an inner function in a larger program that is somehow changing a float value to "nan" when I expect it to be zero. I have trimmed the function down to the simplest form, with no parameters:
static void func(void)
{
int a = 1;
float x = 0.0f;
float v = 0.0f;
printf("x(%f), ", x);
x += (float)a * v;
printf("x(%f), ", x);
printf("(int)x: %d, ", (int)x);
}
This gives the output:
x(0.000000), x(nan), (int)x: -2147483648
If I remove the variable a and hardcode the value (1), the nan value goes away. Similarly, if I remove the line x += (float)a * v;, everything prints as expected (all zeroes).
The frustrating part is that I can't reproduce this by just creating a new program and tossing this in main(). When I try that, the program works perfectly and outputs:
x(0.000000), x(0.000000), (int)x: 0
Disassembly from the function in the actual program:
00000029 <_func>:
29: 55 push %ebp
2a: 89 e5 mov %esp,%ebp
2c: 83 ec 38 sub $0x38,%esp
2f: c7 45 f4 01 00 00 00 movl $0x1,-0xc(%ebp)
36: a1 18 00 00 00 mov 0x18,%eax
3b: 89 45 f0 mov %eax,-0x10(%ebp)
3e: a1 18 00 00 00 mov 0x18,%eax
43: 89 45 ec mov %eax,-0x14(%ebp)
46: d9 45 f0 flds -0x10(%ebp)
49: dd 5c 24 04 fstpl 0x4(%esp)
4d: c7 04 24 00 00 00 00 movl $0x0,(%esp)
54: e8 a7 ff ff ff call 0 <_printf>
59: d9 45 f0 flds -0x10(%ebp)
5c: db 45 f4 fildl -0xc(%ebp)
5f: d9 5d e4 fstps -0x1c(%ebp)
62: d9 45 e4 flds -0x1c(%ebp)
65: d9 45 ec flds -0x14(%ebp)
68: de c9 fmulp %st,%st(1)
6a: de c1 faddp %st,%st(1)
6c: d9 5d f0 fstps -0x10(%ebp)
6f: d9 45 f0 flds -0x10(%ebp)
72: dd 5c 24 04 fstpl 0x4(%esp)
76: c7 04 24 00 00 00 00 movl $0x0,(%esp)
7d: e8 7e ff ff ff call 0 <_printf>
82: d9 45 f0 flds -0x10(%ebp)
85: d9 7d e2 fnstcw -0x1e(%ebp)
88: 0f b7 45 e2 movzwl -0x1e(%ebp),%eax
8c: b4 0c mov $0xc,%ah
8e: 66 89 45 e0 mov %ax,-0x20(%ebp)
92: d9 6d e0 fldcw -0x20(%ebp)
95: db 5d dc fistpl -0x24(%ebp)
98: d9 6d e2 fldcw -0x1e(%ebp)
9b: 8b 45 dc mov -0x24(%ebp),%eax
9e: 89 44 24 04 mov %eax,0x4(%esp)
a2: c7 04 24 08 00 00 00 movl $0x8,(%esp)
a9: e8 52 ff ff ff call 0 <_printf>
ae: c9 leave
af: c3 ret
Disassembly from stand-alone function (as main()):
00000000 <_main>:
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: 83 e4 f0 and $0xfffffff0,%esp
6: 83 ec 30 sub $0x30,%esp
9: e8 00 00 00 00 call e <_main+0xe>
e: c7 44 24 2c 01 00 00 movl $0x1,0x2c(%esp)
15: 00
16: a1 18 00 00 00 mov 0x18,%eax
1b: 89 44 24 28 mov %eax,0x28(%esp)
1f: a1 18 00 00 00 mov 0x18,%eax
24: 89 44 24 24 mov %eax,0x24(%esp)
28: d9 44 24 28 flds 0x28(%esp)
2c: dd 5c 24 04 fstpl 0x4(%esp)
30: c7 04 24 00 00 00 00 movl $0x0,(%esp)
37: e8 00 00 00 00 call 3c <_main+0x3c>
3c: db 44 24 2c fildl 0x2c(%esp)
40: d8 4c 24 24 fmuls 0x24(%esp)
44: d9 44 24 28 flds 0x28(%esp)
48: de c1 faddp %st,%st(1)
4a: d9 5c 24 28 fstps 0x28(%esp)
4e: d9 44 24 28 flds 0x28(%esp)
52: dd 5c 24 04 fstpl 0x4(%esp)
56: c7 04 24 00 00 00 00 movl $0x0,(%esp)
5d: e8 00 00 00 00 call 62 <_main+0x62>
62: d9 44 24 28 flds 0x28(%esp)
66: d9 7c 24 1e fnstcw 0x1e(%esp)
6a: 0f b7 44 24 1e movzwl 0x1e(%esp),%eax
6f: b4 0c mov $0xc,%ah
71: 66 89 44 24 1c mov %ax,0x1c(%esp)
76: d9 6c 24 1c fldcw 0x1c(%esp)
7a: db 5c 24 18 fistpl 0x18(%esp)
7e: d9 6c 24 1e fldcw 0x1e(%esp)
82: 8b 44 24 18 mov 0x18(%esp),%eax
86: 89 44 24 04 mov %eax,0x4(%esp)
8a: c7 04 24 08 00 00 00 movl $0x8,(%esp)
91: e8 00 00 00 00 call 96 <_main+0x96>
96: b8 00 00 00 00 mov $0x0,%eax
9b: c9 leave
9c: c3 ret
9d: 90 nop
9e: 90 nop
9f: 90 nop
This issue is often the result of undefined behavior. In this specific instance, there was an implicit function declaration (a header file hadn't been included elsewhere in the program) which caused UB, and resulted in this bug.

Memory Size Load and Store penalty analysis?

Profiling the code with ocount shows more cycles with penalty on and lesser cycles with penalty off. I'm trying to understand why there is more penalty when the penalty flag is on?
uint16_t arr[1010];
uint32_t r[500];
void func()
{
uint32_t i = 0;
for (i = 0; i < 1000; i+=2)
{
arr[i] = i;
arr[i+1] = i+10;
#ifdef PENALTY_ON
r[i/2] = *(uint32_t *)((uint16_t *)&arr[i+1]);
#endif
}
#ifndef PENALTY_ON
for (i = 0; i < 1000; i+=2)
{
r[i/2] = *(uint32_t *)((uint16_t *)&arr[i+1]);
}
#endif
}
Compiling both with gcc on a 32-bit machine with -O3
With PENALTY_ON
00000000 <func>:
0: 31 c0 xor %eax,%eax
2: 8d b6 00 00 00 00 lea 0x0(%esi),%esi
8: 8d 50 0a lea 0xa(%eax),%edx
b: 66 89 94 00 02 00 00 mov %dx,0x2(%eax,%eax,1)
12: 00
13: 8b 8c 00 02 00 00 00 mov 0x2(%eax,%eax,1),%ecx
1a: 89 c2 mov %eax,%edx
1c: 66 89 84 00 00 00 00 mov %ax,0x0(%eax,%eax,1)
23: 00
24: 83 c0 02 add $0x2,%eax
27: d1 ea shr %edx
29: 3d e8 03 00 00 cmp $0x3e8,%eax
2e: 89 0c 95 00 00 00 00 mov %ecx,0x0(,%edx,4)
35: 75 d1 jne 8 <func+0x8>
37: f3 c3 repz ret
Without PENALTY_ON
00000000 <func>:
0: 31 c0 xor %eax,%eax
2: 8d b6 00 00 00 00 lea 0x0(%esi),%esi
8: 8d 50 0a lea 0xa(%eax),%edx
b: 66 89 84 00 00 00 00 mov %ax,0x0(%eax,%eax,1)
12: 00
13: 66 89 94 00 02 00 00 mov %dx,0x2(%eax,%eax,1)
1a: 00
1b: 83 c0 02 add $0x2,%eax
1e: 3d e8 03 00 00 cmp $0x3e8,%eax
23: 75 e3 jne 8 <func+0x8>
25: 66 31 c0 xor %ax,%ax
28: 8b 8c 00 02 00 00 00 mov 0x2(%eax,%eax,1),%ecx
2f: 89 c2 mov %eax,%edx
31: 83 c0 02 add $0x2,%eax
34: d1 ea shr %edx
36: 3d e8 03 00 00 cmp $0x3e8,%eax
3b: 89 0c 95 00 00 00 00 mov %ecx,0x0(,%edx,4)
42: 75 e4 jne 28 <func+0x28>
44: f3 c3 repz ret
I think the reason is that a Read-after-Write stall occurs with PENALTY_ON
b: 66 89 94 00 02 00 00 mov %dx,0x2(%eax,%eax,1)
12: 00
13: 8b 8c 00 02 00 00 00 mov 0x2(%eax,%eax,1),%ecx

Disassembly -- buffer overflow attack (homework)

I am working in a buffer overflow attack program for a class assignment. I have provided the C code, as well as the disassembled code, and one of my jobs is to annotate the disassembly code. I don't need anyone to annotate the whole thing, but am I on the right track with my comments? If not, maybe annotate a couple lines to get me on the right track. Thanks!
#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
/* Like gets, except that characters are typed as pairs of hex digits.
Nondigit characters are ignored. Stops when encounters newline */
char *getxs(char *dest)
{
int c;
int even = 1; /* Have read even number of digits */
int otherd = 0; /* Other hex digit of pair */
char* sp = dest;
while ((c = getchar()) != EOF && c != '\n')
{
if (isxdigit(c))
{
int val;
if ('0' <= c && c <= '9')
val = c - '0';
else if ('A' <= c && c <= 'F')
val = c - 'A' + 10;
else
val = c - 'a' + 10;
if (even)
{
otherd = val;
even = 0;
}
else
{
*sp++ = otherd * 16 + val;
even = 1;
}
}
}
*sp++ = '\0';
return dest;
}
int getbuf()
{
char buf[12];
getxs(buf);
return 1;
}
void test()
{
int val;
printf("Type hex string: ");
val = getbuf();
printf("getbuf returned 0x%x\n", val);
}
int main()
{
int buf[16];
/* This little hack is an attempt to get the stack to be in a
stable position
*/
int offset = (((int)buf) & 0xFFF);
int* space = (int*) alloca(offset);
*space = 0; /* So that we don't get complaint of unused variable */
test();
return 0;
}
The annotated disassembly is:
buffer.o: file format elf32-i386
disassembly of section .text:
0000000 <getxs>:
0: 55 push %ebp // pushes stack pointer to top
1: 89 e5 mov %esp,%ebp // stack pointer = c
3: 83 ec 28 sub $0x28,%esp // allocates space for c
6: c7 45 e8 01 00 00 00 movl $0x1,-0x18(%ebp) // even = 1
d: c7 45 ec 00 00 00 00 movl $0x0,-0x14(%ebp) // otherd = 0
14: 8b 45 08 mov 0x8(%ebp),%eax // sp = dest
17: 89 45 f0 mov %eax,-0x10(%ebp) // conditional setup
1a: e9 89 00 00 00 jmp a8 <getxs+0xa8>
1f: e8 fc ff ff ff call 20 <getxs+0x20>
24: 8b 00 mov (%eax),%eax
26: 8b 55 e4 mov -0x1c(%ebp),%edx
29: 01 d2 add %edx,%edx
2b: 01 d0 movzwl (%eax),%eax
30: 0f b7 c0 add %edx,%eax
2d: 0f b7 00 movzwl %ax,%eax
33: 25 00 10 00 00 and $0x1000,%eax
38: 85 c0 test %eax,%eax
3a: 74 6c je a8 <getxs+0xa8>
3c: 83 7d e4 2f cmpl $0x2f,-0x1c(%ebp)
40: 7e 11 jle 53 <getxs+0x53>
42: 83 7d e4 39 cmpl $0x39,-0x1c(%ebp)
46: 7f 0b jg 53 <getxs+0x53>
48: 8b 45 e4 mov -0x1c(%ebp),%eax
4b: 83 e8 30 sub $0x30,%eax
4e: 89 45 f4 mov %eax,-0xc(%ebp)
51: eb 20 jmp 73 <getxs+0x73>
53: 83 7d e4 40 cmpl $0x40,-0x1c(%ebp)
57: 7e 11 jle 6a <getxs+0x6a>
59: 83 7d e4 46 cmpl $0x46,-0x1c(%ebp)
5d: 7f 0b jg 6a <getxs+0x6a>
5f: 8b 45 e4 mov -0x1c(%ebp),%eax
62: 83 e8 37 sub $0x37,%eax
65: 89 45 f4 mov %eax,-0xc(%ebp)
68: eb 09 jmp 73 <getxs+0x73>
6a: 8b 45 e4 mov -0x1c(%ebp),%eax
6d: 83 e8 57 sub $0x57,%eax
70: 89 45 f4 mov %eax,-0xc(%ebp)
73: 83 7d e8 00 cmpl $0x0,-0x18(%ebp)
77: 74 0f je 88 <getxs+0x88>
79: 8b 45 f4 mov -0xc(%ebp),%eax
7c: 89 45 ec mov %eax,-0x14(%ebp)
7f: c7 45 e8 00 00 00 00 movl $0x0,-0x18(%ebp)
86: eb 20 jmp a8 <getxs+0xa8>
88: 8b 45 ec mov -0x14(%ebp),%eax
8b: 89 c2 mov %eax,%edx
8d: c1 e2 04 shl $0x4,%edx
90: 8b 45 f4 mov -0xc(%ebp),%eax
93: 8d 04 02 lea (%edx,%eax,1),%eax
96: 89 c2 mov %eax,%edx
98: 8b 45 f0 mov -0x10(%ebp),%eax
9b: 88 10 mov %dl,(%eax)
9d: 83 45 f0 01 addl $0x1,-0x10(%ebp)
a1: c7 45 e8 01 00 00 00 movl $0x1,-0x18(%ebp)
a8: e8 fc ff ff ff call a9 <getxs+0xa9>
ad: 89 45 e4 mov %eax,-0x1c(%ebp)
b0: 83 7d e4 ff cmpl $0xffffffff,-0x1c(%ebp)
b4: 74 0a je c0 <getxs+0xc0>
b6: 83 7d e4 0a cmpl $0xa,-0x1c(%ebp)
ba: 0f 85 5f ff ff ff jne 1f <getxs+0x1f>
c0: 8b 45 f0 mov -0x10(%ebp),%eax
c3: c6 00 00 movb $0x0,(%eax)
c6: 83 45 f0 01 addl $0x1,-0x10(%ebp)
ca: 8b 45 08 mov 0x8(%ebp),%eax
cd: c9 leave
ce: c3 ret
00000cf <getbuf>:
cf: 55 push %ebp // pushes stack pointer to the top
d0: 89 e5 mov %esp,%ebp // stack pointer = buf[12]
d2: 83 ec 28 sub $0x28,%esp // allocates space (40 bits)
d5: 8d 45 ec lea -0x14(%ebp),%eax // rv = stack pointer - 20
d8: 89 04 24 mov %eax,(%esp)
db: e8 fc ff ff ff call dc <getbuf+0xd>
e0: b8 01 00 00 00 mov $0x1,%eax // return 1 -- want to return ef be ad de
e5: c9 leave
e6: c3 ret
00000e7 <test>:
e7: 55 push %ebp
e8: 89 e5 mov %esp,%ebp
ea: 83 ec 28 sub $0x28,%esp
ed: b8 00 00 00 00 mov $0x0,%eax
f2: 89 04 24 mov %eax,(%esp)
f5: e8 fc ff ff ff call f6 <test+0xf>
fa: e8 fc ff ff ff call fb <test+0x14>
ff: 89 45 f4 mov %eax,-0xc(%ebp)
102: b8 13 00 00 00 mov $0x13,%eax
107: 8b 55 f4 mov -0xc(%ebp),%edx
10a: 89 54 24 04 mov %edx,0x4(%esp)
10e: 89 04 24 mov %eax,(%esp)
111: e8 fc ff ff ff call 112 <test+0x2b>
116: c9 leave
117: c3 ret
0000118 <main>:
118: 8d 4c 24 04 lea 0x4(%esp),%ecx
11c: 83 e4 f0 and $0xfffffff0,%esp
11f: ff 71 fc pushl -0x4(%ecx)
122: 55 push %ebp
123: 89 e5 mov %esp,%ebp
125: 51 push %ecx
126: 83 ec 54 sub $0x54,%esp
129: 8d 45 b0 lea -0x50(%ebp),%eax
12c: 25 ff 0f 00 00 and $0xfff,%eax
131: 89 45 f0 mov %eax,-0x10(%ebp)
134: 8b 45 f0 mov -0x10(%ebp),%eax
137: 83 c0 0f add $0xf,%eax
13a: 83 c0 0f add $0xf,%eax
13d: c1 e8 04 shr $0x4,%eax
140: c1 e0 04 shl $0x4,%eax
143: 29 c4 sub %eax,%esp
145: 89 e0 mov %esp,%eax
147: 83 c0 0f add $0xf,%eax
14a: c1 e8 04 shr $0x4,%eax
14d: c1 e0 04 shl $0x4,%eax
150: 89 45 f4 mov %eax,-0xc(%ebp)
153: 8b 45 f4 mov -0xc(%ebp),%eax
156: c7 00 00 00 00 00 movl $0x0,(%eax)
15c: e8 fc ff ff ff call 15d <main+0x45>
161: b8 00 00 00 00 mov $0x0,%eax
166: 8b 4d fc mov -0x4(%ebp),%ecx
169: c9 leave
16a: 8d 61 fc lea -0x4(%ecx),%esp
16d: c3 ret
The annotations should describe the intent of the instruction or block of instructions. It shouldn't just parrot what the instruction does (incorrectly).
In the first line:
0: 55 push %ebp // pushes stack pointer to top
We can see that the instruction pushes the base pointer onto the stack, but the annotation incorrectly states that we're pushing the stack pointer on the stack.
Rather, the sequence of instructions:
0: 55 push %ebp // pushes stack pointer to top
1: 89 e5 mov %esp,%ebp // stack pointer = c
3: 83 ec 28 sub $0x28,%esp // allocates space for c
Is a standard function entry preeamble that establishes the stack frame and allocates 0x28 bytes of local storage. It is useful to document the layout of the stack frame, including the location of the function arguments:
0x08(%ebp): dest
0x04(%ebp): return-address
0x00(%ebp): prev %ebp
-0x04(%ebp): ?
-0x08(%ebp): ?
-0x0c(%ebp): ?
-0x10(%ebp): sp
-0x14(%ebp): otherd
-0x18(%ebp): even
-0x1c(%ebp): ?
-0x20(%ebp): ?
-0x24(%ebp): ?
-0x28(%ebp): ?
In the following:
14: 8b 45 08 mov 0x8(%ebp),%eax // sp = dest
17: 89 45 f0 mov %eax,-0x10(%ebp) // conditional setup
%eax is not really sp, it holds dest temporarily while it is moved from the function argument at 0x8(%ebp) to the local variable sp at -0x10(%ebp). There is no "conditional setup".

Resources