Doubts in executable and relocatable object file - c

I have written a simple Hello World program.
#include <stdio.h>
int main() {
printf("Hello World");
return 0;
}
I wanted to understand how the relocatable object file and executable file look like.
The object file corresponding to the main function is
0000000000000000 <main>:
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
4: bf 00 00 00 00 mov $0x0,%edi
9: b8 00 00 00 00 mov $0x0,%eax
e: e8 00 00 00 00 callq 13 <main+0x13>
13: b8 00 00 00 00 mov $0x0,%eax
18: c9 leaveq
19: c3 retq
Here the function call for printf is callq 13. One thing i don't understand is why is it 13. That means call the function at adresss 13, right??. 13 has the next instruction, right?? Please explain me what does this mean??
The executable code corresponding to main is
00000000004004cc <main>:
4004cc: 55 push %rbp
4004cd: 48 89 e5 mov %rsp,%rbp
4004d0: bf dc 05 40 00 mov $0x4005dc,%edi
4004d5: b8 00 00 00 00 mov $0x0,%eax
4004da: e8 e1 fe ff ff callq 4003c0 <printf#plt>
4004df: b8 00 00 00 00 mov $0x0,%eax
4004e4: c9 leaveq
4004e5: c3 retq
Here it is callq 4003c0. But the binary instruction is e8 e1 fe ff ff. There is nothing that corresponds to 4003c0. What is that i am getting wrong?
Thanks.
Bala

In the first case, take a look at the instruction encoding - it's all zeroes where the function address would go. That's because the object hasn't been linked yet, so the addresses for external symbols haven't been hooked up yet. When you do the final link into the executable format, the system sticks another placeholder in there, and then the dynamic linker will finally add the correct address for printf() at runtime. Here's a quick example for a "Hello, world" program I wrote.
First, the disassembly of the object file:
00000000 <_main>:
0: 8d 4c 24 04 lea 0x4(%esp),%ecx
4: 83 e4 f0 and $0xfffffff0,%esp
7: ff 71 fc pushl -0x4(%ecx)
a: 55 push %ebp
b: 89 e5 mov %esp,%ebp
d: 51 push %ecx
e: 83 ec 04 sub $0x4,%esp
11: e8 00 00 00 00 call 16 <_main+0x16>
16: c7 04 24 00 00 00 00 movl $0x0,(%esp)
1d: e8 00 00 00 00 call 22 <_main+0x22>
22: b8 00 00 00 00 mov $0x0,%eax
27: 83 c4 04 add $0x4,%esp
2a: 59 pop %ecx
2b: 5d pop %ebp
2c: 8d 61 fc lea -0x4(%ecx),%esp
2f: c3 ret
Then the relocations:
main.o: file format pe-i386
RELOCATION RECORDS FOR [.text]:
OFFSET TYPE VALUE
00000012 DISP32 ___main
00000019 dir32 .rdata
0000001e DISP32 _puts
As you can see there's a relocation there for _puts, which is what the call to printf turned into. That relocation will get noticed at link time and fixed up. In the case of dynamic library linking, the relocations and fixups might not get fully resolved until the program is running, but you'll get the idea from this example, I hope.

The target of the call in the E8 instruction (call) is specified as relative offset from the current instruction pointer (IP) value.
In your first code sample the offset is obviously 0x00000000. It basically says
call +0
The actual address of printf is not known yet, so the compiler just put the 32-bit value 0x00000000 there as a placeholder.
Such incomplete call with zero offset will naturally be interpreted as the call to the current IP value. On your platform, the IP is pre-incremented, meaning that when some instruction is executed, the IP contains the address of the next instruction. I.e. when instruction at the address 0xE is executed the IP contains value 0x13. And the call +0 is naturally interpreted as the call to instruction 0x13. This is why you see that 0x13 in the disassembly of the incomplete code.
Once the code is complete, the placeholder 0x00000000 offset is replaced with the actual offset of printf function in the code. The offset can be positive (forward) or negative (backward). In your case the IP at the moment of the call is 0x4004DF, while the address of printf function is 0x4003C0. For this reason, the machine instruction will contain a 32-bit offset value equal to 0x4003C0 - 0x4004DF, which is negative value -287. So what you see in the code is actually
call -287
-287 is 0xFFFFFEE1 in binary. This is exactly what you see in your machine code. It is just that the tool you are using displayed it backwards.

Calls are relative in x86, IIRC if you have e8 , the call location is addr+5.
e1 fe ff ff a is little endian encoded relative jump. It really means fffffee1.
Now add this to the address of the call instruction + 5:
(0xfffffee1 + 0x4004da + 5) % 2**32 = 0x4003c0

Related

Switch methods calls inside an executable

I wonder how to switch the call to a function for another inside an executable (.exe in my case)
Here is the code I try to play with
#include <stdio.h>
void hello()
{
printf("Hello world!");
}
void investigate()
{
printf("Investigate all the things!");
}
main()
{
hello();
}
Once I compiled the above code (with gcc) and got an executable (.exe) out of it, I want to switch the "hello" call with "investigate".
--Edit--
My environment: Windows 10 (64bit), mingw with gcc/g++ 4.8.1
--Edit 2--
I'm fine with Linux answer (any Ubuntu or any OpenSuse and any architecture) too as for me it's very important to have a proof-of-concept.
Assuming the compiler is not omitting dead functions entirely, that it is not inlining the function and that the call won't go through the PLT, once you have compiled the executable, you can simply edit the call instruction.
Note that the two functions must be "compatible", where the notion of compatibility is fuzzy, it means "the new must satisfy at least the same assumptions the compiler made when calling the old one".
The ABI is of course one such assumption but it may not be the only one.
If your compiler omitted dead function, you can't switch the function (one is missing).
If your compiler inlined the call, you can't switch the function (there is no call). You can work against the compiler and rewrite the code at the call-site (in the C source), this is called patching.
If your compiler used the PLT, you need to change the GOT entry used by the PLT stub. You may need to document your self a bit but changing the linked procedure is actually a feature of PLT machinery.
If your compiler did nothing of that, this should be case for such a simple source when no optimisations are enabled, you can use objdump -d <file> to find the call-site and the address of the new function:
000000000040051d <hello>:
40051d: 55 push %rbp
40051e: 48 89 e5 mov %rsp,%rbp
400521: bf f0 05 40 00 mov $0x4005f0,%edi
400526: b8 00 00 00 00 mov $0x0,%eax
40052b: e8 d0 fe ff ff callq 400400 <printf#plt>
400530: 5d pop %rbp
400531: c3 retq
0000000000400532 <investigate>:
400532: 55 push %rbp
400533: 48 89 e5 mov %rsp,%rbp
400536: bf fd 05 40 00 mov $0x4005fd,%edi
40053b: b8 00 00 00 00 mov $0x0,%eax
400540: e8 bb fe ff ff callq 400400 <printf#plt>
400545: 5d pop %rbp
400546: c3 retq
0000000000400547 <main>:
400547: 55 push %rbp
400548: 48 89 e5 mov %rsp,%rbp
40054b: b8 00 00 00 00 mov $0x0,%eax
400550: e8 c8 ff ff ff callq 40051d <hello>
400555: b8 00 00 00 00 mov $0x0,%eax
40055a: 5d pop %rbp
40055b: c3 retq
40055c: 0f 1f 40 00 nopl 0x0(%rax)
Then change the immediate value of the call instruction with the difference between the target address and the address after the end of the call instruction (it doesn't matter where the origin is as long as it's the same for both addresses).
Target = 400532
After the end of call = 400555
Difference = 400532 - 400555 = -23 = 0xFFFFFFDD
Change from:
400550: e8 c8 ff ff ff
to:
400550: e8 dd ff ff ff
Note that immediates are little-endiands.
You can use an hexeditor to edit the code, to find the offset into the file you can either use an elf reader and do a bit of math your self or you can simply search for the bytes of the call instruction (check also the bytes around the call to be sure).
After the edit, the binary has been patched:
0000000000400532 <investigate>:
400532: 55 push %rbp
400533: 48 89 e5 mov %rsp,%rbp
400536: bf fd 05 40 00 mov $0x4005fd,%edi
40053b: b8 00 00 00 00 mov $0x0,%eax
400540: e8 bb fe ff ff callq 400400 <printf#plt>
400545: 5d pop %rbp
400546: c3 retq
0000000000400547 <main>:
400547: 55 push %rbp
400548: 48 89 e5 mov %rsp,%rbp
40054b: b8 00 00 00 00 mov $0x0,%eax
400550: e8 dd ff ff ff callq 400532 <investigate>
400555: b8 00 00 00 00 mov $0x0,%eax
40055a: 5d pop %rbp
40055b: c3 retq

System V AMD64 ABI floating point varargs order

I compiled a call to printf with different kinds of args.
Here's the code + generated asm:
int main(int argc, char const *argv[]){
// 0: 55 push rbp
// 1: 48 89 e5 mov rbp,rsp
// 4: 48 83 ec 20 sub rsp,0x20
// 8: 89 7d fc mov DWORD PTR [rbp-0x4],edi
// b: 48 89 75 f0 mov QWORD PTR [rbp-0x10],rsi
printf("%s %f %d %f\n", "aye u gonna get some", 133.7f, 420, 69.f);
// f: f2 0f 10 05 00 00 00 00 movsd xmm0,QWORD PTR [rip+0x0] # 17 <main+0x17> 13: R_X86_64_PC32 .rodata+0x2c 69
// 17: 48 8b 05 00 00 00 00 mov rax,QWORD PTR [rip+0x0] # 1e <main+0x1e> 1a: R_X86_64_PC32 .rodata+0x34 133.7
// 1e: 66 0f 28 c8 movapd xmm1,xmm0
// 22: ba a4 01 00 00 mov edx,0x1a4 (420)
// 27: 48 89 45 e8 mov QWORD PTR [rbp-0x18],rax
// 2b: f2 0f 10 45 e8 movsd xmm0,QWORD PTR [rbp-0x18]
// 30: 48 8d 35 00 00 00 00 lea rsi,[rip+0x0] # 37 <main+0x37> 33: R_X86_64_PC32 .rodata-0x4 "aye u wanna get some"
// 37: 48 8d 3d 00 00 00 00 lea rdi,[rip+0x0] # 3e <main+0x3e> 3a: R_X86_64_PC32 .rodata+0x18 "%s %f %d %f\n"
// 3e: b8 02 00 00 00 mov eax,0x2
// 43: e8 00 00 00 00 call 48 <main+0x48> 44: R_X86_64_PLT32 printf-0x4
return 0;
// 48: b8 00 00 00 00 mov eax,0x0
// 4d: c9 leave
// 4e: c3 ret
}
Most stuff here makes sense to me.
In fact everything here makes some level of sense to me.
"%s %f %d %f\n" -> rdi
"aye u gonna get some" -> rsi
133.7 -> xmm0
420 -> rdx
69 -> xmm1
2 -> rax (to indicate there are 2 floating point arguments)
Now what I don't understand is how printf (or any other varargs function) would figure out the position of these floating point arguments among the others.
It can't be compiler magic either since it's dynamically linked.
So the only thing I can think of is maybe it's just va_arg internals, and how when you provide a type, if it's floating point, it must get from the xmms (or stack) instead of otherwise.
Is that correct? If not, how does the other side know where to get 'em? Thanks in advance.
For printf the format string indicates the type of the remaining argments.
The implementation of va_arg knows the type as it is an argument of va_arg, and the correct register can be deduced from the types.

How does a compiled "Hello World" C program store the String using machine language?

so I've started learning about machine language today. I wrote a basic "Hello World" program in C which prints "Hello, world!" ten times using a for loop. I then used the Gnu Debugger to disassemble main and look at the code in machine language (my computer has a x86 processor and I've set gdb up to use intel syntax):
user#PC:~/Path/To/Code$ gdb -q ./a.out
Reading symbols from ./a.out...done.
(gdb) list
1 #include <stdio.h>
2
3 int main()
4 {
5 int i;
6 for(i = 0; i < 10; i++) {
7 printf("Hello, world!\n");
8 }
9 return 0;
10 }
(gdb) disassemble main
Dump of assembler code for function main:
0x0804841d <+0>: push ebp
0x0804841e <+1>: mov ebp,esp
0x08048420 <+3>: and esp,0xfffffff0
0x08048423 <+6>: sub esp,0x20
0x08048426 <+9>: mov DWORD PTR [esp+0x1c],0x0
0x0804842e <+17>: jmp 0x8048441 <main+36>
0x08048430 <+19>: mov DWORD PTR [esp],0x80484e0
0x08048437 <+26>: call 0x80482f0 <puts#plt>
0x0804843c <+31>: add DWORD PTR [esp+0x1c],0x1
0x08048441 <+36>: cmp DWORD PTR [esp+0x1c],0x9
0x08048446 <+41>: jle 0x8048430 <main+19>
0x08048448 <+43>: mov eax,0x0
0x0804844d <+48>: leave
0x0804844e <+49>: ret
End of assembler dump.
(gdb) x/s 0x80484e0
0x80484e0: "Hello, world!"
I understand most of the machine code and what each of the commands do. If I understood it correctly, the address "0x80484e0" is loaded into the esp register so that can use the memory at this address. I examined the address, and to no surprise it contained the desired string. My question now is - how did that string get there in the first place? I can't find a part in the program that sets the string up at this location.
I also don't understand something else: When I first start the program, the eip points to , where the variable i is initialized at [esp+0x1c]. However, the address that esp points to is changed later on in the program (to 0x80484e0), but [esp+0x1c] is still used for "i" after that change. Shouldn't the adress [esp+0x1c] change when the address esp points to changes?
I binary or program is made up of both machine code and data. In this case your string which you put in the source code, the compiler too that data which is just bytes, and because of how it was used was considered read only data, so depending on the compiler that might land in .rodata or .text or some other name the compiler might use. Gcc would probably call it .rodata. The program itself is in .text. The linker comes along and when it links things finds a place for .text, .data, .bss, .rodata, and any other items you may have and then connects the dots. In the case of your call to printf the linker knows where it put the string, the array of bytes, and it was told what its name was (some internal temporary name no doubt) and the printf call was told about that name to so the linker patches up the instruction to grab the address to the format string before calling printf.
Disassembly of section .text:
0000000000400430 <main>:
400430: 53 push %rbx
400431: bb 0a 00 00 00 mov $0xa,%ebx
400436: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
40043d: 00 00 00
400440: bf e4 05 40 00 mov $0x4005e4,%edi
400445: e8 b6 ff ff ff callq 400400 <puts#plt>
40044a: 83 eb 01 sub $0x1,%ebx
40044d: 75 f1 jne 400440 <main+0x10>
40044f: 31 c0 xor %eax,%eax
400451: 5b pop %rbx
400452: c3 retq
400453: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
40045a: 00 00 00
40045d: 0f 1f 00 nopl (%rax)
Disassembly of section .rodata:
00000000004005e0 <_IO_stdin_used>:
4005e0: 01 00 add %eax,(%rax)
4005e2: 02 00 add (%rax),%al
4005e4: 48 rex.W
4005e5: 65 6c gs insb (%dx),%es:(%rdi)
4005e7: 6c insb (%dx),%es:(%rdi)
4005e8: 6f outsl %ds:(%rsi),(%dx)
4005e9: 2c 20 sub $0x20,%al
4005eb: 77 6f ja 40065c <__GNU_EH_FRAME_HDR+0x68>
4005ed: 72 6c jb 40065b <__GNU_EH_FRAME_HDR+0x67>
4005ef: 64 21 00 and %eax,%fs:(%rax)
the compiler will have encoded this instruction but left the address as zeros probably or some fill
400440: bf e4 05 40 00 mov $0x4005e4,%edi
so that the linker could fill it in later. The gnu disassembler attempts to disassemble the .rodata (and .data, etc) blocks which doesnt make sense, so ignore the instructions it is trying to interpret your string which starts at address 0x4005e4.
Before linking a disassembly of the object shows the two sections .text and .rodata
Disassembly of section .text.startup:
0000000000000000 <main>:
0: 53 push %rbx
1: bb 0a 00 00 00 mov $0xa,%ebx
6: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
d: 00 00 00
10: bf 00 00 00 00 mov $0x0,%edi
15: e8 00 00 00 00 callq 1a <main+0x1a>
1a: 83 eb 01 sub $0x1,%ebx
1d: 75 f1 jne 10 <main+0x10>
1f: 31 c0 xor %eax,%eax
21: 5b pop %rbx
22: c3 retq
0000000000000000 <.rodata.str1.1>:
0: 48 rex.W
1: 65 6c gs insb (%dx),%es:(%rdi)
3: 6c insb (%dx),%es:(%rdi)
4: 6f outsl %ds:(%rsi),(%dx)
5: 2c 20 sub $0x20,%al
7: 77 6f ja 78 <main+0x78>
9: 72 6c jb 77 <main+0x77>
b: 64 21 00 and %eax,%fs:(%rax)
unlinked it has to just pad this address/offset for the linker to fill in later.
10: bf 00 00 00 00 mov $0x0,%edi
also note the object contains only the string in .rodata. linking with libraries and other items to make it a complete program clearly added more .rodata, but the linker manages all of that.
Perhaps easier to see with this example
void more_fun ( unsigned int, unsigned int, unsigned int );
unsigned int a;
unsigned int b=5;
const unsigned int c=7;
void fun ( void )
{
more_fun(a,b,c);
}
disassembled as a object
Disassembly of section .text:
0000000000000000 <fun>:
0: 8b 35 00 00 00 00 mov 0x0(%rip),%esi # 6 <fun+0x6>
6: 8b 3d 00 00 00 00 mov 0x0(%rip),%edi # c <fun+0xc>
c: ba 07 00 00 00 mov $0x7,%edx
11: e9 00 00 00 00 jmpq 16 <fun+0x16>
Disassembly of section .data:
0000000000000000 <b>:
0: 05 .byte 0x5
1: 00 00 add %al,(%rax)
...
Disassembly of section .rodata:
0000000000000000 <c>:
0: 07 (bad)
1: 00 00 add %al,(%rax)
...
and for whatever reason you have to link it to see the .bss section. The point of the example is the machine code for the function is in .text, the uninitialized global is in .bss, the initialized global is .data and the const initialized global is .rodata. The compiler was smart enough to know that a const even if it is global wont change so it can just hardcode that value into the math and not need to read from ram, but the other two variables it has to read from ram so generates an instruction with the address zeros to be filled in by the linker at link time.
In your case your read only/const data was a collection of bytes and it wasnt a math operation so the bytes as defined in your source file were placed in memory so they could be pointed at as the first parameter to printf.
There is more to a binary than just machine code. And the compiler and linker can have things placed in memory for the machine code to get, the machine code itself does not have to write every value that will be used by the rest of the machine code.
The compiler 'hard wires' the string into the object code and the linker then 'hard wires' it into the machine code.
Not that the string is embedded into the code, and not stored in a data area meaning that if you took a pointer to the string and attempted to change it you would get an exception.

Create a Data Hazard in a C Program

I am working on a problem where I am attempting to create different scenarios in different C programs such as
Data Hazard
Branch Evaluation
Procedure Call
This is in an attempt at learning pipelining and the different hazards that come up.
So I am writing simple C programs and disassembling to assembly language to see if a hazard gets created. But I cannot figure out how to create these hazards. Do yall have any idea how I could do this? Here is some of the simple code I have written.
I compile using.
gcc -g -c programName.c -o programName.o
gcc programName.o -o programName
objdump -d programName.o > programName.asm
Code:
#include <stdio.h>
int main()
{
int i = 0;
int size = 5;
int num[5] = {1,2,3,4,5};
int sum=0;
int average = 0;
for(i = 0; i < size; i++)
{
sum += num[i];
}
average=sum/size;
return 0;
}
...and here is the assembly for that.
average.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <main>:
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
4: c7 45 f0 00 00 00 00 movl $0x0,0xfffffffffffffff0(%rbp)
b: c7 45 f4 05 00 00 00 movl $0x5,0xfffffffffffffff4(%rbp)
12: c7 45 d0 01 00 00 00 movl $0x1,0xffffffffffffffd0(%rbp)
19: c7 45 d4 02 00 00 00 movl $0x2,0xffffffffffffffd4(%rbp)
20: c7 45 d8 03 00 00 00 movl $0x3,0xffffffffffffffd8(%rbp)
27: c7 45 dc 04 00 00 00 movl $0x4,0xffffffffffffffdc(%rbp)
2e: c7 45 e0 05 00 00 00 movl $0x5,0xffffffffffffffe0(%rbp)
35: c7 45 f8 00 00 00 00 movl $0x0,0xfffffffffffffff8(%rbp)
3c: c7 45 fc 00 00 00 00 movl $0x0,0xfffffffffffffffc(%rbp)
43: c7 45 f0 00 00 00 00 movl $0x0,0xfffffffffffffff0(%rbp)
4a: eb 10 jmp 5c <main+0x5c>
4c: 8b 45 f0 mov 0xfffffffffffffff0(%rbp),%eax
4f: 48 98 cltq
51: 8b 44 85 d0 mov 0xffffffffffffffd0(%rbp,%rax,4),%eax
55: 01 45 f8 add %eax,0xfffffffffffffff8(%rbp)
58: 83 45 f0 01 addl $0x1,0xfffffffffffffff0(%rbp)
5c: 8b 45 f0 mov 0xfffffffffffffff0(%rbp),%eax
5f: 3b 45 f4 cmp 0xfffffffffffffff4(%rbp),%eax
62: 7c e8 jl 4c <main+0x4c>
64: 8b 55 f8 mov 0xfffffffffffffff8(%rbp),%edx
67: 89 d0 mov %edx,%eax
69: c1 fa 1f sar $0x1f,%edx
6c: f7 7d f4 idivl 0xfffffffffffffff4(%rbp)
6f: 89 45 fc mov %eax,0xfffffffffffffffc(%rbp)
72: b8 00 00 00 00 mov $0x0,%eax
77: c9 leaveq
78: c3 retq
Would appreciate any insight or help. Thanks!
Since this is homework, I'm not going to give you a straight answer, but some food for thought to push you in the right direction.
x86 is a terrible ISA to be using to try and comprehend pipelining. A single x86 instruction can hide two or three side-effects, making it difficult to tease out how a given instruction would perform in even the simplest of pipelines. Are you sure you're not provided a RISC ISA to use for this problem?
Put your loop/hazard code into a function and preferably randomize the creation of the array. Make the array much longer. A good compiler will basically figure out the answer otherwise and remove most of the code you wrote! For reasons I don't understand it's putting your variables in memory.
A good compiler will also do things such as loop unrolling in attempt to hide data hazards and get better code scheduling. Learn how to defeat that (or if you can, give the flag the compiler telling it to NOT do those things if messing around with the compiler is allowed).
The keyword "volatile" can be very helpful in telling the compiler to not optimize around/away certain variables (it tells the compiler this value can change at any moment, so don't be clever and optimize code with it and also don't keep the variable inside the register file).
A data hazard means the pipeline will stall waiting on data. Normally instructions get bypassed just in time, so no stalling occurs. Think about which types of instructions may not be able to be bypassed and could cause a stall on a data hazard. This is dependent on the pipeline, so code that stalls for a specific processor may not stall for another. Modern out-of-order Intel processors are excellent at avoiding these stalls and compilers are great at re-scheduling code so they won't occur even on an in-order core.

analyzing i386 assembled function... line by line

hi i'm such newb in assemble and OS world. and yes this is my homework which i'm in stuck in deep dark of i386 manual. please help me or give me some hint.. here's code i have to analyze ine by line. this function is part of EOS(educational OS), doing about interrupt request in hal(hardware abstraction layer). i did "objdump -d interrupt.o" and got this assemble code. of course in i386.
00000000 <eos_ack_irq>:
0: 55 push %ebp ; push %ebp to stack to save stack before
1: b8 fe ff ff ff mov $0xfffffffe,%eax ; what is this??
6: 89 e5 mov %esp,%ebp ; couple with "push %ebp". known as prolog assembly function.
8: 8b 4d 08 mov 0x8(%ebp),%ecx ; set %ecx as value of (%ebp+8)...and what is this do??
b: 5d pop %ebp ; pop the top of stack to %ebp. i know this is for getting back to callee..
c: d3 c0 rol %cl,%eax ; ????? what is this for???
e: 21 05 00 00 00 00 and %eax,0x0 ; make %eax as 0. for what??
14: c3 ret ; return what register??
00000015 <eos_get_irq>:
15: 8b 15 00 00 00 00 mov 0x0,%edx
1b: b8 1f 00 00 00 mov $0x1f,%eax
20: 55 push %ebp
21: 89 e5 mov %esp,%ebp
23: 56 push %esi
24: 53 push %ebx
25: bb 01 00 00 00 mov $0x1,%ebx
2a: 89 de mov %ebx,%esi
2c: 88 c1 mov %al,%cl
2e: d3 e6 shl %cl,%esi
30: 85 d6 test %edx,%esi
32: 75 06 jne 3a <eos_get_irq+0x25>
34: 48 dec %eax
35: 83 f8 ff cmp $0xffffffff,%eax
38: 75 f0 jne 2a <eos_get_irq+0x15>
3a: 5b pop %ebx
3b: 5e pop %esi
3c: 5d pop %ebp
3d: c3 ret
0000003e <eos_disable_irq_line>:
3e: 55 push %ebp
3f: b8 01 00 00 00 mov $0x1,%eax
44: 89 e5 mov %esp,%ebp
46: 8b 4d 08 mov 0x8(%ebp),%ecx
49: 5d pop %ebp
4a: d3 e0 shl %cl,%eax
4c: 09 05 00 00 00 00 or %eax,0x0
52: c3 ret
00000053 <eos_enable_irq_line>:
53: 55 push %ebp
54: b8 fe ff ff ff mov $0xfffffffe,%eax
59: 89 e5 mov %esp,%ebp
5b: 8b 4d 08 mov 0x8(%ebp),%ecx
5e: 5d pop %ebp
5f: d3 c0 rol %cl,%eax
61: 21 05 00 00 00 00 and %eax,0x0
67: c3 ret
and here's pre-assembled C code
/* ack the specified irq */
void eos_ack_irq(int32u_t irq) {
/* clear the corresponding bit in _irq_pending register */
_irq_pending &= ~(0x1<<irq);
}
/* get the irq number */
int32s_t eos_get_irq() {
/* get the highest bit position in the _irq_pending register */
int i = 31;
for(; i>=0; i--) {
if (_irq_pending & (0x1<<i)) {
return i;
}
}
return -1;
}
/* mask an irq */
void eos_disable_irq_line(int32u_t irq) {
/* turn on the corresponding bit */
_irq_mask |= (0x1<<irq);
}
/* unmask an irq */
void eos_enable_irq_line(int32u_t irq) {
/* turn off the corresponding bit */
_irq_mask &= ~(0x1<<irq);
}
so these functions do ack and get and mask and unmask an interrupt request. and i'm stuck at the first one. so if you are mercy enough, would you please get me some hint or answer to analyze the first function? i'll try to get the others... and i'm very sorry for another homework.. (my TA doesn't look email)
21 05 00 00 00 00 (that and) is actually an and with a memory operand (namely and [0], eax) which the AT&T syntax obscures (but technically it does say that, note the absence of a $ sign). It makes more sense that way (the offset of 0 suggests you didn't link the code before disassembling).
mov $0xfffffffe, %eax is doing exactly what it looks like it's doing (note that 0xfffffffe is all ones except the lowest bit), and that means the function has been implemented like this:
_irq_pending &= rotate_left(0xFFFFFFFE, irq);
Saving a not operation. It has to be a rotate there instead of a shift in order to make the low bits 1 if necessary.

Resources