So lately I've been looking at the disassembly of my C++ code, and having to manually track what's in each register, like this:
95: 48 8b 16 mov (%rsi),%rdx ; %rdx = raggedCross.sink
98: 48 8b 42 38 mov 0x38(%rdx),%rax ; %rax = sink.table
9c: 8b 4a 10 mov 0x10(%rdx),%ecx ; %ecx = sink.baseCol
9f: 48 8b 70 50 mov 0x50(%rax),%rsi ; %rsi = table.starts
a3: 89 c8 mov %ecx,%eax ; %eax = baseCol
a5: 83 c1 1c add $0x1c,%ecx ; %ecx = baseCol + 1
And so on. The comments are mine, added by hand, from looking up the offset of various fields (e.g. sink, table, baseCol, starts) in the C++ classes.
It's straight forward to do, but tedius and time consuming: the perfect thing for a program to be doing. gdb seems to know the offset of various fields within a struct: I can do &((Table *)0x1200)->starts and it tells the the right address. So, this information is around.
Is there some disassembler that can use this info to annotate the code for me?
Failing that, I could write my own. Where does gdb get the offsets?
GDB uses the debugging information you included to determine that sort of thing, it's not part of a normal executable; DWARF is one common format used to store debug information
You can use the debugging information (DWARF2) in order to look at the object files. As you're using GCC, you can do an annotated dump using the binutils utility objdump -S. If you dump all sections, the DWARF information is dumped as well.
You could take a look at IDA Pro. It won't completely automate the process, but it'll at least let you define your structure/class in one place, and it'll handle most things from there.
Related
Note, this question already has similar answers here, which I want to point out to:
"global main" in Assembly
What is global _start in assembly language?
However this question is asking more about the return formats of them and how they relate to each other (which I don't think is entirely covered in the above questions).
What are the differences between _start and main ? It seems to me like ld uses _start, but that gcc uses main as the entry point. The other difference that I've noticed is that main seems to return the value in %rax, whereas _start returns the value in %rbx
The following is an example of the two ways I'm seeing this:
.globl _start
_start:
mov $1, %rax
mov $2, %rbx
int $0x80
And to run it:
$ as script.s -o script.o; ld script.o -o script; ./script; echo $?
# 2
And the other way:
.globl main
main:
mov $3, %rax
ret
And to run it:
$ gcc script.s -o script; ./script; echo $?
3
What is the difference between these two methods? Does main automatically invoke _start somewhere, or how do they relate to each other? Why does one return their value in rbx whereas the other one returns it in rax ?
TL:DR: function return values and system-call arguments use separate registers because they're completely unrelated.
When you compile with gcc, it links CRT startup code that defines a _start. That _start (indirectly) calls main, and passes main's return value (which main leaves in EAX) to the exit() library function. (Which eventually makes an exit system call, after doing any necessary libc cleanup like flushing stdio buffers.)
See also Return vs Exit from main function in C - this is exactly analogous to what you're doing, except you're using _exit() which bypasses libc cleanup, instead of exit(). Syscall implementation of exit()
An int $0x80 system call takes its argument in EBX, as per the 32-bit system-call ABI (which you shouldn't be using in 64-bit code). It's not a return value from a function, it's the process exit status. See Hello, world in assembly language with Linux system calls? for more about system calls.
Note that _start is not a function; it can't return in that sense because there's no return address on the stack. You're taking a casual description like "return to the OS" and conflating that with a function's "return value". You can call exit from main if you want, but you can't ret from _start.
EAX is the return-value register for int-sized values in the function-calling convention. (The high 32 bits of RAX are ignored because main returns int. But also, $? exit status can only get the low 8 bits of the value passed to exit().)
Related:
Why am I allowed to exit main using ret?
What happens with the return value of main()?
where goes the ret instruction of the main
What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code? explains why you should use syscall, and shows some of the kernel side of what happens inside the kernel after a system call.
_start is the entry point for the binary. Main is the entry point for the C code.
_start is specific to a toolchain, main() is specific to a language.
You can't simply start executing compiled C code, you need a bootstrap, some code that preps the minimum things that a high level language like that requires, other languages have a longer list of requirements but for C you need to either through the loader if on an operation system or the bootstrap or both a solution for the stack pointer so that there is a stack, the read/write global data (often called .data) is initialized and the zeroed (often called .bss) data is zeroed. Then the bootstrap can call main().
Because most code runs on some operating system, and the operating system can/does load that code into ram it doesn't need a hard entry point requirement as you would need for booting a processor for example where there is a hard entry point or there is a hard vector table address. So gnu is flexible enough and some operating systems are flexible enough that the entry point of the code doesn't have to be the first machine code in the binary. Now that doesn't mean that _start indicates the entry point per se as you need to tell the linker the entry point ENTRY(_start) for example if you use a linker script for gnu ld. But the tools do expect a label to be found called _start, and if the linker doesn't then it issues a warning, it keeps going but issues a warning.
main() is specific to the C language as the C entry point, the label the bootstrap calls after it does its job and is ready to run the compiled C code.
If loading into ram and if the binary file format supports it and the operating system's loader supports it the entry point into the binary can be anywhere in the binary, indicated in the binary file.
You can kind of think of _start as the entry point into the binary and main as the entry point into the compiled C code.
The return for a C function is defined by the calling convention that C compiler uses, which the compiler authors are free to do whatever they want, but modern times they often conform to a target defined (ARM, x86, MIPS, etc) defined convention. So that C calling convention defines exactly how to return something depending on the thing, so int main () is a return of an int but float myfun() might have a different rule within the convention.
The return from a binary if you can even return, is defined by the operating system or operating environment which is independent of the high level language. So on a mac on an x86 processor the rule may be one thing on Windows on an x86 the rule may be another, on Ubuntu Linux on the same x86 may be another, bsd, another, probably not but Mint Linux another, and so on.
The rules and system calls are specific to the operating system not the processor or computer or certainly not the high level language that does not directly touch the operating system anyway (handled in bootstrap or library code not in high level language code). A number of them you are supposed to make a system call not simply return a value in a register, but clearly the operating system needs to be robust enough to handle an improper return, for malformed binaries. And/or allow that as a legal return without an exiting system call, and in that case would then define a rule for how to return without a system call.
As far as main calling _start you can easily see this yourself:
int main ( void )
{
return(5);
}
readelf shows:
Entry point address: 0x500
objdump shows (not the whole output here)
Disassembly of section .init:
00000000000004b8 <_init>:
4b8: 48 83 ec 08 sub $0x8,%rsp
4bc: 48 8b 05 25 0b 20 00 mov 0x200b25(%rip),%rax # 200fe8 <__gmon_start__>
4c3: 48 85 c0 test %rax,%rax
4c6: 74 02 je 4ca <_init+0x12>
4c8: ff d0 callq *%rax
4ca: 48 83 c4 08 add $0x8,%rsp
4ce: c3 retq
...
Disassembly of section .text:
00000000000004f0 <main>:
4f0: b8 05 00 00 00 mov $0x5,%eax
4f5: c3 retq
4f6: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
4fd: 00 00 00
...
0000000000000500 <_start>:
500: 31 ed xor %ebp,%ebp
502: 49 89 d1 mov %rdx,%r9
505: 5e pop %rsi
506: 48 89 e2 mov %rsp,%rdx
509: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp
50d: 50 push %rax
50e: 54 push %rsp
50f: 4c 8d 05 6a 01 00 00 lea 0x16a(%rip),%r8 # 680 <__libc_csu_fini>
516: 48 8d 0d f3 00 00 00 lea 0xf3(%rip),%rcx # 610 <__libc_csu_init>
51d: 48 8d 3d cc ff ff ff lea -0x34(%rip),%rdi # 4f0 <main>
524: ff 15 b6 0a 20 00 callq *0x200ab6(%rip) # 200fe0 <__libc_start_main#GLIBC_2.2.5>
52a: f4 hlt
52b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
So you can see everything I mentioned above. The entry point for the binary is not at the beginning of the binary. The entry point (for the binary) is _start, somewhere in the middle of the binary. And somewhere after _start (not necessarily as close as seen here, could be buried under other nested calls) main is called from the bootstrap code. It is assumed that .data and .bss and the stack are setup by the loader not by the bootstrap before calling the C entry point.
So in this case which is typical _start is the entry point for the binary, somewhere after it bootstraps for C it calls the C entry point main(). As the programmer though you control which linker script and bootstrap are used and as a result don't have to use _start as the entry point you can create your own (certainly can't be main() though, unless you are not fully supporting C and possibly other exceptions related to the operating system).
If I compile this program:
#include <stdio.h>
int main(int argc, char** argv) {
printf("hello world!\n");
return 0;
}
for x86-64, the asm output uses movl $.LC0, %edi / call puts. (See full asm output / compile options on godbolt.)
My question is: How can GCC know that the the string's address can fit in a 32bit immediate operand? Why doesn't it need to use movabs $.LC0, %rdi (i.e. a mov r64, imm64, not a zero or sign-extended imm32).
AFAIK, there's nothing saying the loader has to decide to load the data section at any particular address. If the string is stored at some address above 1ULL << 32 then the higher bits will be ignored by the movl. I get similar behavior with clang, so I don't think this is unique to GCC.
The reason I care is I want to create my own data segment that lives in memory at any arbitrary address I choose (above 2^32 potentially).
In GCC manual:
https://gcc.gnu.org/onlinedocs/gcc-4.5.3/gcc/i386-and-x86_002d64-Options.html
3.17.15 Intel 386 and AMD x86-64 Options
-mcmodel=small
Generate code for the small code model: the program and its symbols
must be linked in the lower 2 GB of the address space. Pointers are 64
bits. Programs can be statically or dynamically linked. This is the
default code model.
-mcmodel=kernel Generate code for the kernel code model. The kernel runs in the negative 2 GB of the address space. This model has to be
used for Linux kernel code.
-mcmodel=medium
Generate code for the medium model: The program is linked in the lower
2 GB of the address space. Small symbols are also placed there.
Symbols with sizes larger than -mlarge-data-threshold are put into
large data or bss sections and can be located above 2GB. Programs can
be statically or dynamically linked.
-mcmodel=large
Generate code for the large model: This model makes no assumptions
about addresses and sizes of sections.
https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html
3.18.1 AArch64 Options
-mcmodel=tiny
Generate code for the tiny code model. The program and its statically defined symbols must be within 1GB of each other. Pointers
are 64 bits. Programs can be statically or dynamically linked. This
model is not fully implemented and mostly treated as ‘small’.
-mcmodel=small
Generate code for the small code model. The program and its statically defined symbols must be within 4GB of each other. Pointers
are 64 bits. Programs can be statically or dynamically linked. This is
the default code model.
-mcmodel=large
Generate code for the large code model. This makes no assumptions about addresses and sizes of sections. Pointers are 64 bits. Programs
can be statically linked only.
I can confirm that this happens on 64-bit compilation:
gcc -O1 foo.c
Then objdump -d a.out (notice also that printf("%s\n") can be optimized into puts!):
0000000000400536 <main>:
400536: 48 83 ec 08 sub $0x8,%rsp
40053a: bf d4 05 40 00 mov $0x4005d4,%edi
40053f: e8 cc fe ff ff callq 400410 <puts#plt>
400544: b8 00 00 00 00 mov $0x0,%eax
400549: 48 83 c4 08 add $0x8,%rsp
40054d: c3 retq
40054e: 66 90 xchg %ax,%ax
The reason is that GCC defaults to -mcmodel=small where the static data is linked in the bottom 2G of address space.
Notice that string constants do not go to the data segment, but they're within the code segment instead, unless -fwritable-strings. Also if you want to relocate the object code freely in memory, you'd probably want to compile with -fpic to make the code RIP relative instead of putting 64-bit addresses everywhere.
void demo()
{
printf("demo");
}
int main()
{
printf("%p",(void*)demo);
return 0;
}
The above code prints the address of function demo.
So if we can print the address of a function, that means that this function is present in the memory and is occupying some space in it.
So how much space it is occupying in the memory?
You can see for yourself using objdump -r -d:
0000000000000000 <demo>:
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
4: bf 00 00 00 00 mov $0x0,%edi
5: R_X86_64_32 .rodata
9: b8 00 00 00 00 mov $0x0,%eax
e: e8 00 00 00 00 callq 13 <demo+0x13>
f: R_X86_64_PC32 printf-0x4
13: 5d pop %rbp
14: c3 retq
0000000000000015 <main>:
EDIT
I took your code and compiled (but not linked!) it. Using objdump you can see the actual way the compiler lays out the code to be run. At the end of the day there is no such thing as a function: for the CPU it's just a jump to some location (that in this listing happens to be labeled). So the size of the "function" is the size of the code that comprises it.
There seems to be some confusion that this is somehow not "real code". Here is what GDB says:
Dump of assembler code for function demo:
0x000000000040052d <+0>: push %rbp
0x000000000040052e <+1>: mov %rsp,%rbp
0x0000000000400531 <+4>: mov $0x400614,%edi
0x0000000000400536 <+9>: mov $0x0,%eax
0x000000000040053b <+14>: callq 0x400410 <printf#plt>
0x0000000000400540 <+19>: pop %rbp
0x0000000000400541 <+20>: retq
This is exactly the same code, with exactly the same size, patched by the linker to use real addresses. gdb prints offsets in decimal while objdump uses the more favourable hex. As you can see, in both cases the size is 21 bytes.
So if we can print the address of a function, that means that this
function is present in the memory and is occupying some space in it.
Yes, the functions you write are compiled into code that's stored in memory. (In the case of an interpreted language, the code itself is kept in memory and executed by an interpreter.)
So how much space it is occupying in the memory?
The amount of memory depends entirely on the function. You can write a very long function or a very short one. The long one will require more memory. Space used for code generally isn't something you need to worry about, though, unless you're working in an environment with severe memory constraints, such as on a very small embedded system. On desktop computer (or even mobile device) with a modern operating system, the virtual memory system will take care of moving pages of code into or out of physical memory as they're needed, so there's very little chance that your code will consume too much memory.
Of course it's occupying space in memory, the entire program is loaded in memory once you execute it. Typically, the program instructions are stored in the lowest bytes of the memory space, known as the text section. You can read more about that here: http://www.geeksforgeeks.org/memory-layout-of-c-program/
Yes, all functions that you use in your code do occupy memory space. However, the memory space does not necessarily belong exclusively to your function. For example, an inline function would occupy space inside each function from where it is called.
The standard does not provide a way to tell how much space a function occupies in memory, as pointer arithmetic, the trick that lets you compute sizes of contiguous memory regions in the data memory, is not defined for function pointers. Moreover, ISO C forbids conversion of function pointer to object pointer type, so you cannot get around this restriction by casting your function pointer to, say, a char*.
printf("%p",demo);
The above code prints the address of function demo().
That is undefined behavior: %p expects a void*, while you are passing it a void (*)(). You should see a compiler warning, telling that what you are doing is not valid (demo).
As for determining the amount of memory it is occupying, this is not possible at run-time. However, there are other ways you can determine it:
How to get the length of a function in bytes?
The functions are compiled into machine code that will run only on a specific ISA (x86, probably ARM if it's going to run on your phone, etc.) Since different processors may need more or fewer instructions to run the same function, and the length of instructions can also vary, there is no way to know in advance exactly how big the function will be until you compile it.
Even if you know what processor and operating system it will be compiled for, different compilers will create different, equivalent representations of the function depending on which instructions they use and how they optimize the code.
Also, keep in mind a function occupies memory in different ways. I think you are talking about the code itself, which is its own section. During execution, the function can also occupy space on the stack - every time the function is called, more memory is taken up in the form of a stack frame. The amount depends on the number and type of local variables and arguments declared by the function.
Yes however you can declare it as being inline, so the compiler will take the source code and move it where ever you call that function. Or you can also use preprocessor macros. Though do keep in mind using inline will generate larger code but it will execute faster, and the compiler can decide to ignore your inline request if it feels that it will become to large.
Basically, I used objdump -D to dis-assemble an object file and an ELF file. The major difference I see between the two is that.
I see the instructions in the object file (of the individual segments) have an address that starts with 0. Hence the consecutive addresses of are offset by a certain value, probably depending upon the length of the op-code corresponding to that specific instruction.
Disassembly of section .text:
00000000 <main>:
0: 8d 4c 24 04 lea 0x4(%esp),%ecx
4: 83 e4 f0 and $0xfffffff0,%esp
7: ff 71 fc pushl -0x4(%ecx)
a: 55 push %ebp
On the other hand, for an ELF fine I see a 32-bit address space for the instructions.Also If I print the address of main in my program. It is equivalent to the address in my dis-assembled ELF.
08048394 <main>:
8048394: 8d 4c 24 04 lea 0x4(%esp),%ecx
8048398: 83 e4 f0 and $0xfffffff0,%esp
804839b: ff 71 fc pushl -0x4(%ecx)
804839e: 55 push %ebp
The questions here are.
What does the addresses in ELF file actually refer to?
How does the linker compute them?
The ELF file contains code linked together against the preferred load address of the executable (and you can change the preference through linker options). The addresses you're seeing are computed by objdump against that address, which is part of the ELF format.
The object-code has no load address (yet) because it isn't linked into a loadable image. Once it is stitched together with linker (along with the rest of the object code and shared object references) the final output moves all that code into position against the preferred load address (sort of... the loader is what actually does this when the ELF image is loaded for execution). Suggested further reading (and there are a TON of links out there)
I have a strange problem after turning on level 1 optimization in gcc. What I do is save the label and jmp back to it from a different function later.
void
UMS__suspend_procr( VirtProcr *animatingPr )
{
animatingPr->nextInstrPt = &&ResumePt;
[Some Code and inline volatile asm]
ResumePt:
return;
}
I do some of these jumps and they all work fine.
The problem is that when I turn on O1 it does not save the right label address. Instead it does this:
804b14e: 8b 45 08 mov 0x8(%ebp),%eax
804b151: c7 40 14 4e b1 04 08 movl $0x804b14e,0x14(%eax)
804b158: 8b 55 08 mov 0x8(%ebp),%edx
So the program is jumping back even before the assignment.
This code is not valid GNU C. To begin with, computed gotos (&&label) are a feature specific to GNU C, not part of the C language, but that's ok if you're using GNU C. However, the only place they're valid in GNU C is with a goto statement. You cannot use the pointer with inline asm as an indirect jump/call destination, because adjusting the stack frame is up to the compiler, and the current logical view of the stack frame from the point of the inline asm and the label destination might not match. With an explicit goto statement, the compiler can patch this up, but with asm it can't even tell it's happening.
As for the bigger picture, if you're writing code like this, you should really rethink some of your assumptions. There's certainly a better way to accomplish what you want.