Hi I'm learning c compiler with this book. https://www.sigbus.info/compilerbook
I want to show the same result as the book shows. What should I do it? I think I need to change the version of gcc, objdump or options.
This book says that it is possible to compile too from the following expected assemble output.
expected
.intel_syntax noprefix
.global main
main:
mov rax, 42
ret
actual
00000000000005fa <main>:
5fa: 55 push rbp
5fb: 48 89 e5 mov rbp,rsp
5fe: b8 2a 00 00 00 mov eax,0x2a
603: 5d pop rbp
604: c3 ret
605: 66 2e 0f 1f 84 00 00 nop WORD PTR cs:[rax+rax*1+0x0]
60c: 00 00 00
60f: 90 nop
what I did
root#686394c78009:/zcc# uname -a
Linux 686394c78009 4.9.125-linuxkit #1 SMP Fri Sep 7 08:20:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
root#686394c78009:/zcc# objdump -v
GNU objdump (GNU Binutils for Ubuntu) 2.30
Copyright (C) 2018 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or (at your option) any later version.
This program has absolutely no warranty.
root#686394c78009:/zcc# gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/7/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 7.4.0-1ubuntu1~18.04.1' --with-bugurl=file:///usr/share/doc/gcc-7/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --prefix=/usr --with-gcc-major-version-only --program-suffix=-7 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 7.4.0 (Ubuntu 7.4.0-1ubuntu1~18.04.1)
root#686394c78009:/zcc# cat test1.c
int main() {
return 42;
}
root#686394c78009:/zcc# gcc -o test1 test1.c
root#686394c78009:/zcc# ./test1
root#686394c78009:/zcc# echo $?
42
root#686394c78009:/zcc# objdump -d -M intel ./test1
Update 1
Generated assembly code with the -S option. Compiling worked from the generated assembly code.
Still there are some differences from my reference book but I will learn more.
And one another curious thing is that the different register name is used respectively. I will look into it too. (I have realized I need to learn from basic..)
// expected
mov rax, 42
// actual
mov eax, 42
root#686394c78009:/zcc# gcc -S -masm=intel test1.c
root#686394c78009:/zcc# cat test1.s
.file "test1.c"
.intel_syntax noprefix
.text
.globl main
.type main, #function
main:
.LFB0:
.cfi_startproc
push rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
mov rbp, rsp
.cfi_def_cfa_register 6
mov eax, 42
pop rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0"
.section .note.GNU-stack,"",#progbits
root#686394c78009:/zcc# gcc -o test1 test1.s
root#686394c78009:/zcc# ./test1
root#686394c78009:/zcc# echo $?
42
Instead of dumping with objdump, try to directly generate assembly code with the -S option for the compiler. With -masm=intel, the output should look similar to what you expect.
Do not expect the compiler to generate the exact same code though. Different compilers and different compiler versions or even the same compiler with different flags may make different choices and generate different assembly for the same code. That's normal.
Related
I wrote this code and found that it acts differently with different versions of gcc.
The source code,
#include<stdio.h>
int *fun();
int main(int argc, char *argv[])
{
int *ptr;
ptr = fun();
printf("%x", *ptr);
}
int *fun()
{
int *ptr;
int foo = 0xdeadbeef;
ptr = &foo;
return ptr;
}
The code is wrong. After execution of fun(), the local variable foo is released and doesn't exist. But the main function tries to use it, so it will lead segmentation fault.
But I tried the same code on three versions of gcc and they act differently.
In 10.2.0
╭─ ~ ································································ ✔ ─╮
╰─ gcc -v | bin/pbcopy ─╯
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-pc-linux-gnu/10.2.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /build/gcc/src/gcc/configure --prefix=/usr --libdir=/usr/lib --libexecdir=/usr/lib --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=https://bugs.archlinux.org/ --enable-languages=c,c++,ada,fortran,go,lto,objc,obj-c++,d --with-isl --with-linker-hash-style=gnu --with-system-zlib --enable-__cxa_atexit --enable-cet=auto --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-default-ssp --enable-gnu-indirect-function --enable-gnu-unique-object --enable-install-libiberty --enable-linker-build-id --enable-lto --enable-multilib --enable-plugin --enable-shared --enable-threads=posix --disable-libssp --disable-libstdcxx-pch --disable-libunwind-exceptions --disable-werror gdc_include_dir=/usr/include/dlang/gdc
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 10.2.0 (GCC)
╭─ ~ ································································ ✔ ─╮
╰─ gcc a.c && a.out ─╯
deadbeef%
It prints deadbeef.
Its assembly code:
(gdb) disassemble fun
Dump of assembler code for function fun:
0x000000000000119d <+23>: movl $0xdeadbeef,-0x14(%rbp)
0x00000000000011a4 <+30>: lea -0x14(%rbp),%rax
0x00000000000011a8 <+34>: mov %rax,-0x10(%rbp)
0x00000000000011ac <+38>: mov -0x10(%rbp),%rax
0x00000000000011b0 <+42>: mov -0x8(%rbp),%rdx
0x00000000000011b4 <+46>: sub %fs:0x28,%rdx
0x00000000000011bd <+55>: je 0x11c4 <fun+62>
0x00000000000011bf <+57>: call 0x1030 <__stack_chk_fail#plt>
0x00000000000011c4 <+62>: leave
0x00000000000011c5 <+63>: ret
End of assembler dump.
(gdb) disass main
0x000000000000116c <+35>: mov %eax,%esi
0x000000000000116e <+37>: lea 0xe8f(%rip),%rdi # 0x2004
0x0000000000001175 <+44>: mov $0x0,%eax
0x000000000000117a <+49>: call 0x1040 <printf#plt>
Assembly code shows the function stores 0xdeadbeef in %rax, and printf receives it as %esi, so it prints 0xdeadbeef.
In 9.3.0:
coolder#ASUS:~$ gcc -v [1/1]
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/9/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:hsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 9.3.0-15' --with-bugurl=file:///usr/share/doc/gcc-9/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-9 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none=/build/gcc-9-0xEOmg/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-build-config=bootstrap-lto-lean --enable-link-mutexThread model: posix
gcc version 9.3.0 (Debian 9.3.0-15)
coolder#ASUS:~$ gcc a.c && ./a.out
a.c: In function ‘fun’:
a.c:14:9: warning: function returns address of local variable [-Wreturn-local-addr]
14 | return &a;
| ^~
0coolder#ASUS:~$
It prints 0.
Its assembly code,
(gdb) disassemble fun
Dump of assembler code for function fun:
0x000055555555515e <+0>: push %rbp
0x000055555555515f <+1>: mov %rsp,%rbp
0x0000555555555162 <+4>: movl $0xdeadbeef,-0x4(%rbp)
0x0000555555555169 <+11>: mov $0x0,%eax
0x000055555555516e <+16>: pop %rbp
(gdb) disass main
0x000055555555513e <+9>: call 0x55555555515e <fun>
0x0000555555555143 <+14>: mov %rax,%rsi
0x0000555555555146 <+17>: lea 0xeb7(%rip),%rdi # 0x555555556004
0x000055555555514d <+24>: mov $0x0,%eax
Assembly code shows it moves 0 to %eax, and printf uses %eax as %rsi, so it prints 0.
In 5.4.1
➜ ~ gcc a.c && ./a.out
a.c: In function ‘fun’:
a.c:17:9: warning: function returns address of local variable [-Wreturn-local-addr]
return &a;
^
[1] 3566 segmentation fault (core dumped) ./a.out
It gets segmentation fault, as I expected.
Its assembly code,
(gdb) disassemble fun
Dump of assembler code for function fun:
0x08048448 <+0>: push %ebp
0x08048449 <+1>: mov %esp,%ebp
0x0804844b <+3>: sub $0x10,%esp
0x0804844e <+6>: movl $0xdeadbeef,-0x4(%ebp)
0x08048455 <+13>: mov $0x0,%eax
0x0804845a <+18>: leave
0x0804845b <+19>: ret
(gdb) disass main
0x0804841d <+17>: call 0x8048448 <fun>
0x08048422 <+22>: mov %eax,-0xc(%ebp)
0x08048425 <+25>: mov -0xc(%ebp),%eax
0x08048428 <+28>: mov (%eax),%eax
Assembly code shows that it moves 0x0 to %eax, and main tries to refer %eax, so this leads to segmentation fault.
So why the assembly code is so different?
Any help will be appreciated.
Returning the address of a local variable and trying to access it after its lifetime is over is undefined behavior, rationalizing what happens under the hood is a fool's errand because there are no standard rules to be followed (appart, of course, from the aforementioned and linked UB rules), it's quite common different compiler versions changing the way a situation like this is dealt with.
So I'm trying to compile a C file to .bin and then add it to an .img file after my first stage bootloader.
I have found these bash commands in this answer by user Michael Petch:
gcc -g -m32 -c -ffreestanding -o kernel.o kernel.c -lgcc
ld -melf_i386 -Tlinker.ld -nostdlib --nmagic -o kernel.elf kernel.o
objcopy -O binary kernel.elf kernel.bin
and used this C code (taken from the same answer, saved as kernel.c):
/* This code will be placed at the beginning of the object by the linker script */
__asm__ ("jmp _main\r\n");
int main(){
/* Do Stuff Here*/
return 0; /* return back to bootloader */
}
I executed those commands in cygwin and it produced the following result:
ld: kernel.o: in function `main':
/cygdrive/d/Work/asm/kernel.c:4: undefined reference to `___main'
objcopy: 'kernel.elf': No such file
The linker.ld file is here:
OUTPUT_FORMAT(elf32-i386)
ENTRY(_main)
SECTIONS
{
. = 0x9000;
.text : { *(.text.start) *(.text) }
.data : { *(.data) }
.bss : { *(.bss) *(COMMON) }
}
I have dissasembled the kernel.o file using objdump, the result of which is here:
> objdump -d -j .text kernel.o
kernel.o: file format pe-i386
Disassembly of section .text:
00000000 <.text>:
0: eb 00 jmp 2 <_main>
00000002 <_main>:
2: 55 push %ebp
3: 89 e5 mov %esp,%ebp
5: 83 e4 f0 and $0xfffffff0,%esp
8: e8 00 00 00 00 call d <_main+0xb>
d: b8 00 00 00 00 mov $0x0,%eax
12: c9 leave
13: c3 ret
Here is the result of gcc -v if that helps also:
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-pc-cygwin/10/lto-wrapper.exe
Target: x86_64-pc-cygwin
Configured with: /mnt/share/cygpkgs/gcc/gcc.x86_64/src/gcc-10.2.0/configure --srcdir=/mnt/share/cygpkgs/gcc/gcc.x86_64/src/gcc-10.2.0 --prefix=/usr --exec-prefix=/usr --localstatedir=/var --sysconfdir=/etc --docdir=/usr/share/doc/gcc --htmldir=/usr/share/doc/gcc/html -C --build=x86_64-pc-cygwin --host=x86_64-pc-cygwin --target=x86_64-pc-cygwin --without-libiconv-prefix --without-libintl-prefix --libexecdir=/usr/lib --with-gcc-major-version-only --enable-shared --enable-shared-libgcc --enable-static --enable-version-specific-runtime-libs --enable-bootstrap --enable-__cxa_atexit --with-dwarf2 --with-tune=generic --enable-languages=c,c++,fortran,lto,objc,obj-c++ --enable-graphite --enable-threads=posix --enable-libatomic --enable-libgomp --enable-libquadmath --enable-libquadmath-support --disable-libssp --enable-libada --disable-symvers --with-gnu-ld --with-gnu-as --with-cloog-include=/usr/include/cloog-isl --without-libiconv-prefix --without-libintl-prefix --with-system-zlib --enable-linker-build-id --with-default-libstdcxx-abi=gcc4-compatible --enable-libstdcxx-filesystem-ts
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 10.2.0 (GCC)
What am I doing wrong? Is this caused by cygwin? If yes, is there any other option I could use on windows? (I tried MSVC but that is just plainly horrible)
Also, my bootloader is not using any .section pseudo-ops (I have no idea on how to correctly work with them), will this cause any problems in the future and will it work correctly with the compiled C program?
By deeper searching, it can be easily found out that the __main (with an additional underscore internally) is the actual entry point for programs.
The same problem is mentioned in the following two answers:
https://stackoverflow.com/a/32164910/14320958
https://stackoverflow.com/a/45442576/14320958
Both of which claim some form of a connection to the -lgcc option and the libgcc library.
Renaming main to __main works, but is not recommended (the entry point for kernels is apparently by convention kmain as seen in other questions and answers)
The __main function is what a OS calls when starting a program and it usually contains (for example) a call to exit() (passing the return code from main if it's return type is int) and some other underlying system calls (which are probably system specific, more research would need to be done here)
GCC expects you to include a __main function even on standalone compilations, since it's by specification (or that's what I seen people claim) the default entry point for all applications
I have a simple C program. Let's say, for example, I have an int and a char array of length 20. I need 24 bytes in total.
int main()
{
char buffer[20];
int x = 0;
buffer[0] = 'a';
buffer[19] = 'a';
}
The stack needs to be aligned to a 16 bytes boundary, so I presume a compiler will reserve 32 bytes. But when I compile such a program with gcc x86-64 and read the output assembly, the compiler reserves 64 bytes.
..\gcc -S -o main.s main.c
Gives me:
.file "main.c"
.def __main; .scl 2; .type 32; .endef
.text
.globl main
.def main; .scl 2; .type 32; .endef
.seh_proc main
main:
pushq %rbp # RBP is pushed, so no need to reserve more for it
.seh_pushreg %rbp
movq %rsp, %rbp
.seh_setframe %rbp, 0
subq $64, %rsp # Reserving the 64 bytes
.seh_stackalloc 64
.seh_endprologue
call __main
movl $0, -4(%rbp) # Using the first 4 bytes to store the int
movb $97, -32(%rbp) # Using from RBP-32
movb $97, -13(%rbp) # to RBP-13 to store the char array
movl $0, %eax
addq $64, %rsp # Restoring the stack with the last 32 bytes unused
popq %rbp
ret
.seh_endproc
.ident "GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 5.2.0"
Why is that? When I program assembly, I always reserve only the minimum memory I need without any problem. Is that a limitation of the compiler which has trouble evaluating the needed memory or is there a reason for that?
Here is gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=D:/Mingw64/bin/../libexec/gcc/x86_64-w64-mingw32/5.2.0/lto-wrapper.exe
Target: x86_64-w64-mingw32
Configured with: ../../../src/gcc-5.2.0/configure --host=x86_64-w64-mingw32 --build=x86_64-w64-mingw32 --target=x86_64-w64-mingw32 --prefix=/mingw64 --with-sysroot=/c/mingw520/x86_64-520-posix-seh-rt_v4-rev0/mingw64 --with-gxx-include-dir=/mingw64/x86_64-w64-mingw32/include/c++ --enable-shared --enable-static --disable-multilib --enable-languages=c,c++,fortran,objc,obj-c++,lto --enable-libstdcxx-time=yes --enable-threads=posix --enable-libgomp --enable-libatomic --enable-lto --enable-graphite --enable-checking=release --enable-fully-dynamic-string --enable-version-specific-runtime-libs --disable-isl-version-check --disable-libstdcxx-pch --disable-libstdcxx-debug --enable-bootstrap --disable-rpath --disable-win32-registry --disable-nls --disable-werror --disable-symvers --with-gnu-as --with-gnu-ld --with-arch=nocona --with-tune=core2 --with-libiconv --with-system-zlib --with-gmp=/c/mingw520/prerequisites/x86_64-w64-mingw32-static --with-mpfr=/c/mingw520/prerequisites/x86_64-w64-mingw32-static --with-mpc=/c/mingw520/prerequisites/x86_64-w64-mingw32-static --with-isl=/c/mingw520/prerequisites/x86_64-w64-mingw32-static --with-pkgversion='x86_64-posix-seh-rev0, Built by MinGW-W64 project' --with-bugurl=http://sourceforge.net/projects/mingw-w64 CFLAGS='-O2 -pipe -I/c/mingw520/x86_64-520-posix-seh-rt_v4-rev0/mingw64/opt/include -I/c/mingw520/prerequisites/x86_64-zlib-static/include -I/c/mingw520/prerequisites/x86_64-w64-mingw32-static/include' CXXFLAGS='-O2 -pipe -I/c/mingw520/x86_64-520-posix-seh-rt_v4-rev0/mingw64/opt/include -I/c/mingw520/prerequisites/x86_64-zlib-static/include -I/c/mingw520/prerequisites/x86_64-w64-mingw32-static/include' CPPFLAGS= LDFLAGS='-pipe -L/c/mingw520/x86_64-520-posix-seh-rt_v4-rev0/mingw64/opt/lib -L/c/mingw520/prerequisites/x86_64-zlib-static/lib -L/c/mingw520/prerequisites/x86_64-w64-mingw32-static/lib '
Thread model: posix
gcc version 5.2.0 (x86_64-posix-seh-rev0, Built by MinGW-W64 project)
Compilers may indeed reserve additional memory for themselves.
Gcc has a flag, -mpreferred-stack-boundary, to set the alignment it will maintain. According to the documentation, the default is 4, which should produce 16-byte alignment, which needed for SSE instructions.
As VermillionAzure noted in a comment, you should provide your gcc version and compile-time options (use gcc -v to show these).
Because you haven't enabled optimization.
Without optimization, the compiler makes no attempt to minimize the amount of space or time it needs for anything in the generated code -- it just generates code in the most straight-forward way possible.
Add -O2 (or even just -O1) or -Os if you want the compiler to produce decent code.
I need 24 bytes in total.
The compiler needs space for a return address and a base pointer. As you are in 64 bit mode, that's another 16 bytes. Total 40. Round that up to a 32-byte boundary and you get 64.
I have a simple C program. Let's say, for example, I have an int and a char array of length 20. I need 24 bytes in total.
int main()
{
char buffer[20];
int x = 0;
buffer[0] = 'a';
buffer[19] = 'a';
}
The stack needs to be aligned to a 16 bytes boundary, so I presume a compiler will reserve 32 bytes. But when I compile such a program with gcc x86-64 and read the output assembly, the compiler reserves 64 bytes.
..\gcc -S -o main.s main.c
Gives me:
.file "main.c"
.def __main; .scl 2; .type 32; .endef
.text
.globl main
.def main; .scl 2; .type 32; .endef
.seh_proc main
main:
pushq %rbp # RBP is pushed, so no need to reserve more for it
.seh_pushreg %rbp
movq %rsp, %rbp
.seh_setframe %rbp, 0
subq $64, %rsp # Reserving the 64 bytes
.seh_stackalloc 64
.seh_endprologue
call __main
movl $0, -4(%rbp) # Using the first 4 bytes to store the int
movb $97, -32(%rbp) # Using from RBP-32
movb $97, -13(%rbp) # to RBP-13 to store the char array
movl $0, %eax
addq $64, %rsp # Restoring the stack with the last 32 bytes unused
popq %rbp
ret
.seh_endproc
.ident "GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 5.2.0"
Why is that? When I program assembly, I always reserve only the minimum memory I need without any problem. Is that a limitation of the compiler which has trouble evaluating the needed memory or is there a reason for that?
Here is gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=D:/Mingw64/bin/../libexec/gcc/x86_64-w64-mingw32/5.2.0/lto-wrapper.exe
Target: x86_64-w64-mingw32
Configured with: ../../../src/gcc-5.2.0/configure --host=x86_64-w64-mingw32 --build=x86_64-w64-mingw32 --target=x86_64-w64-mingw32 --prefix=/mingw64 --with-sysroot=/c/mingw520/x86_64-520-posix-seh-rt_v4-rev0/mingw64 --with-gxx-include-dir=/mingw64/x86_64-w64-mingw32/include/c++ --enable-shared --enable-static --disable-multilib --enable-languages=c,c++,fortran,objc,obj-c++,lto --enable-libstdcxx-time=yes --enable-threads=posix --enable-libgomp --enable-libatomic --enable-lto --enable-graphite --enable-checking=release --enable-fully-dynamic-string --enable-version-specific-runtime-libs --disable-isl-version-check --disable-libstdcxx-pch --disable-libstdcxx-debug --enable-bootstrap --disable-rpath --disable-win32-registry --disable-nls --disable-werror --disable-symvers --with-gnu-as --with-gnu-ld --with-arch=nocona --with-tune=core2 --with-libiconv --with-system-zlib --with-gmp=/c/mingw520/prerequisites/x86_64-w64-mingw32-static --with-mpfr=/c/mingw520/prerequisites/x86_64-w64-mingw32-static --with-mpc=/c/mingw520/prerequisites/x86_64-w64-mingw32-static --with-isl=/c/mingw520/prerequisites/x86_64-w64-mingw32-static --with-pkgversion='x86_64-posix-seh-rev0, Built by MinGW-W64 project' --with-bugurl=http://sourceforge.net/projects/mingw-w64 CFLAGS='-O2 -pipe -I/c/mingw520/x86_64-520-posix-seh-rt_v4-rev0/mingw64/opt/include -I/c/mingw520/prerequisites/x86_64-zlib-static/include -I/c/mingw520/prerequisites/x86_64-w64-mingw32-static/include' CXXFLAGS='-O2 -pipe -I/c/mingw520/x86_64-520-posix-seh-rt_v4-rev0/mingw64/opt/include -I/c/mingw520/prerequisites/x86_64-zlib-static/include -I/c/mingw520/prerequisites/x86_64-w64-mingw32-static/include' CPPFLAGS= LDFLAGS='-pipe -L/c/mingw520/x86_64-520-posix-seh-rt_v4-rev0/mingw64/opt/lib -L/c/mingw520/prerequisites/x86_64-zlib-static/lib -L/c/mingw520/prerequisites/x86_64-w64-mingw32-static/lib '
Thread model: posix
gcc version 5.2.0 (x86_64-posix-seh-rev0, Built by MinGW-W64 project)
Compilers may indeed reserve additional memory for themselves.
Gcc has a flag, -mpreferred-stack-boundary, to set the alignment it will maintain. According to the documentation, the default is 4, which should produce 16-byte alignment, which needed for SSE instructions.
As VermillionAzure noted in a comment, you should provide your gcc version and compile-time options (use gcc -v to show these).
Because you haven't enabled optimization.
Without optimization, the compiler makes no attempt to minimize the amount of space or time it needs for anything in the generated code -- it just generates code in the most straight-forward way possible.
Add -O2 (or even just -O1) or -Os if you want the compiler to produce decent code.
I need 24 bytes in total.
The compiler needs space for a return address and a base pointer. As you are in 64 bit mode, that's another 16 bytes. Total 40. Round that up to a 32-byte boundary and you get 64.
I am following this half-completed tutorial to develop a simple OS. One step (on page 50) is to compile a simple kernel with $ld -o kernel.bin -Ttext 0x1000 kernel.o --oformat binary. However I don't really understand what is the option -Ttext doing.
To make the question specific, why in the following experiment are md5s of kernel_1000.bin & kernel.bin equal, kernel_1001.bin & kernel_1009.bin equal, and kernel_1007.bin & kernel_1017.bin equal, while all other pairs are not equal?
My experiment
I tried to compile several different kernels with different -Ttext like the in the following Makefile:
...
kernel.o: kernel.c
gcc -ffreestanding -c kernel.c
kernel.bin: kernel.o
ld -o $# kernel.o --oformat binary
kernel_1000.bin: kernel.o
ld -o $# -Ttext 0x1000 kernel.o --oformat binary
kernel_1001.bin: kernel.o
ld -o $# -Ttext 0x1001 kernel.o --oformat binary
...
And then I checked their md5:
$ ls *.bin | xargs md5sum
d9248440a2c816e41527686cdb5118e4 kernel_1000.bin
65db5ab465301da1176b523dec387a40 kernel_1001.bin
819a5638827494a4556b7a96ee6e14b2 kernel_1007.bin
d9248440a2c816e41527686cdb5118e4 kernel_1008.bin
65db5ab465301da1176b523dec387a40 kernel_1009.bin
216b24060abce034911642acfa880403 kernel_1015.bin
e92901b1d12d316c564ba7916abca20c kernel_1016.bin
819a5638827494a4556b7a96ee6e14b2 kernel_1017.bin
d9248440a2c816e41527686cdb5118e4 kernel.bin
kernel.c
void main() {
char* video_memory = (char*) 0xb8000;
*video_memory = 'X';
}
Development environment
$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.9/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 4.9.2-10' --with-bugurl=file:///usr/share/doc/gcc-4.9/README.Bugs --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.9 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.9 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-4.9-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-4.9-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-4.9-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-multiarch --with-arch-32=i586 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 4.9.2 (Debian 4.9.2-10)
$ uname -a
Linux localhost 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt20-1+deb8u1 (2015-12-14) x86_64 GNU/Linux
The -Ttext option puts the .text section of your program by the given address. For example if you are compile this assembly code:
section .text
global _start
_start:
mov al, '!'
jmp l
l: mov ah, 0x0e
mov bh, 0x00
mov bl, 0x07
int 0x10
jmp $
times 510-($-$$) db 0
db 0x55
db 0xaa
with:
nasm -f elf64 -o test.o test.S
ld -o test test.o
And will look on it with the objdump, you will see that it was linked by default address, something around 0x0000000000400000 for the x86_64:
~$ objdump -D test
test: file format elf64-x86-64
Disassembly of section .text:
0000000000400080 <_start>:
400080: b0 21 mov $0x21,%al
400082: eb 00 jmp 400084 <l>
0000000000400084 <l>:
400084: b4 0e mov $0xe,%ah
...
...
...
And all addresses in the program (at least in the .text section) will be relative to this address. If you will add the -Ttext 1000 option, you will see:
~$ objdump -D test
test: file format elf64-x86-64
Disassembly of section .text:
0000000000001000 <_start>:
1000: b0 21 mov $0x21,%al
1002: eb 00 jmp 1004 <l>
0000000000001004 <l>:
1004: b4 0e mov $0xe,%ah
That you program will be linked to start at 0x1000 address and all addresses (including jmp and etc.) will be relative to the 0x1000 to.
This important for two things. In short words, when an operating system kernel loads your program, it loads your executable which is in elf format or in other binary format and reads where the .text section starts. In our case, you can link your kernel.bin as you want, because there are no loaders as an operating system kernel and your are master of all memory space.
So if you will link your kernel.bin to start at 0x1000, you will know where the code starts to work (of course if it will actually loaded at this place in memory) and if you know the base address of your code, you can get all addresses inside it, something like my_label_inside_program - _start.