I am in the process of writing a small operating system in C. I have written a bootloader and I'm now trying to get a simple C file (the "kernel") to compile with gcc:
int main(void) { return 0; }
I compile the file with the following command:
gcc kernel.c -o kernel.o -nostdlib -nostartfiles
I use the linker to create the final image using this command:
ld kernel.o -o kernel.bin -T linker.ld --oformat=binary
The contents of the linker.ld file are as follows:
. = 0x7e00;
.text ALIGN (0x00) :
(The bootloader loads the image at address 0x7e00.)
This seems to work quite well - ld produces a 128-byte file containing the following instructions in the first 11 bytes:
00000000 55 push ebp
00000001 48 dec eax
00000002 89 E5 mov ebp, esp
00000004 B8 00 00 00 00 mov eax, 0x00000000
00000009 5D pop ebp
0000000A C3 ret
However, I can't figure out what the other 117 bytes are for. Disassembling them seems to produce a bunch of garbage that doesn't make any sense. The existence of the additional bytes has me wondering if I'm doing something wrong.
Should I be concerned?

These are additional sections, which were not stripped and not discarded. You want your linker.ld file to look like this:
. = 0x7e00;
.text ALIGN (0x00) :
I know what sections to discard from the output of objdump -t kernel.o.

Simple, you're using gcc, and it always put its initialization code before passing control to your main.
What's on that start up code I don't know, but they are there. As you may see there's also an comment 'GNU' on your binary, you can't print specific sectors by using objdump -s -j 'section name'.


Assembly code different from gdb display of code

I'm learning about operating systems from the book Operating Systems from 0 to 1, and I'm trying to display the code in my kernel called main, however the code displayed in GDB is not the same even though I jumped to the address that is the entry point.
; bootloader.asm
; A Simple Bootloader
bits 16
start: jmp boot
;; constants and variable definitions
msg db "Welcome to My Operating System!", 0ah, 0dh, 0h
cli ; no interrupts
cld ; all that we need to init
mov ax, 0x0000
;; set buffer
mov es, ax
mov bx, 0x0600
mov al, 1 ; read one sector
mov ch, 0 ; track 0
mov cl, 2 ; sector to read
mov dh, 0 ; head number
mov dl, 0 ; drive number
mov ah, 0x02 ; read sectors from disk
int 0x13 ; call the BIOS routine
jmp 0x0000:0x0600 ; jump and execute the sector!
hlt ; halt the system
; We have to be 512 bytes. Clear the rest of the bytes with 0
times 510 - ($-$$) db 0
dw 0xAA55 ; Boot Signature
readelf -l main
ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: Intel 80386
Version: 0x1
Entry point address: 0x600
Start of program headers: 52 (bytes into file)
Start of section headers: 12888 (bytes into file)
Flags: 0x0
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 3
Size of section headers: 40 (bytes)
Number of section headers: 12
Section header string table index: 11
readelf -l main
Elf file type is EXEC (Executable file)
Entry point 0x600
There are 3 program headers, starting at offset 52
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x000000 0x00000000 0x00000000 0x00094 0x00094 R 0x4
LOAD 0x000000 0x00000000 0x00000000 0x00094 0x00094 R 0x4
LOAD 0x000100 0x00000600 0x00000600 0x00006 0x00006 R E 0x100
Section to Segment mapping:
Segment Sections...
02 .text
void main(){}
objdump -z -M intel -S -D build/os/main
Disassembly of section .text:
00000600 <main>:
void main(){}
600: 55 push ebp
601: 89 e5 mov ebp,esp
603: 90 nop
604: 5d pop ebp
605: c3 ret
But this is GDB's output by setting a breakpoint at main 0x600
0x600 <main> jg 0x647 │
│ 0x602 <main+2> dec esp │
│ 0x603 <main+3> inc esi │
│ 0x604 <main+4> add DWORD PTR [ecx],eax │
why is this happening? Am I loading at the wrong address? How do I find the correct address to load at?
here is the code for compiling;
nasm -f elf bootloader.asm -F dwarf -g -o ../build/bootloader/bootloader.o
ld -m elf_i386 -T bootloader.lds ../build/bootloader/bootloader.o -o ../build/bootloader/bootloader.o.elf
objcopy -O binary ../build/bootloader/bootloader.o.elf ../build/bootloader/bootloader.o
gcc -ffreestanding -nostdlib -fno-pic -gdwarf-4 -m16 -ggdb3 -c main.c -o ../build/os/main.o
ld -m elf_i386 -nmagic -T os.lds ../build/os/main.o -o ../build/os/main
dd if=/dev/zero of=disk.img bs=512 count=2880
2880+0 records in
2880+0 records out
1474560 bytes (1.5 MB, 1.4 MiB) copied, 0.0150958 s, 97.7 MB/s
dd conv=notrunc if=build/bootloader/bootloader.o of=disk.img bs=512 count=1 seek=0
1+0 records in
1+0 records out
512 bytes copied, 0.000127745 s, 4.0 MB/s
dd conv=notrunc if=build/os/main.o of=disk.img bs=512 count=$((8504/512))
16+0 records in
16+0 records out
8192 bytes (8.2 kB, 8.0 KiB) copied, 0.000184251 s, 44.5 MB/s
qemu-system-i386 -machine q35 -fda disk.img -gdb tcp::26000 -S
and gdb code for displaying main code;
set architecture i8086
target remote localhost:26000
b *0x7c00
set disassembly-flavor intel
layout asm
layout reg
symbol-file build/os/main
b main
jg / dec esp / inc esi is the ELF magic number, not machine code! You'll see the same thing from the start of the output of ndisasm -b32 /bin/ls. (ndisasm always treats its input as a flat binary; it doesn't look for any metadata.)
7F 45 4C 46 is the string "ELF" after a 0x7F byte, the ELF magic number that identifies the file format as ELF. It's followed by more ELF header bytes before the actual machine code for main. objdump -D disassembles all ELF sections, but it still parses the ELF headers, not disassembling them like ndisasm does. So you still just end up seeing the code from the .text section because the others are empty (because you linked this executable without libc or CRT startfiles, and with C main as the ELF entry point?!?)
You're jumping to the start of the ELF file as if it was a flat binary. It's not, writing an ELF program loader is not that simple. The ELF program headers (which readelf can parse) tell you which file offset goes at which address. The start of the .text section will be at some offset into the file, not overlapping the ELF magic number for obvious reasons. (Although it can overlap with the ELF header if you can find a way to make it fit: http://www.muppetlabs.com/~breadbox/software/tiny/teensy.html)
Then once you have the file mapped into memory as specified in the program headers, you jump to the ELF entry point address (0x600 in your case). (Which is normally not a function; under a real OS like Linux, you can't ret from the entry point. Instead you need to make an exit system call.) You can't here, either, because you jmp to it instead of call.
This is why _start is separate from main; building a program with a compiler-generated main as its entry point doesn't work.
Of course most of this effort is doomed because you're jumping to your main with the CPU still in 16-bit real mode. But your main is compiled/assembled for 32-bit mode. You could somewhat work around that with gcc -m16 to assemble gcc output for 16-bit mode, using operand-size + address-size prefixes as necessary.
The machine code for that do-nothing main will actually work the in both 16 and 32-bit mode. If you'd used a return 0 without optimization, that wouldn't be the case: the opcode (without prefixes) for mov eax, imm32 implies a different instruction length depending on what mode the CPU decodes it in, so decoding in 16-bit mode would write AX and leave 2 bytes of zeros.
Most likely the easiest thing to do is turn your "kernel" into a flat binary, instead of writing an ELF program loader in your bootloader. Follow an osdev tutorial because lots can go wrong, and you have to be careful about static data for example.
Or see How to make the kernel for my bootloader? for an example bootloader that calls a C function after switching to 32-bit protected mode.
See more links in https://stackoverflow.com/tags/x86/info.

gcc subtracting from esp before call

I am planning to use C to write a small kernel and I really don't want it to bloat with unnecessary instructions.
I have two C files which are called main.c and hello.c. I compile and link them using the following GCC command:
gcc -Wall -T lscript.ld -m16 -nostdlib main.c hello.c -o main.o
I am dumping .text section using following OBJDUMP command:
objdump -w -j .text -D -mi386 -Maddr16,data16,intel main.o
and get the following dump:
00001000 <main>:
1000: 67 66 8d 4c 24 04 lea ecx,[esp+0x4]
1006: 66 83 e4 f0 and esp,0xfffffff0
100a: 67 66 ff 71 fc push DWORD PTR [ecx-0x4]
100f: 66 55 push ebp
1011: 66 89 e5 mov ebp,esp
1014: 66 51 push ecx
1016: 66 83 ec 04 sub esp,0x4
101a: 66 e8 10 00 00 00 call 1030 <hello>
1020: 90 nop
1021: 66 83 c4 04 add esp,0x4
1025: 66 59 pop ecx
1027: 66 5d pop ebp
1029: 67 66 8d 61 fc lea esp,[ecx-0x4]
102e: 66 c3 ret
00001030 <hello>:
1030: 66 55 push ebp
1032: 66 89 e5 mov ebp,esp
1035: 90 nop
1036: 66 5d pop ebp
1038: 66 c3 ret
My questions are: Why are machine codes at the following lines being generated?
I can see that subtraction and addition completes each other, but why are they generated? I don't have any variable to be allocated on stack. I'd appreciate a source to read about usage of ECX.
1016: 66 83 ec 04 sub esp,0x4
1021: 66 83 c4 04 add esp,0x4
extern void hello();
void main(){
void hello(){}
.text 0x1000 : {*(.text)}
As I mentioned in my comments:
The first few lines (plus the push ecx) are to ensure the stack is aligned on a 16-byte boundary which is required by the Linux System V i386 ABI. The pop ecx and lea before the ret in main is to undo that alignment work.
#RossRidge has provided a link to another Stackoverflow answer that details this quite well.
In this case you seem to be doing real mode development. GCC isn't well suited for this but it can work and I will assume you know what you are doing. I mention some of the pitfalls of using -m16 in this Stackoverflow answer. I put this warning in that answer regarding real mode development with GCC:
There are so many pitfalls in doing this that I recommend against it.
If you remain undeterred and wish to continue forward you can do a few things to minimize the code. The 16-byte alignment of the stack at the point a function call is made is part of the more recent Linux System V i386 ABIs. Since you are generating code for a non-Linux environment you can change the stack alignment to 4 using compiler option -mpreferred-stack-boundary=2 . The GCC manual says:
Attempt to keep the stack boundary aligned to a 2 raised to num byte boundary. If -mpreferred-stack-boundary is not specified, the default is 4 (16 bytes or 128 bits).
If we add that to your GCC command we get gcc -Wall -T lscript.ld -m16 -nostdlib main.c hello.c -o main.o -mpreferred-stack-boundary=2:
00001000 <main>:
1000: 66 55 push ebp
1002: 66 89 e5 mov ebp,esp
1005: 66 e8 04 00 00 00 call 100f <hello>
100b: 66 5d pop ebp
100d: 66 c3 ret
0000100f <hello>:
100f: 66 55 push ebp
1011: 66 89 e5 mov ebp,esp
1014: 66 5d pop ebp
1016: 66 c3 ret
Now all the extra alignment code to get it on a 16-byte boundary has disappeared. We are left with typical function frame pointer prologue and epilogue code. This is often in the form of push ebp and mov ebp,esp pop ebp. we can remove these with the -fomit-frame-pointer define in the GCC manual as:
The option -fomit-frame-pointer removes the frame pointer for all functions which might make debugging harder.
If we add that option we get gcc -Wall -T lscript.ld -m16 -nostdlib main.c hello.c -o main.o -mpreferred-stack-boundary=2 -fomit-frame-pointer:
00001000 <main>:
1000: 66 e8 02 00 00 00 call 1008 <hello>
1006: 66 c3 ret
00001008 <hello>:
1008: 66 c3 ret
You can then optimize for size with -Os. The GCC manual says this:
Optimize for size. -Os enables all -O2 optimizations that do not typically increase code size. It also performs further optimizations designed to reduce code size.
This has a side effect that main will be placed into a section called .text.startup. If we display both with objdump -w -j .text -j .text.startup -D -mi386 -Maddr16,data16,intel main.o we get:
Disassembly of section .text:
00001000 <hello>:
1000: 66 c3 ret
Disassembly of section .text.startup:
00001002 <main>:
1002: e9 fb ff jmp 1000 <hello>
If you have functions in separate objects you can alter the calling convention so the first 3 Integer class parameters are passed in registers rather than the stack. The Linux kernel uses this method as well. Information on this can be found in the GCC documentation:
regparm (number)
On the Intel 386, the regparm attribute causes the compiler to pass arguments number one to number if they are of integral type in registers EAX, EDX, and ECX instead of on the stack. Functions that take a variable number of arguments will continue to be passed all of their arguments on the stack.
I wrote a Stackoverflow answer with code that uses __attribute__((regparm(3))) that may be a useful source of further information.
Other Suggestions
I recommend you consider compiling each object individually rather than altogether. This is also advantageous since it can be more easily be done in a Makefile later on.
If we look at your command line with the extra options mentioned above you'd have:
gcc -Wall -T lscript.ld -m16 -nostdlib main.c hello.c -o main.o \
-mpreferred-stack-boundary=2 -fomit-frame-pointer -Os
I recommend you do it this way:
gcc -c -Os -Wall -m16 -ffreestanding -nostdlib -mpreferred-stack-boundary=2 \
-fomit-frame-pointer main.c -o main.o
gcc -c -Os -Wall -m16 -ffreestanding -nostdlib -mpreferred-stack-boundary=2 \
-fomit-frame-pointer hello.c -o hello.o
The -c option (I added it to the beginning) forces the compiler to just generate the object file from the source and not to perform linking. You will also notice the -T lscript.ld has been removed. We have created .o files above. We can now use GCC to link all of them together:
gcc -ffreestanding -nostdlib -Wl,--build-id=none -m16 \
-Tlscript.ld main.o hello.o -o main.elf
The -ffreestanding will force the linker to not use the C runtime, the -Wl,--build-id=none will tell the compiler not to generate some noise in the executable for build notes. In order for this to really work you'll need a slightly more complex linker script that places the .text.startup before .text. This script also adds the .data section, the .rodata and .bss sections. The DISCARD option removes exception handling data and other unneeded information.
.text 0x1000 : SUBALIGN(4) {
.data : SUBALIGN(4) {
.bss : SUBALIGN(4) {
__bss_start = .;
. = ALIGN(4);
__bss_end = .;
If we look at a complete OBJDUMP with objdump -w -D -mi386 -Maddr16,data16,intel main.elf we would see:
Disassembly of section .text:
00001000 <main>:
1000: e9 01 00 jmp 1004 <hello>
1003: 90 nop
00001004 <hello>:
1004: 66 c3 ret
If you want to convert main.elf to a binary file that you can place in a disk image and read it (ie. via BIOS interrupt 0x13), you can create it this way:
objcopy -O binary main.elf main.bin
If you dump main.bin with NDISASM using ndisasm -b16 -o 0x1000 main.bin you'd see:
00001000 E90100 jmp word 0x1004
00001003 90 nop
00001004 66C3 o32 ret
Cross Compiler
I can't stress this enough but you should consider using a GCC cross compiler. The OSDev Wiki has information on building one. It also has this to say about why:
Why do I need a Cross Compiler?
You need to use a cross-compiler unless you are developing on your own operating system. The compiler must know the correct target platform (CPU, operating system), otherwise you will run into trouble. If you use the compiler that comes with your system, then the compiler won't know it is compiling something else entirely. Some tutorials suggest using your system compiler and passing a lot of problematic options to the compiler. This will certainly give you a lot of problems in the future and the solution is build a cross-compiler.

Change entry point with gnu linker

I have an assembly file with a _start label as the first thing in the .text segment. I would like this label to be the entry point of my application.
Whenever I pass this file together with another file that have a function called main, that main function ends up being the entry point of my application no matter what.
I am using the GNU linker and have tried the -e _start flag, along with changing the input file order. As long as there exist a main function, it will become the entry point.. If I rename the main function, it works fine and my _start label becomes the entry point.
EDIT: Seems like it is because of -O2 flag to the compiler.
.global _start
jmp main
int main(){
return 0;
gcc -O2 -c as.s -o as.o
gcc -O2 -c main.c -o main.o
ld -e _start as.o main.o -o test
00000000004000b0 <main>:
4000b0: 31 c0 xor %eax,%eax
4000b2: c3 retq
00000000004000b3 <_start>:
4000b3: e9 f8 ff ff ff jmpq 4000b0 <main>
Any ideas?
It appears your question really is How can I place a particular function before all others in the generated executable?
First thing is that doing this only has value in certain circumstances. An ELF executable has the entry point encoded in the ELF header. The placement of the entry point in the executable isn't relevant.
One special circumstance is a non-mulitboot compatible kernel where a custom bootloader loads a kernel that was generated by GCC and converted to binary output. Looking through your question history suggests that bootloader / kernel development might be a possibility for your requirement.
When using GCC you can't assume that the generated code will be in the order you want. As you have found out options (like optimizations) may reorder the functions relative to each other or eliminate some altogether.
One way to put a function first in an ELF executable is to place it into its own section and then create a linker script to position that section first. An example linker script link.ld that should work with C would be:
/* This should be your memory offset (VMA) where the code and data
* will be loaded. In Linux this is 0x400000, multiboot loader is
* 0x100000 etc */
. = 0x400000;
/* Place special section .text.prologue before everything else */
.text : {
/* Output the data sections */
.data : {
.rodata : {
/* The BSS section for uniitialized data */
.bss : {
__bss_start = .;
. = ALIGN(4);
__bss_end = .;
/* Size of the BSS section in case it is needed */
__bss_size = ((__bss_end)-(__bss_start));
/* Remove the note that may be placed before the code by LD */
This script explicitly places whatever is in the section .text.prologue before any other code. We just need to place _start into that section. Your as.s file could be modified to do this:
.global _start
# Start a special section called .text.prologue making it
# allocatable and executable
.section .text.prologue, "ax"
jmp main
# All other regular code in the normal .text section
You'd compile, assemble and link them like this:
gcc -O2 -c main.c -o main.o
gcc -O2 -c as.s -o as.o
ld -Tlink.ld main.o as.o -o test
An objdump -D test should show the function _start before main:
test: file format elf32-i386
Disassembly of section .text:
00400000 <_start>:
400000: e9 0b 00 00 00 jmp 400010 <main>
400005: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%eax,%eax,1)
40000c: 00 00 00
40000f: 90 nop
00400010 <main>:
400010: 31 c0 xor %eax,%eax
400012: c3 ret

gcc/ld script ignores the start adress of the .text section and adds a lot of junk to my binary

I am trying to build the following really small C program into a raw binary file:
asm ("call sys_main\n" // Immediately run sys_main at start of code
"jmp __asm_loop_halt\n"); // Then halt
int sys_main() {
short *addr = (short*) 0x08b000; // Address start of EGA-VRAM
*addr = 0x0f41; // Write white 'A' on black to screen
Because I am trying to create a raw binary I have to use a linker script gcc -std=gnu99 -Os -nostdlib -m32 -march=i386 -ffreestanding -Wl,--nmagic,--script=386.ld -o test test.c:
.text 0x0500 :
.data :
_heap = ALIGN(4);
The script is supposed to tell the linker that the code starts running at 0x500 and that it should only create a binary file. However when I disassemble the binary, I get:
00000000 E802000000 call dword 0x7
00000005 EBFE jmp short 0x5
00000007 55 push ebp
00000008 89E5 mov ebp,esp
0000000A 66C70500B0080041 mov word [dword 0x8b000],0xf41
00000013 5D pop ebp
00000014 C3 ret
00000015 0000 add [eax],al
00000017 001400 add [eax+eax],dl
Appearantly the linker still took 0x0 as the start address of the code and also added a bunch of random data behind the last senseful 'ret' instruction, that is in total 4 times as big as the code.
What is this data, why is it there and what did I do wrong to have my code start at 0x0?
Edit: Thanks to Eugene's tip with the map I discovered that the bytes behind the .text section are .eh_frame responsible for exception handling which can easily removed by calling gcc with -fno-asynchronous-unwind-tables.

How can I get the _GLOBAL_OFFSET_TABLE_ address in my program?

I want to get the address of _GLOBAL_OFFSET_TABLE_ in my program. One way is to use the nm command in Linux, maybe redirect the output to a file and parse that file to get address of _GLOBAL_OFFSET_TABLE_. However, that method seems to be quite inefficient. What are some more efficient methods of doing it?
This appears to work:
// test.c
#include <stdio.h>
extern void *_GLOBAL_OFFSET_TABLE_;
int main()
return 0;
In order to get consistent address of _GLOBAL_OFFSET_TABLE_, matching nm's result, you will need to compile your code with -fPIE to do code-gen as if linking into a position-independent executable. (Otherwise you get a small integer like 0x2ed6 with -fno-pie -no-pie). The GCC default for most modern Linux distros is -fPIE -pie, which would make nm addresses be just offsets relative to an image base, and the runtime address be ASLRed. (This is normally good for security, but you may not want it.)
$: gcc -fPIE -no-pie test.c -o test
It gives:
$ ./test
However, nm thinks different:
$ nm test | fgrep GLOBAL
0000000000600868 d _GLOBAL_OFFSET_TABLE_
Or with a GCC too old to know about PIEs at all, let alone have it -fPIE -pie as the default, -fpic can work.
If you use assembly language, you can get _GLOBAL_OFFSET_TABLE_ address without get_pc_thunk.
It is tricky way. :)
Here is the sample code :
$ cat test.s
.global main
lea HEREIS, %eax # Now %eax holds address of _GLOBAL_OFFSET_TABLE_
.section .got
$ gcc -o test test.s
This is available because .got section is adjacent to the <.got.plt>
Therefore the symbol HEREIS and _GLOBAL_OFFSET_TABLE_ locate at same address.
PS. You can check it works with objdump.
Disassembly of section .got:
080495e8 <HEREIS-0x4>:
80495e8: 00 00 add %al,(%eax)
Disassembly of section .got.plt:
80495ec: 00 95 04 08 00 00 add %dl,0x804(%ebp)
80495f2: 00 00 add %al,(%eax)
80495f4: 00 00 add %al,(%eax)
