I'm learning about operating systems from the book Operating Systems from 0 to 1, and I'm trying to display the code in my kernel called main, however the code displayed in GDB is not the same even though I jumped to the address that is the entry point.
bootloader.asm
;*************************************************
; bootloader.asm
; A Simple Bootloader
;*************************************************
bits 16
start: jmp boot
;; constants and variable definitions
msg db "Welcome to My Operating System!", 0ah, 0dh, 0h
boot:
cli ; no interrupts
cld ; all that we need to init
mov ax, 0x0000
;; set buffer
mov es, ax
mov bx, 0x0600
mov al, 1 ; read one sector
mov ch, 0 ; track 0
mov cl, 2 ; sector to read
mov dh, 0 ; head number
mov dl, 0 ; drive number
mov ah, 0x02 ; read sectors from disk
int 0x13 ; call the BIOS routine
jmp 0x0000:0x0600 ; jump and execute the sector!
hlt ; halt the system
; We have to be 512 bytes. Clear the rest of the bytes with 0
times 510 - ($-$$) db 0
dw 0xAA55 ; Boot Signature
readelf -l main
ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: Intel 80386
Version: 0x1
Entry point address: 0x600
Start of program headers: 52 (bytes into file)
Start of section headers: 12888 (bytes into file)
Flags: 0x0
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 3
Size of section headers: 40 (bytes)
Number of section headers: 12
Section header string table index: 11
readelf -l main
Elf file type is EXEC (Executable file)
Entry point 0x600
There are 3 program headers, starting at offset 52
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x000000 0x00000000 0x00000000 0x00094 0x00094 R 0x4
LOAD 0x000000 0x00000000 0x00000000 0x00094 0x00094 R 0x4
LOAD 0x000100 0x00000600 0x00000600 0x00006 0x00006 R E 0x100
Section to Segment mapping:
Segment Sections...
00
01
02 .text
main.c
void main(){}
objdump -z -M intel -S -D build/os/main
Disassembly of section .text:
00000600 <main>:
void main(){}
600: 55 push ebp
601: 89 e5 mov ebp,esp
603: 90 nop
604: 5d pop ebp
605: c3 ret
But this is GDB's output by setting a breakpoint at main 0x600
0x600 <main> jg 0x647 │
│ 0x602 <main+2> dec esp │
│ 0x603 <main+3> inc esi │
│ 0x604 <main+4> add DWORD PTR [ecx],eax │
why is this happening? Am I loading at the wrong address? How do I find the correct address to load at?
edit:
here is the code for compiling;
nasm -f elf bootloader.asm -F dwarf -g -o ../build/bootloader/bootloader.o
ld -m elf_i386 -T bootloader.lds ../build/bootloader/bootloader.o -o ../build/bootloader/bootloader.o.elf
objcopy -O binary ../build/bootloader/bootloader.o.elf ../build/bootloader/bootloader.o
gcc -ffreestanding -nostdlib -fno-pic -gdwarf-4 -m16 -ggdb3 -c main.c -o ../build/os/main.o
ld -m elf_i386 -nmagic -T os.lds ../build/os/main.o -o ../build/os/main
dd if=/dev/zero of=disk.img bs=512 count=2880
2880+0 records in
2880+0 records out
1474560 bytes (1.5 MB, 1.4 MiB) copied, 0.0150958 s, 97.7 MB/s
dd conv=notrunc if=build/bootloader/bootloader.o of=disk.img bs=512 count=1 seek=0
1+0 records in
1+0 records out
512 bytes copied, 0.000127745 s, 4.0 MB/s
dd conv=notrunc if=build/os/main.o of=disk.img bs=512 count=$((8504/512))
seek=1
16+0 records in
16+0 records out
8192 bytes (8.2 kB, 8.0 KiB) copied, 0.000184251 s, 44.5 MB/s
qemu-system-i386 -machine q35 -fda disk.img -gdb tcp::26000 -S
and gdb code for displaying main code;
set architecture i8086
target remote localhost:26000
b *0x7c00
set disassembly-flavor intel
layout asm
layout reg
symbol-file build/os/main
b main
jg / dec esp / inc esi is the ELF magic number, not machine code! You'll see the same thing from the start of the output of ndisasm -b32 /bin/ls. (ndisasm always treats its input as a flat binary; it doesn't look for any metadata.)
7F 45 4C 46 is the string "ELF" after a 0x7F byte, the ELF magic number that identifies the file format as ELF. It's followed by more ELF header bytes before the actual machine code for main. objdump -D disassembles all ELF sections, but it still parses the ELF headers, not disassembling them like ndisasm does. So you still just end up seeing the code from the .text section because the others are empty (because you linked this executable without libc or CRT startfiles, and with C main as the ELF entry point?!?)
You're jumping to the start of the ELF file as if it was a flat binary. It's not, writing an ELF program loader is not that simple. The ELF program headers (which readelf can parse) tell you which file offset goes at which address. The start of the .text section will be at some offset into the file, not overlapping the ELF magic number for obvious reasons. (Although it can overlap with the ELF header if you can find a way to make it fit: http://www.muppetlabs.com/~breadbox/software/tiny/teensy.html)
Then once you have the file mapped into memory as specified in the program headers, you jump to the ELF entry point address (0x600 in your case). (Which is normally not a function; under a real OS like Linux, you can't ret from the entry point. Instead you need to make an exit system call.) You can't here, either, because you jmp to it instead of call.
This is why _start is separate from main; building a program with a compiler-generated main as its entry point doesn't work.
Of course most of this effort is doomed because you're jumping to your main with the CPU still in 16-bit real mode. But your main is compiled/assembled for 32-bit mode. You could somewhat work around that with gcc -m16 to assemble gcc output for 16-bit mode, using operand-size + address-size prefixes as necessary.
The machine code for that do-nothing main will actually work the in both 16 and 32-bit mode. If you'd used a return 0 without optimization, that wouldn't be the case: the opcode (without prefixes) for mov eax, imm32 implies a different instruction length depending on what mode the CPU decodes it in, so decoding in 16-bit mode would write AX and leave 2 bytes of zeros.
Most likely the easiest thing to do is turn your "kernel" into a flat binary, instead of writing an ELF program loader in your bootloader. Follow an osdev tutorial because lots can go wrong, and you have to be careful about static data for example.
Or see How to make the kernel for my bootloader? for an example bootloader that calls a C function after switching to 32-bit protected mode.
See more links in https://stackoverflow.com/tags/x86/info.
Related
While debugging a small kernel I am writing for fun/learning experience, I encountered a somewhat puzzling issue with gdb where it is apparently not correctly resolving local variable addresses on the stack. My investigation so far suggests that the debugging symbols are correct but somehow gdb still reads from a wrong memory location when displaying the contents of that variable.
The relevant C code in question is:
typedef union
{
uint16_t packed;
struct __attribute__((packed))
{
uint8_t PhysicalLimit;
uint8_t LinearLimit;
} limits;
} MemAddrLimits;
void KernelMain32()
{
ClearScreen();
SimplePrint("kernelMain32");
MemAddrLimits memAddr;
memAddr.packed = GetMemoryAddressLimits();
for (;;) {}
}
where GetMemoryAddressLimits() returns the memory address widths provided by the cpuid instruction in a 2-byte integer (0x3028 currently for my tests). However, when stepping through this function using gdb to print the value of memAddr does not show the right result:
gdb> p memAddr
$1 = {packed = 0, limits = {PhysicalLimit = 0 '\000', LinearLimit = 0 '\000'}}
gdb> info locals
memAddr = {packed = 0, limits = {PhysicalLimit = 0 '\000', LinearLimit = 0 '\000'}}
gdb> info addr memAddr prints Symbol "memAddr" is a variable at frame base reg $ebp offset 8+-18. i.e., memAddr is located at ebp-10 and, indeed, inspecting that address shows the expected content:
gdb> x/hx $ebp-10
0x8ffee: 0x3028
In contrast gdb> p &memAddr gives a value of (MemAddrLimits *) 0x7f6 at which location the memory is zeroed.
When declaring memAddr as a uint16_t instead of my union type these issues do not occur. In that case we get
gdb> info addr memAddr
Symbol "memAddr" is multi-location:
Range 0x8b95-0x8b97: a variable in $eax
.
However, the result is still (also) written to ebp-10, i.e., the disassembly of the function is identical - the only difference is in debug symbols.
Am I missing something here or has someone a good idea on what might be going wrong in this case?
More Details
Program versions and build flags
Using gcc (Ubuntu 9.3.0-10ubuntu2) 9.3.0 and GNU gdb (Ubuntu 9.1-0ubuntu1) 9.1.
Compiling with flags
-ffreestanding -m32 -fcf-protection=none -fno-pie -fno-pic -O0 -gdwarf-2 -fvar-tracking -fvar-tracking-assignments
and linking with
-m elf_i386 -nodefaultlibs -nostartfiles -Ttext 0x7c00 -e start -g
The linking phase produces the kernel.elf which I postprocess to extract the raw executable binary as well as a symbols file to load into gdb. So far, this has been working well for me.
There's obviously more code involved in the binary than what I have shown, most of which written in assembly, which shouldn't be relevant here.
Compiled Files
gcc generates the following code (snippet from objdump -d kernel.elf):
00008b74 <KernelMain32>:
8b74: 55 push ebp
8b75: 89 e5 mov ebp,esp
8b77: 83 ec 18 sub esp,0x18
8b7a: e8 f0 fe ff ff call 8a6f <ClearScreen>
8b7f: 68 41 8c 00 00 push 0x8c41
8b84: e8 7a ff ff ff call 8b03 <SimplePrint>
8b89: 83 c4 04 add esp,0x4
8b8c: e8 0f 00 00 00 call 8ba0 <GetMemoryAddressLimits>
8b91: 66 89 45 f6 mov WORD PTR [ebp-0xa],ax
8b95: eb fe jmp 8b95 <KernelMain32+0x21>
From that we can see that memAddr is indeed located at ebp-10 on the stack, consistent to what gdb> info addr memAddr told us.
Dwarf information (objdump --dwarf kernel.elf):
<1><4ff>: Abbrev Number: 20 (DW_TAG_subprogram)
<500> DW_AT_external : 1
<501> DW_AT_name : (indirect string, offset: 0x23c): KernelMain32
<505> DW_AT_decl_file : 2
<506> DW_AT_decl_line : 79
<507> DW_AT_decl_column : 6
<508> DW_AT_low_pc : 0x8b74
<50c> DW_AT_high_pc : 0x8b97
<510> DW_AT_frame_base : 0x20 (location list)
<514> DW_AT_GNU_all_call_sites: 1
<515> DW_AT_sibling : <0x544>
<2><519>: Abbrev Number: 21 (DW_TAG_variable)
<51a> DW_AT_name : (indirect string, offset: 0x2d6): memAddr
<51e> DW_AT_decl_file : 2
<51f> DW_AT_decl_line : 86
<520> DW_AT_decl_column : 19
<521> DW_AT_type : <0x4f3>
<525> DW_AT_location : 2 byte block: 91 6e (DW_OP_fbreg: -18)
and relevant snippet from objdump --dwarf=loc kernel.elf:
Offset Begin End Expression
00000000 <End of list>
objdump: Warning: There is an overlap [0x8 - 0x0] in .debug_loc section.
00000000 <End of list>
objdump: Warning: There is a hole [0x8 - 0x20] in .debug_loc section.
00000020 00008b74 00008b75 (DW_OP_breg4 (esp): 4)
0000002c 00008b75 00008b77 (DW_OP_breg4 (esp): 8)
00000038 00008b77 00008b97 (DW_OP_breg5 (ebp): 8)
00000044 <End of list>
[...]
These all seem to be what I'd expect. (I'm not sure if the warnings in the last one have significance, though).
Additional Note
If I change compilation flag -gdwarf-2 to just -g I get
gdb> p &memAddr
$1 = (MemAddrLimits *) 0x8ffde
gdb> info addr memAddr
Symbol "memAddr" is a complex DWARF expression:
0: DW_OP_fbreg -18
.
gdb> p memAddr
$2 = {packed = 0, limits = {PhysicalLimit = 0 '\000', LinearLimit = 0 '\000'}}
gdb> p/x $ebp-10
$3 = 0x8ffee
So memAddr is still not resolved correctly but p &memAddr at least is in the stack frame and not somewhere completely different. However, info addr memAddr seems to have problems now...
After some more investigation, I have tracked this to being due to remote debugging 32-bit code (my kernel not yet having switched to long mode) on a x86-64 qemu emulated system.
If I debug the same code with qemu-system-i386 everything works just as it should.
I am using Windows 10 and Windows subsystem for linux.
I have started creating my own Operating System in Assembly and C.
I am following a tutorial.
I got stuck into 2 problems.
Error 1:
When I Link and create bin files, i am getting a warning:
"ld: warning: cannot find entry symbol _start; defaulting to 0000000000001000"
Does this matter?
Error 2:
After compiling my code, there was no error. But when i boot my operating system, it shows an error: Disk read error!
Please help me.
Boot.asm
[org 0x7c00]
KERNEL_OFFSET equ 0x1000
mov [BOOT_DRIVE], dl
mov bp, 0x9000
mov sp, bp
mov si, MSG_REAL_MODE
call print
call load_kernel
call switch_to_pm
jmp $
%include "printstr.asm"
%include "diskload.asm"
[bits 16]
load_kernel :
mov si, MSG_LOAD_KERNEL
call print
mov bx, KERNEL_OFFSET
mov dh, 15
mov dl, [ BOOT_DRIVE ]
call disk_load
ret
[bits 32]
BEGIN_PM :
mov ebx, MSG_PROT_MODE
call print_string_pm
call KERNEL_OFFSET
jmp $
BOOT_DRIVE db 0
MSG_REAL_MODE db " Started in 16 - bit Real Mode " , 0
MSG_PROT_MODE db " Successfully landed in 32 - bit Protected Mode " , 0
MSG_LOAD_KERNEL db " Loading kernel into memory. " , 0
times 510 -( $ - $$ ) db 0
dw 0xaa55
diskload.asm
disk_load :
push dx
mov ah , 0x02
mov al , dh ;
mov ch , 0x00
mov dh , 0x00
mov cl , 0x02
int 0x13
jc disk_error
pop dx
cmp dh , al
jne disk_error
ret
disk_error :
mov si , DISK_ERROR_MSG
call prints
jmp $
; Variables
DISK_ERROR_MSG db " Disk read error !" , 0
prints:
lodsb
or al, al
jz printdones
mov ah, 0eh
int 10h
jmp prints
printdones:
ret
Compiling commands:
nasm boot.asm -f bin -o boot.bin
nasm kernel_entry.asm -f elf64 -o kernel_entry.o
gcc -ffreestanding -c kernel.c -o kernel.o
ld -o kernel.bin -Ttext 0x1000 kernel_entry.o kernel.o --oformat binary
cat boot.bin kernel.bin > os.iso
ld: warning: cannot find entry symbol _start; defaulting to 0000000000001000
The effect of this error will be that the entry point address stored in the executable file will not be correct.
However, "raw" binary files only containing some memory content; they don't contain any additional information - such as the entry point - as an ELF or COFF file would do.
In other words: For "raw" binary files (--oformat binary) this warning message has no meaning at all.
But when i boot my operating system, it shows an error: Disk read error!
I'm not sure, but there are two possible errors:
Are you sure that the ES register contains the correct value?
(If you want to load your kernel into absolute address 0x1000, you have to set ES to 0.)
Many drive types don't support reading too many sectors at once.
(However a real 1440K drive should support reading 15 sectors starting at sector #2.)
I had this problem too , the disk read error may be caused by your final os image being too small for 15 sectors read.
if you use qemu , try to resize your os image with :
qemu-img resize os.iso +20K
this is going to resize your image to 20 KB
(you can put your own value instead of 20K).
I think qemu considers the os image as the whole disk , so you should resize it accordingly .
I've seen in some posts/videos/files that they are zero-padded to look bigger than they are, or match "same file size" criteria some file system utilities have for moving files, mostly they are either prank programs, or malware.
But I often wondered, what would happen if the file corrupted, and would "load" the next set of "instructions" that are in the big zero-padded space at the end of the file?
Would anything happen? What's the instruction set for 0x0?
The decoding of 0 bytes completely depends on the CPU architecture. On many architectures, instruction are fixed length (for example 32-bit), so the relevant thing would be 00 00 00 00 (using hexdump notation).
On most Linux distros, clang/llvm comes with support for multiple target architectures built-in (clang -target and llvm-objdump), unlike gcc / gas / binutils, so I was able to use that to check for some architectures I didn't have cross-gcc / binutils installed for. Use llvm-objdump --version to see the supported list. (But I didn't figure out how to get it to disassemble a raw binary like binutils objdump -b binary, and my clang won't create SPARC binaries on its own.)
On x86, 00 00 (2 bytes) decodes (http://ref.x86asm.net/coder32.html) as an 8-bit add with a memory destination. The first byte is the opcode, the 2nd byte is the ModR/M that specifies the operands.
This usually segfaults right away (if eax/rax isn't a valid pointer), or segfaults once execution falls off the end of the zero-padded part into an unmapped page. (This happens in real life because of bugs like falling off the end of _start without making an exit system call), although in those cases the following bytes aren't always all zero. e.g. data, or ELF metadata.)
x86 64-bit mode: ndisasm -b64 /dev/zero | head:
address machine code disassembly
00000000 0000 add [rax],al
x86 32-bit mode (-b32):
00000000 0000 add [eax],al
x86 16-bit mode: (-b16):
00000000 0000 add [bx+si],al
AArch32 ARM mode: cd /tmp && dd if=/dev/zero of=zero bs=16 count=1 && arm-none-eabi-objdump -z -D -b binary -marm zero. (Without -z, objdump skips over large blocks of all-zero and shows ...)
addr machine code disassembly
0: 00000000 andeq r0, r0, r0
ARM Thumb/Thumb2: arm-none-eabi-objdump -z -D -b binary -marm --disassembler-options=force-thumb zero
0: 0000 movs r0, r0
2: 0000 movs r0, r0
AArch64: aarch64-linux-gnu-objdump -z -D -b binary -maarch64 zero
0: 00000000 .inst 0x00000000 ; undefined
MIPS32: echo .long 0 > zero.S && clang -c -target mips zero.S && llvm-objdump -d zero.o
zero.o: file format ELF32-mips
Disassembly of section .text:
0: 00 00 00 00 nop
PowerPC 32 and 64-bit: -target powerpc and -target powerpc64. IDK if any extensions to PowerPC use the 00 00 00 00 instruction encoding for anything, or if it's still an illegal instruction on modern IBM POWER chips.
zero.o: file format ELF32-ppc (or ELF64-ppc64)
Disassembly of section .text:
0: 00 00 00 00 <unknown>
IBM S390: clang -c -target systemz zero.S
zero.o: file format ELF64-s390
Disassembly of section .text:
0: 00 00 <unknown>
2: 00 00 <unknown>
I am writing a bootloader as follows:
bits 16
[org 0x7c00]
KERN_OFFSET equ 0x1000
mov [BOOTDISK], dl
mov dl, 0x0 ;0 is for floppy-disk
mov ah, 0x2 ;Read function for the interrupt
mov al, 0x15 ;Read 15 sectors conating kernel
mov ch, 0x0 ;Use cylinder 0
mov cl, 0x2 ;Start from the second sector which contains kernel
mov dh, 0x0 ;Read head 0
mov bx, KERN_OFFSET
int 0x13
jc disk_error
cmp al, 0x15
jne disk_error
jmp KERN_OFFSET:0x0
jmp $
disk_error:
jmp $
BOOTDISK: db 0
times 510-($-$$) db 0
dw 0xaa55
The kernel is a simple C program which prints "e" on the VGA display (seen on QEmu):
void main()
{
extern void put_in_mem();
char c = 'e';
put_in_mem(c, 0xA0);
}
I am using this code in 16 bit (real mode) in QEmu so I am using the compiler bcc for this code using:
bcc -ansi -c -o kernel.o kernel.c
I have the following questions:
1. When I try to disassemble this code, using
objdump -D -b binary -mi386 kernel.o
I get an output like this (only initial portion of output):
kernel.o: file format binary
Disassembly of section .data:
00000000 <.data>:
0: a3 86 01 00 2a mov %eax,0x2a000186
5: 3e 00 00 add %al,%ds:(%eax)
8: 00 22 add %ah,(%edx)
a: 00 00 add %al,(%eax)
c: 00 19 add %bl,(%ecx)
e: 00 00 add %al,(%eax)
10: 00 55 55 add %dl,0x55(%ebp)
13: 55 push %ebp
14: 55 push %ebp
15: 00 00 add %al,(%eax)
17: 00 02 add %al,(%edx)
19: 22 00 and (%eax),%al
This output does not seem to correspond to the kernel.c file I made. For example I could not see where 'e' is stored as ASCII 0x65 or where is the call to put_in_mem made. Is something wrong with the way I am disassembling the code?
To make the object file of the kernel for QEmu I used the following command:
ld86 -o kernel -d kernel.o put_in_mem.o
Here put_in_mem.o is the object file created after assembling the put_in_mem.asm file which contains the definition of the function put_in_mem() used in kernel.c.
Then floppy image for QEmu is made using:
cat boot.o kernel > floppy_img
But when I try to look at the address 0x10000 (using GDB), where the kernel was supposed to be present after loading (using the boot.asm program), it was not present.
Why is this happening?
Further, in ld command we used -Ttext option to specify the load address of the binary, should we use some similar option here with ld86?
Your kernel.o is in an object file format not understood by objdump so it tries to disassemble everything in it, including headers and whatnot. Try to disassemble the linked output kernel instead. Also objdump might not understand 16 bit code. Better try objdump86 if you have that available.
As to why it's not present: you are looking at the wrong place. You are loading it to offset 0x1000 (3 zeroes) but you are looking at 0x10000 (4 zeroes). Also note that you don't set up ES which is bad practice. Maybe you intended to set ES to 0x1000 and BX to 0x0000 and then you would find your kernel at 0x10000 physical address.
The -Ttext doesn't influence loading, it only specifies where the code expects to find itself.
I am in the process of writing a small operating system in C. I have written a bootloader and I'm now trying to get a simple C file (the "kernel") to compile with gcc:
int main(void) { return 0; }
I compile the file with the following command:
gcc kernel.c -o kernel.o -nostdlib -nostartfiles
I use the linker to create the final image using this command:
ld kernel.o -o kernel.bin -T linker.ld --oformat=binary
The contents of the linker.ld file are as follows:
SECTIONS
{
. = 0x7e00;
.text ALIGN (0x00) :
{
*(.text)
}
}
(The bootloader loads the image at address 0x7e00.)
This seems to work quite well - ld produces a 128-byte file containing the following instructions in the first 11 bytes:
00000000 55 push ebp
00000001 48 dec eax
00000002 89 E5 mov ebp, esp
00000004 B8 00 00 00 00 mov eax, 0x00000000
00000009 5D pop ebp
0000000A C3 ret
However, I can't figure out what the other 117 bytes are for. Disassembling them seems to produce a bunch of garbage that doesn't make any sense. The existence of the additional bytes has me wondering if I'm doing something wrong.
Should I be concerned?
These are additional sections, which were not stripped and not discarded. You want your linker.ld file to look like this:
SECTIONS
{
. = 0x7e00;
.text ALIGN (0x00) :
{
*(.text)
}
/DISCARD/ :
{
*(.comment)
*(.eh_frame_hdr)
*(.eh_frame)
}
}
I know what sections to discard from the output of objdump -t kernel.o.
Simple, you're using gcc, and it always put its initialization code before passing control to your main.
What's on that start up code I don't know, but they are there. As you may see there's also an comment 'GNU' on your binary, you can't print specific sectors by using objdump -s -j 'section name'.