Okay, so I've been trying to make a two-step bootloader in assembly/C but I haven't been able to get the JMP working. At first I thought the read was failing, but, after the following test I ruled that out:
__asm__ __volatile__(
"xorw %ax, %ax;"
"movw %ax, %ds;"
"movw %ax, %es;"
"movb $0x02, %ah;"
"movb $0x01, %al;"
"movw $0x7E00, %bx;"
"movw $0x0003, %cx;"
"xorb %dh, %dh;"
"int $0x13;"
"movb 0x7E00, %al;"
"movb $0x0e, %ah;"
"int $0x10;"
//"jmp 0x7E00"
);
This printed 'f' as expected (the first byte of the sector is 0x66 which is the ASCII code for 'f') proving that the read is successful and the jmp is the problem. This is my code:
__asm__(".code16\n");
__asm__(".code16gcc\n");
__asm__("jmpl $0x0000, $main\n");
void main(){
__asm__ __volatile__(
"xorw %ax, %ax;"
"movw %ax, %ds;"
"movw %ax, %es;"
"movb $0x02, %ah;"
"movb $0x01, %al;"
"movw $0x7E00, %bx;"
"movw $0x0003, %cx;"
"xorb %dh, %dh;"
"int $0x13;"
"jmp $0x200;"
);
}
When run, my program simply hangs, this means that the program is probably jumping to the wrong location in memory. By the way, I am obviously running this in real mode under VMWare player. I am compiling this with the following commands:
gcc -c -0s -march=i686 -ffreestanding -Wall -Werror boot.c -o boot.o
ld -static -Ttest.ld -nostdlib --nmagic -o boot.elf boot.o --no-check-sections
objcopy -0 binary boot.elf boot.bin
and this is test.ld:
ENTRY(main);
SECTIONS
{
. = 0x7C00;
.text : AT(0x7C00)
{
*(.test);
}
.sig : AT(0x7DFE)
{
SHORT(0xAA55);
}
}
Note: I have confirmed this is not a problem with the inline asm - I have tried a pure assembly implementation too with the same results - the only reason I am using C is because I plan on expanding this a bit and I am much more comfortable with C loops and functions...
EDIT: I've uploaded the first three sectors of my floppy drive here
EDIT 2: I have been unable to get my boot loader working using any of the suggestions and, based on advice from #RossRidge I have written an assembly version of the same program and a simple assembly program to echo input. Sadly these aren't working either..
Bootloader:
org 0x7c00
xor ax, ax
mov ds, ax
mov es, ax
mov ah, 0x02
mov al, 0x01
mov bx, 0x7E00
mov cx, 0x0003
xor dh, dh
int 0x13
jmp 0x7E00
Program in sector 3:
xor ax, ax
int 0x16
mov ah, 0xe
int 0x10
These are both compiled with: nasm Linux/boot.S -o Linux/asm.bin and behave the same as their C counterparts..
I am pretty sure your assembler is generating the wrong jump offset, it's probably interpreting the 0x7e00 as relative in the current text section, but that's mapped at 0x7c00 by the linker script so your jump probably goes to 0x7c00+0x7e00=0xfa00 instead. As an ugly workaround you could try jmp 0x200 instead or use an indirect jump to hide it from your tools such as mov $0x7e00, %ax; jmp *%ax. Alternatively jmp .text+0x200 could also work.
Also note that writing 16 bit asm in a 32 bit C compiler will generate wrong code, which is also hinted by the first byte of the loaded sector being 0x66 which is the operand size override prefix. Since the compiler assumes 32 bit mode, it will generate your 16 bit code with prefixes, which when run in 16 bit mode will switch back to 32 bits. In general, it's not a good idea to abuse inline asm for such purpose, you should use a separate asm file and make sure you tell your assembler that the code is intended for 16 bit real mode.
PS: learn to use a debugger.
Update: The following code when assembled using nasm boot.asm -o boot.bin works fine in qemu and bochs:
org 0x7c00
xor ax, ax
mov ds, ax
mov es, ax
mov ah, 0x02
mov al, 0x01
mov bx, 0x7E00
mov cx, 0x0003
xor dh, dh
int 0x13
jmp 0x7E00
times 510-($-$$) db 0
dw 0xaa55
; second sector
times 512 db 0
; Program in sector 3:
xor ax, ax
int 0x16
mov ah, 0xe
int 0x10
jmp $
times 1536-($-$$) db 0
You can download an image here (limited offer ;)).
I have written many boot loaders.
There needs to be two(2) separate executables,
1) the boot loader
2) the main code
so the loaded code (the main program) can be updated without doing anything to the boot loader.
The boot loader, when done needs to jump to a known location.
The main program needs to place a jump instruction at the known location, that jumps to a function in the librt library. (usually start())
start() will set I/O, heap, etc and call main()
Both the linker command file for the boot loader and the linker command file for the main program must agreed on the address of the known location
The known location is usually the only thing in a unique section of both linker command files.
The boot loader must be smart enough to place all the parts of the main program in the right areas in memory.
It is best if the main program is in motorola S1 (or similar) format rather than a raw .elf or .coff format; otherwise the boot loader can get very big very quickly.
Related
is there a way to push this line to asm from C from GCC
mov ah,1h
int 21h
I can't find a way to set the AH register to 1h
asm("mov %ah,1h");
asm("int 21h");
1h means 1 in hexadecimal number. You can use $0x1 to express that. ($ is required for integer literals in GCC assembly language and 0x is marking the number as hexadecimal).
Also note that in GCC assembly language, the destination of mov instruction (and other instructions with two operands) should be the 2nd operand.
asm("mov $0x1, %ah");
asm("int $0x21");
One more note is that if you want to make sure that %ah is 0x1 when the int is executed, the two lines should be put into one asm statement not to let the compiler put other instructions between them.
asm(
"mov $0x1, %ah\n\t"
"int $0x21"
);
I have been trying to compile this C program to assembly but it hasn't been working fine.
I am reading
Dennis Yurichev Reverse Engineering for Beginner but I am not getting the same output. Its a simple hello world statement. I am trying to get the 32 bit output
#include <stdio.h>
int main()
{
printf("hello, world\n");
return 0;
}
Here is what the book says the output should be
main proc near
var_10 = dword ptr -10h
push ebp
mov ebp, esp
and esp, 0FFFFFFF0h
sub esp, 10h
mov eax, offset aHelloWorld ; "hello, world\n"
mov [esp+10h+var_10], eax
call _printf
mov eax, 0
leave
retn
main endp
Here are the steps;
Compile the print statement as a 32bit (I am currently running a 64bit pc)
gcc -m32 hello_world.c -o hello_world
Use gdb to disassemble
gdb file
set disassembly-flavor intel
set architecture i386:intel
disassemble main
And i get;
lea ecx,[esp+0x4]
and esp,0xfffffff0
push DWORD PTR [ecx-0x4]
push ebp
mov ebp,esp
push ebx
push ecx
call 0x565561d5 <__x86.get_pc_thunk.ax>
add eax,0x2e53
sub esp,0xc
lea edx,[eax-0x1ff8]
push edx
mov ebx,eax
call 0x56556030 <puts#plt>
add esp,0x10
mov eax,0x0
lea esp,[ebp-0x8]
pop ecx
pop ebx
pop ebp
lea esp,[ecx-0x4]
ret
I have also used
objdump -D -M i386,intel hello_world> hello_world.txt
ndisasm -b32 hello_world > hello_world.txt
But none of those are working either. I just cant figure out what's wrong. I need some help. Looking at you Peter Cordes ^^
The output from the book looks like MSVC, not GCC. GCC will definitely not ever emit main proc because that's MASM syntax, not valid GAS syntax. And it won't do stuff like var_10 = dword ptr -10h.
(And even if it did, you wouldn't see assemble-time constant definitions in disassembly, only in the compiler's asm output which is what the book suggested you look at. gcc -S -masm=intel output. How to remove "noise" from GCC/clang assembly output?)
So there are lots of differences because you're using a different compiler. Even modern versions of MSVC (on the Godbolt compiler explorer) make somewhat different asm, for example not bothering to align ESP by 16, perhaps because more modern Windows versions, or CRT startup code, already does that?
Also, your GCC is making PIE executables by default, so use -fno-pie -no-pie. 32-bit PIE sucks for efficiency and for ease of understanding. See How do i get rid of call __x86.get_pc_thunk.ax. (Also 32-bit absolute addresses no longer allowed in x86-64 Linux? for more about PIE executables, mostly focused on 64-bit code)
The extra clunky stack-alignment in main's prologue is something that GCC8 optimized for functions that don't also need alloca. But it seems even current GCC10 emits the full un-optimized version when you don't enable optimization :(.
Why is gcc generating an extra return address? and Trying to understand gcc's complicated stack-alignment at the top of main that copies the return address
Optimizing printf to puts: see How to get the gcc compiler to not optimize a standard library function call like printf? and -O2 optimizes printf("%s\n", str) to puts(str). gcc -fno-builtin-printf would be one way to make that not happen, or just get used to it. GCC does a few optimizations even at -O0 that other compilers only do at higher optimization levels.
MSVC 19.10 compiles your function like this (on the Godbolt compiler explorer) with optimization disabled (the default, no compiler options).
_main PROC
push ebp
mov ebp, esp
push OFFSET $SG4501
call _printf
add esp, 4
xor eax, eax
pop ebp
ret 0
_main ENDP
_DATA SEGMENT
$SG4501 DB 'hello, world', 0aH, 00H
GCC10.2 still uses an over-complicated stack alignment dance in the prologue.
.LC0:
.string "hello, world"
main:
lea ecx, [esp+4]
and esp, -16
push DWORD PTR [ecx-4]
push ebp
mov ebp, esp
push ecx
sub esp, 4
# end of function prologue, I think.
sub esp, 12 # make sure arg will be 16-byte aligned
push OFFSET FLAT:.LC0 # push a pointer
call puts
add esp, 16 # pop the arg-passing space
mov eax, 0 # return 0
mov ecx, DWORD PTR [ebp-4] # undo stack alignment.
leave
lea esp, [ecx-4]
ret
Yes, this is super inefficient. If you called your function anything other than main, it would already assume ESP was aligned by 16 on function entry:
# GCC10.2 -m32 -O0
.LC0:
.string "hello, world"
foo:
push ebp
mov ebp, esp
sub esp, 8 # reach a 16-byte boundary, assuming ESP%16 = 12 on entry
#
sub esp, 12
push OFFSET FLAT:.LC0
call puts
add esp, 16
mov eax, 0
leave
ret
So it still doesn't combine the two sub instructions, but you did tell it not to optimize so braindead code is expected. See Why does clang produce inefficient asm with -O0 (for this simple floating point sum)? for example.
My GCC will very eagerly swap a call to printf to puts! I did not manage to find the command line options that would make the compiler to not do this. I.e. the program has the same external behaviour but the machine code is that of
#include <stdio.h>
int main(void)
{
puts("hello, world");
}
Thus, you'll have really hard time trying to get the exact same assembly as in the book, as the assembly from that book has a call to printf instead of puts!
First of all you compile not decompile.
You get a lots of noise as you compile without the optimizations. If you compile with optimizations you will get much smaller code almost identical with the one you have (to prevent change from printf to puts you need to remove the '\n' https://godbolt.org/z/cs4qe9):
.LC0:
.string "hello, world"
main:
lea ecx, [esp+4]
and esp, -16
push DWORD PTR [ecx-4]
push ebp
mov ebp, esp
push ecx
sub esp, 16
push OFFSET FLAT:.LC0
call puts
mov ecx, DWORD PTR [ebp-4]
add esp, 16
xor eax, eax
leave
lea esp, [ecx-4]
ret
https://godbolt.org/z/xMqo33
I am using Windows 10 and Windows subsystem for linux.
I have started creating my own Operating System in Assembly and C.
I am following a tutorial.
I got stuck into 2 problems.
Error 1:
When I Link and create bin files, i am getting a warning:
"ld: warning: cannot find entry symbol _start; defaulting to 0000000000001000"
Does this matter?
Error 2:
After compiling my code, there was no error. But when i boot my operating system, it shows an error: Disk read error!
Please help me.
Boot.asm
[org 0x7c00]
KERNEL_OFFSET equ 0x1000
mov [BOOT_DRIVE], dl
mov bp, 0x9000
mov sp, bp
mov si, MSG_REAL_MODE
call print
call load_kernel
call switch_to_pm
jmp $
%include "printstr.asm"
%include "diskload.asm"
[bits 16]
load_kernel :
mov si, MSG_LOAD_KERNEL
call print
mov bx, KERNEL_OFFSET
mov dh, 15
mov dl, [ BOOT_DRIVE ]
call disk_load
ret
[bits 32]
BEGIN_PM :
mov ebx, MSG_PROT_MODE
call print_string_pm
call KERNEL_OFFSET
jmp $
BOOT_DRIVE db 0
MSG_REAL_MODE db " Started in 16 - bit Real Mode " , 0
MSG_PROT_MODE db " Successfully landed in 32 - bit Protected Mode " , 0
MSG_LOAD_KERNEL db " Loading kernel into memory. " , 0
times 510 -( $ - $$ ) db 0
dw 0xaa55
diskload.asm
disk_load :
push dx
mov ah , 0x02
mov al , dh ;
mov ch , 0x00
mov dh , 0x00
mov cl , 0x02
int 0x13
jc disk_error
pop dx
cmp dh , al
jne disk_error
ret
disk_error :
mov si , DISK_ERROR_MSG
call prints
jmp $
; Variables
DISK_ERROR_MSG db " Disk read error !" , 0
prints:
lodsb
or al, al
jz printdones
mov ah, 0eh
int 10h
jmp prints
printdones:
ret
Compiling commands:
nasm boot.asm -f bin -o boot.bin
nasm kernel_entry.asm -f elf64 -o kernel_entry.o
gcc -ffreestanding -c kernel.c -o kernel.o
ld -o kernel.bin -Ttext 0x1000 kernel_entry.o kernel.o --oformat binary
cat boot.bin kernel.bin > os.iso
ld: warning: cannot find entry symbol _start; defaulting to 0000000000001000
The effect of this error will be that the entry point address stored in the executable file will not be correct.
However, "raw" binary files only containing some memory content; they don't contain any additional information - such as the entry point - as an ELF or COFF file would do.
In other words: For "raw" binary files (--oformat binary) this warning message has no meaning at all.
But when i boot my operating system, it shows an error: Disk read error!
I'm not sure, but there are two possible errors:
Are you sure that the ES register contains the correct value?
(If you want to load your kernel into absolute address 0x1000, you have to set ES to 0.)
Many drive types don't support reading too many sectors at once.
(However a real 1440K drive should support reading 15 sectors starting at sector #2.)
I had this problem too , the disk read error may be caused by your final os image being too small for 15 sectors read.
if you use qemu , try to resize your os image with :
qemu-img resize os.iso +20K
this is going to resize your image to 20 KB
(you can put your own value instead of 20K).
I think qemu considers the os image as the whole disk , so you should resize it accordingly .
I'm trying to call a golang function from my C code. Golang does not use the standard x86_64 calling convention, so I have to resort to implementing the transition myself. As gcc does not want to mix cdecl with the x86_64 convention,
I'm trying to call the function using inline assembly:
void go_func(struct go_String filename, void* key, int error){
void* f_address = (void*)SAVEECDSA;
asm volatile(" sub rsp, 0xe0; \t\n\
mov [rsp+0xe0], rbp; \t\n\
mov [rsp], %0; \t\n\
mov [rsp+0x8], %1; \t\n\
mov [rsp+0x18], %2; \t\n\
call %3; \t\n\
mov rbp, [rsp+0xe0]; \t\n\
add rsp, 0xe0;"
:
: "g"(filename.str), "g"(filename.len), "g"(key), "g"(f_address)
: );
return;
}
Sadly the compiler always throws an error at me that I dont understand:
./code.c:241: Error: too many memory references for `mov'
This corresponds to this line: mov [rsp+0x18], %2; \t\n\ If I delete it, the compilation works. I don't understand what my mistake is...
I'm compiling with the -masm=intel flag so I use Intel syntax. Can someone please help me?
A "g" constraint allows the compiler to pick memory or register, so obviously you'll end up with mov mem,mem if that happens. mov can have at most 1 memory operand. (Like all x86 instructions, at most one explicit memory operand is possible.)
Use "ri" constraints for the inputs that will be moved to a memory destination, to allow register or immediate but not memory.
Also, you're modifying RSP so you can't safely use memory source operands. The compiler is going to assume it can use addressing modes like [rsp+16] or [rsp-4]. So you can't use push instead of mov.
You also need to declare clobbers on all the call-clobbered registers, because the function call will do that. (Or better, maybe ask for the inputs in those call-clobbered registers so the compiler doesn't have to bounce them through call-preserved regs like RBX. But you need to make those operands read/write or declare separate output operands for the same registers to let the compiler know they'll be modified.)
So probably your best bet for efficiency is something like
int ecx, edx, edi, esi; // dummy outputs as clobbers
register int r8 asm("r8d"); // for all the call-clobbered regs in the calling convention
register int r9 asm("r9d");
register int r10 asm("r10d");
register int r11 asm("r11d");
// These are the regs for x86-64 System V.
// **I don't know what Go actually clobbers.**
asm("sub rsp, 0xe0\n\t" // adjust as necessary to align the stack before a call
// "push args in reverse order"
"push %[fn_len] \n\t"
"push %[fn_str] \n\t"
"call \n\t"
"add rsp, 0xe0 + 3*8 \n\t" // pop red-zone skip space + pushed args
// real output in RAX, and dummy outputs in call-clobbered regs
: "=a"(retval), "=c"(ecx), "=d"(edx), "=D"(edi), "=S"(esi), "=r"(r8), "=r"(r9), "=r"(r10), "=r"(r11)
: [fn_str] "ri" (filename.str), [fn_len] "ri" (filename.len), etc. // inputs can use the same regs as dummy outputs
: "xmm0", "xmm1", "xmm2", "xmm3", "xmm4", "xmm5", "xmm6", "xmm7", // All vector regs are call-clobbered
"xmm8", "xmm9", "xmm10", "xmm11", "xmm12", "xmm13", "xmm14", "xmm15",
"memory" // if you're passing any pointers (even read-only), or the function accesses any globals,
// best to make this a compiler memory barrier
);
Notice that the output are not early-clobber, so the compiler can (at its option) use those registers for inputs, but we're not forcing it so the compiler is still free to use some other register or an immediate.
Upon further discussion, Go functions don't clobber RBP, so there's no reason to save/restore it manually. The only reason you might have wanted to is that locals might use RBP-relative addressing modes, and older GCC made it an error to declare a clobber on RBP when compiling without -fomit-frame-pointer. (I think. Or maybe I'm thinking of EBX in 32-bit PIC code.)
Also, if you're using the x86-64 System V ABI, beware that inline asm must not clobber the red-zone. The compiler assumes that doesn't happen and there's no way to declare a clobber on the red zone or even set -mno-redzone on a per-function basis. So you probably need to sub rsp, 128 + 0xe0. Or 0xe0 already includes enough space to skip the red-zone if that's not part of the callee's args.
The original poster added this solution as an edit to their question:
If someone ever finds this, the accepted answer does not help you when you try to call golang code with inline asm! The accepted answer only helps with my initial problem, which helped me to fix the golangcall. Use something like this:**
void* __cdecl go_call(void* func, __int64 p1, __int64 p2, __int64 p3, __int64 p4){
void* ret;
asm volatile(" sub rsp, 0x28; \t\n\
mov [rsp], %[p1]; \t\n\
mov [rsp+0x8], %[p2]; \t\n\
mov [rsp+0x10], %[p3]; \t\n\
mov [rsp+0x18], %[p4]; \t\n\
call %[func_addr]; \t\n\
add rsp, 0x28; "
:
: [p1] "ri"(p1), [p2] "ri"(p2),
[p3] "ri"(p3), [p4] "ri"(p4), [func_addr] "ri"(func)
: );
return ret;
}
I am following a tutorial about creating a 32bit operating system. I have managed to switch to the 32bit protected mode happily but could not get further.
Below, boot_sect.asm is the main assembly code. After switching to protected mode inside boot_sect.asm, an external C function (kmain) is called via kernel_entry.asm (calls the c function) from boot_sect.asm. The location of the C function in the ram is 0x1000000.
The kmain() is supposed to print an x on the screen. However, this does not happen. Indeed, to diagnose the code, I put an infinite while loop inside the kmain(), but the program did not hang there; instead it directly ended. I guess the files are not linked properly or there is a memory overlap problem. I could not get any further, and I am stuck at this point.
I would appreciate any idea or help to get over this step.
boot_sect.asm (the main asm code)
; A boot sector that boots a C kernel in 32-bit protected mode
[bits 16]
[org 0x7c00]
KERNEL_OFFSET equ 0x100000 ; This is the memory offset to where kernel will be loaded
mov [BOOT_DRIVE], dl
mov bp, 0x9000 ; stack origin.
mov sp, bp
mov bx, MSG_REAL_MODE ;print that we are starting
call print_string ; booting from 16-bit real mode
call load_kernel ; Load our kernel
call switch_to_pm ; Switch to protected mode, from which
; we will not return
jmp $
; Include our useful , hard-earned routines
%include "print/print_string.asm"
%include "disk/disk_load.asm"
%include "pm/gdt.asm"
%include "pm/print_string_pm.asm"
%include "pm/switch_to_pm.asm"
[bits 16]
; load_kernel
load_kernel:
mov bx, MSG_LOAD_KERNEL ; Print a message to say we are loading the kernel
call print_string
mov bx, KERNEL_OFFSET ; Set-up parameters for our disk_load routine , so
mov dh, 15 ; that we load the first 15 sectors (excluding
mov dl, [BOOT_DRIVE] ; the boot sector) from the boot disk (i.e. our
call disk_load ; kernel code) to address KERNEL_OFFSET
ret
[bits 32]
BEGIN_PM:
mov ebx, MSG_PROT_MODE ; Use 32-bit print routine to
call print_string_pm ; announce its in protected mode
;the program manages to come here perfectly.
;here the kmain() function is called via kernel_entry.asm
;but after this call, the program gets "lost", it ends somehow.
call KERNEL_OFFSET
;it cant even get here, it ends before getting here. Probably the kmain() is not located at KERNEL_OFFSET due to a problem.
jmp $ ;Hang.
; Global variables
BOOT_DRIVE db 0
MSG_REAL_MODE db "Started in 16-bit Real Mode", 0
MSG_PROT_MODE db "Successfully landed in 32-bit Protected Mode", 0
MSG_LOAD_KERNEL db "Loading kernel into memory.", 0
; Bootsector padding
times 510-($-$$) db 0
dw 0xaa55
kernel.c (c code)
void kmain()
{
char* screen = (char*)0xb8000;
screen[0] = 'X';
}
kernel_entry.asm (calls the c code)
[bits 32]
[extern _kmain]
call _kmain
jmp $
Finally, to obtain the os-image , I used the following commands
kernel.c and kernel_entry.asm are converted to object files (nasm and gcc commands), then they are linked to create kernel.bin file (by using ld command and the address 0x100000 of the kmain()).
Then kernel.bin and boot_sect.bin are combined using type command to obtain an os image.
gcc -m32 -ffreestanding -c kernel.c -o kernel.o
nasm kernel_entry.asm -f elf -o kernel_entry.o
ld -m i386pe -T NUL -o kernel.bin -Ttext-segment 0x100000 kernel_entry.o kernel.o
nasm boot_sect.asm -f bin -o boot_sect.bin
type boot_sect.bin kernel.bin > os-image