From bootsector to C++ kernel - c

I decided to write a simple asm bootloader and a c++ kernel. I read a lot of tutorials, but I cant compile an assembly file seems like this:
[BITS 32]
[global start]
[extern _k_main]
start:
call _k_main
cli
hlt
(I would like to call th k_main function from c file)
Compile/assemble/linking errors:
nasm -f bin -o kernelstart.asm -o kernelstart.bin:
error: bin file cannot contain external references
okay, then i tried create a .o file:
nasm -f aout -o kernelstart.asm -o kernelstart.o (That's right)
ld -i -e _main -Ttext 0x1000 kernel.o kernelstart.o main.o
error: File format not recognized
Someone give me plz a working example or say how to compile. :/
(I'm browsing the tutorials and helps 2 days ago but cannot find a right answer)

I don't have a direct answer on where your error comes from. However, I do see a lot of things going wrong so I'll write these here:
nasm
nasm -f aout -o kernelstart.asm -o kernelstart
Does that even work? That should be something like
nasm -f aout -o kernelstart kernelstart.asm
ld
ld -i -e _main -Ttext 0x1000 kernel.o kernelstart.o main.o
Since you said you wanted to make a bootloader and a kernel, I'm assuming your goal here is to make ld output something that can be put in the MBR. If that's the case, here are some things to keep in mind:
You didn't specify the output format. If you want to make an MBR image, add --oformat=binary to the command line options. This makes sure a flat binary file is generated.
You set the entry point to _main. I'm not sure where that symbol is defined, but I guess you want your entry point to be start because that's where you call your kernel.
You link your text section starting at 0x1000. If you want to put your image in the MBR to be loaded by the BIOS, it should be linked at 0x7c00.
As a side note: it seems your trying to link your bootloader and kernel together in one image. Just remember that the MBR is can only hold 512 bytes (well, actually 510 bytes since the last 2 should contain a magic value) so you won't be able to write much of a kernel there. What you should do is create a separate kernel image and load this from your bootloader.
I hope these points will help you in solving your problem.
Also, you'll find a lot of useful information as OSDev. Here is a tutorial on writing a real mode "kernel" that only uses the MBR. The tutorial contains working code.

Related

make an object file(*.o) compiled by gcc to be a binary file(*.bin)

I am learning the os tutorial.
https://github.com/cfenollosa/os-tutorial
After I using gcc convert my basic.c to basic.o,
gcc -ffreestanding -c basic.c -o basic.o
As describe in the toturial, I get a file named basic.o which is an object file.
According to the next step, I need to convert basic.o to basic.bin.
ld -o basic.bin -Ttext 0x0 --oformat binary basic.o
The question is that my platform is Macbook Pro(M1) which os version is 12.4.
It seems that there are differences between macos ld and windows ld.
If I input ld -o basic.bin -Ttext 0x0 --oformat binary basic.o, I will get feedback as below.
ld: unknown option: -Ttext
What command about ld I need to convert basic.o to basic.bin on my platform?
Yes they are different, in fact gcc will be just a symlink to clang unless explicitly installed and told otherwise.
As for the -T option, I'm not too sure if you know what it means(if you don't, it will basically tell the linker to put the text section at address 0x0).
You might want to invest the time and look for how to build a linkerScript because you will probably going to use that later anyway.
How to build a linkerScript
If the link is dead at some point, you can just google "linkerScript ld" and it will show some tutorials.

OCaml: Issues linking C and OCaml

I am able to wrap C code and access it from the OCaml interpreter, but cannot build a binary! I'm 98% sure it is some linking problem, but can't find the tools to explore the linkage.
Getting even to this point was a chore, (endless quantities of Error: The external function is not available messages) so I'll document everything I did.
A 'system' file stuff.c
#include <stdio.h>
int fun(int z) // Emulate a "real" subroutine
{
printf("duuude whoa z=%d\n", z);
return 42;
}
Compile above as
cc -fPIC -c stuff.c
ld -shared -o libstuff.so stuff.o
An OCaml wrapper around above, in ocmstuff.c:
#include <caml/mlvalues.h>
CAMLprim value yofun(value z) {
return Val_long(fun(Long_val(z)));
}
Build above as
cc -fPIC -c ocmstuff.c
ld -shared -o dllostuff.so ocmstuff.o -L . -lstuff -lc -rpath .
Yes, the rpath really is needed, else the next steps suffer. (Edit: If you don't use rpath, you'll need to use LD_LIBRARY_PATH=. instead. For the final 'production' version, you'd change the rpath to the actual library path, or do ld.so.conf trickery or install into 'standard' locations, or tell your users about LD_LIBRARY_PATH. This is just like what you'd do for any other system. The rpath solution seems to be the most stable and reliable solution.)
Next, a module declaration, stored in fapi.mli
module Fapi : sig
external ofun : int -> int = "yofun" ;;
end
Build above as:
ocamlc -a -o fapi.cma -intf fapi.mli -dllib -lostuff
Does it work? Yes it does:
$ rlwrap ocaml fapi.cma
OCaml version 4.11.1
open Fapi ;;
Fapi.ofun 33 ;;
duuude whoa z=33
- : int = 42
#
So the wrapper works fine. Now lets compile with it. Here's myprog.ml:
open Fapi ;;
Fapi.ofun 33 ;;
Compile it:
ocamlc -c myprog.ml
ocamlc -o myprog myprog.cmo fapi.cma
The very last command spews:
File "_none_", line 1:
Error: Required module `Fapi' is unavailable
I am 98% sure the above error is due to some silly linking error, but I cannot track it down. Why do I think this? Well, here's a related problem that provides a hint.
$ rlwrap ocaml
open Fapi ;;
# Fapi.ofun 33 ;;
Error: The external function `yofun' is not available
#
Well, that's odd. It clearly must have found fapi.cma because that is the only way it can know about yofun. But somehow, it doesn't know it needs to dig into dllostuff.so for that. Or possibly dllostuff.so is failing to correctly link/load libstuff.so ? Or maybe libc.so to get printf ? I'm pretty sure its one of these last few, but I just can't get it to work, and don't have the tools to debug it. (nm and ldd -r look healthy. Are there some similar tools for the assorted cma,cmo,cmi,cmx files?)
Interfacing with C is much easier if you use dune. You don't need to know the low-level details it is all handled for you.
Now, back to your example. This is definitely not how OCaml users are interfacing with C, but if you really want to learn about it here are a few notes.
The reason why you have the error is that:
you specified modules in an incorrect order, it should be topological, not reverse topological order, i.e., the dependency comes before dependent
you do not have the .ml file (the -intf option means very different)
The reason why the last snippet doesn't work is because you're not loading the library. The ocaml binary obviously doesn't have any fapi units linked into it, so you have to explicitly load it using either #load directive or by passing it in the command line.
Also the following line is not necessary,
ld -shared -o dllostuff.so ocstuff.o -L . -lstuff -lc -rpath .
First of all, there is no need to link a stub file into a shared library. It is counterproductive and doesn't really bring you anything. Second, passing -rpath . will render the end executable unusable, unless the shared objects are stored in the same folder as the executable. Just remove this.
Just to complete your exercise, here is how it could be built and run. First, let's fix the stub file. We need the ml file and we also need to remove an extra module definition,
$ cat fapi.{ml,mli}
external ofun : int -> int = "yofun" ;;
external ofun : int -> int = "yofun" ;;
Yes, they are the same. The mli file is not really needed here, but let's keep it for the sake of completeness.
The way how you build the pure C part is fine, as long as you get a relocatable .so file it works.
Now to build the ocstuff.c (which we conventionally call stubs) you just need to do,
ocamlc -c ocstuff.c
Don't turn it into a shared library, don't do anything else with it. Now let's build the fapi library,
ocamlc -c fapi.mli
ocamlc -c fapi.ml
Now let's build the library that contains both OCaml and C code,
ocamlmklib -o fapi fapi.cmo ocstuff.o -lstuff -L.
Now we can finally build the executable,
ocamlc -c myprog.ml
LD_LIBRARY_PATH=. ocamlc -o myprog fapi.cma myprog.cmo
and run it,
LD_LIBRARY_PATH=. ./myprog
duuude whoa z=33
Notice that we have to use the LD_LIBRARY_PATH to tell the system dynamic loader where to look for the external dependency libstuff.so. You can, of course, use rpath to specify its location (pass it to ocamlmklib via -ccopt) but in general it is assumed that the external library is installed at some location that the system loader knows.
Again, unless you're developing your own build system, please use dune or oasis for building OCaml programs. These systems will handle all low-level details in the best possible way.
P.S. It is also worth mentioning that you're not building a binary, but a bytecode executable. For binaries, you will have to use the ocamlopt compiler. And this would be a completely different story. Again, dune is the solution.
There is a lot to take in here, but these lines are suspicious:
ocamlc -c myprog.ml
ocamlc -o myprog myprog.cmo fapi.cma
OCaml expects modules in topologically sorted order, with a module appearing on the command line before the modules that refer to it.
So it would seem the last line should be this:
ocamlc -o myprog fapi.cma myprog.cmo
I hope this helps, it's just a quick response.
The answer provided by ivg works. It also provides enough hints to retrofit the original question to get the correct behavior. The changes to the original recipe are:
Create fapi.mli and fapi.ml which both have the same content: external ofun : int -> int = "yofun" ;;
Compile both the above with ocaml -c. The mli must be compiled first: it yields an interface file cmi which is needed before the ml file can be compiled into it's object file cmo.
The name dllostuff.so was wrong: it must be dllfapi.so to maintain naming consistency.
Build the cma archive/library as ocamlc -a -o fapi.cma fapi.cmo -dllib -lfapi
That's it! Other than these, the original instructions work. The answer from ivg suggests using
ocamlmklib -o fapi fapi.cmo ostuff.o -L. -lstuff
instead of
ld -shared -o dllfapi.so ostuff.o -L. -lstuff
Either of these work. The primary difference is that ocamlmklib also creates a static-linked library libfapi.a. Other than that, it creates the dllfapi.so as before. (That version also contains a motley assortment of typical gcc symbols, for handling exceptions, library ctors, etc. It's not clear why these are needed here, since they'll show up sooner or later anyway.)

How to remove 'bloat' from a compiled shared object?

I have a gcc C application which compiles to a shared object using the -fpic
option. The intent is to create a 'executable' which allows running the code anywhere in the memory.This is how a sample C program is compiled.
./armeb-eabi-gcc -march=armv5t -mbig-endian -nostdlib -fpic -c main.c
main.c
int main(){
void (*UART)() = 0x594323 | 1;
UART("Hello");
}
The problem is the compiled executable has 'bloat' where i am only looking for machine code and no symbols. I was unable to extract the exact portions from objcopy and objdump which did absolutely nothing. The file size is around 948 bytes which is insane for such simple program.
Here is a snippet of the 'portion' of the file i am looking for.
(The exact highlighted parts could be skewed)
Running
objcopy -I elf32-big -O binary main.o test.bin
gives a 64 byte file which for some odd reason moves part of the string to the top of the file which makes tools like ghidra and ida unable to disassemble properly.
Hopefully it can be seen that the reference to "Hello" is incorrect.

Referring to a specific symbol in a static library with the GNU gold linker

When laying out symbols in the address space using a linker script, ld allows to
refer to a specific symbol coming from a static library with the following
syntax:
archive.a:object_file.o(.section.symbol_name)
Using gold rather than ld, it seems that such a directive is ignored. The
linking process succeeds. However, when using this instruction to put a specific
symbol at a specific location with gold and checking the resulting symbol layout
using nm or having a look at the Map file, the symbol is not in the expected
location.
I made a small test case using a dummy hello world program statically compiled
in its entrety with gcc 5.4.0. The C library is musl libc (last commit on the
master branch from the official git repository). For binutils, I also use the
last commit on the master branch from the official git repository.
I use the linker script to place a specific symbol (.text.exit) from a static
library (musl C library: libc.a) at a specific location in the address space
which is: the first position in the .text section.
My linker script is:
ENTRY(_start)
SECTIONS
{
. = 0x10000;
.text :
{
/* Forcing .text.exit in the first position in .text section */
musl/lib/libc.a:exit.o(.text.exit);
*(.text*);
}
. = 0x8000000;
.data : { *(.data*) }
.rodata : { *(.rodata*) }
.bss : { *(.bss*) }
}
My Makefile:
# Set this to 1 to link with gold, 0 to link with ld
GOLD=1
SRC=test.c
OBJ=test.o
LIBS=musl/lib/crt1.o \
musl/lib/libc.a \
musl/lib/crtn.o
CC=gcc
CFLAGS=-nostdinc -I musl/include -I musl/obj/include
BIN=test
LDFLAGS=-static
SCRIPT=linker-script.x
MAP=map
ifeq ($(GOLD), 1)
LD=binutils-gdb/gold/ld-new
else
LD=binutils-gdb/ld/ld-new
endif
all:
$(CC) $(CFLAGS) -c $(SRC) -o $(OBJ)
$(LD) --output $(BIN) $(LDFLAGS) $(OBJ) $(LIBS) -T $(SCRIPT) \
-Map $(MAP)
clean:
rm -rf $(OBJ) $(BIN) $(MAP)
After compiling and linking I'm checking the map file (obtained using the -Map
ld/gold flag) to have a look at the location of .text.exit. Using ld as the
linker, it is indeed in the first position of the text section. Using gold, it
is not (it is present farther in the address space, as if my directive was not
taken into account).
Now, while neither of these work with gold:
musl/lib/libc.a:exit.o(.text.exit);
musl/lib/libc.a(.text.exit)
This works:
*(.text.exit);
Is that a missing feature in gold? or am I doing something wrong, maybe there is
another way to refer to a specific symbol of a specific object file in an
archive using gold?
When laying out symbols in the address space using a linker script, ld allows to
refer to a specific symbol coming from a specific object file inside a static
library with the following syntax:
archive.a:object_file.o(.section.symbol_name)
That isn't quite what that syntax means. When you see
".section.symbol_name" in the linker script (or in a readelf or
objdump list of sections), that is the whole name of the section, and
you'll only see sections with names like that if you use the
-ffunction-sections option when compiling. Given that your script
works with ld, and if you just use the full filename wild card with
gold, it looks like your musl libraries were indeed compiled with
-ffunction-sections, but that's not something you can always assume is
true for system libraries. So the linker isn't really searching for a
section named ".text" that defines a symbol named "exit" -- instead,
it's simply looking for a section named ".text.exit". Subtle
difference, but you should be aware of it.
Now, while neither of these work with gold:
musl/lib/libc.a:exit.o(.text.exit);
musl/lib/libc.a(.text.exit);
This works:
*(.text.exit);
Is that a missing feature in gold? or am I doing something wrong, maybe there is
another way to refer to a specific symbol of a specific object file in an
archive using gold?
If you look at the resulting -Map output file, I suspect you'll see
the name of the object file is written as "musl/lib/libc.a(exit.o)".
That's the spelling you need to use in the script, and because of the
parentheses, you need to quote it. This:
"musl/lib/libc.a(exit.o)"(.text.exit)
should work. If you want something that will work in both linkers, try
something like this:
"musl/lib/libc.a*exit.o*"(.text.exit)
or just
"*exit.o*"(.text.exit)

How do I emulate objdump --dwarf=decodedline in .bundle files?

I've been successfully using objdump --dwarf=decodedline to find the source location of each offset in a .so file on Linux.
Unfortunately on Mac-OS X. It seems that .bundle files (used as shared libraries) are not queriable in this manner.
I'm optimistic that there's something I can do, because gdb is able to correctly debug and step through code in these bundles — does anyone know what it's doing?
Further information:
The dwarfdump utility claims that the .bundle file contains no DWARF data, but that it does contain STABS data; however objdump --stabs cannot find any stabs data either.
(If it makes the question easier to answer, I don't actually need all of the offsets; being able to query the source location of any given offset would be good enough).
The bundle file I've been testing this on was generated using:
cc -dynamic -bundle -undefined suppress -flat_namespace -g -o c_location.bundle c_location.o -L. -L/Users/User/.rvm/rubies/ruby-1.8.7-p357/lib -L. -lruby -ldl -lobjc
The original c_location.o file does contain the necessary information for objdump --dwarf=decodedline to work.
So it turns out that one way to do this is to use Apple's nm -pa *.bundle to find the symbol name and the original object file for a given offset.
Once you have that, you can first use objdump -tT to find the offset of the symbol name in the original object file; and then use objdump --dwarf=decodedline as before.
Each step requires a little bit of simplistic output parsing, but it does seem to work™. I'd be interested if there are more robust approaches.

Resources