logical and physical adress in C code in real mode - c

Suppose I write boot loader on C. What happens when I create some global variable? What is it's logical address? How does it correspond to physical address? For example if I created some string (global)
const char* s = "some string";
Am I right that s stored in .data section? What would be the physical address of s and what would be a logical one? Should we do some extra work to make this addresses correspond each other.
My OS is Linux and I compile my code like this:
as --32 boot.S -o boot.o
gcc -c -m32 -g -Os -ffreestanding -Wall -Werror -I. -o mbr.o mbr.c
ld -Tlinker.ld -nostdlib -o mbr boot.o mbr.o
boot.S is just where I initilize some registers and call c code:
.code16
.text
.global _start
_start:
cli
xor %ax, %ax
mov %ax, %ds
mov %ax, %es
mov %ax, %ss
mov $0x7c00, %sp
ljmp $0, $mmain
mmain -- function in C code. My linker script is:
OUTPUT_FORMAT(binary)
OUTPUT_ARCH(i8086)
ENTRY(_start)
SECTIONS
{
. = 0x7C00;
.text : { *(.text) }
.sig : AT(0x7DFE)
{
SHORT(0xaa55);
}
}

Related

Linking a compiled assembly and C file with ld

I have compiled these programs:
BITS 16
extern _main
start:
mov ax, 07C0h
add ax, 288
mov ss, ax
mov sp, 4096
mov ax, 07C0h
mov ds, ax
mov si, text_string
call print_string
jmp $
text_string db 'Calling Main Script'
call _main
print_string:
mov ah, 0Eh
.repeat:
lodsb
cmp al, 0
je .done
int 10h
jmp .repeat
.done:
ret
times 510-($-$$) db 0
dw 0xAA55
and this as a test just to try linking them
int main()
{
return 0;
}
both compile completely fine on their own using:
gcc -Wall -m32 main.c
nasm -f elf bootloader.asm
however I cannot link them using:
ld bootloader.o main.o -lc -I /lib/Id-linux.so.2
and I get this error:
ld: i386 architecture of input file `bootloader.o' is incompatible with i386:x86-64 output
ld: i386 architecture of input file `main.o' is incompatible with i386:x86-64 output
ld: warning: cannot find entry symbol _start; defaulting to 0000000000401000
ld: bootloader.o: file class ELFCLASS32 incompatible with ELFCLASS64
ld: final link failed: file in wrong format
Any help would be great thanks
GCC by default already dynamically linking with libc, so if you want linking manually using ld, be sure make your ELF executable static, you can passing with -static flag.
gcc -o <filename> <filename>.c -static -Wall -m32 then link with ld -m elf_i386 -dynamic-linker /lib/ld-linux.so.2 -o <filename> -lc <filename>.o
I guess, since assembler like NASM has statically (stand-alone without libc) you can make ELF dynamic executable directly with libc, you can passing with -dynamic-linker flag.
For example :
x86
nasm -f elf32 -o <filename>.o <filename>.asm
ld -m elf_i386 -dynamic-linker /lib/ld-linux.so.2 -o <filename> -lc <filename>.o
x86_64
nasm -f elf64 -o <filename>.o <filename>.asm
ld -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o <filename> -lc <filename>.o
In case you just want to do some simple assembly programming on your PC, don't actually need 16bit code, and don't want to dive into bootloaders and OS development, you can get started much more easily by writing 32bit (IA32) or 64bit (AMD64) application code. Instead of BIOS interrupts, you'd use (Linux) system calls.
An example "hello world" for i386 would be:
.section .text._start
.global _start
.type _start, %function
_start:
mov $4, %eax
mov $1, %ebx
mov $message, %ecx
mov $14, %edx
int $0x80
mov $1, %eax
xor %ebx, %ebx
int $0x80
.section .rodata.message
.type message, %object
message:
.ascii "Hello, World!\n"
Assemble, link and execute via
as --32 test32.S -o test32.o && ld -m elf_i386 test32.o -o test32 && ./test32
The same thing for AMD64:
.section .text._start
.global _start
.type _start, %function
_start:
mov $1, %rax
mov $1, %rdi
mov $message, %rsi
mov $14, %rdx
syscall
mov $0x3c, %rax
xor %rdi, %rdi
syscall
.section .rodata.message
.type message, %object
message:
.ascii "Hello, World!\n"
Assemble, link and execute via
as --64 test64.S -o test64.o && ld -m elf_x86_64 test64.o -o test64 && ./test64
Just for fun, the same thing for ARM (32bit):
.syntax unified
.arch armv6
.arm
.section .text._start
.global _start
.type _start, %function
_start:
movs r7, #4
movs r0, #1
ldr r1, =#message
movs r2, #14
svc #0
movs r7, #1
movs r0, #0
svc #0
.ltorg
.section .rodata.message
.type message, %object
message:
.ascii "Hello, World!\n"
Assemble, link and execute via (e.g. on a Raspberry PI or Beaglebone):
as testarm.S -o testarm.o && ld testarm.o -o testarm && ./testarm

Failed to pass constant string as parameter in C function

I copied the bootasm.S from https://github.com/jeffallen/xv6/blob/master/bootasm.S,
#include "asm.h"
# Start the first CPU: switch to 32-bit protected mode, jump into C.
# The BIOS loads this code from the first sector of the hard disk into
# memory at physical address 0x7c00 and starts executing in real mode
# with %cs=0 %ip=7c00.
#define SEG_KCODE 1 // kernel code
#define SEG_KDATA 2 // kernel data+stack
#define CR0_PE 1 // protected mode enable bit
.code16 # Assemble for 16-bit mode
.globl start
start:
cli # BIOS enabled interrupts; disable
# Set up the important data segment registers (DS, ES, SS).
xorw %ax,%ax # Segment number zero
movw %ax,%ds # -> Data Segment
movw %ax,%es # -> Extra Segment
movw %ax,%ss # -> Stack Segment
# Physical address line A20 is tied to zero so that the first PCs
# with 2 MB would run software that assumed 1 MB. Undo that.
seta20.1:
inb $0x64,%al # Wait for not busy
testb $0x2,%al
jnz seta20.1
movb $0xd1,%al # 0xd1 -> port 0x64
outb %al,$0x64
seta20.2:
inb $0x64,%al # Wait for not busy
testb $0x2,%al
jnz seta20.2
movb $0xdf,%al # 0xdf -> port 0x60
outb %al,$0x60
# Switch from real to protected mode. Use a bootstrap GDT that makes
# virtual addresses map dierctly to physical addresses so that the
# effective memory map doesn't change during the transition.
lgdt gdtdesc
movl %cr0, %eax
orl $CR0_PE, %eax
movl %eax, %cr0
# Complete transition to 32-bit protected mode by using long jmp
# to reload %cs and %eip. The segment registers are set up with no
# translation, so that the mapping is still the identity mapping.
ljmp $(SEG_KCODE<<3), $start32
.code32 # Tell assembler to generate 32-bit code now.
start32:
# Set up the protected-mode data segment registers
movw $(SEG_KDATA<<3), %ax # Our data segment selector
movw %ax, %ds # -> DS: Data Segment
movw %ax, %es # -> ES: Extra Segment
movw %ax, %ss # -> SS: Stack Segment
xor %eax, %eax # Zero segments not ready for use
movw %ax, %fs # -> FS
movw %ax, %gs # -> GS
## sti TaoWang: It should NOT call STI here, since NO IDT is ready.
# Set up the stack pointer and call into C.
movl $start, %esp
call bootmain
spin:
jmp spin
# Bootstrap GDT
.p2align 2 # force 4 byte alignment
gdt:
SEG_NULLASM # null seg
SEG_ASM(STA_X|STA_R, 0x0, 0xffffffff) # code seg
SEG_ASM(STA_W, 0x0, 0xffffffff) # data seg
gdtdesc:
.word (gdtdesc - gdt - 1) # sizeof(gdt) - 1
.long gdt # address gdt
.fill 510-(.-start)
.word 0xaa55
and change the bootmain.c as follows,
#include "types.h"
char serial_buffer[256];
static void my_memcpy(void *dst, void *src, u32 length)
{
u32 i = 0;
for (i = 0; i < length; i ++) {
*(char *)dst = *(char *)src;
}
if (serial_buffer[0] == 'A') {
asm ("cli\nhlt\n");
} else {
asm ("vmcall");
}
}
int bootmain(void)
{
my_memcpy(serial_buffer, "Abcedife", 8);
return 0;
}
void handle_page_fault(void)
{
return;
}
After the code is built through the Makefile (I listed below), the code to load the output binary is here,
unsigned char tempbuf[0x400];
void file_load(char *vmfname)
{
int vmfd = -1;
size_t cnt = 0, offset = 0;
vmfd = open( vmfname, O_RDWR );
if (vmfd < 0) {
exit(2);
}
do {
cnt = read(vmfd, tempbuf, sizeof(tempbuf));
// initialize the virtual-machine registers
memcpy((void *)(CODE_START + offset), tempbuf, cnt);
offset += cnt;
} while (cnt > 0);
close(vmfd);
printf("Loading %ld bytes of VM to run\n", offset);
}
To my surprise, the while loop does NOT execute at all.
Here is my linker.ld, and I run them in Linux 4.4.0.
ENTRY(start);
SECTIONS
{
. = 0x7C00;
.text : AT(0x7C00)
{
_text = .;
*(.text);
_text_end = .;
}
.data :
{
_data = .;
*(.bss);
*(.bss*);
*(.data);
*(.rodata*);
*(COMMON)
_data_end = .;
}
PROVIDE(data = .);
/* The data segment */
.data : {
*(.data)
}
PROVIDE(edata = .);
.bss : {
*(.bss)
}
PROVIDE(end = .);
/DISCARD/ : {
*(.eh_frame .note.GNU-stack)
}
}
The Makefile,
all: test
OBJDUMP=objdump
OBJCOPY=objcopy
CFLAGS = -fno-pic -static -fno-builtin -fno-strict-aliasing -Wall -MD -ggdb -m32 -Werror -fno-omit-frame-pointer
CFLAGS += $(shell $(CC) -fno-stack-protector -E -x c /dev/null >/dev/null 2>&1 && echo -fno-stack-protector)
ASFLAGS = -m32 -gdwarf-2 -Wa,-divide
LDFLAGS += -m $(shell $(LD) -V | grep elf_i386 2>/dev/null)
guest: test_app.c
$(CC) -g2 -Wall -Wextra -Werror $^ -o $#
$(CC) $(CFLAGS) -fno-pic -nostdinc -I. -c bootasm.S
$(CC) $(CFLAGS) -fno-pic -I. -c bootmain.c
$(LD) $(LDFLAGS) -N -e start -Tlinker.ld -o bootblock.o bootasm.o bootmain.o
$(OBJDUMP) -S bootblock.o > bootblock.asm
$(OBJCOPY) -S -O binary -j .text bootblock.o bootblock.bin
clean:
rm -f *.o
rm -f *.d
rm -f test
rm -f *.bin
rm -f bootblock.asm
I don't know why the constant string failed to be passed as the parameter or its content is all '0' ?
If I use an array of char, put the array name as the parameter to myfputs(chararray), it will work well.
I answered this question by referring to Michael's answer about adding -j .data to Makefile, so that data section will be added to the final binary, which can solve the problem.
With the change in the Makefile, now the code can work as expected.
Here is the command line for building the final binary.
guest: test_app.c
$(CC) -g2 -Wall -Wextra -Werror $^ -o $#
$(CC) $(CFLAGS) -nostdinc -I. -c bootasm.S
$(CC) $(CFLAGS) -I. -c bootmain.c
$(LD) $(LDFLAGS) -N -e start -Tlinker.ld -o bootblock.o bootasm.o bootmain.o
$(OBJDUMP) -S bootblock.o > bootblock.asm
$(OBJCOPY) -S -O binary -j .text -j .data -j .bss bootblock.o bootblock.bin

Loading elf-i386 from my boot loader

I am doing operating system project, until now I have my bootloader running. I can load binary file using bios interuppt, but I am unable to load and call C function from ELF file format:
Here is my C program that I want to finally execute:
//build :: cc -m32 -nostdlib -nostdinc -fno-builtin -fno-stack-protector -c -o kmain.o kmain.c
void kmain(){
int a = 5;
for(;;);
}
Here is assembly code to call kmain()
; build :: nasm -f elf loader.asm
[BITS 32]
[GLOBAL start]
[EXTERN kmain]
section .text
start:
mov eax, 0
call kmain
This is my linker script
ENTRY(start)
and this how I am linking everything together
ld -m elf_i386 -T link.ld -o kernel loader.o kmain.o
Now to call start from my bootloader, I am using e_entry offset field from elf header( 24 byte away from starting address) :
xor edx, edx
mov edx, 24
add edx, IMAGE_PMODE_BASE
add ebx, dword[edx]
add ebx, IMAGE_PMODE_BASE
call ebx
where IMAGE_PMODE_BASE is address of elf file loaded in memory.
My question is "Is This the correct way of loading and calling a function in C in ELF file format?".
Thank you for reading, please help.

Using custom main loader with GCC

I wrote the following loader:
GLOBAL _start
EXTERN main
section .text
_start:
xor ebp, ebp ; ebp = 0
pop esi ; esi = argc
mov ecx, esp ; ecx = argv
and esp, 0xFFFF ; align esp
push ecx ; load argv
push esi ; load argc
call main ; call main
push eax ; exit with main's ret value
mov ebx,0
int 80h
And a short main function, now I'm trying to run compile and link these files using gcc, but using the commands
nasm -f elf32 loader.asm
gcc -c -m32 main.c
gcc -m32 main.o loader.o -o main.out
Results in a multiple definition of _start error. I imagine this is because gcc is trying to link his own _start. How can I prevent this from happening?
You haven't told GCC to not link to the standard startup code, so GCC links to it.
To tell GCC to not link in _start, pass in the -nostartfiles flag to GCC when linking.
Note that the standard libraries (stdlib, stdio, etc) will still be linked in, unless you also use the -nodefaultlibs flag. The -nostdlib flag combines the two.

Assembly function call from c

I cannot combine my kernel_entry.asm and main.c. My main.c calls an asm function Sum. Both nasm and gcc compiles respective files. However, the linker gives an error.
Kernel_entry.asm:
[bits 32]
[extern _start]
[global _Sum]
....
_Sum:
push ebp
mov ebp, esp
mov eax, [ebp+8]
mov ecx, [ebp+12]
add eax, ecx
pop ebp
ret
main.c:
....
extern int Sum();
void start() {
....
int x = Sum(4, 5);
....
}
To compile source files, I use following commands:
nasm kernel_entry.asm -f win32 -o kernel_entry.o
gcc -ffreestanding -c main.c -o main.o
....
ld -T NUL -o kernel.tmp -Ttext 0x1000 kernel_entry.o main.o mem.o port_in_out.o screen.o idt.o
Linker gives following error:main.o:main.c:(.text+0xa82): undifened reference to 'Sum'. I tried everything but couldn't find any solution. When I remove asm function call from main.c, it works.
The TL;DR version of the answer is that mixing nasm's -f win32 generates an object file that is not compatible with the GNU toolchain on Windows - you need to use -f elf if you want to link using ld. That is described in NASM's documentation here under sections 7.5 and 7.9.
The hint for me was that by running nm kernel_entry.o generated:
00000000 a .absolut
00000000 t .text
00000001 a #feat.00
U _start
U _Sum
Which basically shows Sum as an undefined symbol. After compiling as ELF, I got:
U _start
00000000 T _Sum
indicating Sum as a recognised symbol in the text section.

Resources