I am writing some firmware code for an ARM Cortex-M0 microcontroller (specifically, the STM32F072B as part of the STM32 Discovery dev board).
My linker script does not do anything special, it just fills out the vector table and then includes all the text and data sections from my code:
OUTPUT_FORMAT("elf32-littlearm")
MEMORY {
ROM (rx) : ORIGIN = 0x00000000, LENGTH = 16K
FLASH (r) : ORIGIN = 0x08000000, LENGTH = 64K
RAM (rw) : ORIGIN = 0x20000000, LENGTH = 16K
}
ENTRY(_start)
PROVIDE(__stack_top = ORIGIN(RAM) + LENGTH(RAM));
SECTIONS {
.vector_table : {
LONG(__stack_top); /* 00 */
LONG(_start); /* 04 */
LONG(dummy_isr); /* 08 */
LONG(dummy_isr); /* 0C */
LONG(dummy_isr); /* 10 */
LONG(dummy_isr); /* 14 */
LONG(dummy_isr); /* 18 */
LONG(dummy_isr); /* 1C */
LONG(dummy_isr); /* 20 */
LONG(dummy_isr); /* 24 */
LONG(dummy_isr); /* 28 */
LONG(dummy_isr); /* 2C */
LONG(dummy_isr); /* 30 */
LONG(dummy_isr); /* 34 */
LONG(dummy_isr); /* 38 */
LONG(dummy_isr); /* 3C */
LONG(dummy_isr); /* 40 */
LONG(dummy_isr); /* 44 */
LONG(dummy_isr); /* 48 */
LONG(dummy_isr); /* 4C */
LONG(dummy_isr); /* 50 */
LONG(dummy_isr); /* 54 */
LONG(dummy_isr); /* 58 */
LONG(dummy_isr); /* 5C */
LONG(dummy_isr); /* 60 */
LONG(dummy_isr); /* 64 */
LONG(dummy_isr); /* 68 */
LONG(dummy_isr); /* 6C */
LONG(dummy_isr); /* 70 */
LONG(dummy_isr); /* 74 */
LONG(dummy_isr); /* 78 */
LONG(dummy_isr); /* 7C */
LONG(dummy_isr); /* 80 */
LONG(dummy_isr); /* 84 */
LONG(dummy_isr); /* 88 */
LONG(dummy_isr); /* 8C */
LONG(dummy_isr); /* 90 */
LONG(dummy_isr); /* 94 */
LONG(dummy_isr); /* 98 */
LONG(dummy_isr); /* 9C */
LONG(dummy_isr); /* A0 */
LONG(dummy_isr); /* A4 */
LONG(dummy_isr); /* A8 */
LONG(dummy_isr); /* AC */
LONG(dummy_isr); /* B0 */
LONG(dummy_isr); /* B4 */
LONG(dummy_isr); /* B8 */
LONG(dummy_isr); /* BC */
} > ROM AT > FLASH
.text : {
*(.text*)
} > ROM AT > FLASH
.rodata : {
*(.rodata*)
*(.data.rel.ro)
} > FLASH
.bss (NOLOAD) : {
*(.bss*)
*(COMMON)
} > RAM
.data : {
*(.data*)
} > RAM
.ARM.exidx : {
*(.ARM.exidx)
} > FLASH
}
When I build and link an ELF file and dump the symbols, I notice that the addresses that end up in the .vector_table section, as well as the ELF entry point, are all off by one:
[shell]$ llvm-objdump --syms zig-cache/bin/main-flash
zig-cache/bin/main-flash: file format elf32-littlearm
SYMBOL TABLE:
00000000 l df *ABS* 00000000 main-flash
0000013c l .text 00000000 $d.1
000000c0 l .text 00000000 $t.0
000000c4 g F .text 00000088 _start
000000c0 g F .text 00000002 dummy_isr
20004000 g *ABS* 00000000 __stack_top
[shell]$ llvm-objdump --full-contents --section=.vector_table zig-cache/bin/main-flash
zig-cache/bin/main-flash: file format elf32-littlearm
Contents of section .vector_table:
0000 00400020 c5000000 c1000000 c1000000 .#. ............
0010 c1000000 c1000000 c1000000 c1000000 ................
0020 c1000000 c1000000 c1000000 c1000000 ................
0030 c1000000 c1000000 c1000000 c1000000 ................
0040 c1000000 c1000000 c1000000 c1000000 ................
0050 c1000000 c1000000 c1000000 c1000000 ................
0060 c1000000 c1000000 c1000000 c1000000 ................
0070 c1000000 c1000000 c1000000 c1000000 ................
0080 c1000000 c1000000 c1000000 c1000000 ................
0090 c1000000 c1000000 c1000000 c1000000 ................
00a0 c1000000 c1000000 c1000000 c1000000 ................
00b0 c1000000 c1000000 c1000000 c1000000 ................
[shell]$ readelf -h zig-cache/bin/main-flash
ELF Header:
...
Entry point address: 0xc5
The symbol table shows _start at 0xC4, while the ELF entry point, which is defined in the linker script to be _start, is set to 0xC5. Similarly, the address of dummy_isr written into the vector table is also off-by-one (the dummy_isr symbol is defined as 0xC0, while 0xC1 is written by the linker into the vector table). The disassembly of .text confirms that _dummy_isr and _start begin at 0xC0 and 0xC4, respectively, so the address that the linker is writing is wrong:
[shell]$ llvm-objdump --disassemble --section=.text zig-cache/bin/main-flash
zig-cache/bin/main-flash: file format elf32-littlearm
Disassembly of section .text:
000000c0 <dummy_isr>:
c0: fe e7 b #-4 <dummy_isr>
c2: c0 46 mov r8, r8
000000c4 <_start>:
c4: 82 b0 sub sp, #8
c6: 01 23 movs r3, #1
c8: d8 04 lsls r0, r3, #19
ca: 1c 49 ldr r1, [pc, #112]
...
0xC1 and 0xC5 are not even the addresses of valid instructions, they are each in the middle of an instruction. What could cause this discrepancy?
This is called an "interworking address".
The least significant bit of the address indicates whether the target instruction is ARM (0) or Thumb (1). The address fetched always has the LSB set to zero.
Since this platform only works in Thumb mode, all vector addresses and addresses used with the BX and BLX instruction must be odd (the X means (ex)change instruction set).
Related
I'm practicing reverse engineering C object files. Suppose I have an object file of the C program:
#include <stdio.h>
#include <string.h>
int main (int argc, char ** argv) {
char * input = argv[1];
int result = strcmp(input, "text_to_compare");
if (result == 0) {
printf("%s\n", "text matches");
}
else {
printf("%s\n", "text doeesn't match");
}
return 0;
}
How would I go about finding "text_to_compare" from the object file given it was compiled with a -g flag and an x86-64 architecture?
Running strings on a binary file will all sequences of four or more printable characters in the file. For a simple file this might be sufficient, but for a larger file you can end up with a lot of false positives. For example, compiling your code with gcc and running strings on the resulting binary will return 295 results.
We can start by using the objdump command to disassemble the code in your sample file:
$ objdump --disassemble=main a.out
a.out: file format elf64-x86-64
Disassembly of section .init:
Disassembly of section .plt:
Disassembly of section .text:
0000000000401136 <main>:
401136: 55 push %rbp
401137: 48 89 e5 mov %rsp,%rbp
40113a: 48 83 ec 20 sub $0x20,%rsp
40113e: 89 7d ec mov %edi,-0x14(%rbp)
401141: 48 89 75 e0 mov %rsi,-0x20(%rbp)
401145: 48 8b 45 e0 mov -0x20(%rbp),%rax
401149: 48 8b 40 08 mov 0x8(%rax),%rax
40114d: 48 89 45 f8 mov %rax,-0x8(%rbp)
401151: 48 8b 45 f8 mov -0x8(%rbp),%rax
401155: be 10 20 40 00 mov $0x402010,%esi
40115a: 48 89 c7 mov %rax,%rdi
40115d: e8 de fe ff ff call 401040 <strcmp#plt>
401162: 89 45 f4 mov %eax,-0xc(%rbp)
401165: 83 7d f4 00 cmpl $0x0,-0xc(%rbp)
401169: 75 0c jne 401177 <main+0x41>
40116b: bf 20 20 40 00 mov $0x402020,%edi
401170: e8 bb fe ff ff call 401030 <puts#plt>
401175: eb 0a jmp 401181 <main+0x4b>
401177: bf 2d 20 40 00 mov $0x40202d,%edi
40117c: e8 af fe ff ff call 401030 <puts#plt>
401181: b8 00 00 00 00 mov $0x0,%eax
401186: c9 leave
401187: c3 ret
Disassembly of section .fini:
Looking at the disassembly, we can see a call to strcmp at offset 40115d:
40115d: e8 de fe ff ff call 401040 <strcmp#plt>
If we look a couple of lines before that, we can see a instruction that is moving data from an address outside of this section (0x402010):
401155: be 10 20 40 00 mov $0x402010,%esi
If we look at the output of objdump -h a.out, we see that this address falls in the .rodata section (we're looking for sections for which the given address is in the block of memory starting at the address in the VMA column):
$ objdump -h a.out
Idx Name Size VMA LMA File off Algn
[...]
15 .rodata 00000041 0000000000402000 0000000000402000 00002000 2**3
CONTENTS, ALLOC, LOAD, READONLY, DATA
[...]
We can extract the data in that section using the objcopy command:
$ objcopy -j .rodata -O binary a.out >(xxd -o 0x402000)
00402000: 0100 0200 0000 0000 0000 0000 0000 0000 ................
00402010: 7465 7874 5f74 6f5f 636f 6d70 6172 6500 text_to_compare.
00402020: 7465 7874 206d 6174 6368 6573 0074 6578 text matches.tex
00402030: 7420 646f 6565 736e 2774 206d 6174 6368 t doeesn't match
00402040: 00 .
And we can see that the string at address 0x402010 is text_to_compare.
So i have a simple linker script for my stm32f7 mcu
MEMORY{
ROM_AXIM (rx) : ORIGIN = 0x08000000, LENGTH = 1M
ROM_ITCM (rx) : ORIGIN = 0x00200000, LENGTH = 1M
RAM_ITCM (rwx): ORIGIN = 0x00000000, LENGTH = 16K
RAM_DTCM (rwx): ORIGIN = 0x20000000, LENGTH = 64K
SRAM (rwx): ORIGIN = 0x20010000, LENGTH = 240K
SRAM2 (rwx): ORIGIN = 0x2004C000, LENGTH = 16K
}
_estack = LENGTH(RAM_DTCM) + ORIGIN(RAM_DTCM);
SECTIONS{
.isr_vector : {
KEEP(*(.isr_vector))
} /* Placed at 0x0 */
.text : {
. = ALIGN(4);
*(.text)
} >ROM_ITCM
.data : {
. = ALIGN(4);
_sdata = .;
*(.data)
. = ALIGN(4);
_sdata = .;
} >SRAM2 AT>ROM_AXIM
.bss : {
. = ALIGN(4);
_sbss = .;
*(.bss)
. = ALIGN(4);
_ebss = .;
} >SRAM2
}
The idea is to place text section to ROM_ITCM because instruction fetching is accelerated with ART accelerator. But the problem is that ROM_AXIM and ROM_ITCM is the same flash storage. How to tell linker that is physically same storage but accessed on separate buses. So it links like it is two separate buses, but the text section should actually follow .isr_vector immediately in memory and offset is taken into account
For example, here is my bin file that will go to flash:
00000000 00 00 01 20 01 00 20 00 00 00 00 00 00 00 00 00 |... .. .........|
00000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00200000 00 20 70 47 |. pG|
00200004
As you can see a lot flash is wasted and it ill try to write this bin beyond flash boundary as well.
VS:
00000000 00 00 01 20 09 00 00 08 00 20 70 47
0000000c
this hexdump, is what i am looking for but, as you can see Reset_Handler has address of AXIM bus. What i want to do is by using the linker script provided above to get an output like this:
00000000 00 00 01 20 09 00 20 00 00 20 70 47
0000000c
The difference here is that it will use 0x00200008 to look for my reset handler.
What i have tried so far:
.text : {
. = ALIGN(4);
*(.text)
} >ROM_ITCM AT>ROM_AXIM
This one would work, but the problem is that it will give this output
00000000 00 00 01 20 01 00 20 00 00 20 70 47 |... .. .. pG|
0000000c
which will load instruction at 0x00200000 and by doing that it will load first entry of vector table (stack pointer) as an instruction
I managed to solve a problem by consulting a gnu linker page. What i did was to specify runtime offset of a section like this.
.text (0x00200000 + SIZEOF(.isr_vector)): {
. = ALIGN(4);
*(.text)
} AT>ROM_AXIM
What it does is:
.text (0x00200000 + SIZEOF(.isr_vector)) specify run-time address with an offset of vector table size. my pointers are now resolved correctly
AT>ROM_AXIM places the code right after vector table which produced the offset in first place. and above line fixes it.
-
I'm playing with tail calls feature of BPF, and it seems the simple code doesn't get loaded:
struct bpf_map_def SEC("maps") jmp_table = {
.type = BPF_MAP_TYPE_PROG_ARRAY,
.key_size = sizeof(u32),
.value_size = sizeof(u32),
.max_entries = 8,
};
SEC("sockops")
int bpf1(struct bpf_sock_ops *sk_ops)
{
return 1;
}
SEC("sockops")
int bpf2(struct bpf_sock_ops *sk_ops)
{
return 1;
}
SEC("sockops")
int bpf3(struct bpf_sock_ops *sk_ops)
{
return 1;
}
SEC("sockops")
int bpf_main(struct bpf_sock_ops *sk_ops)
{
__u32 port = bpf_ntohl(sk_ops->remote_port);
switch (port) {
case 5000:
bpf_tail_call(sk_ops, &jmp_table, 1);
break;
case 6000:
bpf_tail_call(sk_ops, &jmp_table, 2);
break;
case 7000:
bpf_tail_call(sk_ops, &jmp_table, 3);
break;
}
sk_ops->reply = 0;
return 1;
}
char _license[] SEC("license") = "GPL";
u32 _version SEC("version") = LINUX_VERSION_CODE;
So I compiled it with llv-3.8 and loaded with bpftool:
$ sudo ./bpftool prog load bpf_main.o /sys/fs/bpf/p1
libbpf: load bpf program failed: Invalid argument
libbpf: -- BEGIN DUMP LOG ---
libbpf:
unreachable insn 2
libbpf: -- END LOG --
libbpf: failed to load program 'sockops'
libbpf: failed to load object 'bpf_main.o'
Error: failed to load program
So man 2 bpf mentions that:
EINVAL For BPF_PROG_LOAD, indicates an attempt to load an invalid program. eBPF programs can be deemed invalid due to unrecognized instructions, the use of reserved fields, jumps out of range, infinite loops or calls of unknown functions.
I don't see what is wrong with this tiny simple program, also llvm-objdump fails:
$ llvm-objdump-3.8 -arch-name=bpf -disassemble ./tcp_metrics_kern.o
./tcp_metrics_kern.o: file format ELF64-unknown
LLVM ERROR: error: no disassembler for target bpfel-unknown-unknown
UPDATE 1
Following Qeole's advice I upgraded to clang-5.0, rebuilt my program and now it complains differently:
$ sudo ./bpftool prog load bpf_main.o /sys/fs/bpf/p1
libbpf: relocation failed: no 10 section
Error: failed to load program
Now I can investigate ELF sections:
$ llvm-objdump-5.0 -disassemble -source ./bpf_main.o
./bpf_main.o: file format ELF64-BPF
Disassembly of section sockops:
bpf1:
0: b7 00 00 00 01 00 00 00 r0 = 1
1: 95 00 00 00 00 00 00 00 exit
bpf2:
2: b7 00 00 00 01 00 00 00 r0 = 1
3: 95 00 00 00 00 00 00 00 exit
bpf3:
4: b7 00 00 00 01 00 00 00 r0 = 1
5: 95 00 00 00 00 00 00 00 exit
bpf_main:
...
Here are available sections:
$ llvm-objdump-5.0 -section-headers ./bpf_main.o
./bpf_main.o: file format ELF64-BPF
Sections:
Idx Name Size Address Type
0 00000000 0000000000000000
1 .strtab 000000a5 0000000000000000
2 .text 00000000 0000000000000000 TEXT DATA
3 sockops 000001f8 0000000000000000 TEXT DATA
4 .relsockops 00000030 0000000000000000
5 maps 0000001c 0000000000000000 DATA
6 .rodata.str1.16 00000021 0000000000000000 DATA
7 .rodata.str1.1 0000000e 0000000000000000 DATA
8 license 00000004 0000000000000000 DATA
9 version 00000004 0000000000000000 DATA
10 .eh_frame 00000090 0000000000000000 DATA
11 .rel.eh_frame 00000040 0000000000000000
12 .symtab 00000138 0000000000000000
It looks that bpftool can't find section .eh_frame?
UPDATE 2
I continue experimenting :-) First of all I updated libbpf with latest commit d77be68955475fc2321e73fe006240248f2f8fef fixing string comparison, then I rebuild the program with -fno-asynchronous-unwind-tables, this does not include .eh_frame section, and also I gave unique section names, e.g. sockops0, sockops1 etc. Now bpftool prog load .. succeeds but bpftool prog show dumps only a single program, the one that goes very first, in my case it is bpf1().
At the moment I can say that bpf_object__load_progs() reports obj->nr_programs as 4, this makes sense for my example.
This might seem a strange question, but I'm generating a binary file and need to put some data in the header.
I'm using gcc and a fairly standard Cortex M4 bare-bones linker script.
Instead of putting the ISR vector first in the binary, I'm putting my own header. The binary will be copied to a pre-determined memory location (0x20008000, to which it has been linked) and run from there.
my startup.s contains this:
.section .isr_vector,"a",%progbits
.type g_pfnVectors, %object
.size g_pfnVectors, .-g_pfnVectors
g_pfnVectors:
.word 0xDADAC0DE /* magic number */
.word .isr_vector /* link base address */
.word Reset_Handler /* code entry point */
.word _end /* stack start */
.word _estack /* stack end */
.word ProgramVector /* pointer to shared memory block */
This all works fine, with the exception of ProgramVector. I define it in my main.c as follows:
typedef struct {
uint8_t checksum;
int16_t* audio_input;
int16_t* audio_output;
...
} SharedMemory;
SharedMemory ProgramVector;
I would expect the binary to include the address to ProgramVector. If I compare the output to what I see in the linker map file, everything matches (Reset_Handler, _end, _estack) but not ProgramVector.
My binary file app.bin :
00000000 de c0 da da 00 80 00 20 0d 9a 00 20 d8 ce 00 20 |....... ... ... |
00000010 00 c0 01 20 60 ce 00 20 60 ce 00 20 4f 57 4c 20 |... `.. `.. OWL |
00000020 50 72 6f 67 72 61 6d 00 10 b5 05 4c 23 78 33 b9 |Program....L#x3.|
From which we can determine that ProgramVector is 0x2000ce60, while my map files says:
0x2001c000 _estack = 0x2001c000
...
.bss 0x2000ae28 0x38 Build/main.o
0x2000ae28 ProgramVector
...
0x2000ced8 PROVIDE (_end, .)
Now you might say that ProgramVector is not a pointer, which is true enough. But I would expect the linker to output the address where it is placed. If I make it a SharedMemory* pointer, the output has the correct address of the pointer. The problem is I need the address of the struct, and I need it before the program has initialised.
I have tried this with a variety of compiler and linker flags, to no avail. Current compilation looks like this:
arm-none-eabi-gcc -Wl,--gc-sections -TSource/flash.ld -mcpu=cortex-m4 -mthumb -mfloat-abi=hard -mfpu=fpv4-sp-d16 -o Build/app.elf Build/sta./Build/libnosys_gnu.o -lm
arm-none-eabi-objcopy -O binary Build/solo.elf Build/solo.bin
I'm probably missing something obvious but going mad trying to figure out what. Any help, pointers or advice appreciated!
I have a simple code that I am trying to compile with lm32-rtems4.11-gcc.
I have the code, the compile command and the lst below. When I compile I see a bunch of code added on the top instead of the startup code that I want in there. The code I want the processor to start with after reset is at location 3f4 instead of 0. What I wanted help on is to figure out how the rest of the code got in and find a way to remove it or move all that code to addresses after my code. I appreciate the help.
Thanks
The code:
//FILE: crt.S
.globl _start
.text
_start:
xor r0, r0, r0
mvhi sp, hi(_fstack)
ori sp, sp, lo(_fstack)
mv fp,r0
mvhi r1, hi(_fbss)
ori r1, r1, lo(_fbss)
mvhi r2, hi(_ebss)
ori r2, r2, lo(_ebss)
1:
bge r1, r2, 2f
sw (r1+0), r0
addi r1, r1, 4
bi 1b
2:
calli main
mvhi r1, 0xdead
ori r2, r0, 0xbeef
sw (r1+0), r2
//FILE: hello_world.c
void putc(char c)
{
char *tx = (char*)0xff000000;
*tx = c;
}
void puts(char *s)
{
while (*s) putc(*s++);
}
void main(void)
{
puts("Hello World\n");
}
//FILE: linker.ld
OUTPUT_FORMAT("elf32-lm32")
ENTRY(_start)
__DYNAMIC = 0;
MEMORY {
pmem : ORIGIN = 0x00000000, LENGTH = 0x8000
dmem : ORIGIN = 0x00008000, LENGTH = 0x8000
}
SECTIONS
{
.text :
{
_ftext = .;
*(.text .stub .text.* .gnu.linkonce.t.*)
_etext = .;
} > pmem
.rodata :
{
. = ALIGN(4);
_frodata = .;
*(.rodata .rodata.* .gnu.linkonce.r.*)
*(.rodata1)
_erodata = .;
} > dmem
.data :
{
. = ALIGN(4);
_fdata = .;
*(.data .data.* .gnu.linkonce.d.*)
*(.data1)
_gp = ALIGN(16);
*(.sdata .sdata.* .gnu.linkonce.s.*)
_edata = .;
} > dmem
.bss :
{
. = ALIGN(4);
_fbss = .;
*(.dynsbss)
*(.sbss .sbss.* .gnu.linkonce.sb.*)
*(.scommon)
*(.dynbss)
*(.bss .bss.* .gnu.linkonce.b.*)
*(COMMON)
. = ALIGN(4);
_ebss = .;
_end = .;
} > dmem
}
The compile command
lm32-rtems4.11-gcc -Tlinker.ld -fno-builtin -o hello_world.elf crt.S hello_world.c
lm32-rtems4.11-objdump -DS hello_world.lst hello_world.elf
The lst file
00000000 <rtems_provides_crt0>:
#include <signal.h> /* sigset_t */
#include <time.h> /* struct timespec */
#include <unistd.h> /* isatty */
void rtems_provides_crt0( void ) {} /* dummy symbol so file always has one */
0: c3 a0 00 00 ret
00000004 <rtems_stub_malloc>:
#define RTEMS_STUB(ret, func, body) \
ret rtems_stub_##func body; \
ret func body
/* RTEMS provides some of its own routines including a Malloc family */
RTEMS_STUB(void *,malloc(size_t s), { return 0; })
4: 34 01 00 00 mvi r1,0
8: c3 a0 00 00 ret
0000000c <malloc>:
c: 34 01 00 00 mvi r1,0
10: c3 a0 00 00 ret
.
.
.
//omitting other such unrelated code that was inserted into the code and going to the
//code at 3f4 that is the code I wanted at 0
000003f0 <__assert_func>:
3f0: c3 a0 00 00 ret
000003f4 <_start>:
3f4: 98 00 00 00 xor r0,r0,r0
3f8: 78 1c 00 00 mvhi sp,0x0
3fc: 3b 9c ff fc ori sp,sp,0xfffc
400: b8 00 d8 00 mv fp,r0
404: 78 01 00 00 mvhi r1,0x0
408: 38 21 84 48 ori r1,r1,0x8448
40c: 78 02 00 00 mvhi r2,0x0
410: 38 42 84 48 ori r2,r2,0x8448
414: 4c 22 00 04 bge r1,r2,424 <_start+0x30>
418: 58 20 00 00 sw (r1+0),r0
41c: 34 21 00 04 addi r1,r1,4
420: e3 ff ff fd bi 414 <_start+0x20>
424: f8 00 00 28 calli 4c4 <main>
428: 78 01 de ad mvhi r1,0xdead
42c: 38 02 be ef mvu r2,0xbeef
430: 58 22 00 00 sw (r1+0),r2
.
.
.
As far as the .elf object you have generated is concerned, execution starts from 0x3f4, not from location 0. That's a result of your linker map specifying the entry point as the _start symbol. Whatever parses the .elf object should jump to that location when transferring execution to the program.
Now, perhaps an .elf object is not what you want to end up with - if the result isn't to be loaded by something which knows how to parse an .elf object, then you may need some other format, such as a flat binary image.
It's quite common when using a gcc elf toolchain with a small embedded chip to turn the .elf object into a flat binary using a command along the lines of
toolchain-prefix-objcopy -O binary something.elf something.bin
It's also possible you may need to create some sort of stub to jump to the _start label, and adjust your linker map to make sure that is the first thing in the image.
More generally though, you can probably find a working example for this toolchain and either this processor or a comparable one. Setting up embedded build systems from scratch is a bit tricky, so don't do it the hard way if there's any chance of finding an example to follow.
So I could not figure out why the compiler does not move the .start label to 0 when the linker.ld clearly tells it to do so. But I did figure a work around.
I created a section name for the startup code as shown in BOLD below. I then created a section in memory starting at 0 which I reserved only for this start up code. That seemed to do the trick. I ran the code and got a hello world :) . All the changes I made are in BOLD and also commented //Change 1 //Change 2 and //Change 3.
//FILE: crt.S
.section .init// Change 1
.globl _start
.text
_start:
xor r0, r0, r0
mvhi sp, hi(_fstack)
ori sp, sp, lo(_fstack)
mv fp,r0
mvhi r1, hi(_fbss)
ori r1, r1, lo(_fbss)
.
.
//linker.ld
OUTPUT_FORMAT("elf32-lm32")
ENTRY(_start)
__DYNAMIC = 0;
MEMORY {
init : ORIGIN = 0x00000000, LENGTH = 0x40 //Change 2
pmem : ORIGIN = 0x00000040, LENGTH = 0x8000
dmem : ORIGIN = 0x00008000, LENGTH = 0x8000
}
SECTIONS
{
.init : {*(.init)}>init //Change 3
.text :
{
_ftext = .;
*(.text .stub .text.* .gnu.linkonce.t.*)
_etext = .;
} > pmem