When I specify the output format to be i386, my execute got a SIGSEGV. However, when I use -m elf_i386 option, it worked. Checking man page, these two are different, since OUTPUT_FORMAT is equivalent to -oformat option.
So, what are the differences between the two and which should I use in which cases?
Example code:
File hello.c:
int a = 1;
int b;
void _start() {
/* exit system call */
asm("movl $1,%eax;"
"xorl %ebx,%ebx;"
"int $0x80"
);
}
script.lds: OUTPUT_FORMAT and OUTPUT_ARCH seem to do nothing to help my program running.
/* OUTPUT_FORMAT("elf32-i386"); */
/* OUTPUT_ARCH(i386); */
OUTPUT(hello);
ENTRY(_start);
SECTIONS
{
.text 0x10000:
{
*(.text)
}
.data 0x8000000:
{
*(.data)
}
.bss :
{
*(.bss)
}
}
Commands executes:
gcc -m32 -nostdlib -g -c hello.c -o hello.o
ld -m elf_i386 -T script.lds hello.o
The difference really is that emulation means way more than just OUTPUT_ARCH and OUTPUT_FORMAT. Some of the details are almost obvious, like the difference in default linker scripts that can be seen with --verbose option, some are described in this document, but most of the answers could only be found in sources, like compare emulation script for elf_i386 and emulation script for elf_x86_64. The difference doesn't seem to be that high, but that's not the only difference and what actually bites you in your particular case can't even be seen with a diff between generated (on ld build) ld/eelf_i386.c and ld/eelf_x86_64.c files, because that boils down to the constant that comes from the bfd library and that also depends on emulation.
So, let's drill down a little bit and see what happens. Everywhere down below by script.lds I mean your script with OUTPUT_ARCH and OUTPUT_FORMAT uncommented.
Now, let's take a look at differences in results first:
$ ld -T script.lds hello.o
$ LC_ALL=C objdump -p hello
hello: file format elf32-i386
Program Header:
LOAD off 0x00000000 vaddr 0x00000000 paddr 0x00000000 align 2**21
filesz 0x00010048 memsz 0x00010048 flags r-x
LOAD off 0x00200000 vaddr 0x08000000 paddr 0x08000000 align 2**21
filesz 0x00000004 memsz 0x00000008 flags rw-
STACK off 0x00000000 vaddr 0x00000000 paddr 0x00000000 align 2**4
filesz 0x00000000 memsz 0x00000000 flags rw-
$ ld -m elf_i386 -T script.lds hello.o
$ LC_ALL=C objdump -p hello
hello: file format elf32-i386
Program Header:
LOAD off 0x00001000 vaddr 0x00010000 paddr 0x00010000 align 2**12
filesz 0x00000048 memsz 0x00000048 flags r-x
LOAD off 0x00002000 vaddr 0x08000000 paddr 0x08000000 align 2**12
filesz 0x00000004 memsz 0x00000008 flags rw-
STACK off 0x00000000 vaddr 0x00000000 paddr 0x00000000 align 2**4
filesz 0x00000000 memsz 0x00000000 flags rw-
Notice that the "bad" binary has a PT_LOAD segment with virtual address of zero and alignment of 0x00200000. Virtual address 0 doesn't sound right, but let's see why it really fails. Debugging that is a real fun. If one tries to use gdb, he gets this:
(gdb) run
Starting program: /somewhere/hello
During startup program terminated with signal SIGSEGV, Segmentation fault.
(gdb) bt
No stack.
(gdb) info registers
The program has no registers now.
So the program doesn't even start actually running. Let's look at strace then:
$ strace ./hello
execve("./hello", ["./hello"], [/* 108 vars */]) = -1 EPERM (Operation not permitted)
--- SIGSEGV {si_signo=SIGSEGV, si_code=SI_KERNEL, si_addr=0} ---
+++ killed by SIGSEGV +++
We see that execve() returns EPERM. What kind of permission could fail? Well, that's exactly because of that virtual address of zero, the kernel tries to load our ELF, tries to map file for virtual address 0 and fails to do that because around Linux 2.6.23 times there was a security feature introduced that forbids doing that. But that can be configured, so after a simple
$ echo 0 > /proc/sys/vm/mmap_min_addr
the "bad" binary suddenly starts working.
But we're not about making something work here (yay!), we're about the differences in ld behaviour. What also differs between our "bad" and "good" binaries is the alignment of the loadable segments. And if you think about it for a while, you'll see that ld behaviour is actually absolutely correct, when it has an alignment constraint of 0x1000 it uses virtual address of 0x10000 for the segment start that is correct for this alignment, but when it has an alignment constraint of 0x200000, given that we have instructed it to put our .text into address 0x10000 it has no other choice but to use base virtual address of zero!
So where this alignment requirement comes from? Here we return to our emulation stuff, because the default alignment for both elf_i386 and elf_x86_64 is the maximum page size (got from bfd via bfd_emul_get_maxpagesize()), but that page size is different for these architectures.
You actually can build your binary without elf_i386 emulation, but in order to do that you need to specify the maximum page size via parameter, like:
$ ld -T script.lds -z max-page-size=0x1000 hello.o
This resulting binary will not only work without mmap_min_addr tweaks, but it will also be bit-by-bit identical to the one built with proper elf_i386 emulation.
Getting back to the original questions — the difference is huge and subtle in its details. You definitely want to use the right emulation when you build your software. 99.99% of the time your OUTPUT_FORMAT is going to be something very similar to your emulation parameter.
But. Well. There are some cases. Things you normally don't do. But you can do if you're careful and there is need to, like, for example:
$ head -n 1 script.lds
OUTPUT_FORMAT("srec");
$ ld -T script.lds hello.o
$ file hello
hello: Motorola S-Record; binary data in text format
Exactly the case where your emulation is one thing and OUTPUT_FORMAT is really about output format that you need for some (strange) reason.
But don't try that at home, please, use proper emulations and forget about all this nightmare.
Related
I am using qemu-arm and the ARM Workbench IDE to run/profile an ARM binary which was built with armcc/armlink (an .axf-File, program written in C). This works fine with Cortex-A9 and ARM926/ARM5TE. However, whatever I tried, it doesnt work when the binary is built for Cortex-M4. Both the simulator and qemu-arm hang when M4 is selected as CPU.
I know that this processor requires some additional startup code, but I could find any comprehensive tutorial on how to get it running. Does anyone know how to do this? I have a quite big project with one main function, but it would already help if a "hello world" or some simple program which takes arguments would run.
Here is the command line I am using with Cortex-A9:
qemu-system-arm -machine versatileab -cpu cortex-a9 -nographic -monitor null -semihosting -append 'some program arguments' -kernel program.axf
I do not know how to do it with the versatilepb, it did not "just work", but this does work:
flash.s
.thumb
.thumb_func
.global _start
_start:
stacktop: .word 0x20001000
.word reset
.word hang
.thumb_func
reset:
bl notmain
b hang
.thumb_func
hang: b .
.thumb_func
.globl PUT32
PUT32:
str r1,[r0]
bx lr
notmain.c
void PUT32 ( unsigned int, unsigned int );
#define UART0BASE 0x4000C000
int notmain ( void )
{
unsigned int rx;
for(rx=0;rx<8;rx++)
{
PUT32(UART0BASE+0x00,0x30+(rx&7));
}
return(0);
}
flash.ld
ENTRY(_start)
MEMORY
{
rom : ORIGIN = 0x00000000, LENGTH = 0x1000
ram : ORIGIN = 0x20000000, LENGTH = 0x1000
}
SECTIONS
{
.text : { *(.text*) } > rom
.rodata : { *(.rodata*) } > rom
.bss : { *(.bss*) } > ram
}
(I am told the entry point being a thumb function address is critical YMMV)
arm-none-eabi-as --warn --fatal-warnings -mcpu=cortex-m3 flash.s -o flash.o
arm-none-eabi-gcc -Wall -O2 -ffreestanding -mcpu=cortex-m3 -mthumb -c notmain.c -o notmain.o
arm-none-eabi-ld -nostdlib -nostartfiles -T flash.ld flash.o notmain.o -o notmain.elf
arm-none-eabi-objdump -D notmain.elf > notmain.list
arm-none-eabi-objcopy -O binary notmain.elf notmain.bin
check the vector table, etc.
00000000 <_start>:
0: 20001000
4: 0000000d
8: 00000013
0000000c <reset>:
c: f000 f804 bl 18 <notmain>
10: e7ff b.n 12 <hang>
00000012 <hang>:
12: e7fe b.n 12 <hang>
Looks good.
And run it
qemu-system-arm -M lm3s811evb -m 8K -nographic -kernel notmain.bin
01234567
Then ctrl-a then x to exit
QEMU: Terminated
-cpu cortex-m4 works as well as one would expect. Would have to try to find things different between the m3 and m4 that might show up in a sim like this and go from there.
After Luminary Micro (acquired by ti a while ago now) I do not think anyone else put the effort in for a machine. But as already discussed in at least one question at this site, you can run the cores (an exercise for the reader).
For versatilepb
int notmain ( void )
{
unsigned int ra;
for(ra=0;;ra++)
{
ra&=7;
PUT32(0x101f1000,0x30+ra);
}
return(0);
}
qemu-system-arm -machine versatileab -cpu cortex-m4 -nographic -monitor null -kernel notmain.elf
qemu-system-arm: This board cannot be used with Cortex-M CPUs
You can't arbitrarily plug different CPU types into an Arm board model. If you try it then the resulting system may work by luck, or may crash, or have odd behaviour; in some cases the -cpu option will just be ignored. This is because the CPU integration with the board matters: things like interrupt controllers are part of the board, not the CPU, but not all CPUs will work with all interrupt controllers. Often QEMU is not as good as it could be about detecting and reporting errors for user options that aren't valid.
In this case you're probably using an older QEMU: newer ones will correctly report:
qemu-system-arm: This board cannot be used with Cortex-M CPUs
if you try to use '-machine versatilepb' with '-cpu cortex-m4'. Older ones would either crash or just misbehave.
Generally the best thing is to use the CPU type that the board has by default (ie don't specify a -cpu option), for every board type except the "virt" board. If you want to write code for a Cortex-M4, you should look for a board type that has a Cortex-M4. The mps2-an386 is probably a good option. (If your QEMU doesn't have that board type, upgrade to a newer one: there have been a lot of M-profile emulation bug fixes anyway that you'll want to have.)
I am try to compile the simple following MBR:
.code16
.globl _start
.text
_start:
end:
jmp end
; Don't bother with 0xAA55 yet
I run the following commands:
> as --32 -o boot.o boot.s
> ld -m elf_i386 boot.o --oformat=binary -o mbr -Ttext 0x7c00
However, I get a binary file of more than 129MB which is strange to me. Thus,
I wanted to know what is going on in that build process ? Thank you very much.
Running objdump over boot.o give me:
> objdump -s boot.o
boot.o: format de fichier elf32-i386
Contenu de la section .text :
0000 ebfe ..
Contenu de la section .note.gnu.property :
0000 04000000 18000000 05000000 474e5500 ............GNU.
0010 020001c0 04000000 00000000 010001c0 ................
0020 04000000 01000000
Manually removing the section .note.gnu.property before calling ld seems to solve the problem. However, I don't know why this section appears by default... Running the following build commands seems to solve the problem too:
> as --32 -o boot.o boot.s -mx86-used-note=no
> ld -m elf_i386 boot.o --oformat=binary -o mbr -Ttext 0x7c00
ld links all your sections into the flat binary output unless you tell it not to (with a linker script for example).
The extra bytes are from the .note.gnu.property section which as adds, which can indicate stuff like x86 ISA version (e.g. AVX2+FMA+BMI2, Haswell feature level, is x86-64_v3.) You don't want that in your flat binary, especially not at the default high address far from where you tell it to put your .text section with -Ttext; that would result in a huge file with zeros padding the gap since it's a flat binary.
Using as -mx86-used-note=no will omit that section from the .o in the first place, leaving only the sections you define in your asm source. From the GAS manual's i386 options
-mx86-used-note=no
-mx86-used-note=yes
These options control whether the assembler should generate GNU_PROPERTY_X86_ISA_1_USED and GNU_PROPERTY_X86_FEATURE_2_USED GNU
property notes. The default can be controlled by the
--enable-x86-used-note configure option.
using -mx86-used-note=no flag with as will remove note section.
Check here https://sourceware.org/binutils/docs/as/i386_002dOptions.html
-mx86-used-note=no
-mx86-used-note=yes
These options control whether the assembler should generate GNU_PROPERTY_X86_ISA_1_USED and GNU_PROPERTY_X86_FEATURE_2_USED GNU
property notes. The default can be controlled by the
--enable-x86-used-note configure option.
I am doing cross compile debugging.
My build server CPU is amd64. My device CPU is MIPS.
When I am trying to do debug the elf file compiled by myself. The gdb can only show ld.so.1
(gdb) info sharedlibrary
From To Syms Read Shared Object Library
0x7704f9c0 0x7706c490 Yes (*) /lib/ld.so.1
(*): Shared library is missing debugging information.
(gdb) q
I checked the /proc/xxxx/maps file. It showed that the shared libraries are loaded.
root#TRA:/proc/13679# cat maps
......
76549000-76d48000 rwxp 00000000 00:00 0 [stack:13682]
76d48000-76d4a000 r-xp 00000000 00:0c 5268 /usr/lib/strongswan/plugins/libstrongswan-addrblock.so
76d4a000-76d59000 ---p 00002000 00:0c 5268 /usr/lib/strongswan/plugins/libstrongswan-addrblock.so
......
If I debug the file which is installed from Debian Package server, then GDB can show all the shared libraries.
(gdb) info sharedlibrary
From To Syms Read Shared Object Library
0x77341bc0 0x77342c80 Yes (*) /lib/mips-linux-gnu/libdl.so.2
0x771d77e0 0x772ff6f0 Yes (*) /lib/mips-linux-gnu/libc.so.6
0x773549c0 0x77371490 Yes (*) /lib/ld.so.1
(*): Shared library is missing debugging information.
(gdb)
GDB version is:
GNU gdb (Debian 7.7.1+dfsg-5) 7.7.1
My questions is:
Why the GDB command 'info sharedlibrary' can't show all the libraries? How can I fix it?
(EDIT)
(Does every executable file need the library ld.so? It is missing.)
The output of command "mips-linux-gnu-readelf -d src/charon/.libs/charon"
Dynamic section at offset 0x1fc contains 33 entries:
Tag Type Name/Value
0x00000001 (NEEDED) Shared library: [libstrongswan.so.0]
0x00000001 (NEEDED) Shared library: [libhydra.so.0]
0x00000001 (NEEDED) Shared library: [libcharon.so.0]
0x00000001 (NEEDED) Shared library: [libm.so.6]
0x00000001 (NEEDED) Shared library: [libpthread.so.0]
0x00000001 (NEEDED) Shared library: [libdl.so.2]
0x00000001 (NEEDED) Shared library: [libc.so.6]
0x0000001d (RUNPATH) Library runpath: [/usr/lib/strongswan]
0x0000000c (INIT) 0xd00
0x0000000d (FINI) 0x2eb0
0x00000004 (HASH) 0x32c
0x00000005 (STRTAB) 0x904
0x00000006 (SYMTAB) 0x4d4
0x0000000a (STRSZ) 787 (bytes)
0x0000000b (SYMENT) 16 (bytes)
0x70000035 (MIPS_RLD_MAP_REL) 0x134dc
0x00000015 (DEBUG) 0x0
0x00000003 (PLTGOT) 0x13760
0x00000011 (REL) 0xcf0
0x00000012 (RELSZ) 16 (bytes)
0x00000013 (RELENT) 8 (bytes)
0x70000001 (MIPS_RLD_VERSION) 1
0x70000005 (MIPS_FLAGS) NOTPOT
0x70000006 (MIPS_BASE_ADDRESS) 0x0
0x7000000a (MIPS_LOCAL_GOTNO) 18
0x70000011 (MIPS_SYMTABNO) 67
0x70000012 (MIPS_UNREFEXTNO) 37
0x70000013 (MIPS_GOTSYM) 0x11
0x6ffffffb (FLAGS_1) Flags: PIE
0x6ffffffe (VERNEED) 0xca0
0x6fffffff (VERNEEDNUM) 2
0x6ffffff0 (VERSYM) 0xc18
0x00000000 (NULL) 0x0
EDIT
Debuging GDB:
The gdb query ‘qXfer:libraries-svr4:read’ returned empty library list.
Breakpoint 7, svr4_current_sos_via_xfer_libraries (list=0x7fff8be59ad0, annex=<optimized out>)
at /gdb/gdb-7.11.1/gdb/solib-svr4.c:1301
1301 result = svr4_ parse_libraries (svr4_library_document, list);
1: svr4_library_document = 0x15cd9c0 "<library-list-svr4 version=\"1.0\"/>"
(gdb)
For Debian packages which are not compiled by me, the gdb query ‘qXfer:libraries-svr4:read’ returned full shared library list.
How does gdbserver construct the reply of this query ‘qXfer:libraries-svr4:read’?
EDIT
One more clue:
The pkgs installed from debian Jessie distribute is not PIE code.
The code I compiled is PIE code.
root#TRA:/proc/14956# readelf -r /usr/lib/strongswan/charon
Relocation section '.rel.dyn' at offset 0xcf0 contains 2 entries:
Offset Info Type Sym.Value Sym. Name
00000000 00000000 R_MIPS_NONE
00013870 00000003 R_MIPS_REL32
root#TRA:/proc/14956# readelf -r /usr/bin/id
There are no relocations in this file.
root#TRA:/proc/14956#
EDIT
After debugging gdbserver, I found one strange info.
The DT_DEBUG entry of the running proc is 0. After loader relocate the code, the DT_DEBUG should not be 0.(?) Does the system not support PIE code? I am using Debian Jessie MIPS system.
gdbserver source code:
if (dyn->d_tag == DT_DEBUG && map == -1)
map = dyn->d_un.d_val;
gdbserver dbg print
(gdb) p *dyn
$19 = {d_tag = 21, d_un = {d_val = 0, d_ptr = 0}}
(gdb)
EDIT
Get some information from this link:
https://sourceware.org/ml/binutils/2015-06/msg00166.html
I installed gdbserver from Debian Jessie mips-pkg server. But it seems not support PIE. Where can I install the mips-gdbserver which can support PIE?
Or how can I disable the gcc compiler generate PIE code?
I tried these flags (-fno-pie -fPIC) in cross-compile, but it still generate PIE code.
libtool: link: mips-linux-gnu-gcc -mfp32 -fno-pie -fPIC
-ggdb -O0 -Wall -Wno-format -Wno-format-security
-Wno-pointer-sign -I/cross-mips/usr/include -I/cross-mips/usr/include/libnl3
-I/cross-mips/usr/include/mips-linux-gnu
-I/work/strongswan/src/util
-include /work/strongswan/config.h
-o .libs/charon charon.o -L/cross-mips/lib/mips-linux-gnu
- L/cross-mips -L/cross-mips/usr/lib/mips-linux-gnu
../../src/libstrongswan/.libs/libstrongswan.so
-lm -lpthread -ldl -Wl,-rpath -Wl,/usr/lib/strongswan
Check the generated code:
mips-linux-gnu-readelf -r src/charon/.libs/charon
Relocation section '.rel.dyn' at offset 0xcf0 contains 2 entries:
Offset Info Type Sym.Value Sym. Name
00000000 00000000 R_MIPS_NONE
00013870 00000003 R_MIPS_REL32
Solution
Unfortunately the reason is my compiler gcc-6 is brocken. I used 'gcc version 6.3.0 20170516 (Debian 6.3.0-18)'. It is configured with '--enable-default-pie'. And there is no way to disable PIE. And this PIE breaks static library links. I have to change my compiler to gcc5.
From the info you provided, it seems that there are two likely causes:
Either you fully strip your binary, and gdbserver requires some symbol, or
You are building a PIE binary, and gdbserver on your system doesn't support such binaries.
(It's also possible that it's the combination of 1 and 2 that causes the problem.)
Since you know that the distribution binaries work, your best bet is probably to understand the differences between them and your binary, and minimizing such differences until gdbserver starts working.
Is it possible to create a basic bare-metal Assembly bootup/startup program using only GNU LD command-line options in lieu of a customary -T scriptfile for a Cortex-M4 target?
I have reviewed the GNU LD documentation and searched various locations including this site; however, I have not found any information suggesting that the exclusive use of command-line options for the GNU linker is possible or not possible.
My attempt to manage the object file layout without a customary vendor provided *.ld scriptfile is purely academic. This not homework. I'm not requesting any help for writing the startup Assembly code. I'm merely looking for a definitive answer or further resource direction.
$ arm-none-eabi-ld bootup.o -o bootup #bootup.ld.cli.file
Sample bootup.ld.cli.file content
--entry 0x0
--Ttext=0x0
--section-start .isr_vector=0x0
--section-start _start=0x4
--section-start .MyCode=0x8c
--Tdata=0x20000000
--Tbss=0x20000000
-M=bootup.map
--print-gc-sections
you have your answer right there the -Ttext=number -Tdata=number and so on are no gnu linker script items they are gnu command line items. note the at sign on your command line.
A gnu linker script looks more like this (although most are significantly more complicated even if they dont need to be).
MEMORY
{
rom : ORIGIN = 0x08000000, LENGTH = 0x1000
ram : ORIGIN = 0x20000000, LENGTH = 0x1000
}
SECTIONS
{
.text : { *(.text*) } > rom
.rodata : { *(.rodata*) } > rom
.bss : { *(.bss*) } > ram
}
Note that the gnu linker is a bit funny when you use the -Ttext=address approach, sometimes it will insert gaps you might have a few Kbytes of program and instead of it just linearly placing it at address like it should it will put some, then pad some dead space, then put some more, never figured out why but for extremely limited targets the linker script (vs command line) all other factors held constant, does not put the gap in the output.
EDIT:
so.s
.cpu cortex-m0
.thumb
.thumb_func
.global _start
_start:
stacktop: .word 0x20001000
.word reset
.word hang
.word hang
.word hang
.word hang
.thumb_func
reset:
b hang
.thumb_func
hang: b .
flash.s
.cpu cortex-m0
.thumb
.thumb_func
.global _start
_start:
stacktop: .word 0x20001000
.word reset
.word hang
.word hang
.word hang
.word hang
.word hang
.thumb_func
reset:
bl notmain
b hang
.thumb_func
hang: b .
.thumb_func
.globl dummy
dummy:
bx lr
flash.ld
MEMORY
{
rom : ORIGIN = 0x08000000, LENGTH = 0x1000
ram : ORIGIN = 0x20000000, LENGTH = 0x1000
}
SECTIONS
{
.text : { *(.text*) } > rom
.rodata : { *(.rodata*) } > rom
.bss : { *(.bss*) } > ram
}
blinker02.c
void dummy ( unsigned int );
int notmain ( void )
{
unsigned int ra;
for(ra=0;ra<100;ra++) dummy(ra);
return(0);
}
Makefile
ARMGNU = arm-none-eabi
AOPS = --warn -mcpu=cortex-m0
COPS = -Wall -O2 -nostdlib -nostartfiles -ffreestanding -mcpu=cortex-m0
all : blinker02.bin sols.bin socl.bin
clean:
rm -f *.bin
rm -f *.o
rm -f *.elf
rm -f *.list
so.o : so.s
$(ARMGNU)-as $(AOPS) so.s -o so.o
flash.o : flash.s
$(ARMGNU)-as $(AOPS) flash.s -o flash.o
blinker02.o : blinker02.c
$(ARMGNU)-gcc $(COPS) -mthumb -c blinker02.c -o blinker02.o
blinker02.bin : flash.ld flash.o blinker02.o
$(ARMGNU)-ld -o blinker02.elf -T flash.ld flash.o blinker02.o
$(ARMGNU)-objdump -D blinker02.elf > blinker02.list
$(ARMGNU)-objcopy blinker02.elf blinker02.bin -O binary
sols.bin : so.o
$(ARMGNU)-ld -o sols.elf -T flash.ld so.o
$(ARMGNU)-objdump -D sols.elf > sols.list
$(ARMGNU)-objcopy sols.elf sols.bin -O binary
socl.bin : so.o
$(ARMGNU)-ld -o socl.elf -Ttext=0x08000000 -Tbss=0x20000000 so.o
$(ARMGNU)-objdump -D socl.elf > socl.list
$(ARMGNU)-objcopy socl.elf socl.bin -O binary
The difference between the command line and the linker script socl and sols list files are the names
diff sols.list socl.list
2c2
< sols.elf: file format elf32-littlearm
---
> socl.elf: file format elf32-littlearm
Not going to bother with demonstrating the difference you may see down the road.
For assembly only you dont need to worry about the no start files and other command line options (on gcc). With C objects you do. by not allowing the linker to use the as-built/configured toolchains (or lets say C library) bootstrap code, you have to provide one, if you dont complicate the linker script to the point that specific object files are called out then the ordering of objects on the command line matters, if you swap flash.o and blinker02.o on the ld command line in the makefile, the binary wont work. you can set entry points all you want but those are strictly for the loader, if this is bare metal which it appears to be then the entry point is useless, the hardware boots how it boots, in this case with a cortex-m address zero is the value to load in the stack pointer, address four is the address to the reset vector (with the lsbit set since this is a thumb only machine, let the tools do that for you using the gnu assembler specific thumb_func to indicate the next label is a branch destination address).
I sprinkled cortex-m0 about one because that is what I took this code from and two the original armv4t and armv5t or as called out in the newer arm docs "all thumb variants", is the most portable arm instruction set across the arm cores. with your cortex-m4 you can get rid of that or perhaps make it a -m3 or -m4 to pull in the armv7-m thumb2 extensions.
so the short answer is
arm-none-eabi-ld -o so.elf -Ttext=0x08000000 -Tbss=0x20000000 so.o
Is more than adequate for making working binaries ASSUMING you dont need a .data.
.data requires a lot more stuff, linker script, a more complicated bootstrap, etc. That or you do a copy-jump thing, compile the REAL program to be run in sram only (different entry point full sized arm style but at the ram base address), then write an adhoc tool to take that binary and turn it into say .word 0xabcdef entries in a program that copies from flash to ram the whole REAL program then branches, that copy and jump program is now flash only with no .data nor .bss really needed and can use the command line, so can the REAL ram only program. And I probably lost you already on that one.
Likewise, using the command line you cannot or should not assume that .bss is zeroed, your bootstrap has to do that too. Now if you have .bss and no .data, then sure you could blindly zero all of the ram on boot before you branch to your C programs entry point (I use notmain() both because at least one old compiler added unnecessary garbage to the binary if it saw a main() function and to emphasize the point that normally there is nothing magic about the function named main().).
Linker scripts are toolchain specific, so no reason to expect gnu linker scripts to port to Kiel to port to ARM (yes I know ARM owns Kiel now was referring to RVCT or whatever it is now), etc. So that is the first .data/.bss problem. Ideally you want your tools to do the work, so they know how bit .data and .bss are so just let them tell you, how you let them tell you is crafting the linker script right (at least with ld) and that is tricky, but it creates variables if you will that can define things like start address for .bss, end address for .bss maybe even some math to subtract them and get length, likewise for .data, then in the bootstrap assembly language you can zero out the .bss memory using start address and length, and/or start address and end address. For .data you need two addresses, where you put it in flash (more linker script foo) and where it wants to go in ram, and the length then the bootstrap copies.
so basically if you write this code
unsigned int x=5;
unsigned int y;
and you use a command line linker script, there is no reason whatsoever to expect x to be 5 or y to be 0 when the first C function is entered that uses those variables. If you assume that x will be a 5 then your program will fail.
if you do this instead
unsigned int x;
unsigned int y;
void myfun ( void )
{
x=5;
y=0;
}
now those assignments are instructions in .text and not values in .data so it will always work command line or not simple linker script or complicated, etc.
Is there a way to make the stack of a C program executable through compilation?
I did
$ gcc -o convert -g convert
and then run
$ readelf -l convert
to check if the stack is executable but the output was:
GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x4
The correct way to make the stack executable doesn't require that stack canaries be disabled, unlike what the accepted answer suggests.
Here's the correct way:
gcc -z execstack ...
What this does is, the -z option of gcc is passed to the linker [source]:
keyword
-z is passed directly on to the linker along with the keyword keyword. See the section in the documentation of your linker for permitted
values and their meanings.
From man ld [source]:
execstack
Marks the object as requiring executable stack.
-fno-stack-protector should do the trick for you.