ld ignores size of nobits input section - linker

When working on a small 32-bit kernel for the x86 architecture I discovered something strange with how ld handles nobits sections.
In my kernel I define a .bootstrap_stack section which holds a temporary stack for the initialisation part of the system. I also hold symbols for the beginning and end of the stack. This input section is redirected to the .bss output section. Each output section of my kernel has a symbol for the beginning and end of the section.
The problem is that in the final executable the symbol for the end of the stack is after the end of the .bss section. In the below examples the symbols stack_top and _kernel_ebss (and _kernel_end) have the same value, which isn't what I wanted.
I expected _kernel_ebss to equal stack_bottom.
However once I rename .bootstrap_stack to .bss this does not happen. Removing nobits also works, but the resulting binary is considerably larger.
Here are the stripped files that reproduce my issue:
boot.s
section .bootstrap_stack, nobits ; this does not work
;section .bootstrap_stack ; this works
;section .bss ; this also works
stack_top:
resb 8096
stack_bottom:
section .text
global _start
_start:
hlt
jmp _start
linker.ld
ENTRY(_start)
SECTIONS
{
. = 0xC0100000;
_kernel_start = .;
.text ALIGN(4K) : AT(ADDR(.text) - 0xC0000000)
{
_kernel_text = .;
*(.multiboot)
*(.text)
_kernel_etext = .;
}
.bss ALIGN(4K) : AT(ADDR(.bss) - 0xC0000000)
{
_kernel_bss = .;
*(COMMON)
*(.bss)
*(.bootstrap_stack)
_kernel_ebss = .;
}
_kernel_end = .;
}
Here are the symbols:
$ objdump -t kernel | sort
00000000 l df *ABS* 00000000 boot.s
c0100000 g .text 00000000 _kernel_start
c0100000 g .text 00000000 _kernel_text
c0100000 g .text 00000000 _start
c0100000 l d .text 00000000 .text
c0100003 g .text 00000000 _kernel_etext
c0101000 g .text 00000000 _kernel_bss
c0101000 g .text 00000000 _kernel_ebss
c0101000 g .text 00000000 _kernel_end
c0101000 l .bootstrap_stack, 00000000 stack_top
c0101000 l d .bootstrap_stack, 00000000 .bootstrap_stack,
c0102fa0 l .bootstrap_stack, 00000000 stack_bottom
By renaming .bootstrap_stack to .bss I get what I expected.
00000000 l df *ABS* 00000000 boot.s
c0100000 g .text 00000000 _kernel_start
c0100000 g .text 00000000 _kernel_text
c0100000 g .text 00000000 _start
c0100000 l d .text 00000000 .text
c0100003 g .text 00000000 _kernel_etext
c0101000 g .bss 00000000 _kernel_bss
c0101000 l .bss 00000000 stack_top
c0101000 l d .bss 00000000 .bss
c0102fa0 g .bss 00000000 _kernel_ebss
c0102fa0 g .bss 00000000 _kernel_end
c0102fa0 l .bss 00000000 stack_bottom
My question is whether this is expected behaviour of ld. If yes, what is the problem with my example, because as far as I understand .bss is also a nobits section, but it produces the expected result?

Okay I figured it out.
Apparently you're not supposed to have a comma right after the name of the section. objdump includes the comma in the name of the section so that clearly shows that that is the mistake.
So
section .bootstrap_stack, nobits
should be
section .bootstrap_stack nobits

Related

Unexpected value for the linker script variables

I am trying to write my own linker script.
The current version is here:
MEMORY
{
ROM (rx) : ORIGIN = 0x00000000, LENGTH = 0x00004000
RAM (rwx) : ORIGIN = 0x00004000, LENGTH = 0x00004000
}
STACK_SIZE = 0x3000;
BOOT_PC = 0x1000;
/* Section Definitions */
SECTIONS
{
/* Code and constants */
.text :
{
*(.rodata*);
KEEP(*(.vectors .vectors.*));
. = BOOT_PC;
KEEP(*start.o(.text*));
*(.text*);
_etext = . ;
_idata = . ;
} > ROM
/* Unitialized data */
.bss (NOLOAD) :
{
_sbss = . ;
*(.bss*);
*(COMMON);
_ebss = . ;
} > RAM
/* Initialized data */
.data : AT(_idata)
{
_sdata = . ;
*(.data*);
_edata = . ;
} > RAM
/* Stack */
.stack (NOLOAD):
{
. = ALIGN(8);
. = . + STACK_SIZE;
. = ALIGN(8);
_stack = . ;
} > RAM
}
In my C program, I have global variables which are supposed to go in .bss (foo) and .data(init, a1 and a2) sections:
int foo;
int init = 4;
int a1 = 4;
int a2 = 4;
When I use objdump, I have the following result:
elf/noste.elf: file format elf32-littleriscv
SYMBOL TABLE:
00000000 l d .text 00000000 .text
00004000 l d .sbss 00000000 .sbss
00004004 l d .sdata 00000000 .sdata
00004010 l d .stack 00000000 .stack
00000000 l d .comment 00000000 .comment
00000000 l d .riscv.attributes 00000000 .riscv.attributes
00000000 l df *ABS* 00000000 start.o
0000103c l .text 00000000 _end_trigger
00001050 l .text 00000000 _end_loop
00000000 l df *ABS* 00000000 main.c
00000000 l df *ABS* 00000000 reset.c
0000105c g F .text 0000008c reset_handler
00003000 g *ABS* 00000000 STACK_SIZE
000010e8 g .text 00000000 _etext
00004000 g .text 00000000 _sbss
00004008 g O .sdata 00000004 a1
00004004 g .sdata 00000000 _sdata
00004000 g .text 00000000 _ebss
000010e8 g .text 00000000 _idata
00001000 g .text 00000000 _start
0000400c g O .sdata 00000004 init
00001054 g F .text 00000008 main
00004004 g O .sdata 00000004 a2
00004000 g O .sbss 00000004 foo
00004004 g .sdata 00000000 _edata
0000103c g .text 00000000 _end
00007010 g .stack 00000000 _stack
00001000 g *ABS* 00000000 BOOT_PC
As expected, the different C variables are placed in the .sbss and .sdata sections.
However, _ebss and _edata are not incremented and have the same value than _sbss and _sdata.
Instead of _ebss = 00004000 and _edata = 00004004, I expected _ebss = 00004004 and _edata = 00004010.
An idea about my issue here ?
Thanks for the help.

"kernel must be loaded first"

I wrote a mini-boot loader and a simple kernel that print a string. I followed step-by-step this playlist(First 3 videos to be precise!). Anyway, when I boot my virtual machine(with my ISO) I get those messages:
"error: no multiboot header found."
"error: you need to load the kernel first."
I tried to modify some section of the assembly code in the boot file but without success.
Here is the code:
boot.s
.set MAGIC, 0x1badb002
.set FLAGS, (1<<0 | 1<<1)
.set CHECKSUM, -(MAGIC + FLAGS)
.section .multiboot
.long MAGIC
.long FLAGS
.long CHECKSUM
.section .text
.extern kernel_main
.extern call_constructors
.global loader
loader:
mov $kernel_stack, %esp
call call_constructors
push %eax
push %ebx
call kernel_main
_stop:
cli
hlt
jmp _stop
.section .bss
.space 2*1024*1024 ;#2 MiB
kernel_stack:
kernel.c
#include <sys/types.h>
void printf(char * str)
{
uint16_t * VideoMemory = (uint16_t *)0xb8000;
for(int32_t i = 0; str[i] != '\0'; i++)
VideoMemory[i] = (VideoMemory[i] & 0xFF00) | str[i];
}
typedef void (*constructor)();
extern "C" constructor start_ctors;
extern "C" constructor end_ctors;
extern "C" void call_constructors()
{
for(constructor* i = &start_ctors; i != &end_ctors; i++)
(*i)();
}
extern "C" void kernel_main(const void * multiboot_structure, uint32_t magic_number)
{
printf("Denos - Version: 0.0.1a");
for(;;);
}
NOTE: sys/types.h comes from my lib. which is included as argument in gcc.
linker.ld
ENTRY(loader)
OUTPUT_FORMAT(elf32-i386)
OUTPUT_ARCH(i386:i386)
SECTIONS
{
. = 0x0100000;
.text :
{
*(.multiboot)
*(.text*)
*(.rodata)
}
.data :
{
start_ctors = .;
KEEP(*( .init_array ));
KEEP(*(SORT_BY_INIT_PRIORITY( .init_array.* )));
end_ctors = .;
*(.data)
}
.bss :
{
*(.bss)
}
/DISCARD/ :
{
*(.fini_array*)
*(.comment)
}
}
Makefile
GPPPARAMS = -m32 -fno-use-cxa-atexit -nostdlib -fno-builtin -fno-rtti -fno-exceptions -fno-leading-underscore -I ../include/
ASPARAMS = --32
objects = boot.o kernel.o
run: denos.iso
(killall VirtualBox && sleep 1) || true
VirtualBox --startvm 'denos' &
%.o: %.c
gcc $(GPPPARAMS) -o $# -c $<
%.o: %.s
as $(ASPARAMS) -o $# $<
kernel.bin : linker.ld $(objects)
ld $(LDPARAMS) -T $< -o $# $(objects)
install: kernel.bin
sudo cp $< ./boot/kernel.bin
denos.iso: kernel.bin
mkdir iso
mkdir iso/boot
mkdir iso/boot/grub
cp kernel.bin iso/boot/kernel.bin
echo 'set timeout=0' > iso/boot/grub/grub.cfg
echo 'set default=0' >> iso/boot/grub/grub.cfg
echo '' >> iso/boot/grub/grub.cfg
echo 'menuentry "Denos" {' >> iso/boot/grub/grub.cfg
echo ' multiboot /boot/kernel.bin' >> iso/boot/grub/grub.cfg
echo ' boot' >> iso/boot/grub/grub.cfg
echo '}' >> iso/boot/grub/grub.cfg
grub-mkrescue --output=denos.iso iso
rm -rf iso
mv -f denos.iso /home/data/libvirt_iso/
objdump -x kernel.bin(Requested)
kernel.bin: file format elf32-i386
kernel.bin
architecture: i386, flags 0x00000112:
EXEC_P, HAS_SYMS, D_PAGED
start address 0x0010000c
Program Header:
LOAD off 0x00000000 vaddr 0x00000000 paddr 0x00000000 align 2**21
filesz 0x001001ac memsz 0x003001ac flags rwx
STACK off 0x00000000 vaddr 0x00000000 paddr 0x00000000 align 2**4
filesz 0x00000000 memsz 0x00000000 flags rwx
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 00000100 00100000 00100000 00100000 2**0
CONTENTS, ALLOC, LOAD, READONLY, CODE
1 .eh_frame 000000a0 00100100 00100100 00100100 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
2 .got.plt 0000000c 001001a0 001001a0 001001a0 2**2
CONTENTS, ALLOC, LOAD, DATA
3 .bss 00200000 001001ac 001001ac 001001ac 2**0
ALLOC
SYMBOL TABLE:
00100000 l d .text 00000000 .text
00100100 l d .eh_frame 00000000 .eh_frame
001001a0 l d .got.plt 00000000 .got.plt
001001ac l d .bss 00000000 .bss
00000000 l df *ABS* 00000000 boot.o
1badb002 l *ABS* 00000000 MAGIC
00000003 l *ABS* 00000000 FLAGS
e4524ffb l *ABS* 00000000 CHECKSUM
003001ac l .bss 00000000 kernel_stack
0010001d l .text 00000000 _stop
00000000 l df *ABS* 00000000 kernel.c
00000000 l df *ABS* 00000000
001001a0 l O .got.plt 00000000 _GLOBAL_OFFSET_TABLE_
001001a0 g .got.plt 00000000 start_ctors
001000e0 g F .text 00000000 .hidden __x86.get_pc_thunk.ax
001000c2 g F .text 0000001e kernel_main
001000e4 g F .text 00000000 .hidden __x86.get_pc_thunk.bx
001001a0 g .got.plt 00000000 end_ctors
00100088 g F .text 0000003a call_constructors
00100021 g F .text 00000067 _Z6printfPc
0010000c g .text 00000000 loader
I'm so sorry I wasted someone times. I checked 'Makefile' file and I figured out that I was missing 'LDPARAMS = -melf_i386'. Now it boot and print
Thanks anyway.

Object file created by objcopy is not compatible

I have created an object file from a binary file using objcopy as below:
objcopy -I binary -O elf32-little --rename-section .data=.text file.bin file.o
In one of the linker script sections I have included the following to place that file into that section:
file.o (.text)
But I get the following error:
skipping incompatible file.o when searching for file.o
error: ld returned 1 exit status
I am developing for a arm microcontroller so I believe the file format "elf32-little" is correct.
Any help is much appreciated.
#####################################################################
UPDATE FOLLOWING THE INCBIN path:
I have tried a new approach and although I have made some progress still not quite yet there.
This is my assembly file:
.section .text.audio_binary
.global audio_start
audio_start:
.incbin "AudioData.bin"
.global audio_start
audio_end:
.byte 0
.global audio_size
audio_size:
.int audio_start - audio_start
This is the object file I get:
raw_audio_binary.o: file format elf32-little
SYMBOL TABLE:
00000000 l d .text 00000000 .text
00000000 l d .data 00000000 .data
00000000 l d .bss 00000000 .bss
00000000 l d .text.audio_binary 00000000 .text.audio_binary
00069a78 l .text.audio_binary 00000000 audio_end
00000000 l .text.audio_binary 00000000 $d
00000000 l d .ARM.attributes 00000000 .ARM.attributes
00000000 g .text.audio_binary 00000000 audio_start
00069a79 g .text.audio_binary 00000000 audio_size
And this is the section I have in my linker script:
.text_Flash3 : ALIGN(4)
{
FILL(0xff)
*(.text.$Flash3*)
*(.text.$AUDIO*) *(.rodata.$Flash3*)
*(.text.audio_binary*) /* audio binary */
*(.rodata.$AUDIO*) } > AUDIO
For some reason the linker does NOT place the data in this section (or in any).
Any ideas what is wrong?
I apologise in advance if something is very wrong here, I am new to linker scripts so still understanding them...
If you have a sufficiently recent version of GAS, you can use this to create an object file from a binary input file using the .incbin directive:
.section .rodata
.globl input_wav
input_wav:
.incbin "input.wav"
.globl input_wav_size
input_wav_size:
.long . - input_wav

Huge Binary size while ld Linking

I have a linker script that links code for imx6q(cortex-A9):
OUTPUT_FORMAT("elf32-littlearm", "elf32-bigarm", "elf32-littlearm")
OUTPUT_ARCH(arm)
ENTRY(Reset_Handler)
/* SEARCH_DIR(.) */
GROUP(libgcc.a libc.a)
/* INPUT (crtbegin.o crti.o crtend.o crtn.o) */
MEMORY {
/* IROM (rwx) : ORIGIN = 0x00000000, LENGTH = 96K */
IRAM (rwx) : ORIGIN = 0x00900000, LENGTH = 256K
IRAM_MMU (rwx): ORIGIN = 0x00938000, LENGTH = 24K
IRAM_FREE(rwx): ORIGIN = 0x00907000, LENGTH = 196K
DDR (rwx) : ORIGIN = 0x10000000, LENGTH = 1024M
}
/* PROVIDE(__cs3_heap_start = _end); */
SECTIONS {
.vector (ORIGIN(IRAM) + LENGTH(IRAM) - 144 ):ALIGN (32) {
__ram_vectors_start = . ;
. += 72 ;
__ram_vectors_end = . ;
. = ALIGN (4);
} >IRAM
. = ORIGIN(DDR);
.text(.) :ALIGN(8) {
*(.entry)
*(.text)
/* __init_array_start = .; */
/* __init_array_end = .; */
. = ALIGN (4);
__text_end__ = .;
} >DDR
.data :ALIGN(8) {
*(.data .data.*)
__data_end__ = .;
}
.bss(__data_end__) : {
. = ALIGN (4);
__bss_start__ = .;
*(.shbss)
*(.bss .bss.* .gnu.linkonce.b.*)
*(COMMON)
__bss_end__ = .;
}
/* . += 10K; */
/* . += 5K; */
top_of_stacks = .;
. = ALIGN (4);
. += 8;
free_memory_start = .;
.mmu_page_table : {
__mmu_page_table_base__ = .;
. = ALIGN (16K);
. += 16K;
} >IRAM_MMU
_end = .;
__end = _end;
PROVIDE(end = .);
}
When i built, the binary size is just 6 KB. But i can not add any initialized variable. When i add an initialized variable, the binary size jumps to ~246 MB. Why is that? I tried to link the data segment at location following text section by specifying exact location and also providing >DDR for the data segment. Even though this seem to reduce the binary size back to 6 KB, the binary fails to boot. How can i keep my code in the DDR and the data, bss, stack and heap in the internal ram itself, with light binary size?
I read in another thread that " using MEMORY tag in linker script should solve the problem of memory waste", How can this be done?
linker script wastes my memory
Plese do ask if anything else needed. I don't have any experience with linker script. Please help
The readelf --sections output of the binary with no initialized data given is as follows,
There are 19 section headers, starting at offset 0xd804:
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .vector NOBITS 0093ff80 007f80 000048 00 WA 0 0 32
[ 2] .text PROGBITS 10000000 008000 0016fc 00 AX 0 0 8
[ 3] .text.vectors PROGBITS 100016fc 0096fc 000048 00 AX 0 0 4
[ 4] .text.proc PROGBITS 10001744 009744 000034 00 AX 0 0 4
[ 5] .bss NOBITS 0093ffc8 007fc8 000294 00 WA 0 0 4
[ 6] .mmu_page_table NOBITS 00938000 008000 004000 00 WA 0 0 1
[ 7] .comment PROGBITS 00000000 009778 00001f 01 MS 0 0 1
[ 8] .ARM.attributes ARM_ATTRIBUTES 00000000 009797 00003d 00 0 0 1
[ 9] .debug_aranges PROGBITS 00000000 0097d8 000108 00 0 0 8
[10] .debug_info PROGBITS 00000000 0098e0 0018a7 00 0 0 1
[11] .debug_abbrev PROGBITS 00000000 00b187 00056f 00 0 0 1
[12] .debug_line PROGBITS 00000000 00b6f6 00080e 00 0 0 1
[13] .debug_frame PROGBITS 00000000 00bf04 000430 00 0 0 4
[14] .debug_str PROGBITS 00000000 00c334 0013dd 01 MS 0 0 1
[15] .debug_ranges PROGBITS 00000000 00d718 000020 00 0 0 8
[16] .shstrtab STRTAB 00000000 00d738 0000cb 00 0 0 1
[17] .symtab SYMTAB 00000000 00dafc 000740 10 18 60 4
[18] .strtab STRTAB 00000000 00e23c 000511 00 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings)
I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
O (extra OS processing required) o (OS specific), p (processor specific)
and The readelf --sections output of the binary with initialized data given is ,
There are 20 section headers, starting at offset 0xd82c:
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .vector NOBITS 0093ff80 007f80 000048 00 WA 0 0 32
[ 2] .text PROGBITS 10000000 008000 0016fc 00 AX 0 0 8
[ 3] .text.vectors PROGBITS 100016fc 0096fc 000048 00 AX 0 0 4
[ 4] .text.proc PROGBITS 10001744 009744 000034 00 AX 0 0 4
[ 5] .data PROGBITS 0093ffc8 007fc8 000004 00 WA 0 0 8
[ 6] .bss NOBITS 0093ffcc 007fcc 000294 00 WA 0 0 4
[ 7] .mmu_page_table NOBITS 00938000 008000 004000 00 WA 0 0 1
[ 8] .comment PROGBITS 00000000 009778 00001f 01 MS 0 0 1
[ 9] .ARM.attributes ARM_ATTRIBUTES 00000000 009797 00003d 00 0 0 1
[10] .debug_aranges PROGBITS 00000000 0097d8 000108 00 0 0 8
[11] .debug_info PROGBITS 00000000 0098e0 0018b6 00 0 0 1
[12] .debug_abbrev PROGBITS 00000000 00b196 000580 00 0 0 1
[13] .debug_line PROGBITS 00000000 00b716 00080e 00 0 0 1
[14] .debug_frame PROGBITS 00000000 00bf24 000430 00 0 0 4
[15] .debug_str PROGBITS 00000000 00c354 0013dd 01 MS 0 0 1
[16] .debug_ranges PROGBITS 00000000 00d738 000020 00 0 0 8
[17] .shstrtab STRTAB 00000000 00d758 0000d1 00 0 0 1
[18] .symtab SYMTAB 00000000 00db4c 000770 10 19 62 4
[19] .strtab STRTAB 00000000 00e2bc 000513 00 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings)
I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
O (extra OS processing required) o (OS specific), p (processor specific)
Hope this is enough...!!!
Note: I am using arm-none-eabi-gcc for linking.
If you are not experienced with linker scripts then either use one that just works or make or borrow a simpler one. Here is a simple one, and this should demonstrate what is most likely going on.
MEMORY
{
bob : ORIGIN = 0x00001000, LENGTH = 0x100
ted : ORIGIN = 0x00002000, LENGTH = 0x100
alice : ORIGIN = 0x00003000, LENGTH = 0x100
}
SECTIONS
{
.text : { *(.text*) } > bob
.data : { *(.text*) } > ted
.bss : { *(.text*) } > alice
}
First program
.text
.globl _start
_start:
mov r0,r1
mov r1,r2
b .
not meant to be a real program just creating some bytes in a segment is all.
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .text PROGBITS 00001000 001000 00000c 00 AX 0 0 4
12 bytes in .text which is at address 0x1000 in memory which is exactly what we told it to do.
If I use -objcopy a.elf -O binary a.bin I get a 12-byte file as expected, the "binary" file format is a memory image, starting with the first address that has some content in the address space and ending with the last byte of content in the address space. so instead of 0x1000+12 bytes, the binary is 12 bytes ad the user has to know it needs to be loaded at 0x1000.
So change this up a little:
.text
.globl _start
_start:
mov r0,r1
mov r1,r2
b .
.data
some_data: .word 0x12345678
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .text PROGBITS 00001000 001000 00000c 00 AX 0 0 4
[ 2] .data PROGBITS 00002000 002000 000004 00 WA 0 0 1
Now we have 12 bytes at 0x1000 and 4 bytes at 0x2000, so -O binary has to give us one memory image from the first defined byte to the last so that would be 0x1000+4.
Sure enough 4100 bytes that is exactly what it did.
.text
.globl _start
_start:
mov r0,r1
mov r1,r2
b .
.data
some_data: .word 0x12345678
.bss
some_more_data: .word 0
which gives
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .text PROGBITS 00001000 001000 00000c 00 AX 0 0 4
[ 2] .data PROGBITS 00002000 002000 000004 00 WA 0 0 1
[ 3] .bss NOBITS 00003000 003000 000004 00 WA 0 0 1
Now I only got a 4100-byte file, and that is actually not surprising, it is assumed that the bootstrap is going to zero .bss So that didn't grow the "binary" file.
There is an intimate relationship. A system-level design. Between the linker script and the bootstrap. For what it appears you are trying to do (just ram no rom) you can probably get away with a much simpler linker script, on par with the one I have but if you care about .bss being zeroed then there are some tricks you can use:
MEMORY
{
ram : ORIGIN = 0x00001000, LENGTH = 0x3000
}
SECTIONS
{
.text : { *(.text*) } > ram
.bss : { *(.text*) } > ram
.data : { *(.text*) } > ram
}
make sure there is at least one .data item and your "binary" will have the complete image with bss already zeroed, the bootstrap simply needs to set the stack pointer(s) and jump to main (if this is for C).
Anyway, hopefully, you can see that the jump from 12 bytes to 4100 bytes was because of the addition of a .data element and the "binary" format having to pad the "binary" file so that the file was a memory image from the lowest address with data to the highest address with data (from 0x1000 to 0x2000+sizeof(.data)-1 in this case). Change the linker script, the 0x1000, and the 0x2000 and this all changes. Swap them put .text at 0x2000 and .data at 0x1000, now the "binary" file has to be 0x2000-0x1000+sizeof(.text) rather than 0x2000-0x1000+sizeof(.data). or 0x100C bytes instead of 0x1004. go back to the first linker script and make .data at 0x20000000 now the "binary" will be 0x20000000-0x1000+sizeof(.data) because that is how much information including padding is required to make a memory image in a single file.
It is most likely that is what is going on. As demonstrated here the file size went from 12 bytes to 4100 by simply adding one word of data.
EDIT.
Well if you noload the data then your initialized variable won't be initialized, it is that simple
unsigned int x = 5;
will not be a 5 if you discard (NOLOAD) .data.
As has been stated and stated again, you can have the data put in the .text sector and then use more linker script foo, to have the bootstrap find that data.
MEMORY
{
bob : ORIGIN = 0x00001000, LENGTH = 0x100
ted : ORIGIN = 0x00002000, LENGTH = 0x100
alice : ORIGIN = 0x00003000, LENGTH = 0x100
}
SECTIONS
{
.text : { *(.text*) } > bob
.data : { *(.text*) } > ted AT > bob
.bss : { *(.text*) } > alice AT > bob
}
This creates a 16 byte "binary" file. the 12 bytes of instruction and the 4 bytes of .data. But you don't know where the data is unless you do some hardcoding which is a bad idea. This is where things like bss_start and bss_end are found in your linker script.
something like this
MEMORY
{
bob : ORIGIN = 0x00001000, LENGTH = 0x100
ted : ORIGIN = 0x00002000, LENGTH = 0x100
alice : ORIGIN = 0x00003000, LENGTH = 0x100
}
SECTIONS
{
.text : { *(.text*) } > bob
.data : {
__data_start__ = .;
*(.data*)
} > ted AT > bob
__data_end__ = .;
__data_size__ = __data_end__ - __data_start__;
.bss : { *(.text*) } > alice AT > bob
}
.text
.globl _start
_start:
mov r0,r1
mov r1,r2
b .
hello:
.word __data_start__
.word __data_end__
.word __data_size__
.data
some_data: .word 0x12345678
which gives us.
Disassembly of section .text:
00001000 <_start>:
1000: e1a00001 mov r0, r1
1004: e1a01002 mov r1, r2
1008: eafffffe b 1008 <_start+0x8>
0000100c <hello>:
100c: 00002000 andeq r2, r0, r0
1010: 00002004 andeq r2, r0, r4
1014: 00000004 andeq r0, r0, r4
Disassembly of section .data:
00002000 <__data_start__>:
2000: 12345678 eorsne r5, r4, #120, 12 ; 0x7800000
and the toolchain/linker creates and fills those defined names in the linker script, and then fills them into your code when it resolves those externals. Then your bootstrap needs to use those variables (and more that I didn't include here where to find the .data in the .text you know from the above that there are 4 bytes and they need to land at 0x2000 but where in the 0x1000 .text area are those 4 bytes found? More linker script foo. Also, note the gnu linker scripts are very sensitive as to where you define those variables. before or after the squiggly brackets can have different results.
This is why I mentioned it appeared you were using ram. If this is a rom based target and you want .data and zeroed .bss then you pretty much have to put the .data and the size and location of .bss in the flash/rom area and the bootstrap has to copy and zero. Alternatively, you can choose not to use .data nor .bss
unsigned int x=5;
unsigned int y;
instead
unsigned int x;
unsigned int y;
...
x=5;
y=0;
Yes, it is not as efficient binary size-wise, but linker scripts are very much toolchain dependent, and with gnu for example over time the linker script language/rules change, what worked on a prior major version of gnu ld doesn't necessarily work on the current or next, I have had to re-architect my minimal linker script over the years as a result.
As demonstrated here, you can use your command line tools to experiment with settings and locations and see what the toolchain has produced.
Bottom line it sounds like you added some information in .data, but then state you want to NOLOAD it, basically meaning that .data isnt there/used your variables are not initialized correctly, so why bother changing the code to cause all this to happen only to have it not work anyway? Either has .data and use it right, have the right bootstrap and linker script pair, or if it is ram only just pack it all up into the same ram space, or don't use the "binary" format you are using, use elf or ihex or srec or other.
Another trick depending on your system is to build the binary for ram, all packed up, then have another program that wraps around that binary runs from rom and copies to ram and jumps. Take the 16 byte program above, write another that includes those 16 bytes from that build, and copies them to 0x1000 and then branches to 0x1000. Depending on the system and the flash/rom technology and interface you may wish to do this anyway, the system from my day job uses a spi flash to boot, which are known to have read-disturb problems and are...spi... So the fastest, cleanest, most reliable solution, is to do the copy jump before doing anything else. Making the linker script much easier as a free side effect.

Difference between OUTPUT_ARCH(arm) and OUTPUT_ARCH(armv4) in linker script

Among other things I am trying to understand the difference between OUTPUT_ARCH(arm) and OUTPUT_ARCH(armv4).
Assume we have next files (I have used linker script example from here as a basis):
main.c:
int main(void)
{
test_1();
test_2();
return 0;
}
main.lds:
OUTPUT_ARCH(arm)
SECTIONS
{
. = 0x10000;
.text : { *(.text) }
. = 0x8000000;
.data : { *(.data) }
.bss : { *(.bss) }
}
test_1.c:
void test_1(void)
{
return;
}
test_2.c:
void test_2(void)
{
return;
}
If we compile it and dump its content we have next:
c:\SysGCC\arm-elf\bin>arm-elf-gcc.exe test_1.c -c
c:\SysGCC\arm-elf\bin>arm-elf-gcc.exe test_2.c -c
c:\SysGCC\arm-elf\bin>arm-elf-objdump.exe -x test_1.o
test_1.o: file format elf32-littlearm
test_1.o
architecture: arm, flags 0x00000010:
HAS_SYMS
start address 0x00000000
private flags = 200: [APCS-32] [FPA float format] [software FP]
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 00000014 00000000 00000000 00000034 2**2
CONTENTS, ALLOC, LOAD, READONLY, CODE
1 .data 00000000 00000000 00000000 00000048 2**0
CONTENTS, ALLOC, LOAD, DATA
2 .bss 00000000 00000000 00000000 00000048 2**0
ALLOC
3 .comment 00000012 00000000 00000000 00000048 2**0
CONTENTS, READONLY
4 .ARM.attributes 00000010 00000000 00000000 0000005a 2**0
CONTENTS, READONLY
SYMBOL TABLE:
00000000 l df *ABS* 00000000 test_1.c
00000000 l d .text 00000000 .text
00000000 l d .data 00000000 .data
00000000 l d .bss 00000000 .bss
00000000 l d .comment 00000000 .comment
00000000 l d .ARM.attributes 00000000 .ARM.attributes
00000000 g F .text 00000014 test_1
c:\SysGCC\arm-elf\bin>arm-elf-gcc.exe -static -nostartfiles -T main.lds -o main.elf test_1.o test_2.o
c:\SysGCC\arm-elf\bin>arm-elf-objdump.exe -x main.elf
main.elf: file format elf32-littlearm
main.elf
architecture: arm, flags 0x00000112:
EXEC_P, HAS_SYMS, D_PAGED
start address 0x00010000
Program Header:
LOAD off 0x00008000 vaddr 0x00010000 paddr 0x00010000 align 2**15
filesz 0x00000028 memsz 0x00000028 flags r-x
private flags = 200: [APCS-32] [FPA float format] [software FP]
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 00000028 00010000 00010000 00008000 2**2
CONTENTS, ALLOC, LOAD, READONLY, CODE
1 .comment 00000011 00000000 00000000 00008028 2**0
CONTENTS, READONLY
2 .ARM.attributes 00000010 00000000 00000000 00008039 2**0
CONTENTS, READONLY
SYMBOL TABLE:
00010000 l d .text 00000000 .text
00000000 l d .comment 00000000 .comment
00000000 l d .ARM.attributes 00000000 .ARM.attributes
00000000 l df *ABS* 00000000 test_1.c
00000000 l df *ABS* 00000000 test_2.c
00010014 g F .text 00000014 test_2
00010000 g F .text 00000014 test_1
But if I change OUTPUT_ARCH(arm) to OUTPUT_ARCH(armv4), I get an error from linker:
c:\SysGCC\arm-elf\bin>arm-elf-gcc.exe -static -nostartfiles -T main.lds -o main.elf test_1.o test_2.o
c:/sysgcc/arm-elf/bin/../lib/gcc/arm-elf/4.6.3/../../../../arm-elf/bin/ld.exe: error: test_1.o uses software FP, whereas main.elf uses hardware FP
c:/sysgcc/arm-elf/bin/../lib/gcc/arm-elf/4.6.3/../../../../arm-elf/bin/ld.exe: failed to merge target specific data of file test_1.o
c:/sysgcc/arm-elf/bin/../lib/gcc/arm-elf/4.6.3/../../../../arm-elf/bin/ld.exe: error: test_2.o uses software FP, whereas main.elf uses hardware FP
c:/sysgcc/arm-elf/bin/../lib/gcc/arm-elf/4.6.3/../../../../arm-elf/bin/ld.exe: failed to merge target specific data of file test_2.o
collect2: ld returned 1 exit status
It can be fixed by specifying -mfloat-abi=hard option. In this case there is a difference in private flags comparing with previous output:
c:\SysGCC\arm-elf\bin>arm-elf-gcc.exe -mfloat-abi=hard test_1.c -c
c:\SysGCC\arm-elf\bin>arm-elf-gcc.exe -mfloat-abi=hard test_2.c -c
c:\SysGCC\arm-elf\bin>arm-elf-objdump.exe -x test_1.o
test_1.o: file format elf32-littlearm
test_1.o
architecture: arm, flags 0x00000010:
HAS_SYMS
start address 0x00000000
private flags = 0: [APCS-32] [FPA float format]
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 00000014 00000000 00000000 00000034 2**2
CONTENTS, ALLOC, LOAD, READONLY, CODE
1 .data 00000000 00000000 00000000 00000048 2**0
CONTENTS, ALLOC, LOAD, DATA
2 .bss 00000000 00000000 00000000 00000048 2**0
ALLOC
3 .comment 00000012 00000000 00000000 00000048 2**0
CONTENTS, READONLY
4 .ARM.attributes 00000010 00000000 00000000 0000005a 2**0
CONTENTS, READONLY
SYMBOL TABLE:
00000000 l df *ABS* 00000000 test_1.c
00000000 l d .text 00000000 .text
00000000 l d .data 00000000 .data
00000000 l d .bss 00000000 .bss
00000000 l d .comment 00000000 .comment
00000000 l d .ARM.attributes 00000000 .ARM.attributes
00000000 g F .text 00000014 test_1
c:\SysGCC\arm-elf\bin>arm-elf-gcc.exe -static -nostartfiles -T main.lds -o main.elf test_1.o test_2.o
c:\SysGCC\arm-elf\bin>arm-elf-objdump.exe -x main.elf
main.elf: file format elf32-littlearm
main.elf
architecture: arm, flags 0x00000112:
EXEC_P, HAS_SYMS, D_PAGED
start address 0x00010000
Program Header:
LOAD off 0x00008000 vaddr 0x00010000 paddr 0x00010000 align 2**15
filesz 0x00000028 memsz 0x00000028 flags r-x
private flags = 0: [APCS-32] [FPA float format]
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 00000028 00010000 00010000 00008000 2**2
CONTENTS, ALLOC, LOAD, READONLY, CODE
1 .comment 00000011 00000000 00000000 00008028 2**0
CONTENTS, READONLY
2 .ARM.attributes 00000010 00000000 00000000 00008039 2**0
CONTENTS, READONLY
SYMBOL TABLE:
00010000 l d .text 00000000 .text
00000000 l d .comment 00000000 .comment
00000000 l d .ARM.attributes 00000000 .ARM.attributes
00000000 l df *ABS* 00000000 test_1.c
00000000 l df *ABS* 00000000 test_2.c
00010014 g F .text 00000014 test_2
00010000 g F .text 00000014 test_1
Does it mean that OUTPUT_ARCH(armv4) causes linker to generate output solely for hard float?
In general, what is the difference between OUTPUT_ARCH(arm) and OUTPUT_ARCH(armv4)?
According to ld manual OUTPUT_ARCH() specifies a particular output machine architecture.
The argument is one of the names used by the BFD library.
But I have found no clear information about BFD library except general information.
I use arm-elf toolchain from here (Binutils 2.22, GCC 4.6.3, Newlib 1.2.0, GDB 7.4).
Thank you in advance for help.
UPDATE 1:
This update is a reply for the comment below.
Compiler -v output from old toolchain we use now:
Using built-in specs.
Target: arm-elf
Configured with: ../gcc-4.4.1/configure --target=arm-elf --host=i686-pc-mingw32 --with-cpu=xscale --without-stabs -nfp --prefix=/c/cross-gcc/4.4.1 --disable-nls --disable-shared --disable-__cxa_atexit
--enable-threads --with-gnu-gcc --with-gnu-ld --with-gnu-as --with-dwarf2 --enable-languages=c,c++ --enable-interwork --disable-multilib --with-gmp=/c/cross-gcc/4.4.1 --with-mpfr=/c/cross-gcc/4.4.1 -
-with-newlib --with-headers=../../newlib-1.17.0/newlib-1.17.0/newlib/libc/include --disable-libssp --disable-libstdcxx-pch --disable-libmudflap --disable-libgomp -v
Thread model: single
gcc version 4.4.1 (GCC)
Compiler -v output from newer toolchain I used in the examples (SysGCC arm-elf):
Using built-in specs.
COLLECT_GCC=arm-elf-gcc.exe
COLLECT_LTO_WRAPPER=c:/sysgcc/arm-elf/bin/../libexec/gcc/arm-elf/4.6.3/lto-wrapper.exe
Target: arm-elf
Configured with: ../gcc-4.6.3/configure --target arm-elf --enable-win32-registry=SysGCC-arm-elf-4.6.3 --prefix /c/gnu/auto/bu-2.22+gcc-4.6.3+gmp-4.2.4+mpfr-2.4.1+mpc-0.8+newlib-1.20.0-arm-elf/ --enabl
e-languages=c,c++ --disable-nls --with-newlib --with-headers=../newlib-1.20.0/newlib/libc/include --enable-interwork --enable-multilib --with-float=soft
Thread model: single
gcc version 4.6.3 (GCC)
There is no difference between linker output for OUTPUT_ARCH(arm) and OUTPUT_ARCH(armv4) for old compiler. I think I should have checked it before.
Seems that it is an answer to this question.
My goal is to use the combination -mfpu=vfpv3 -mfloat-abi=hard, but according to Debian documentation and GCC 4.4.7 manual this combination is not supported by GCC 4.4.
In fact if I try to compile with -mfpu=vfpv3 -mfloat-abi=hard by old compiler, it returns error:
sorry, unimplemented: -mfloat-abi=hard and VFP
Still and all it is possible to use -mfpu=vfpv3 -mfloat-abi=softfp with old compiler, but according to this comparison it gives big overhead for small routines.

Resources