I'm really confused by scatter files and the steps necessary to execute in RAM (for a bootloader).
From my understanding the startup.S file and sysinit need to be execute from Flash, and during that time the vector table needs to be copied over to RAM before jumping to main?
I also don't really understand the purpose of the scatter file, if I am copying the vectors and code to ram before jumping why do I even need it.
Here's my wrong scatter file:
LR_IROM1 0x14000000 0x00400000 { ; load region size_region
ER_IROM1 0x14000000 0x00400000 { ; load address = execution address
startup.o (RESET, +FIRST)
* (InRoot$$Sections)
}
RW_IRAM1 0x10000000 0x00020000 { ; RW data
*.o
}
RW_IRAM2 0x20000000 0x00010000 {
* (+RO,+RW,+ZI)
}
}
One solution for the bootloaders ram based linker script. Assuming you are using the gnu linker. There is more than one way to do this.
MEMORY
{
ram : ORIGIN = 0x20000000, LENGTH = 0x2000
}
SECTIONS
{
.text : { *(.text*) } > ram
.rodata : { *(.rodata*) } > ram
.bss : { *(.bss*) } > ram
.data : { *(.data*) } > ram
}
If your code requires .bss to be zeroed you can add more code to the linker script and more code to the bootstrap, but gnu will do this for you if you use the above and guarantee there is at least one byte of .data somewhere (it will pad .bss with zeros to get the .data item(s) in the right relative place when doing the objcopy to a binary). Your choice on how to solve that one though. If you dont need .bss zeroed then swap .data and .bss make the binary smaller. you are either copying zeros in a very very efficient loop, or writing zeros in a maybe as efficient loop if you work the alignments in the linker script.
the copy and jump side if it would only need something like this
MEMORY
{
rom : ORIGIN = 0x00000000, LENGTH = 0x2000
}
SECTIONS
{
.text : { *(.text*) } > rom
.rodata : { *(.rodata*) } > rom
}
worst case (for a simple assembly copy and jump), may be able to get rid of the .rodata line
The above are for gnu ld assuming that is what you are using, note that the memory names dont have meaning, you can instead do this:
MEMORY
{
bob : ORIGIN = 0x00000000, LENGTH = 0x2000
}
SECTIONS
{
.text : { *(.text*) } > bob
.rodata : { *(.rodata*) } > bob
}
or
MEMORY
{
joe : ORIGIN = 0x00000000, LENGTH = 0x2000
}
SECTIONS
{
.text : { *(.text*) } > joe
.rodata : { *(.rodata*) } > joe
}
or
MEMORY
{
pizza : ORIGIN = 0x00000000, LENGTH = 0x2000
}
SECTIONS
{
.text : { *(.text*) } > pizza
.rodata : { *(.rodata*) } > pizza
}
or
MEMORY
{
thehut : ORIGIN = 0x08000000, LENGTH = 0x2000
pizza : ORIGIN = 0x20000000, LENGTH = 0x2000
}
SECTIONS
{
.text : { *(.text*) } > thehut
.rodata : { *(.rodata*) } > thehut
.bss : { *(.bss*) } > pizza
.data : { *(.data*) } > pizza AT > thehut
}
you can attack things in the MEMORY or in the SECTIONS side with gnu ld. if you feel the need to have two .texts perhaps two .datas two .bss, etc (one for the copy/jump portion of the bootloader and one for the bootloader itself and have one linker script and link for all of the bootloader) you can do the this AT that thing or take the approach you are taking. But as well as the bootstrap for each being intimately connected to the linker script, you also have to use toolchain specific solutions to make the bootloader on ram fit into the proper .text/.bss, etc by either overriding them into some other (.my_bl_text...) or calling out object file names in the linker script or other solutions. Unfortunately the gnu linker script language has many features, and at the same time the documentation is more of a reference assuming you already know the language. Difficult to see how folks who have written elaborate linker scripts figured that out from the existing gnu documentation, and then trying to write your first one or modify someone elses. I recommend two programs the ram program and the copy jump program that contains the ram program as data.
Related
I have a long linker script but at the end of it, I do the following:
.heap (COPY): {
__heap_start = .;
KEEP(*(.heap*))
__heap_end = .;
} > RAM
.stack (COPY): {
KEEP(*(.stack*))
} > RAM
__stack_start = ORIGIN(RAM) + LENGTH(RAM);
__stack_end = __stack_start - SIZEOF(.stack);
PROVIDE(_stack = __stack_start);
ASSERT(__stack_end >= __heap_end, "Stack overflow in region RAM")
This way I can figure out the bare minimum RAM I need for the code to fit. It works fine in ARM and when I look at the linker map, I see .stack and .heap not empty. However, when I do the same for RISC-V, the regions are empty. Is there a reason?
The code I am using for both are similar. I am using arm-none-eabi-gcc and riscv32-unknown-elf-gcc.
I have noticed that in the TI for the CC3200 (ARMv8 / ARM Cortex M4) examples of the startup_gcc.c the actual data section within the application image is copied to a different location. The application image itself is copied from flash to SRAM by the cc3200s internal bootloader.
The application image itself is loaded into SRAM and run this way.
So in my opinion this is a total waste of memory, as the copies the data section to another place in SRAM. Am I missing something? Would the removing the code section out of the ResetISR and altering the Linker file would work fine and just use the memory within the application image in SRAM itself?
ResetISR:
uint32_t *pui32Src, *pui32Dest;
pui32Src = &__init_data;
for(pui32Dest = &_data; pui32Dest < &_edata; )
{
*pui32Dest++ = *pui32Src++;
}
Linker:
.text :
{
_text = .;
KEEP(*(.intvecs))
*(.bss.gpCtlTbl)
*(.text*)
*(.ARM.extab* .gnu.linkonce.armextab.*)
. = ALIGN(8);
_etext = .;
} > SRAM
.rodata :
{
*(.rodata*)
} > SRAM
.ARM : {
__exidx_start = .;
*(.ARM.exidx*)
__exidx_end = .;
} > SRAM
__init_data = .;
.data : AT(__init_data)
{
_data = .;
*(.data*)
. = ALIGN (8);
_edata = .;
} > SRAM
Edited Linker without copy (and changing the linker):
.data
{
_data = .;
*(.data*)
. = ALIGN (8);
_edata = .;
} > SRAM
This kind of thing is normal when you are loading to ROM. I would expect __init_data to point to an address in ROM, in which case the copy loads it from there to RAM.
In your case it appears that everything is already in SRAM, so there is no need to do a copy of the initialized data.
The only question is, how does the internal bootloader know how big the image is and how much to copy? As long as it includes the data section in its image size then you should be fine to remove the copy loop, and the : AT(__init_data).
It should be easy to test, just define a static int x = 42; and then if (x == 42) { led(on); } or similar.
I am not aware of the capabilities of the particular processor you are using but on X86 for example doing this allows the image to be loaded read-only. The data is then copied to pages that can be written to (actually for X86 in particular, copy-on-write is generally used for these pages so that multiple processes can initialize .data from the same memory and not copy pages that aren't actually changed).
In order to not need this step the image would need to be written with the various sections padded to page-alignment but people generally prefer that the image be as small as possible while containing all needed information.
I have a bootloader program "boot.asm" which must contains a special word at an offset of 510 bytes from the beginning. And I have a kernel source written in C "kernel.c".
My plan is to call the Kernel (which will be in the second sector of hd) by loading second sector of hard disk from the bootloader program and put it at location 0x8000 in memory.
Now I am compiling both the source files into ELF object files (separately) into "boot.o" and "kernel.o" and then linking them through a linker and outputting a raw binary file "kernel.bin" .
I want to put my bootloader code starting at 0x7c00 and then at the 0x7dfe location I have to put the special word. Then right at 0x8000 I have to place my kernel code. i.e I want to put respective sections of both the object files at different locations.
This is my failed attempt.
ENTRY(boot)
OUTPUT_FORMAT("binary")
SECTIONS{
. = 0x7c00;
.text :
{
*(.boot)
}
.sig : AT(0x7dfe){
SHORT(0xaa55);
}
. = 0x8000;
.text :
{
kernel.o(.text)
}
.rodata :
{
kernel.o(.rodata)
}
.data :
{
kernel.o(.data)
}
.bss :
{
kernel.o(.bss)
}
}
What I have understood is that an executable cannot have a section more than once.
I have limited knowledge about low level programming.
How do I solve this problem.
Thankyou.
You need to fix two things, don't split the .text output section, and use AT() to place the kernel immediately after the boot sector in the output binary while keeping its address at 0x8000. For example, a linker script something like this should work:
ENTRY(boot)
OUTPUT_FORMAT("binary")
SECTIONS {
. = 0x7c00;
.boot :
{
*(.boot)
}
. = 0x7dfe;
.sig : {
SHORT(0xaa55);
}
. = 0x8000;
.kernel : AT(0x7e00) /* place immediately after the boot sector */
{
*(.text)
*(.rodata)
*(.data)
_bss_start = .;
*(.bss)
*(COMMON)
_bss_end = .;
}
kernel_sectors = (SIZEOF(.kernel) + 511) / 512;
/DISCARD/ : {
*(.eh_frame)
}
}
I've added some stuff to handle sections you'll see in GCC compiled object files. The _bss_start and _bss_end symbols can be used to zero out the .bss section, and per Michael Petch's suggestion the kernel_sector symbol is set to the length of the kernel in 512 byte sectors.
I am using the Atmel AVR-GCC for compiling a project based on Atmel's Zigbee Bitcloud. On the Atmega256rfr2( 256k Flash, 32k RAM)
After adding more code, I am approaching the limit of the memory (seems like it).
I figured out, that if the linker added too much to the "data-section", it leads to unpredictable behaviour of the program. The problem is the linker does not help me to find this point. So I am struggling to find a stable solution.
I am using the following linker file provided by Atmel:
OUTPUT_FORMAT("elf32-avr")
OUTPUT_ARCH(avr:6)
MEMORY
{
text (rx) : ORIGIN = 0x00000000, LENGTH = 256K
boot (rx) : ORIGIN = 0x0003F000, LENGTH = 4K
access (rx) : ORIGIN = 0x0003FFF0, LENGTH = 16
data (rw!x) : ORIGIN = 0x00800200, LENGTH = 32K - 500 /* leave 500 bytes for stack */
eeprom (rw!x) : ORIGIN = 0x00810000, LENGTH = 8K
}
SECTIONS
{
.text :
{
PROVIDE(__text_start = .);
*(.vectors)
KEEP(*(.vectors))
. = ALIGN(0x400);
/* PDS NV memory section */
PROVIDE(__d_nv_mem_start = .);
. = ALIGN(0x4400);
PROVIDE(__d_nv_mem_end = .);
/* Non-volatile file system PDS_FF section */
PROVIDE(__pds_ff_start = .);
KEEP(*(.pds_ff))
PROVIDE(__pds_ff_end = .);
/* Non-volatile file system PDS_FD section */
PROVIDE(__pds_fd_start = .);
KEEP(*(.pds_fd))
PROVIDE(__pds_fd_end = .);
*(.progmem.gcc*)
*(.progmem*)
. = ALIGN(2);
*(.trampolines*)
*(.jumptables*)
*(.lowtext*)
*(.init0)
KEEP (*(.init0))
*(.init1)
KEEP (*(.init1))
*(.init2)
KEEP (*(.init2))
*(.init3)
KEEP (*(.init3))
*(.init4)
KEEP (*(.init4))
*(.init5)
KEEP (*(.init5))
*(.init6)
KEEP (*(.init6))
*(.init7)
KEEP (*(.init7))
*(.init8)
KEEP (*(.init8))
*(.text.main)
KEEP (*(.text*main))
*(.text)
*(.text.*)
PROVIDE(__text_end = .);
} > text
.data : AT (ADDR(.text) + SIZEOF(.text))
{
PROVIDE(__data_start = .);
*(.data*)
*(.rodata*)
*(.gnu.linkonce.d*)
. = ALIGN(2);
PROVIDE(__data_end = .);
} > data
.bss __data_end :
{
PROVIDE(__bss_start = .);
*(.bss*)
*(COMMON)
PROVIDE(__bss_end = .);
} > data
.noinit __bss_end :
{
*(.noinit*)
PROVIDE(__heap_start = .);
} > data
__stack_start = .;
__data_load_start = LOADADDR(.data);
__data_load_end = __data_load_start + SIZEOF(.data);
.access_section :
{
KEEP(*(.access_section*))
*(.access_section*)
} > access
.boot_section :
{
*(.boot_section*)
} > boot
.eeprom :
{
FILL(0xff)
BYTE(0xff)
. = . + LENGTH(eeprom)-1;
} > eeprom
/DISCARD/ :
{
*(.init9)
*(.fini9)
}
}
I managed to figure out at which amount of data the code is definitely not working any more and until which amount I do not have an obvious malfunction.
The program is working for a size output of:
text data bss dec hex filename
210260 10914 25427 246601 3c349 (TOTALS)
avr-gcc-size -A:
section size addr
.data 2722 8389120
.text 209468 0
.bss 25426 8391842
.noinit 1 8417268
.access_section 4 262128
.boot_section 798 258048
.eeprom 8192 8454144
.debug_info 538541 0
.debug_abbrev 46706 0
.debug_loc 73227 0
.debug_aranges 5704 0
.debug_ranges 6032 0
.debug_line 108276 0
.debug_str 89073 0
.comment 92 0
.debug_frame 14252 0
Total 1128514
I have obvious malfunction at a size of:
210260 10918 25427 246605 3c34d (TOTALS)
Increasing only the text, but not the data, does not lead to any though:
210270 10914 25427 246611 3c353 (TOTALS)
Does anyone has an idea, why the program fails at this point? And how can I predict the limit in the future or make the linker give me a warning, when this might happen?
I do not get any linker error message or Warning. The program just crashes at this point.
Everything in the .data section takes up Flash and RAM. The part in Flash is used to initialize the part in RAM. You're probably running out of RAM. So my suggestion is to mark as much as possible as const. Doing that thing will be moved into the .text segment, where it occupies just Flash and leaves RAM for better things.
There's some serious misconceptions going on here.
0x00800200, LENGTH = 32K - 500. 200h is not the same as 500, but the same as 512. Also, 0x00810000 - 0x00800000 is not 32k but 64kib. There are lots of such errors all over, whoever setup this linker file didn't quite know what they were doing and they didn't know hexadecimal numbers.
"The Program is working for a size output of..." 10914 + 25427 = 36341 bytes. How can it work fine, you just said that you have 32kib of physical RAM available on the chip. And you also reserve 512 bytes for the stack. It does not work fine, it might seem to work for now, by pure chance.
If you think that your program can work fine when you allocate more memory than what is physically available, there is no hope for you to ever recover this program. Memory cannot be allocated in thin air. Similarly, you cannot have RW sections that are larger than 256k added together unless there's some special boot area ROM on this chip.
The reason why you don't get any linker warnings might be because you have told the linker that you have 64kib available, while the physical chip only got 32kib.
Your bss+data sections (both probably go to data region) exceed your data region for few kB.
Probably due to some random behavior, you write over your stack at some point, which crashes your program.
Linker should warn you if section does not fit the region.
I think only way to be sure no issues will occur is to extend data region (if yor board has more RAM), or decrease size of your initialized + uninitialized data.
Or maybe some of your initialized data goes to eprom region, and only after you add few bytes you overflow data. To be sure use avr-something-size -A yourexecutable, which should show more detail.
I'm writing a bare metal ARM boot loader and am trying to use some internal SRAM as a scratch pad to communicate to the application code. For my needs I don't need to initialise or zero the memory. Using this script I can place my desired variables in the memory just fine.
/**
* Linker script for secondary bootloader.
*
* Allocatest the first 1Mb of DRAM for its use.
* Scratchpad in internal SRAM.
*/
MEMORY
{
SRAM : o = 0x402F0400, l = 0x0000FC00 /* 63kB available internal SRAM */
DDR0 : o = 0x80000000, l = 1M /* 1Mb external DDR Bank 0 */
}
OUTPUT_FORMAT("elf32-littlearm", "elf32-littlearm", "elf32-littlearm")
OUTPUT_ARCH(arm)
SECTIONS
{
.startcode :
{
__AppBase = .;
. = ALIGN(4);
*init.o (.text)
} >DDR0
.text :
{
. = ALIGN(4);
*(.text*)
*(.rodata*)
} >DDR0
.data :
{
. = ALIGN(4);
*(.data*)
} >DDR0
.bss :
{
. = ALIGN(4);
_bss_start = .;
*(.bss*)
*(COMMON)
_bss_end = .;
} >DDR0
.stack :
{
. = ALIGN(4);
__StackLimit = . ;
*(.stack*)
. = __AppBase + 1M;
__StackTop = .;
} >DDR0
_stack = __StackTop;
.internal_ram :
{
. = ALIGN(4);
*(.internal_ram*)
} >SRAM
}
When using objcopy to create the raw binary, I'm getting huge files. I'm assuming this is because the first bytes of the raw binary are actually the internal memory with megabytes of padding up to the start of the .text section. Objdump -h shows that the internal_ram section being marked with the CONTENTS, LOAD, and DATA flags even though the variables placed there are not initialised.
I can clean this up in objcopy using --remove-section=.internal_ram but it seems there should be a way to get the linker to recognise that the data is not initialised.
Is there a way to mark the section appropriately?
The correct section declaration is:
.internal_ram (NOLOAD) :
{
. = ALIGN(4);
*(.internal_ram*)
} >SRAM
The NOLOAD section attribute is documented but speaks in terms of program loaders handling the section at load time. At first this doesn't seem to apply to bare metal images but, for that purpose, objcopy acts like a program loader and honors the flag settings in the object file, omitting the section from the raw image.
The other answer mentions this as well - the key is to make the section NOLOAD so that the data remains uninitialized.
The `(NOLOAD)’ directive will mark a section to not be loaded at run time. The linker will process the section normally, but will mark it so that a program loader will not load it into memory.
A quote from Ashley Duncan that you might find useful:
NOLOAD is useful in embedded projects for making sure a block of RAM is not initialised or zeroed. For example if you want the contents of that RAM to not lose its values during a software reset (e.g. if you want to set a variable with the reason you are resetting). Another useful application is to pass information from a boot loader to application without the application startup code overwriting the values of that memory area. Of course in this case both the boot loader and application linker files need to declare the exact same memory area location and size.
Some more explanation/story can be found here