Relocation of data from flash to RAM during boot phase

Relocation of data from flash to RAM during boot phase - c

I'm currently trying to solve a problem which requires moving data from flash to RAM during the booting phase. Right now everything is only being simulated using a microcontroller architecture which is based on the open-source PULPissimo. For simulation I use QuestaSim by Mentor Graphics. Toolchain is GNU.
Unfortunately I have pretty much zero experience on how to relocate data during the boot phase so I've read some posts and tutorials on this topic but I'm still confused about quite a few thing.
The situation is as follows: I set my boot mode to boot from flash which in this case means that the code will already reside pre-loaded inside the flash memory. The code is just a simply hello world or any other program really. When I simulate everything is compiled and the modules are loaded. After the boot phase the output "hello world" is displayed and the simulation is done. This means everything works as intended which is obviously a good sign and a good starting point.
Side note: As far as i know the PULPissimo architecture does not support direct boot from flash at the moment so the data from flash has to be moved to RAM (which they call L2) and executed.
From what I understand there are multiple things involved in the booting process. Please correct me if anything in the next paragraph is wrong:
First: The code that will be executed. It's written in C and has to be translated into a language which the architecture understands. This should be done automatically and reside in the flash memory pre boot phase. Considering that the code is actually being executed as mentioned above there is not much confusion here.
Second: The bootloader. This is also written in C. It is also translated and will be burned into ROM later on so changing this wouldn't make much sense. It loads the data which is neccessary for booting. It can also differentiate if you want to boot from flash or JTAG.
Third: The main startup file crt0.S. This is one of the things that confuse me, especially what it exactly does and what the difference between the bootloader and the main startup file is. Wikipedia (yes i know...) defines it as: "crt0 (also known as c0) is a set of execution startup routines linked into a C program that performs any initialization work required before calling the program's main function." So does that mean that it has noting to do with the boot phase but instead kind of "initializes" and/or loads only the code that I want to execute?
Fourth: The linker script link.ld. Even tho this is the part I read the most about, there are still quite a lot of questions. From what I understand the linker script contains information on where to relocate data. The data that is to be relocated is the data of the code i want to execute(?). It consists of different parts explained here.
.text program code;
.rodata read-only data;
.data read-write initialized data;
.bss read-write zero initialized data.
Sometimes I see more than those sections, not just text, rodata, data, bss. But how does the linker script know what the "text" is and what the "data" is and so on?
I know that's quite a lot and probably pretty basic stuff for a lot of you but I'm genuinely confused.
What I am trying to accomplish is relocating data from flash to RAM during the boot phase. Not only the code that I want to execute but more data that is also located in the flash memory. Consider the following simple scenario: I want to run a hello world C program. I want to boot from flash. Up to this point nothing special and everything works fine. Now after the data of the code I also load more data into flash, let's say 256 bytes of A (hex) so I can check my memory in QuestaSim by looking for AAAAAAAA sections. I also want to say where I want that data to be loaded during boot phase, for example 0x1C002000. I tried playing around with the crt0.S and the linker.ld files but with no success. The only time it actually worked was when I modified the bootloader.c file but I have to assume that this is already burned into ROM and i can't do any modifications on it. To be honest I'm not even sure if what I'm trying to do is even possible without any changes to the bootloader.c.
Thank you for your time.
Update
So I was playing around a bit and tried to create a simple example to understand what's happening and what manipulations or relocations I can do.
First I created a C file which basically contains only data.
Lets call it my_test_data.c
int normal_arr[] = {0x55555555, 0x55555555, 0x55555555, 0x55555555, 0x55555555, 0x55555555, 0x55555555, 0x55555555};
int attribute_arr[] __attribute__ ((section(".my_test_section"))) = {0x66666666, 0x66666666, 0x66666666, 0x66666666, 0x66666666, 0x66666666, 0x66666666, 0x66666666};
static int static_arr[] = {0x77777777, 0x77777777, 0x77777777, 0x77777777, 0x77777777, 0x77777777, 0x77777777, 0x77777777};
int normal_var = 0xCCCCCCCC;
static int static_var = 0xDDDDDDDD;
int result_var;
Then I created the object file. I looked into it via objdump and could see my section my_test_section :
4 .my_test_section 00000020 00000000 00000000 00000054 2**2
After that I tried to modify my linker script so that this section would be loaded to an address that I specified. These are the lines I added in the linker script (probably more than needed). It is not the whole linker script!:
CUT01 : ORIGIN = 0x1c020000, LENGTH = 0x1000
.my_test_section : {
. = ALIGN(4);
KEEP(*(.my_test_section))
_smytest = .;
*(.my_test_section)
*(.my_test_section.*)
_endmytest = .;
} > CUT01
I wanted to see what data from my_test_data.c gets moved and where it gets moved. Remember that my goal is to have the data inside the RAM (Addr.: 0x1c020000) after booting (or during booting however you prefer). Unfortunately only:
int normal_arr[] = {0x55555555, 0x55555555, 0x55555555, 0x55555555, 0x55555555, 0x55555555, 0x55555555, 0x55555555};
gets moved into ROM (Addr.: 0x1A000000) as it seems to be part of the .text section (iirc) which is already being handled by the linker script:
.text : {
. = ALIGN(4);
KEEP(*(.vectors))
_stext = .;
*(.text)
*(.text.*)
_etext = .;
*(.lit)
( ... more entries ...)
_endtext = .;
} > ROM
What also confuses me is the fact that I can add this line in the above .text section:
*(.my_test_section)
and then the data from the attribute_arr will be located in ROM but if I try to move it to the address I added (CUT01) nothing will ever end up there.
I also generated the map file which also lists my_test_section. This is an excerpt from the map file (don't mind the locations of where the output files are on my machine).
.my_test_section
0x000000001c020000 0x3c
0x000000001c020000 _mts_start = .
*(.text)
*(.text.*)
*(.comment)
.comment 0x000000001c020000 0x1a /.../bootloader.o
0x1b (size before relaxing)
.comment 0x000000001c02001a 0x1b /.../my_test_data.o
*(.comment.*)
*(.rodata)
*(.rodata.*)
*(.data)
*(.data.*)
*(.my_test_section)
*fill* 0x000000001c02001a 0x2
.my_test_section
0x000000001c02001c 0x20 /.../my_test_data.o
0x000000001c02001c attribute_arr
*(.my_test_section.*)
*(.bss)
*(.bss.*)
*(.sbss)
*(.sbss.*)
0x000000001c02003c . = ALIGN (0x4)
0x000000001c02003c _mts_end = .
OUTPUT(/.../bootloader elf32-littleriscv)
I will continue to try to get this to work but right now I'm kind of confused as to why it seems like my_test_section gets recognized but not moved to the location which I specified. This makes me wonder if I made a mistake (or several mistakes) in the linker script or if one of the other files (bootloader.c or crt0.S) might be the reason.

There is a lot being asked here. I'm going to take a stab at answering part of the questions. you ask:
But how does the linker script know what the "text" is and what the
"data" is and so on?
The additional, custom, sections, and the predefined sections, are handled differently.
Custom sections usually require the related variables to have the section specified with a pragma.
The standard sections are defined by their type:
text: this is the code. that should be clear; the instructions to the computer of what to do, not the data
rodata: const data -- such as literal strings (eg. "This is a literal string" in the code. A good compiler/linker should put variables defined as 'const' (not const parameters) in the rodata section as well.
bss: static or global variables which are not initialized when declared:
int global_var_not_a_good_idea; // not in a function; local variables are different
static int anUninitializedArray[10];
data: static or global variables which are initialized when declared
int initializedGlobalVarStillNotRecommended = 10;
static int initializedArray[] = { 1, 2, 3, 4, 5, 6};
This data should be copied to RAM when the program loads.
EDIT:
Somewhere in your startup code should be a reset handler. This function will be called on processor reset. It should be the function that copies data to RAM, possibly clears the zero segment, initializes the C library, etc. When finished with initializations, it should call main();
Here is an example (in this case, from generated or example code for the Atmel SAMG55 processor, but the idea should be the same) of relocating data to RAM.
In the linker script memory space definitions (I'm going to leave out the real numbers):
ram (rwx) : ORIGIN = 0x########, LENGTH = 0x########
in the linker script section definitions:
.relocate : AT (_etext)
{
. = ALIGN(4);
_srelocate = .;
(.ramfunc .ramfunc.);
(.data .data.);
. = ALIGN(4);
_erelocate = .;
} > ram
note that _etext is the end of the previous section
_srelocate and _erelocate are used in the startup code to relocate, I believe, everything in .data (and, apparently, .ramfunc as well) in all the files:
/* Initialize the relocate segment */
pSrc = &_etext;
pDest = &_srelocate;
if (pSrc != pDest) {
for (; pDest < &_erelocate;) {
*pDest++ = *pSrc++;
}
}
This is a pretty standard example. If you search in your project for where main() is called, you should find something similar.
If all you want to do is relocate the entire .data section to the address you are specifying in RAM, you should need only to change the definition of the location of the RAM section, not define your own. You only need to define your own section if you want to move specific variables to a different location
I am not familiar with the platform on which you are working, but there should be either a C or assembly file with the startup code that runs before crt0. This will set up the stack, heap, and interrupt vectors. In some implementations, this code also copies the .data section to RAM, and may be set up to copy everything from the beginning of the data section until the beginning of the .bss section, to RAM. If your platform is set up in this way, if you locate your section between .data and .bss, it should be copied with no other changes from you (see here, for example).
If, however, you want to copy the data to a different location, you will probably have to add code to copy it, either in the loader code or at the very beginning of main, using the symbols you defined for the beginning and ending of the section.
Since you mention, though, that it is read-only data, I would recommend leaving it in read-only memory if you can.

Related

Adding NOLOAD section before FLASH changes ELF base address

So I am trying to add a reserved section of flash at an address in between my bootloader and main code that is in its own sector (I am using an STM32F4). When I use the section in code, the elf base address changes and my debugger freaks out, however the hex (obviously) works. When opening the elf in my debugger it looks up the base address, which is incorrectly, at 0x8000000. Since FLASH is at 0x800C000 (and isr_vector is loaded there) the debugger just can't start code.
So, my question is, why does adding this section cause the elf to rebase the address? I use another codebase where this was implemented by another person for the STM32F0 (the same way) and doesn't have this issue. I thought the NOLOAD tag was suppose to tell the compiler not to load that flash section and therefore it would not affect the elf program headers?
Below is an example of how I am setting this up:
Code
const myStruct var __attribute__((section(".rsv_flash"), used, aligned(4));
Linker
MEMORY
{
FLASH (rx): ORIGIN=0x800C000 LENGTH= 2M - 32K - 16K
RSV_FLASH (r): ORIGIN=0x8008000 LENGTH= 64
}
SECTIONS {
/* There are other Sections in here */
.rsv_flash (NOLOAD) :
{
__RSV_FLASH_START=.;
. = ALIGN(4);
KEEP(*(.rsv_flash))
. = ALIGN(4);
__RSV_FLASH_END=.;
} >RSV_FLASH

get GNU `ld` to link against static functions in another binary without including them

I have a microcontroller, where i put a big program in ROM, which is supposed to, at a certain point, fetch a payload into RAM and execute it, which then again is supposed to call back into functions in ROM.
The first half (calling into RAM) seemed simple enough:
/* ROM linker script */
MEMORY
{
rom (rx) : ORIGIN = 0x08000000, LENGTH = 64K /* this includes .text */
ram (rwx) : ORIGIN = 0x20000000, LENGTH = 10K /* was 20K, cut in half to make space for payload, this includes .data, .bss, etc */
plram (rwx) : ORIGIN = 0x20002800, LENGTH = 10K /* space for payload */
}
SECTIONS{
.ramPayloadBlock : {
__RAM_PAYLOAD_START = .;
KEEP(*(.ramPayload))
} > plram
}
// ROM code
extern int __RAM_PAYLOAD_START;
void __attribute__((section (".ramPayload"))) (*ramPayload)(void) = (void(*)(void))&__RAM_PAYLOAD_START; // this works as long as the payload's entrypoint is actually at the start of .ramPayload
ramPayload();
but now, when actually linking the code for ramPayload I need to somehow tell the linker to look into the ROM's .map file or .elf binary or whatever to look up the addresses of ROM functions i want to call back into from RAM.
and after hours of looking through the ld docs, i have literally no idea how to do that, apart from writing a shell script trying to parse the messy map file and generating a custom header full of function pointers each time, thus reinventing the wheellinker.

ok so i got it to work finally.
the magic linker parameter is -R rom_binary.elf (which reads symbols but doesn't include anything from that file).
that and loads of -E to make it "export all dynamic symbols" (even though nothing is linked dynamically, which of course isn't documented behaviour so yay)
then just make the linker script for the ram binary locate everything in ram, your entrypoint first (so it neatly gets put at the section start for you to refer to from the ROM code).

Compiling and linking position independent code (PIC) for ARM M4

I'm working on making a bootloader and giving it the ability to update itself. This process involves copying the binary to a new location, jumping to it, and using it to flash the new bootloader in the original location. This is all being developed for an M4 processor in Eclipse, using the ARM GCC toolchain.
To do this, I've gathered that I need to compile as Position Independent Code (PIC).
I've searched around and found this excellent article, so when I added "-fPIC" to the ARM GCC compiler call I expected to see linker errors about GOT and PLT being missing
https://eli.thegreenplace.net/2011/11/03/position-independent-code-pic-in-shared-libraries/
In my linker script, I added these location to the .data section as follows:
.data : AT(__DATA_ROM)
{
. = ALIGN(4);
__DATA_RAM = .;
__data_start__ = .; /* Create a global symbol at data start. */
*(.got*) /* .got and .plt for position independent code */
*(.data) /* .data sections */
*(.data*) /* .data* sections */
KEEP(*(.jcr*))
. = ALIGN(4);
__data_end__ = .; /* Define a global symbol at data end. */
} > m_data
However, this code fails to copy-up from ROM to RAM.
My next thought was that perhaps my linker needed to be aware it was linking a PIC executable. To find out, I added "--pic-executable" to the LD linker script call. However, the linker now generated sections for "interp", "dyn", "rel.dyn" and "hash". I tried throwing these into the data section as well, but got the following errors:
gcc-arm-none-eabi-4_9/bin/../lib/gcc/arm-none-eabi/4.9.3/../../../../arm-none-eabi/bin/ld.exe:
could not find output section .hash
gcc-arm-none-eabi-4_9/bin/../lib/gcc/arm-none-eabi/4.9.3/../../../../arm-none-eabi/bin/ld.exe:
final link failed: Nonrepresentable section on output
I assume this means the compiler didn't actually fill the ".hash" section with anything, so the link failed.
Am I going about this correctly? Is there anything else I need to add to get the compiler to do? Any help would be greatly appreciated.

Recently I researched extensively Cortex-M4 bootloading a PIC firmware image.
There are couple of things needed:
A very simple bootloader. It only needs to read 2 4-byte words from the flash location of the firmware image. First is stack pointer address and second is Reset_Handler address. Bootloader needs to jump to Reset_Handler. But as the firmware is relocated further in the flash, the Reset_Handler is actually a bit further. See this picture for clarification:
So, bootloader adds the correct offset before jumping to Reset_Handler of the firmware image. No other patching is done, but bootloader (in my solution at least) stores location and offset of the firmware image, calculates checksum and passes this information in registers for the firmware image to use.
Then we need modifications in firmware linker file to force the ISR vector table in the beginning of RAM. For Cortex-M4 and VTOR (Vector Offset Table Register) the ISR vector table needs to be aligned to 512 boundary. In the beginning of RAM it is there naturally. In linker script we also dedicate a position in RAM for Global Offset Table (GOT) for easier manipulation. Address ranges should be exported via linker script symbols also.
We need also C compiler options. Basically these:
-fpic
-mpic-register=r9
-msingle-pic-base
-mno-pic-data-is-text-relative
They go to C compiler only! This creates the Global Offset Table (GOT) accounting.
Finally we need dedicated and tedious assembly bootstrap routines in the firmware image project. These routines perform the normal startup task of setting up the C runtime environment, but they additionally also go and read ISR and GOT from flash and copy them to RAM. In RAM, the addresses of those tables pointing to flash are offsetted by the amount the firmware image is running apart from bootloader. Example pictures:
I spent last 6 months researching this subject and I have written an in-depth article about it here:
https://techblog.paalijarvi.fi/2022/01/16/portable-position-independent-code-pic-bootloader-and-firmware-for-arm-cortex-m0-and-cortex-m4/
By providing the link I hope I can contribute to StackOverflow which I used as a starting point for my research. I hope this helps some people in learning the intrinsics.

Booting a code and relocating involves many careful steps and initialization of configuration of RAM, SPI and other necessary peripherals.
I know U-Boot does the sequence you are trying to achieve. So a better starting is to walk through u-boot documentation and sources in, machine specific folders for the processor or board of interest.

For what it's worth, neither I nor NXP's technical support team were able to get their S32DS IDE to compile truly position independent code.
To this day, we have two bootloaders - one compiled for location A, the other is an intermediate for location B. Both of which are required for an update.

TI does this with the Tivaware bootloader. You can do it with linker gnu ld trickery:
.text 0x20000000 : AT (0x00000000)
{
_text = .;
KEEP(*(.isr_vector))
*(.text*)
*(.rodata*)
_etext = .;
}
Startup code that copies this code from Flash at 0x0 to SRAM at 0x2000_0000 is left as an exercise to the reader.

ARM execute code from embedded RAM

I can easily place some of my code RO sections at specific execution regions at specific addresses (which might be RAM addresses). There will be no problem in my program integrity because of proper linking.
The problem is that those RO sections placed at RAM addresses will not appear on RAM after power off/power on. They will be missing. Am I right?
Of course I can load them in place with bootloader, but it is not the case now.
My question is: is there any trusted default method to solve this problem? Maybe some attributes etc. For example, maybe there is method of copying RO sections (like RW) at startup by C library?
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka11494.html
As this post suggest, it is incorrect. As I mentioned before, after restart RAM will not contain any RO data.

You already have this problem in an embedded system with the .data area. .bss just gets zeroed in the bootstrap, but .data needs to be in non volatile storage so it is there when you power up, but it needs to live in ram. The typical solution is to mark it as such in the linker script as I normally run from ram but want to be stored in the binary in rom.
with gnu tools you do the something at something thing
MEMORY
{
bob : ORIGIN = 0x8000, LENGTH = 0x1000
ted : ORIGIN = 0xA000, LENGTH = 0x1000
}
SECTIONS
{
.text : { *(.text*) } > bob
__data_rom_start__ = .;
.data : {
__data_start__ = .;
*(.data*)
} > ted AT > bob
__data_end__ = .;
__data_size__ = __data_end__ - __data_start__;
.bss : {
__bss_start__ = .;
*(.bss*)
} > bob
__bss_end__ = .;
__bss_size__ = __bss_end__ - __bss_start__;
}
Linker scripts or other ways of controlling the linker are very specific to the toolchain, I wouldnt automatically expect arms tools to use the same solution as gnus tools or other tools. They might to keep the rest of us sane, but it is not something a standards body manages.
then you have to match your bootstrap code to the linker script scheme and copy the data over.
if you have sections of code you want to move and not just data or instead of data you would use the exact same scheme, add linker script things to mark that blob of code as want to live here when run but want to live there in the binary image. and your bootstrap or some code has to copy that fraction of the program to ram before it is used.

It is (amongst other tasks) the responsibility of the C runtime start-up code (that which runs before main() is executed) to copy RAM executable code from the ROM image to RAM. In some cases the ROM image may be compressed so that the start-up code must also perform decompression.
Your tool-chain may already provide suitable start-up code, or you may have to modify the existing runtime start-up code to support this.

align all object files in data/sbss section in linker script

EDIT: Solved - the linker script property "SUBALIGN(32)" applied to the static data sections does exactly what I required, forcing each object file linked to be aligned to a 32byte boundary, with padding automatically inserted.
__bss_start = .;
.bss :
SUBALIGN(32)
{
*(.dynbss)
*(.bss .bss.* .gnu.linkonce.b.*)
*(COMMON)
SORT(CONSTRUCTORS)
. = ALIGN(32);
} = 0
. = ALIGN(32);
I am building a multiprogram benchmark on a cache-incoherent architecture, comprised of multiple instances of the EEMBC suite renamed and linked together.
The problem is that the libraries are not cache line aligned in the writable data segments, and I am getting data corruption here (evidenced by cache line thrashing in a coherent simulation).
For example cache line at 0x7500 is being shared between the cores operating on Viterb0 and Viterb1, the map output indicates that this is where library 0 is running into the cache line that library1 starts in:
...
.bss 0x000068e8 0xc24 ../EEMBClib/libmark_0.a(renamed_renamed_viterb00_viterb00.o)
.bss 0x0000750c 0x4 ../EEMBClib/libmark_1.a(renamed_renamed_viterb00_bmark_lite.o)
...
I need to align every object file linked in the various data segments to 32byte boundaries, I only know how to align the whole section, the current .bss sections is:
__bss_start = .;
.bss :
{
*(.dynbss)
*(.bss .bss.* .gnu.linkonce.b.*)
*(COMMON)
SORT(CONSTRUCTORS)
. = ALIGN(32);
} = 0
. = ALIGN(32);
Any help would be greatly appreciated here, rebuilding the libraries with padding isn't really an option I want to consider yet as I would like this more robust solution for future linking purposes on this platform.

The solution is the linker script property "SUBALIGN(32)". When applied to the static data sections this does exactly what I required, forcing each object file linked to be aligned to a 32byte boundary, with padding automatically inserted.
__bss_start = .;
.bss :
SUBALIGN(32)
{
*(.bss .bss.* .gnu.linkonce.b.*)
} = 0
. = ALIGN(32);
gives the fixed result
.bss 0x00006940 0xc24 ../EEMBClib/libmark_0.a(renamed_renamed_viterb00_viterb00.o)
fill 0x00007564 0x1c 00000000
.bss 0x00007580 0x4 ../EEMBClib/libmark_1.a(renamed_renamed_viterb00_bmark_lite.o)
instead of
.bss 0x000068e8 0xc24 ../EEMBClib/libmark_0.a(renamed_renamed_viterb00_viterb00.o)
.bss 0x0000750c 0x4 ../EEMBClib/libmark_1.a(renamed_renamed_viterb00_bmark_lite.o)

(Apologies that this is at least currently more a collection of thoughts than a concrete answer, but it's going to be a bit long to post in comments)
Probably the first thing that would be worth doing is to come up with some verification routine that parses objdump/readelf output to verify if your alignment requirement has been met, and put this into your build process as a check. If you can't do it at compile time, at least do it as a run time check.
Then some paths of achieving the alignment could be investigated.
Assume for a minute that a custom section is created and all data with this requirement is placed there with pragmas in the source code. Something to look into would then be if the linker is willing to honor the section alignment setting given in the occurrence of that section in each object file. You could for example hexedit one of the objects to increase that alignment and use your dump processor to see what happens. If this works out, great - it seems like the proper way to handle the task, and hopefully there's a reasonable way to specify the alignment size requirement for that section which will end up in the object files.
Another idea would be to attempt some sort of scripted allocation adjustment. For example, use objcopy to join all the applicable sections into one file, while stripping them out of the others. Analyze the file and figure out what allocations you want, then use objcopy or a custom elf modification program to set that. Maybe you could even make this modification to the fully linked result, at least if you have your linker script put the special section at the end, so that you don't have to move other allocations out of its way when you grow it to achieve internal alignment.
If you don't want to get into modifying elf's, another approach for doing your own auxiallary linking with a script could be to calculate the size of each object's data in the special section, then automatically generate an additional object file that simply pads that section out to the next alignment boundary. Your link stage would then specify objects in a list: program1.o padding1.o program2.o padding2.o
Or you could have each program put its special data in its own uniquely named linker section. Dump out the sizes of all of these, figure out where you want them to be, and then have the script create a customized linker script which explicitly puts the named sections in the just determined places.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight