Unexpected linker section output location - linker

I'm trying to use the ld command in linux on an assembly file for a kernel. For it to boot with grub, it needs to be after the 1Mb address. So my link script has the text going to the address 0x00100000.
Here's the linker script I'm using:
SECTIONS {
.text 0x00100000 :{
*(.text)
}
textEnd = .;
.data :{
*(.data)
*(.rodata)
}
dataEnd = .;
.bss :{
*(.common)
*(.bss)
}
bssEnd = .;
}
My question is about the output file. When I look at the binary of the file, text section starts at 0x1000. When I change the text location in the script and use addresses lower than 0x1000, such as 0x500, the text will start there. But whenever I go above 0x1000, it rounds it (0x2500 will put the text at 0x500).
When I specify that the text should be at 0x100000, shouldn't it be there in the output file? Or is there another part of the binary that specifies that there's more moving to do. I'm asking because there's a problem booting my kernel, but for now I'm just simply trying to understand the linker output.

You are referring to two different address spaces. The addresses you refer to within the linked file (such as 0x1000 and 0x500) are just the file offsets. The addresses specified in the linker script, such as 0x00100000, are with respect to computer memory (i.e. RAM).
In the case of the linker script, the linker is being told that the .text section of the binary/executable file should be loaded at the 1MiB point in RAM (i.e. 0x00100000). This has less to do with the layout of the file output by the linker and more to do with how the file is to be loaded when executed.
The section locations in the actual file have to do with alignment. That is, your linker appears to be aligning the first section at a 4096-byte boundary. If, for example, each section is less than 4096 bytes in size and each placed at 4096-byte boundary, their respective offsets in the file would be 0x1000, 0x2000, 0x3000, etc. By default, this alignment would also hold once the file is loaded into RAM such that the previous example would yield sections located at 0x00100000, 0x00101000, 0x00102000, etc.
And it appears that when you change the load location to a small enough number, the linker automatically changes the alignment. However, the 'ALIGN' function can be used if you wanted to manually specify the alignment.
For a short & sweet explanation of the linker (describing all of the above in more detail) I recommend:
http://www.math.utah.edu/docs/info/ld_3.html
or
http://sourceware.org/binutils/docs-2.15/ld/Scripts.html

Related

STM32 - Extremely large .bin file

Good day!
I work in a safety critical field and I'm having trouble with building in CubeIDE.
Backstory
Being safety critical, my company requires some redundant CRC in our code section of flash. We set aside a small portion of flash for non-CRC data (the numerical value of our code section CRC value, firmware revision, etc.) then have a post build batch file that calls a python script that calculates the initial CRC value of our code section which is then stored in the non-CRC section. In runtime, the CRC value is occasionally computed from flash and compared to the initial CRC value.
The Problem
I'm porting the source code from a similar product over to the new version of the firmware for a different product (moving from an 8-bit to a 32-bit STM32 uC). The ported code compiles fine and flashes/runs on the uC, but the .bin file that the CRC value is calculated from is massive at nearly 400Mb. The .bin file for the similar product's firmware is 32kb. In the large .bin file, there are about 399Mb of 0xFF's from the end of "flash memory" to the end of the file. As such, a calculation of the CRC value would not be valid.
What I've Tried
I'm still a baby developer, so I don't have much experience in troubleshooting these things. Looking at the .elf files for both projects, I noticed that the project with the massive .bin file has a fundamental difference from the other .elf file. The VMA and LMA for the .data section are identical in the .elf file for the new project and different for the similar project (with a VMA in the RAM section and an LMA in the FLASH section.
I've done research into LD (the GNU linker language) and understand that the NOLOAD flag is supposed to cause a section to not be loaded into memory, but putting that flag on the .data section doesn't have any effect. The linker files for both projects are identical for the .data section, so I'm not sure why their addressing would differ.
Here is the code for the .data section from the linker file:
/* Initialized data sections into RAM memory */
.data :
{
. = ALIGN(4);
_sdata = .; /* create a global symbol at data start */
*(.data) /* .data sections */
*(.data*) /* .data* sections */
. = ALIGN(4);
_edata = .; /* define a global symbol at data end */
} > RAM AT > CODE_FLASH
*RAM is at 0x2000 0000 and CODE_FLASH is at 0x0800 0000 (approximately.. it's just after a small VTABLE and the non-CRC section of flash).
What I believe is happening is that, even though there's nothing to put in the .data section (SIZE is 0), because it's LMA is at RAM, the .bin file is filled with 0xFF's up to that point, then nothing else. I just don't know how to fix it.
I've attached a (Notepad++ diff'd) screenshot of the sections from objdump.
Any assistance would be appreciated!

Adding NOLOAD section before FLASH changes ELF base address

So I am trying to add a reserved section of flash at an address in between my bootloader and main code that is in its own sector (I am using an STM32F4). When I use the section in code, the elf base address changes and my debugger freaks out, however the hex (obviously) works. When opening the elf in my debugger it looks up the base address, which is incorrectly, at 0x8000000. Since FLASH is at 0x800C000 (and isr_vector is loaded there) the debugger just can't start code.
So, my question is, why does adding this section cause the elf to rebase the address? I use another codebase where this was implemented by another person for the STM32F0 (the same way) and doesn't have this issue. I thought the NOLOAD tag was suppose to tell the compiler not to load that flash section and therefore it would not affect the elf program headers?
Below is an example of how I am setting this up:
Code
const myStruct var __attribute__((section(".rsv_flash"), used, aligned(4));
Linker
MEMORY
{
FLASH (rx): ORIGIN=0x800C000 LENGTH= 2M - 32K - 16K
RSV_FLASH (r): ORIGIN=0x8008000 LENGTH= 64
}
SECTIONS {
/* There are other Sections in here */
.rsv_flash (NOLOAD) :
{
__RSV_FLASH_START=.;
. = ALIGN(4);
KEEP(*(.rsv_flash))
. = ALIGN(4);
__RSV_FLASH_END=.;
} >RSV_FLASH

What does > region1 AT > region2 mean in an LD linker script?

I'm trying to understand a third party linker script.
At the beginning of the script it defines two memory (using MEMORY {...}) called iram and dram.
Then there are a few sections defined that have the following syntax:
.data{
...
} > dram AT > iram
I know that > dram at the end means to position that section (.data in this case) in the dram region. However I don't understand what the "AT > iram" means.
The dram part of the .data definition in your example specifies the virtual memory address (VMA) of the .data section, whereas the the iram part specifies the load memory address (LMA).
The VMA is the address the section will have when the program is run. The LMA is the address of the section when program is being loaded. As an example this can be used to provide initial values for global variables in non-volatile memory which are copied to RAM during program load.
More information can also be found in the manual for the GNU linker ld: https://sourceware.org/binutils/docs/ld/Output-Section-Attributes.html#Output-Section-Attributes

align all object files in data/sbss section in linker script

EDIT: Solved - the linker script property "SUBALIGN(32)" applied to the static data sections does exactly what I required, forcing each object file linked to be aligned to a 32byte boundary, with padding automatically inserted.
__bss_start = .;
.bss :
SUBALIGN(32)
{
*(.dynbss)
*(.bss .bss.* .gnu.linkonce.b.*)
*(COMMON)
SORT(CONSTRUCTORS)
. = ALIGN(32);
} = 0
. = ALIGN(32);
I am building a multiprogram benchmark on a cache-incoherent architecture, comprised of multiple instances of the EEMBC suite renamed and linked together.
The problem is that the libraries are not cache line aligned in the writable data segments, and I am getting data corruption here (evidenced by cache line thrashing in a coherent simulation).
For example cache line at 0x7500 is being shared between the cores operating on Viterb0 and Viterb1, the map output indicates that this is where library 0 is running into the cache line that library1 starts in:
...
.bss 0x000068e8 0xc24 ../EEMBClib/libmark_0.a(renamed_renamed_viterb00_viterb00.o)
.bss 0x0000750c 0x4 ../EEMBClib/libmark_1.a(renamed_renamed_viterb00_bmark_lite.o)
...
I need to align every object file linked in the various data segments to 32byte boundaries, I only know how to align the whole section, the current .bss sections is:
__bss_start = .;
.bss :
{
*(.dynbss)
*(.bss .bss.* .gnu.linkonce.b.*)
*(COMMON)
SORT(CONSTRUCTORS)
. = ALIGN(32);
} = 0
. = ALIGN(32);
Any help would be greatly appreciated here, rebuilding the libraries with padding isn't really an option I want to consider yet as I would like this more robust solution for future linking purposes on this platform.
The solution is the linker script property "SUBALIGN(32)". When applied to the static data sections this does exactly what I required, forcing each object file linked to be aligned to a 32byte boundary, with padding automatically inserted.
__bss_start = .;
.bss :
SUBALIGN(32)
{
*(.bss .bss.* .gnu.linkonce.b.*)
} = 0
. = ALIGN(32);
gives the fixed result
.bss 0x00006940 0xc24 ../EEMBClib/libmark_0.a(renamed_renamed_viterb00_viterb00.o)
fill 0x00007564 0x1c 00000000
.bss 0x00007580 0x4 ../EEMBClib/libmark_1.a(renamed_renamed_viterb00_bmark_lite.o)
instead of
.bss 0x000068e8 0xc24 ../EEMBClib/libmark_0.a(renamed_renamed_viterb00_viterb00.o)
.bss 0x0000750c 0x4 ../EEMBClib/libmark_1.a(renamed_renamed_viterb00_bmark_lite.o)
(Apologies that this is at least currently more a collection of thoughts than a concrete answer, but it's going to be a bit long to post in comments)
Probably the first thing that would be worth doing is to come up with some verification routine that parses objdump/readelf output to verify if your alignment requirement has been met, and put this into your build process as a check. If you can't do it at compile time, at least do it as a run time check.
Then some paths of achieving the alignment could be investigated.
Assume for a minute that a custom section is created and all data with this requirement is placed there with pragmas in the source code. Something to look into would then be if the linker is willing to honor the section alignment setting given in the occurrence of that section in each object file. You could for example hexedit one of the objects to increase that alignment and use your dump processor to see what happens. If this works out, great - it seems like the proper way to handle the task, and hopefully there's a reasonable way to specify the alignment size requirement for that section which will end up in the object files.
Another idea would be to attempt some sort of scripted allocation adjustment. For example, use objcopy to join all the applicable sections into one file, while stripping them out of the others. Analyze the file and figure out what allocations you want, then use objcopy or a custom elf modification program to set that. Maybe you could even make this modification to the fully linked result, at least if you have your linker script put the special section at the end, so that you don't have to move other allocations out of its way when you grow it to achieve internal alignment.
If you don't want to get into modifying elf's, another approach for doing your own auxiallary linking with a script could be to calculate the size of each object's data in the special section, then automatically generate an additional object file that simply pads that section out to the next alignment boundary. Your link stage would then specify objects in a list: program1.o padding1.o program2.o padding2.o
Or you could have each program put its special data in its own uniquely named linker section. Dump out the sizes of all of these, figure out where you want them to be, and then have the script create a customized linker script which explicitly puts the named sections in the just determined places.

Loading HEX data into memory

I am compiling baremetal software (no OS) for the Beagleboard (ARM Cortex A8) with Codesourcerys GCC arm EABI compiler. Now this compiles to a binary or image file that I can load up with the U-Boot bootloader.
The question is, Can I load hexdata into memory dynamically in runtime (So that I can load other image files into memory)? I can use gcc objcopy to generate a hexdump of the software. Could I use this information and load it into the appropriate address? Would all the addresses of the .text .data .bss sections be loaded correctly as stated in the linker script?
The hexdata output generated by
$(OBJCOPY) build/$(EXE).elf -O binary build/$(EXE).bin
od -t x4 build/$(EXE).bin > build/$(EXE).bin.hex
look like this:
0000000 e321f0d3 e3a00000 e59f1078 e59f2078
0000020 e4810004 e1510002 3afffffc e59f006c
0000040 e3c0001f e321f0d2 e1a0d000 e2400a01
0000060 e321f0d1 e1a0d000 e2400a01 e321f0d7
... and so on.
Is it as simple as to just load 20 bytes for each line into the desired memory address and everything would work by just branching the PC into the correct address? Did I forget something?
when you use -O binary you pretty much give up your .text, .data. .bss control. For example if you have one word 0x12345678 at address 0x10000000 call that .text, and one word of .data at 0x20000000, 0xAABBCCDD, and you use -O binary you will get a 0x10000004 byte length file which starts with the 0x12345678 and ends with 0xAABBCCDD and has 0x0FFFFFFC bytes of zeros. try to dump that into a chip and you might wipe out your bootloader (uboot, etc) or trash a bunch of registers, etc. not to mention dealing with potentially huge files and an eternity to transfer to the board depending on how you intend to do that.
What you can do which is typical with rom based bootloaders, is if using gcc tools
MEMORY
{
bob : ORIGIN = 0x10000000, LENGTH = 16K
ted : ORIGIN = 0x20000000, LENGTH = 16K
}
SECTIONS
{
.text : { *(.text*) } > bob
.bss : { *(.bss*) } > ted AT > bob
.data : { *(.data*) } > ted AT > bob
}
The code (.text) will be linked as if the .bss and .data are at their proper places in memory , 0x20000000, but the bytes are loaded by the executable (an elf loader or -O binary, etc) tacked onto the end of .text. Normally you use more linkerscript magic to determine where the linker did this. On boot, your .text code should first zero the .bss and copy the .data from the .text space to the .data space and then you can run normally.
uboot can probably handle formats other than .bin yes? It is also quite easy to write an elf tool that extracts the different parts of binaries and makes your own .bins, not using objcopy. It is also quite easy to write code that never relies on .bss being zero nor has a .data. solving all of these problems.
If you can write to random addresses without an OS getting in the way, there's no point in using some random hex dump format. Just load the binary data directly to the desired address. Converting on the fly from hex to binary to store in memory buys you nothing. You can load binary data to any address using plain read() or fread(), of course.
If you're loading full-blown ELF files (or similar), you of course need to implement whatever tasks that particular format expects from the object loader, such as allocating memory for BSS data, possibly resolving any unresolved addresses in the code (jumps and such), and so on.
Yes, it is possible to write to memory (on an embedded system) during run-time.
Many bootloaders copy data from a read-only memory (e.g. Flash), into writeable memory (SRAM) then transfer execution to that address.
I've worked on other systems that can download a program from a port (USB, SD Card) into writeable memory then transfer execution to that location.
I've written functions that download data from a serial port and programmed it into a Flash Memory (and EEPROM) device.
For memory to memory copies, use either memcpy or write your own, use pointers that are assigned a physical address.
For copying data from a port to memory, figure out how to get data from a device (such as a UART) then copy the data from its register into your desired location, via pointers.
Example:
#define UART_RECEIVE_REGISTER_ADDR (0x2000)
//...
volatile uint8_t * p_uart_receive_reg = (uint8_t*) UART_RECEIVE_REGISTER_ADDR;
*my_memory_location = *p_uart_receive_reg; // Read device and put into memory.
Also, search Stack Overflow for "embedded C write to memory"

Resources