Why my system ld script use expression like "dot = dot"?

Why my system ld script use expression like "dot = dot"? - linker

When dumping my system's linker script with ld -verbose, I noteice that it uses:
.data1 : { *(.data1) }
_edata = .; PROVIDE (edata = .);
. = .;
__bss_start = .;
.bss :
why does it assign the current address to the current address?

. = .; serves as a barrier for orphan section placement.
https://sourceware.org/binutils/docs/ld/Location-Counter.html says "... Instead, it assumes that all assignments or other statements belong to the previous output section, except for the special case of an assignment to ."
After the linker picks the best output section description, it will skip following non-. symbol assignments. With a . = .;, the linker will stop searching and place the orphan section immediately above the . = .;.

Related

GCC Linker - Locate a section/constant at a specific address within the .text section

I would like to locate a 32bit constant value at a specific address (0x080017FC) within the .text (code) section.
To be honest, when it comes to modifying the linker script to this extent I'm naïve and feel like I do not have a clue what to do.
I've modified my linker script to contain this new section (.systemid) within the .text section.
.text :
{
. = ALIGN(4);
KEEP(*(.systemid))
*(.text) /* .text sections (code) */
*(.text*) /* .text* sections (code) */
*(.glue_7) /* glue arm to thumb code */
*(.glue_7t) /* glue thumb to arm code */
*(.eh_frame)
KEEP (*(.init))
KEEP (*(.fini))
. = ALIGN(4);
_etext = .; /* define a global symbols at end of code */
} >FLASH
To ensure it does not get optimized away, I used KEEP.
I then declared my constant in the new section (.systemid). This is where I start to wonder what am I supposed to do. If .systemid was a section on its own, I would have declared the constant as follows:
const uint32_t __attribute__((used, section (".systemid"))) SYSTEM_ID_U32 = 0x11223344;
But since this is a section within a section, should it not be?:
uint32_t __attribute__((used, section (".text.systemid"))) SYSTEM_ID_U32 = 0x11223344;
So the linker will locate the constant at the beginning of the .text section (0x000001A0). Great, it is inside the text section but not at the correct address. I would like to locate the constant at 0x08001F7C.
To try and achieve this, I pass the following to the linker:
-Wl,--section-start=.text.systemid=0x080017FC
Again I'm not sure if it should be .systemid or .text.systemid
Either way, it does not locate the constant at 0x080017FC
How do I get my constant to be located at 0x080017FC within the .text (code) section without any overlap errors?

It will not work this way. There is no way I am aware of placing section at the particular address without problems from the linker if it is part of another section. Linker is quite a simple program and will not optimize the memory to avoid your location.
I use two methods:
Place this id at the end of the FLASH. You cant do this at the beginning as there is the vector table.
const uint32_t __attribute__((used, section (".systemid"))) SYSTEM_ID_U32 = 0x11223344;
Place after all other sections in FLASH (it can be the last section definition
.systemid :
{
. = ORIGIN(FLASH) + LENGTH(FLASH) - 4;
KEEP(*(.systemid))
} >FLASH
or
.systemid ORIGIN(FLASH) + LENGTH(FLASH) - 4:
{
KEEP(*(.systemid))
} >FLASH

Use of PROVIDE keyword in linker script of ARM processor

what is the difference between statement at line-4 and 5???
And what are the problems comes when defining symbol with leading undersore??
code:
SECTIONS
{
.text :
*(.text)
_etext = . ; // line-4
PROVIDE(etext= .); // line-5
}

Splitting embedded program in multiple parts in memory

I am working on an embedded system (Stellaris Launchpad) and writing a simple OS (as a hobby project). The used toolchain is gcc-none-eabi.
My next step is to get used to the MPU to allow the kernel to prevent user programs from altering specific data. I have a bunch of C files and I splitted them in two parts: kernel and other.
I have the following linker script to start out with:
MEMORY
{
FLASH (rx) : ORIGIN = 0x00000000, LENGTH = 0x00040000
SRAM (rwx) : ORIGIN = 0x20000000, LENGTH = 0x00008000
}
SECTIONS
{
.text :
{
_text = .;
KEEP(*(.isr_vector))
*(.text*)
*(.rodata*)
_etext = .;
} > FLASH
.data : /*AT(ADDR(.text) + SIZEOF(.text))*/ /*contains initialized data*/
{
_data = .;
*(vtable)
*(.data*)
_edata = .;
} > SRAM AT > FLASH
.bss : AT (ADDR(.data) + SIZEOF(.data)) /*contains unitialized data (should be set to all zero's)*/
{
_bss = .;
*(.bss*)
*(COMMON)
_ebss = .;
_start_heap = .;
} > SRAM
_stack_top = ORIGIN(SRAM) + LENGTH(SRAM) - 1; /*The starting point of the stack, at the very bottom of the RAM*/
}
And after reading up on linker scripts I know that I can replace the stars with filenames, and thus start splitting the flash in multiple parts. I would for example create a .kernel.bss section and put all of the kernel object files instead of the stars in that section.
My only problem left is that the kernel is not one file, it is a whole lot of files. And files might be added, removed etc. So how do I do this? How do I change my linker script so that a dynamic first group of files is mapped to the first place and a dynamic second group of files is mapped to a second place?

you know that you can specify what files are used as input for a section?
We use this for separating kernel and application code into fast internal flash, and slower external flash memory, like so:
.kernel_text :
{
build/kernel/*.o (.text*) /*text section from files in build/kernel*/
} > INT_FLASH
.app_text:
{
build/app/*.o(.text*)
} > EXT_FLASH
Section 4.6.4 might be helpful, (describes input sections in more detail)
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/4/html/Using_ld_the_GNU_Linker/sections.html

I found a solution, allthough it feels a bit hacky. It does work though:
I found out that a linker script is OK with working on .a files if they are statically linked with ar. So lets say you have a buch of .o files that, together form the kernel: a.o, b.o, c.o. Use ar rcs kernel.a a.o, b.o, c.o. kernel.a is now your kernel, which you want to store seperately in memory.
The next thing you need to know is that the * in a linker script is actually a wildcard for everything not used yet. So we can create the following linker script:
MEMORY
{
FLASH (rx) : ORIGIN = 0x00000000, LENGTH = 0x00040000
SRAM (rwx) : ORIGIN = 0x20000000, LENGTH = 0x00008000
}
SECTIONS
{
.kernel.text :
{
_kernel_text = .;
KEEP(kernel.a(.isr_vector))
KEEP(kernel.a(_sbrk))
kernel.a(.text*)
kernel.a(.rodata*)
_kernel_etext = .;
_kernel_flash_data = ALIGN(0x4);
} > FLASH
.kernel.data : /*AT(ADDR(.text) + SIZEOF(.text))*/ /*contains initialized data*/
{
_kernel_data = .;
kernel.a(vtable)
kernel.a(.data*)
_kernel_edata = .;
} > SRAM AT > FLASH
.kernel.bss :
{
_kernel_bss = .;
kernel.a(.bss*)
kernel.a(COMMON)
_kernel_ebss = .;
} > SRAM
.text : /*AT (ADDR(.core.text) + SIZEOF(.core.text) + SIZEOF(.core.data))*/
{
_text = .;
*(.text*)
*(.rodata*)
_etext = .;
_flash_data = ALIGN(0x4);
} > FLASH
.data :
{
_data = .;
*(vtable)
*(.data*)
_edata = .;
} > SRAM AT > FLASH
.bss : AT (ADDR(.data) + SIZEOF(.data)) /*contains unitialized data (should be set to all zero's)*/
{
_bss = .;
*(.bss*)
*(COMMON)
_ebss = .;
_start_heap = .;
} > SRAM
}
This works but will probably lead to a new problem: the linker treats libraries as.. well, libraries. So if they contain the program start (as in my case) the linker does not actually look for it, the linker only looks trough the library for functions refered to by the actual o files. The solution I found for this is to add the -u <name> flag to the linker invocation. This flag causes a symbol to become undefined, so the linker will look for this symbol plus all symbols that are needed by this synbol.
My invocation, for references sake:
arm-none-eabi-ld -Tlinker_script.ld -nostdlib --entry ResetISR
--gc-sections -u _sbrk -u .isr_vector
-L./lib//hardfp
-L/home/me/gcc-arm-none-eabi/gcc-arm-none-eabi-4_9-2015q1/arm-none-eabi/lib/armv7e-m/fpu
-L/home/me/gcc-arm-none-eabi/gcc-arm-none-eabi-4_9-2015q1/lib/gcc/arm-none-eabi/4.9.3/armv7e-m/fpu
-Lrelease/
-o release/os
./user/obj/release/ledsDance.c.o ./user/obj/release/main.c.o ./validation/obj/release/val_floattest.c.o ./validation/obj/release/val_genTest.c.o ./validation/obj/release/val_gpiotest.c.o ./validation/obj/release/val_iotest.c.o ./validation/obj/release/val_proctest.c.o ./validation/obj/release/val_schedTest.c.o release/kernel.a release/core.a
-ldriver-cm4f
-luartstdio
-lm
-lc
-lgcc

Why should "data = .;" be repeated three times in a linker script?

I saw this link script in
http://www.jamesmolloy.co.uk/tutorial_html/1.-Environment%20setup.html
SECTIONS
{
.text 0x100000 :
{
code = .; _code = .; __code = .; // What is this line for?
*(.text)
. = ALIGN(4096);
}
.data :
{
data = .; _data = .; __data = .;
*(.data)
*(.rodata)
. = ALIGN(4096);
}
.bss :
{
bss = .; _bss = .; __bss = .;
*(.bss)
. = ALIGN(4096);
}
end = .; _end = .; __end = .;
}
You can see that, code, _code, __code and the fallowing ones all appearing in a same style. What are they for? Why should they be written in such a way?

The syntax <symbol> = . simply defines a symbol at the current address.
You can use this symbol like this:
extern int __code;
int foo()
{
cout << "Address of __code" << &__code << endl;
}
_code and __code typically holds the start address of the text section. This is used from the startup code of your system you compile for.
Definig symbols without a leading underscore are not so common I believe. This can maybe result in conflicts with normal definitions from your code. But this is only a convention. Technically you can define what you want and need. The rules are the same as all other symbols in your project: Never define symbols twice :-)

Meaning of arm loader script

OUTPUT_FORMAT("elf32-littlearm", "elf32-littlearm", "elf32-littlearm")
OUTPUT_ARCH(arm)
ENTRY(_ram_entry)
SECTIONS
{
. = 0xA0008000;
. = ALIGN(4);
.text : { *(.text) }
. = ALIGN(4);
.rodata : { *(.rodata) }
. = ALIGN(4);
.data : { *(.data) }
. = ALIGN(4);
.got : { *(.got) }
. = ALIGN(4);
.bss : { *(.bss) }
}
I get the output_format, output_arch, entry... maybe meaning that the output will be as elf32-littlearm and so on.
But the Sections part is what I don't get.
this '. =' is the start.
and '. = ALIGN(4)' and .text : { *(.text) } ....
can anybody help me on this T_T
thanks for reading.

Actually, this linker description language is defined in the documentation for ld last I checked. It's not as bad as it looks. Basically, the '.' operator refers to the "current location pointer". So, the line
. = 0xA0008000
says move the location pointer to that value. The next entry, .text, basically says place all text objects into a .text section in the final ELF file starting at the location pointer (which is also adjusted to have a 4-byte (32-bit) alignment.) Note that the first use of the ALIGN is probably redundant since 0xA0008000 is already 32-bit aligned!
The next sections simply instructs the linker to emit the collection of all .rodata, .data, .got and .bss sections from all input objects into their final respective sections of the ELF binary in order, starting at 32-bit aligned addresses.
So the final ELF binary that the linker produces will have those five sections respectively and sequentially. You can see the structure of that final ELF binary using the readelf utility. It's quite useful and helps to make sense of all of this stuff. Usually there is a cross-version of readelf, something like arm-linux-gnueabi-readelf, or whatever prefix was used to generate the compiler/linker you are using. Start with readelf -S to get a summary of the sections that your ELF file contains. Then you can explore from there. Happy reading!

. = 0xA0008000;
I think, but I'm not 100% sure, is where the arm will start executing the binary
. = ALIGN(4);
defines how to align the following instruction.
.text, .data, .rodata, .got and .bss are the sections of the program. text is for instructions, data and rodata for initialized data sections and bss for uninitialized data section. got is global offset table.
.text : { *(.text) }
This copies all the instructions, similar commands are for data and global offset table.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Why my system ld script use expression like "dot = dot"? - linker

When dumping my system's linker script with ld -verbose, I noteice that it uses: .data1 : { *(.data1) } _edata = .; PROVIDE (edata = .); . = .; __bss_start = .; .bss : why does it assign the current address to the current address?

Related

GCC Linker - Locate a section/constant at a specific address within the .text section

Use of PROVIDE keyword in linker script of ARM processor

Splitting embedded program in multiple parts in memory

Why should "data = .;" be repeated three times in a linker script?

Meaning of arm loader script

Categories

Resources