What is the effect of a `section` command with an empty list of input sections in a GNU linker script? - c

In an LD linker script I have the following fragment in the SECTIONS section:
. = (__BUFFER_LOCATION_);
BUFFER . : { } > EXTERNAL_MEM
where __BUFFER_LOCATION_ is defined to some address and EXTERNAL_MEM is defined in the MEMORY section.
In the C program, I have a global buffer declared as:
char outbuf[4096] __attribute__((section("BUFFER")));
It can be seen that the linker script does not mention any input section named BUFFER, but the output section is named as such.
When compiling the program I see that the linker placed the buffer in the supposed address (BUFFER_LOCATION), although the input section was not defined in the LDF. When I remove the attribute from the source, the buffer is placed in a completely different address.
So, I assume that by default, an output-section-command of type "input section description" adds the output section's name to the input sections list implicitly, unless defined somewhere else. However, reading the manual, I could not find a description of such behaviour.
Did I miss something, or is it an "undocumented feature"?

Yes, an output section will automatically match input sections with the same name, unless a different output section mentions them explicitly.
This is documented under Orphan Sections (emphasis mine):
Orphan sections are sections present in the input files which are not explicitly placed into the output file by the linker script. The
linker will still copy these sections into the output file by either
finding, or creating a suitable output section in which to place the
orphaned input section.
If the name of an orphaned input section exactly matches the name of
an existing output section, then the orphaned input section will be
placed at the end of that output section.
If there is no output section with a matching name then new output
sections will be created...

Related

Linux kernel section names

Why do some linux kernel section names use a single . and others use two
For example .data..page_aligned and .data..init_task vs .data.unlikely and .data.once
Does the number of . have a specific meaning
Thanks
I have read the ld and kernel documentation and not found any information
Section name .data..init_task consists from 3 parts:
Prefix .data.
Delimiter ..
Name of the subsection .init_task.
Since the name of the subsection starts with dot, the section name contains the sequence of two dots.
The section name .data.unlikely consists from 3 parts too, but the name of the subsection - unlikely - is not started with dot.

Get section of variable in linker script

I want to get the section (at least its address) of a symbol in the object file using linker script. I watched the list of builtin functions but there is nothing like that. Does linker script can tell us the section where the variable is located?

Why do STM32 gcc linker scripts automatically discard all input sections from these standard libraries: libc.a, libm.a, libgcc.a?

From the bottom of any auto-generated STM32CubeMx-generated linker script:
/* Remove information from the standard libraries */
/DISCARD/ :
{
libc.a ( * )
libm.a ( * )
libgcc.a ( * )
}
From the GNU Binutils ld (linker script) manual, 3.6.7 Output Section Discarding:
The special output section name ‘/DISCARD/’ may be used to discard input sections. Any input sections which are assigned to an output section named ‘/DISCARD/’ are not included in the output file.
What do these 3 input object files contain, and why do we discard everything (all input sections) from them?
Other STM32 linker script topics of interest:
Is accessing the "value" of a linker script variable undefined behavior in C?
How to get value of variable defined in ld linker script from C
Looks like in this example, /DISCARD/ removes any other sections, that are not explicitly defined by script. For example, since *(.text), *(.data), *(.bss), *(.init_array) etc, has been defined earlier in the script, they get into the ELF. But libc, libm or libgcc could contain unnecessary sections for firmware (e.g. .foo, .bar, .debug ...), so /DISCARD/ just wipes them out, but NOT all sections!

How do I specify manual relocation for GCC code?

I am in a situation in an embedded system (an xtensa processor) where I need to manually override a symbol, but the symbol happens to be in the middle of another symbol. When I try using -Wl,--wrap=symbol it won't work, since the symbol isn't its own thing.
What I need to do is specify (preferably in a GCC .S, though .c is okay) where the code will end up. Though the actual symbol will be located somewhere random by the compiler, I will be memcpying the code into the correct place.
40101388 <replacement_user_vect>:
40101388: 13d100 wsr.excsave1 a0
4010138b: 002020 esync
4010138e: 011fc5 call0 4010258c <_UserExceptionVector_1>
My problem is GCC creates the assembly with relative jumps assuming the code will be located where it is in flash, while the eventual location will be fixed in an interrupt vector. How do I tell GCC / GNU as "put the code wherever you feel like, but, trust me it will actually execute from {here}"
Though my code is at 0x40101388 (GCC decided) it will eventually reside and execute from 0x40100050. How do I trick GCC by telling it "put the code HERE" but pretend it's located "HERE"
EDIT: I was able to get around this, as it turns out, the function I needed to modify was held in the linker script, individually. I was able to just switch it out in the linker script. Though I still would love to know the answer, I now have a work-around.
In the linker script each output section has two associated addresses: VMA and LMA -- the address for which the code is linked and the address where the code will be loaded.
Put the code that needs to be relocated into separate section, add an output section to your linker script with desired VMA and LMA and put an input section matching the name of the code section inside it.
E.g. the following C code
void f(void) __attribute__((section(".relocatable1.text")))
{
...
}
extern char _relocatable1_lma[];
extern char _relocatable1_vma_start[];
extern char _relocatable1_vma_end[];
void relocatable1_copy(void)
{
memcpy(_relocatable1_vma_start, _relocatable1_lma,
_relocatable1_vma_end - _relocatable1_vma_start);
}
Together with the following piece of ld script, with VMA substituted with the desired target code location
SECTIONS {
...
.some_section : { ... }
.relocatable1 VMA : AT(LOADADDR(.some_section) + SIZEOF(.some_section)) {
_relocatable1_vma_start = . ;
*(.relocatable1.literal .relocatable1.text) ;
_relocatable1_vma_end = . ;
}
_relocatable1_lma = LOADADDR(.relocatable1) ;
...
}
should do what you want.

C extra # lines after preprocessing part [duplicate]

I was inspecting the preprocessed output generated by GCC, and I see a lot of these in the .i file that I generated using the -save-temps flag:
# 8 "/usr/include/i386-linux-gnu/gnu/stubs.h" 2 3 4
What do the numbers before and after the absolute path of stubs.h mean? It seems to be some kind of debugging information that is inserted by the preprocessor and allows the compiler to issue error messages referring to this information. These lines do not affect the program itself, but what specifically is each number for?
Based on the documentation the number before the filename is the line number. The numbers after the file name are a flag and mean the following:
1 indicates the start of a new file.
2 indicates returning to a file (after having included another file).
3 indicates that the following text comes from a system header file, so certain warnings should be suppressed.
4 indicates that the following text should be treated as being wrapped in an implicit extern "C" block.

Resources