Predefined ELF Code Sections - linker

What are the predefined code sections that can be referenced in an ELF linker command file? In addition to any others that may be available, I am specifically wondering about these:
.text
.rodata
.sdata
.sbss
.bss
.data
Finding documentation has proven most difficult. If anyone can also tell me what the acronym ELF stands for in this context, that would be a plus. Thanks.

Not sure what you mean about not finding documentation. Wikipedia has a large collection of links about the Executable and Linkable Format. One of the links there describes the ELF sections you are interested in (plus lots of other stuff). Another link here describes additional ELF special sections (.sbss/.sdata).

Related

gcc: how to produce ELF where file size equals mem size for all LOAD segments without custom linker script?

I have to produce an ELF binary with gcc from a Hello World-program written in C, where the mem size equals the file size in all LOAD-segments of the ELF file. My experience says me, that I can prevent this if I move .bss into .data in a custom linker script. But in my case, I want to achieve this without a custom linker script.
Is there a way I can force all LOAD-segments to have the same file size as mem size with an option for GCC?
Background: I'm working on enabling Linux binaries on a custom OS. The ELF-Loader so far is pretty basic and testing/developing will be much simpler, if I just can map the ELF as it is (as long as all LOAD-segments are page-aligned)..
For completeness, I provide the solution that includes a dedicated linker script. The relevant excerpt is the following:
.data ALIGN(4K) :
{
*(.data .data.*)
/* Putting .bss into the .data segment simplifies loading an ELF file especially in kernel
scenarios. Some basic ELF loaders in OS dev space require MEMSIZE==FILESIZE for each
LOAD segment. The zeroed memory will land "as is" in the ELF and increase its size.
I'm not sure why but "*(COMMON)" must be specified as well so that the .bss section
actually lands in .data. But the GNU ld doc also does it like this:
https://sourceware.org/binutils/docs/ld/Input-Section-Common.html */
*(COMMON)
*(.bss .bss.*)
} : rw
It is important that the output section is not called ".bss" and that
the section contains more than just ".bss". Otherwise, the "FILESIZE != MEMSIZE" optimization is done where the ELF loader needs to provide zeroed memory.

Linker (ld) ELF Questions

I have an issue with an ELF file generated by the GNU linker ld.
The result is that the data section (.data) gets corrupted when the executable is loaded into memory. The corruption to the .data section occurs when the loader performs the relocation on the .eh_frame section using the relocation data (.rela.eh_frame).
What happens is that this relocation causes seven writes that are beyond the .eh_frame section and over-write the correct contents of the .data section which is adjacent to the top of the .eh_frame section.
After some investigation, I believe the loader is behaving correctly, but the ELF file it has been given contains an error.
But I could be wrong and wanted to check what I've found so far.
Using readelf on the ELF file, it can be seen that seven of the entries in the .rela.eh_frame section contain offsets that are outside (above) the range given by readelf for the .eh_frame section. ie The seven offsets in .rela.eh_frame are greater than the length given for .eh_frame. When these seven offsets are applied in the relocation, they corrupt the .data section.
So my questions are:
(1) Is my deduction that relocation offsets should not be greater than the length of the section to which they apply? And therefore the ELF file that has been generated is in error?
(2) What are people's opinions on the best way of proceeding to diagnose the cause of the incorrect ELF file? Are there any options to ld that will help, or any options that will remove/fix the .eh_frame and it's relocation counterpart .rela.eh_frame?
(3) How would I discover what linker script is being used when the ELF file is generated?
(4) Is there a specific forum where I might find a whole pile of linker experts who would be able to help. I appreciate this is a highly technical question and that many people may not have a clue what I'm talking about!
Thanks for any help!
The .eh_frame section is not supposed to have any run-time relocations. All offsets are fixed when the link editor is run (because the object layout is completely known at this point) and the ET_EXEC or ET_DYN object is created. Only ET_REL objects have relocations in that section, and those are never seen by the dynamic linker. So something odd most be going on.
You can ask such questions on the binutils list or the libc-help list (if you use the GNU toolchain).
EDIT It seems that you are using a toolchain configured for ZCX exceptions with a target which expects SJLJ exceptions. AdaCore has some documentation about his:
GNAT User's Guide Supplement for Cross Platforms 19.0w documentation » VxWorks Topics
Zero Cost Exceptions on PowerPC Targets
It doesn't quite say how t switch to the SJLJ-based VxWorks 5 toolchain. It is definitely not a matter of using the correct linker script. The choice of exception handling style affects code generation, too.

ELF second load segment address of .data + .bss

In this case, is right that address of:
.data start at 0x08048054 up to 0x08048054+0x0000e
.bss start at 0x08048054+0x0000e up to 0x0804805+0x00016
or am I missing something? please clarify it for me.
EDIT
I used this command to get the information as in the image:
readelf -l filename
Ok, so where do I begin... Yes both .data and .bss are in that region in memory. The problem is that there is no way to figure out what order they are in.
We can assume that the default order is followed and make an educated guess but I don't like that.
Through the lengthy comment thread under the question you mentioned something interesting, that wasn't evident in your question.
the executable isn't dynamically linked as file command says: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), statically linked, stripped in this case, there's no a linker script, isn't? – The Mask
In this case the library contains the symbol table with all of the symbol offsets. This table includes section information. It will be processed by the linker when you compile your application. At that point it is your linker script that controls the order in which the .data and .bss sections are out put in.
If it is the default linker script, look it up. If it is custom, you should have access to it and can read it. If unsure elaborate here and we'll try and help :)
I myself have asked a question that is unrelated but offers example code of a linker script and some C code. In that linker script the .bss segment came after the .data segment.
You are looking at the program header information, whereas the section headers are probably what you need. There may be many sections contained within a program header and you cannot precisely infer the sizes and alignment requirements of the various sections.
To see the section headers, use:
readelf -S

Why does objdump not show .bss, .shstratab, .symtab and .strtab sections?

I'm currently doing my own objdump implementation in C.
For my -s option, I have to show the full contents of the sections of an ELF file.
I'm doing it well, but I'm showing more sections than the "real" objdump.
In fact, it does not output the .bss, .shstrtab, .symtab and .strtab sections.
I'm looking around the sh_flags value on the Shdr struct but I can't find any logic...
Why does objdump -s <ELF file> not show these sections ?
Why objdump -s does not shows these sections ?
Objdump is based on libbfd, which abstracts away many complexities of ELF, and was written when objects tended to only have three sections.
As such, objdump is quite deficient. In addition to not showing you (some) existing sections, it may also "synthesize" sections that don't exist at all, and do other weird tricks. This is more of a libbfd fault -- its abstraction layer simply doesn't tell objdump about the "missing" sections.
TL;DR: don't use objdump. Use readelf instead.
Try using sh_size and sh_type, instead of sh_flags.
Quoting from the ELF specification
sh_size This member gives the section’s size in bytes. Unless the
section type is SHT_NOBITS, the section occupies sh_size bytes in the
file. A section of type SHT_NOBITS may have a non-zero size, but it
occupies no space in the file

Linker scripts: strategies for debugging?

I'm trying to debug a linker problem that I have, when writing a kernel.
The issue is that I have a variable SCAN_CODE_MAPPING that I'm not able to use -- it appears to be empty or something. I can fix this by changing the way I link my program, but I don't know why.
When I look inside the generated binary file using objdump, the data for the variable is definitely there, so there's just something broken with the reference to it.
Here's a gist with both of the linker scripts and the part of the symbol table that's different between the two files.
What confuses me is that both of the symbol tables have all the same symbols, they're all the same length, and they appear to contain the right data. The only difference that I can see is that they're not in the same order.
So far I've tried
inspecting the SCAN_CODE_MAPPING memory location to make sure it has the data I expect and hasn't been zeroed out
checking that all the symbols are the same
checking that all the symbol contents are the same length
looking at .data.rel.ro.local to make sure it has the address of the data
One possible clue is this warning:
warning: uninitialized space declared in non-BSS section `.text': zeroing
which I get in both the broken and the correct case.
What should I try next?
The problem here turned out to be that I was writing an OS, and only 12k of it was being loaded instead of the whole thing. So the linker script was actually working fine.
The main tools I used to understand binaries were:
nm
objdump
readelf
You can get a ton more information using "readelf".
In particular, take a look at the program headers:
readelf -l program
Your BSS section is quite different than the standard one, which probably causing the warning. Here's what the default looks like on my system:
.bss :
{
*(.dynbss)
*(.bss .bss.* .gnu.linkonce.b.*)
*(COMMON)
/* Align here to ensure that the .bss section occupies space up to
_end. Align after .bss to ensure correct alignment even if the
.bss section disappears because there are no input sections.
FIXME: Why do we need it? When there is no .bss section, we don't
pad the .data section. */
. = ALIGN(. != 0 ? 64 / 8 : 1);
}
If an input section doesn't match anything in your linker script, the linker still has to place it somewhere. Make sure you're covering all the input sections.
Note that there is a difference between sections and segments. Sections are used by the linker, but the only thing the program loader looks at are the segments. The text segment includes the text section, but it also includes other sections. Sections that go into the same segment must be adjacent. So order does matter.
The rodata section usually goes after the text section. These are both read-only during execution and will show up once in your program headers as a LOAD entry with read & execute permissions. That LOAD entry is the text segment.
The bss section usually goes after the data section. These are both writable during execution and will show up once in your program headers as a LOAD entry with read & write permissions. That LOAD entry is the data segment.
If you change the order, it affects how the linker generates the program headers. The program headers, rather than the section headers, are used when loading your program prior to executing it. Make sure to check the program headers when using a custom linker script.
If you can give more specifics about what your actual symptoms are, then it'll be easier to help.

Resources