GRUB Multiboot header not found - linker

After reading this this question and it's primary answer, I ran readelf on my kernel, and noticed my .text section was at 0x00101000 and not 0x00100000. I also noticed a section above that read .not.gnu.build-i that was in the place the .text section is supposed to be. Is there a way I could make my .text section be in the correct place? I have already used align 4 to set it to 1M.

The issue is that LD (or LD via GCC) is automatically placing a notes section (if one was generated) in the first 4k. If linking the final kernel with GCC pass it the -Wl,--build-id=none option. If you are using LD directly to link the final binary then you can pass it --build-id=none .
If writing a multiboot ELF object the existence of this extra section can force the mulitboot header beyond the 8k position in the file. This is accounting for the fact that ELF headers usually take a minimum of the first 4k of the file. Add in the 4k for .note.gnu.build-id and .multiboot section is now beyond 8k mark of the physical file. This will cause a multiboot loader like GRUB to think your ELF executable doesn't have a multiboot header since it only looks through the first 8k of the file.

Your linker script (given that it is the same as in the other question) is the problem: By telling it to 4k-align the sections and putting multiboot in a separate section, you allocate 4k for it, so .text starts at an offset of 1M + 4k, which causes your problem. Change it to the following:
SECTIONS
{
. = 1M;
.text ALIGN(4K) :
{
*(.multiboot)
*(.text)
}
[snip]

Related

gcc: how to produce ELF where file size equals mem size for all LOAD segments without custom linker script?

I have to produce an ELF binary with gcc from a Hello World-program written in C, where the mem size equals the file size in all LOAD-segments of the ELF file. My experience says me, that I can prevent this if I move .bss into .data in a custom linker script. But in my case, I want to achieve this without a custom linker script.
Is there a way I can force all LOAD-segments to have the same file size as mem size with an option for GCC?
Background: I'm working on enabling Linux binaries on a custom OS. The ELF-Loader so far is pretty basic and testing/developing will be much simpler, if I just can map the ELF as it is (as long as all LOAD-segments are page-aligned)..
For completeness, I provide the solution that includes a dedicated linker script. The relevant excerpt is the following:
.data ALIGN(4K) :
{
*(.data .data.*)
/* Putting .bss into the .data segment simplifies loading an ELF file especially in kernel
scenarios. Some basic ELF loaders in OS dev space require MEMSIZE==FILESIZE for each
LOAD segment. The zeroed memory will land "as is" in the ELF and increase its size.
I'm not sure why but "*(COMMON)" must be specified as well so that the .bss section
actually lands in .data. But the GNU ld doc also does it like this:
https://sourceware.org/binutils/docs/ld/Input-Section-Common.html */
*(COMMON)
*(.bss .bss.*)
} : rw
It is important that the output section is not called ".bss" and that
the section contains more than just ".bss". Otherwise, the "FILESIZE != MEMSIZE" optimization is done where the ELF loader needs to provide zeroed memory.

ELF second load segment address of .data + .bss

In this case, is right that address of:
.data start at 0x08048054 up to 0x08048054+0x0000e
.bss start at 0x08048054+0x0000e up to 0x0804805+0x00016
or am I missing something? please clarify it for me.
EDIT
I used this command to get the information as in the image:
readelf -l filename
Ok, so where do I begin... Yes both .data and .bss are in that region in memory. The problem is that there is no way to figure out what order they are in.
We can assume that the default order is followed and make an educated guess but I don't like that.
Through the lengthy comment thread under the question you mentioned something interesting, that wasn't evident in your question.
the executable isn't dynamically linked as file command says: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), statically linked, stripped in this case, there's no a linker script, isn't? – The Mask
In this case the library contains the symbol table with all of the symbol offsets. This table includes section information. It will be processed by the linker when you compile your application. At that point it is your linker script that controls the order in which the .data and .bss sections are out put in.
If it is the default linker script, look it up. If it is custom, you should have access to it and can read it. If unsure elaborate here and we'll try and help :)
I myself have asked a question that is unrelated but offers example code of a linker script and some C code. In that linker script the .bss segment came after the .data segment.
You are looking at the program header information, whereas the section headers are probably what you need. There may be many sections contained within a program header and you cannot precisely infer the sizes and alignment requirements of the various sections.
To see the section headers, use:
readelf -S

Why does objdump not show .bss, .shstratab, .symtab and .strtab sections?

I'm currently doing my own objdump implementation in C.
For my -s option, I have to show the full contents of the sections of an ELF file.
I'm doing it well, but I'm showing more sections than the "real" objdump.
In fact, it does not output the .bss, .shstrtab, .symtab and .strtab sections.
I'm looking around the sh_flags value on the Shdr struct but I can't find any logic...
Why does objdump -s <ELF file> not show these sections ?
Why objdump -s does not shows these sections ?
Objdump is based on libbfd, which abstracts away many complexities of ELF, and was written when objects tended to only have three sections.
As such, objdump is quite deficient. In addition to not showing you (some) existing sections, it may also "synthesize" sections that don't exist at all, and do other weird tricks. This is more of a libbfd fault -- its abstraction layer simply doesn't tell objdump about the "missing" sections.
TL;DR: don't use objdump. Use readelf instead.
Try using sh_size and sh_type, instead of sh_flags.
Quoting from the ELF specification
sh_size This member gives the section’s size in bytes. Unless the
section type is SHT_NOBITS, the section occupies sh_size bytes in the
file. A section of type SHT_NOBITS may have a non-zero size, but it
occupies no space in the file

Why Executable and Linkable Format(ELF) File contains set of sections?

These-days I'm referring File Handling System Calls in Linux.
Furthermore I understood ELF which is Executable and Linkable Format , contains set of sections.
Those are .bss , .data , .rodata , .text , .comment , and unknown
I referred Wikipedia and this Website to study
So I have below questions
why ELF file uses set of sections?
what is the task of each above section ?
what is the feasibility of this using set of sections ?
A good reference for the ELF file format is the Object Files chapter of the System V ABI. In particular, special sections describes the uses of most of the sections you're likely to encounter.
why ELF file uses set of sections?
An object file contains lots of different classes of data, and it makes sense to group similar data into sections, especially since some sections' contents can be read directly into a process's image when the OS execs the ELF file.
.bss contains uninitialized data, such as int a; declared at global level in a C program. Actually, it contains nothing except the size that needs to be allocated when the ELF file is loaded into a process, because all variables in bss are initialized to 0.
.data contains initialized data, such as int a = 1000; declared at global level in a C program.
.rodata contains read-only data, such as character string literals and global level variables declared as const in C. When the OS execs the ELF file, it will load this section into an area of memory that is read-only.
.text contains executable instructions. When the OS execs the ELF file, it will load this section into an area of memory that is read-only. Sometimes .text and .rodata wind up being loaded into the same area of a process's memory.
.comment typically contains the name and version of the compiler(s) used to generate the file.
Not all of the sections described in the documentation may be present in all ELF files; in particular, running the strip command on the ELF file will remove the .symtab and .debug sections.

Linker scripts: strategies for debugging?

I'm trying to debug a linker problem that I have, when writing a kernel.
The issue is that I have a variable SCAN_CODE_MAPPING that I'm not able to use -- it appears to be empty or something. I can fix this by changing the way I link my program, but I don't know why.
When I look inside the generated binary file using objdump, the data for the variable is definitely there, so there's just something broken with the reference to it.
Here's a gist with both of the linker scripts and the part of the symbol table that's different between the two files.
What confuses me is that both of the symbol tables have all the same symbols, they're all the same length, and they appear to contain the right data. The only difference that I can see is that they're not in the same order.
So far I've tried
inspecting the SCAN_CODE_MAPPING memory location to make sure it has the data I expect and hasn't been zeroed out
checking that all the symbols are the same
checking that all the symbol contents are the same length
looking at .data.rel.ro.local to make sure it has the address of the data
One possible clue is this warning:
warning: uninitialized space declared in non-BSS section `.text': zeroing
which I get in both the broken and the correct case.
What should I try next?
The problem here turned out to be that I was writing an OS, and only 12k of it was being loaded instead of the whole thing. So the linker script was actually working fine.
The main tools I used to understand binaries were:
nm
objdump
readelf
You can get a ton more information using "readelf".
In particular, take a look at the program headers:
readelf -l program
Your BSS section is quite different than the standard one, which probably causing the warning. Here's what the default looks like on my system:
.bss :
{
*(.dynbss)
*(.bss .bss.* .gnu.linkonce.b.*)
*(COMMON)
/* Align here to ensure that the .bss section occupies space up to
_end. Align after .bss to ensure correct alignment even if the
.bss section disappears because there are no input sections.
FIXME: Why do we need it? When there is no .bss section, we don't
pad the .data section. */
. = ALIGN(. != 0 ? 64 / 8 : 1);
}
If an input section doesn't match anything in your linker script, the linker still has to place it somewhere. Make sure you're covering all the input sections.
Note that there is a difference between sections and segments. Sections are used by the linker, but the only thing the program loader looks at are the segments. The text segment includes the text section, but it also includes other sections. Sections that go into the same segment must be adjacent. So order does matter.
The rodata section usually goes after the text section. These are both read-only during execution and will show up once in your program headers as a LOAD entry with read & execute permissions. That LOAD entry is the text segment.
The bss section usually goes after the data section. These are both writable during execution and will show up once in your program headers as a LOAD entry with read & write permissions. That LOAD entry is the data segment.
If you change the order, it affects how the linker generates the program headers. The program headers, rather than the section headers, are used when loading your program prior to executing it. Make sure to check the program headers when using a custom linker script.
If you can give more specifics about what your actual symptoms are, then it'll be easier to help.

Resources