Difference between Program header and Section Header in ELF - c

Q1 What is the difference between Program header and Section Header in ELF?
Q1.1 What is the difference between segment and a section?
I believe pheaders point to sections only.
Q2. What is the difference between File Header and Program Header?
As per GNU ld linker script, Using Id: The GNU Linker:
PHDRS
{
name type [ FILEHDR ] [ PHDRS ] [ AT ( address ) ]
[ FLAGS ( flags ) ] ;
}
You may use the FILEHDR and PHDRS keywords appear after the program header type to further
describe the contents of the segment. The FILEHDR keyword means that the segment should include
the ELF file header. The PHDRS keyword means that the segment should include the ELF program
headers themselves.
This is a bit confusing.

The Executable & Linkable Format wikipage has a nice picture explaining ELF, and the difference between its program header & sections header. See also elf(5)
The [initial] program header is defining segments (in the address space of a process running that ELF executable) projected in virtual memory (the executable point of view) at execve(2) time. The [final] sections header is defining sections (the linkable point of view, for ld(1) etc...). Each section belongs to a segment (and may, or not, be visible -i.e. mapped into memory- at execution time). The ELF file header tells where program header table & section header table are.
Use also objdump(1) and readelf(1) to explore several ELF files (executables, shared objects, linkable objects) existing on your Linux system.
Levine's Linkers & Loaders book has a chapter explaining that in details.
And Drepper's paper How to Write Shared Libraries also has some good explanation.

Q1 What is the difference between the Program header and the Section Header in ELF?
A program header describes a segment or other information that the system needs to prepare the program for execution.
A section is an interface that can represent a lot of things. Look here for details (search for Elf64_Shdr)
A section header is inside a segment.
Q1.1 What is the difference between a segment and a section?
A segment consists of one or more sections, though this fact is transparent to the program header.
Q2. What is the difference between File Header and Program Header?
The ELF file header. This appears at the start of every ELF file (see /usr/include/elf.h). It also has the number of program headers existing in this file.
The ELF file always starts with the ELF file header. And it references the program headers. You need at least one program header.

Related

Elf file format, what's the order?

Please help me fix my confusion:
In Elf64 File at first we the elf header, then immediately all program header and then all section headers?
Then why for example in elf header we have a index to where the first program header starts if it's always 64 bytes after the start of the file, that seems to be extra garbage information?
If nothing else, it allows the header's size to be made larger in future versions of the format without losing backwards compatibility.
In Elf64 File at first we the elf header, then immediately all program header and then all section headers?
No, the program header, the section headers and the sections are wherever the headers say they are. There's no requirement for them to be immediately after one another or in any particular order.

Difference between the ELF vs MAP file

The linker can output both the ELF and the MAP file. These files are especially relevant in the embedded systems world because the ELF file is usually used to read out the addresses of variables or functions. Additionally, the ELF file is used by different embedded measurement or analysis tools.
When I open a MAP file, then within it, I can see for each global variable and every external function the following information: allocated address, symbolic name, allocated bytes, memory unit, and memory section.
On the other hand, once I open the ELF file, it is a binary file and not human readable. However, some tools I use are able to read it out and interpret it. Those tools can interpret the ELF file, and obtain the information about the symbolic name of a variable/function and its address or even show a function prototype.
From my understanding the ELF and MAP files are basically containing the same information, it is just that the first one is binary and the latter one is the text file.
So what are the actual differences between these two files from the content perspective?
Thank you in advance!
The primary output of the linker (i.e. its main purpose) is to produce the fully linked executable code. That is the ELF (Executable Linkable Format) file. An ELF file may as you have observed contain symbols - these are used for debug. It may also contain meta-data that associates the machine code with the source code from which it was generated. But the bulk of its content (and the part that is not optional) is the executable machine code and data objects that are your application.
The MAP file is an optional information only human readable output that contains information about the location and size of code and data objects in your application. The MAP file includes a summary that shows the total size and memory usage of your code.
In an embedded cross-development environment, the symbol information in the ELF file is used when the code is loaded into a source-level symbolic debugger. The debugger takes the binary code/data segments in the ELF file and loads them onto the target (typically using a JTAG or other debug/programming hardware tool), it loads the symbols and source-level debug meta-data into the debugger, then while the real machine code is executing on the target, that execution is reflected in the debugger in the original source code where you can view, step and break-point the code at the source level.
In short, the ELF file is your program. The MAP file is, as its name suggests, a map of your executable - it tells you where things are in the executable.

How to create an ELF executable from process memory image

First of all, Engilish is not my native language. Please excuse if there are any mistakes.
As stated above, I want to create an ELF executable from process memory image. Up until now, I successfully extracted an ELF Header, Program Headers and a list of Elf64_Dyn structures resides in Dynamic segment. I also restored GOT. However, I can't figure out how to reconstruct section headers.
The problem is when an ELF executable is loaded into memory, section headers are not loaded. If we use a list of Elf64_Dyn structures inside Dynamic segment, we can get .rela* sections' address, GOT's address, string table's address, and so on. However, it doesn't contain addresses for sections like .text and .data. To reconstruct section headers we need section's offset and address, but it seems that there is no way to get these information.
How can I reconstruct section headers properly?
Thanks for your consideration.
How can I reconstruct section headers properly?
You can't, but you don't have to -- sections (and section headers) are not used at runtime (at least not by the dynamic loader).
You can also run strip --strip-all a.out to remove them from a "normal" ELF binary, which will continue to run just fine.

Why Executable and Linkable Format(ELF) File contains set of sections?

These-days I'm referring File Handling System Calls in Linux.
Furthermore I understood ELF which is Executable and Linkable Format , contains set of sections.
Those are .bss , .data , .rodata , .text , .comment , and unknown
I referred Wikipedia and this Website to study
So I have below questions
why ELF file uses set of sections?
what is the task of each above section ?
what is the feasibility of this using set of sections ?
A good reference for the ELF file format is the Object Files chapter of the System V ABI. In particular, special sections describes the uses of most of the sections you're likely to encounter.
why ELF file uses set of sections?
An object file contains lots of different classes of data, and it makes sense to group similar data into sections, especially since some sections' contents can be read directly into a process's image when the OS execs the ELF file.
.bss contains uninitialized data, such as int a; declared at global level in a C program. Actually, it contains nothing except the size that needs to be allocated when the ELF file is loaded into a process, because all variables in bss are initialized to 0.
.data contains initialized data, such as int a = 1000; declared at global level in a C program.
.rodata contains read-only data, such as character string literals and global level variables declared as const in C. When the OS execs the ELF file, it will load this section into an area of memory that is read-only.
.text contains executable instructions. When the OS execs the ELF file, it will load this section into an area of memory that is read-only. Sometimes .text and .rodata wind up being loaded into the same area of a process's memory.
.comment typically contains the name and version of the compiler(s) used to generate the file.
Not all of the sections described in the documentation may be present in all ELF files; in particular, running the strip command on the ELF file will remove the .symtab and .debug sections.

ELF Parser to separate data of program

Using ELF Parser, how can I separate the address and data part of a program?
I think you are speaking about the symbol table in the elf file. The symbol table is a separate section in the elf file which will give the addresses of all data and functions.
To do so you will have to read each section header table in the location pointed by e_shoff( from the elf header) in the file. For each section header, check the type of the section. IF the type is 2(SHT_SYMTAB as per elf specification), read the corresponding sh_size from the location sh_offset to get the symbol table.
Find the elf specification here
Please look at ELFIO library. It contains WriteObj and Writer examples. By using the library, you will be able to create/modify ELF binary files (including particular sections filtering).

Resources