Elf file format, what's the order? - c

Please help me fix my confusion:
In Elf64 File at first we the elf header, then immediately all program header and then all section headers?
Then why for example in elf header we have a index to where the first program header starts if it's always 64 bytes after the start of the file, that seems to be extra garbage information?

If nothing else, it allows the header's size to be made larger in future versions of the format without losing backwards compatibility.
In Elf64 File at first we the elf header, then immediately all program header and then all section headers?
No, the program header, the section headers and the sections are wherever the headers say they are. There's no requirement for them to be immediately after one another or in any particular order.

Related

How to create an ELF executable from process memory image

First of all, Engilish is not my native language. Please excuse if there are any mistakes.
As stated above, I want to create an ELF executable from process memory image. Up until now, I successfully extracted an ELF Header, Program Headers and a list of Elf64_Dyn structures resides in Dynamic segment. I also restored GOT. However, I can't figure out how to reconstruct section headers.
The problem is when an ELF executable is loaded into memory, section headers are not loaded. If we use a list of Elf64_Dyn structures inside Dynamic segment, we can get .rela* sections' address, GOT's address, string table's address, and so on. However, it doesn't contain addresses for sections like .text and .data. To reconstruct section headers we need section's offset and address, but it seems that there is no way to get these information.
How can I reconstruct section headers properly?
Thanks for your consideration.
How can I reconstruct section headers properly?
You can't, but you don't have to -- sections (and section headers) are not used at runtime (at least not by the dynamic loader).
You can also run strip --strip-all a.out to remove them from a "normal" ELF binary, which will continue to run just fine.

How can I read a custom section within the loader?

I'm trying to embed information (a simple integer) inside the executable object file (Elf) of a process in Linux.
I've accomplished that by writing the integer value as binary inside a file, and then by copying the binary file content using the objcopy command.
objcopy --add-section .customsection=binaryfile processElfFile newProcessElfFile
In this way, inside newProcessElfFile I have a perfectly working copy of the process with a new section containing the integer value, and I can see the section by using
readelf -e newProcessElfFile
I have also verified the section value being correct by using some C code on top of the Libelf library. Everything works fine.
Now, I want to read the integer value contained in the new section and perform a printk when the elf file is loaded to be executed.
In order to achieve this, I need to modify the loader code, which is kernel side.
The problem now is that:
I cannot write code inside the kernel which uses the libelf library, so I cannot access directly the section value as I do with my user-side program.
The elf kernel loader, contained inside linux-VERSION/fs/binfmt_elf.c, in the function load_elf_binary(), doesn't read elf sections, but access the elf program headers, which point towards elf segments, not the single sections.
In order to solve the problem I guess I need to link my custom section within a segment such that I can access it.
So I have 2 related questions:
Can I somehow read directly my custom section within the loader?
If not, How can I make a segment link to the custom section, so that I can access it using the elf file program headers?
Another possibility may be to add the integer value as an element of the already existent .rodata section, but I unfortunately don't know how to perform it and again how to access that section in the kernel.
The ELF header (Elf32_Ehdr or Elf64_Ehdr) contains information pointing to the section header table (members e_shoff, e_shentsize). Together with the section string table index (e_shstrndx), this information can be used to read the section headers and eventually locate the data you are interested in.

Difference between Program header and Section Header in ELF

Q1 What is the difference between Program header and Section Header in ELF?
Q1.1 What is the difference between segment and a section?
I believe pheaders point to sections only.
Q2. What is the difference between File Header and Program Header?
As per GNU ld linker script, Using Id: The GNU Linker:
PHDRS
{
name type [ FILEHDR ] [ PHDRS ] [ AT ( address ) ]
[ FLAGS ( flags ) ] ;
}
You may use the FILEHDR and PHDRS keywords appear after the program header type to further
describe the contents of the segment. The FILEHDR keyword means that the segment should include
the ELF file header. The PHDRS keyword means that the segment should include the ELF program
headers themselves.
This is a bit confusing.
The Executable & Linkable Format wikipage has a nice picture explaining ELF, and the difference between its program header & sections header. See also elf(5)
The [initial] program header is defining segments (in the address space of a process running that ELF executable) projected in virtual memory (the executable point of view) at execve(2) time. The [final] sections header is defining sections (the linkable point of view, for ld(1) etc...). Each section belongs to a segment (and may, or not, be visible -i.e. mapped into memory- at execution time). The ELF file header tells where program header table & section header table are.
Use also objdump(1) and readelf(1) to explore several ELF files (executables, shared objects, linkable objects) existing on your Linux system.
Levine's Linkers & Loaders book has a chapter explaining that in details.
And Drepper's paper How to Write Shared Libraries also has some good explanation.
Q1 What is the difference between the Program header and the Section Header in ELF?
A program header describes a segment or other information that the system needs to prepare the program for execution.
A section is an interface that can represent a lot of things. Look here for details (search for Elf64_Shdr)
A section header is inside a segment.
Q1.1 What is the difference between a segment and a section?
A segment consists of one or more sections, though this fact is transparent to the program header.
Q2. What is the difference between File Header and Program Header?
The ELF file header. This appears at the start of every ELF file (see /usr/include/elf.h). It also has the number of program headers existing in this file.
The ELF file always starts with the ELF file header. And it references the program headers. You need at least one program header.

How to identify whether a file is elf file using C language function?

In my program , I want to identify whether a file is ELF(Executable and Linkable Format) type. How to identify whether a file is elf file using C language function?
If the only thing you want to do is test whether the file is ELF or not, then read the first 16 bytes to check the file identity. The first four bytes will equal {0x7f, 'E', 'L', 'F'}. The remaining bytes can vary, but checking them will help you be even more certain that the file is elf. See the elf(3) man page for more detail.
That man page gives enough info for parsing ELF files in general, but if you want to do more than just check the format, then you should probably use a library. See both the Elf Toolchain and the Binary File Descriptor Library.
Update: Yet another alternative is libmagic(3) which will read the ELF header for you. It is probably overkill if you are only interested in ELF, but libmagic also knows about just about every file format worth knowing about.

What is the difference between ELF files and bin files?

The final images produced by compliers contain both bin file and extended loader format ELf file ,what is the difference between the two , especially the utility of ELF file.
A Bin file is a pure binary file with no memory fix-ups or relocations, more than likely it has explicit instructions to be loaded at a specific memory address. Whereas....
ELF files are Executable Linkable Format which consists of a symbol look-ups and relocatable table, that is, it can be loaded at any memory address by the kernel and automatically, all symbols used, are adjusted to the offset from that memory address where it was loaded into. Usually ELF files have a number of sections, such as 'data', 'text', 'bss', to name but a few...it is within those sections where the run-time can calculate where to adjust the symbol's memory references dynamically at run-time.
A bin file is just the bits and bytes that go into the rom or a particular address from which you will run the program. You can take this data and load it directly as is, you need to know what the base address is though as that is normally not in there.
An elf file contains the bin information but it is surrounded by lots of other information, possible debug info, symbols, can distinguish code from data within the binary. Allows for more than one chunk of binary data (when you dump one of these to a bin you get one big bin file with fill data to pad it to the next block). Tells you how much binary you have and how much bss data is there that wants to be initialised to zeros (gnu tools have problems creating bin files correctly).
The elf file format is a standard, arm publishes its enhancements/variations on the standard. I recommend everyone writes an elf parsing program to understand what is in there, dont bother with a library, it is quite simple to just use the information and structures in the spec. Helps to overcome gnu problems in general creating .bin files as well as debugging linker scripts and other things that can help to mess up your bin or elf output.
some resources:
ELF for the ARM architecture
http://infocenter.arm.com/help/topic/com.arm.doc.ihi0044d/IHI0044D_aaelf.pdf
ELF from wiki
http://en.wikipedia.org/wiki/Executable_and_Linkable_Format
ELF format is generally the default output of compiling.
if you use GNU tool chains, you can translate it to binary format by using objcopy, such as:
arm-elf-objcopy -O binary [elf-input-file] [binary-output-file]
or using fromELF utility(built in most IDEs such as ADS though):
fromelf -bin -o [binary-output-file] [elf-input-file]
bin is the final way that the memory looks before the CPU starts executing it.
ELF is a cut-up/compressed version of that, which the CPU/MCU thus can't run directly.
The (dynamic) linker first has to sufficiently reverse that (and thus modify offsets back to the correct positions).
But there is no linker/OS on the MCU, hence you have to flash the bin instead.
Moreover, Ahmed Gamal is correct.
Compiling and linking are separate stages; the whole process is called "building", hence the GNU Compiler Collection has separate executables:
One for the compiler (which technically outputs assembly), another one for the assembler (which outputs object code in the ELF format),
then one for the linker (which combines several object files into a single ELF file), and finally, at runtime, there is the dynamic linker,
which effectively turns an elf into a bin, but purely in memory, for the CPU to run.
Note that it is common to refer to the whole process as "compiling" (as in GCC's name itself), but that then causes confusion when the specifics are discussed,
such as in this case, and Ahmed was clarifying.
It's a common problem due to the inexact nature of human language itself.
To avoid confusion, GCC outputs object code (after internally using the assembler) using the ELF format.
The linker simply takes several of them (with an .o extension), and produces a single combined result, probably even compressing them (into "a.out").
But all of them, even ".so" are ELF.
It is like several Word documents, each ending in ".chapter", all being combined into a final ".book",
where all files technically use the same standard/format and hence could have had ".docx" as the extension.
The bin is then kind of like converting the book into a ".txt" file while adding as many whitespace as necessary to be equivalent to the size of the final book (printed on a single spool),
with places for all the pictures to be overlaid.
I just want to correct a point here. ELF file is produced by the Linker, not the compiler.
The Compiler mission ends after producing the object files (*.o) out of the source code files. Linker links all .o files together and produces the ELF.

Resources