ELF Parser to separate data of program

ELF Parser to separate data of program - c

Using ELF Parser, how can I separate the address and data part of a program?

I think you are speaking about the symbol table in the elf file. The symbol table is a separate section in the elf file which will give the addresses of all data and functions.
To do so you will have to read each section header table in the location pointed by e_shoff( from the elf header) in the file. For each section header, check the type of the section. IF the type is 2(SHT_SYMTAB as per elf specification), read the corresponding sh_size from the location sh_offset to get the symbol table.
Find the elf specification here

Please look at ELFIO library. It contains WriteObj and Writer examples. By using the library, you will be able to create/modify ELF binary files (including particular sections filtering).

Related

Difference between the ELF vs MAP file

The linker can output both the ELF and the MAP file. These files are especially relevant in the embedded systems world because the ELF file is usually used to read out the addresses of variables or functions. Additionally, the ELF file is used by different embedded measurement or analysis tools.
When I open a MAP file, then within it, I can see for each global variable and every external function the following information: allocated address, symbolic name, allocated bytes, memory unit, and memory section.
On the other hand, once I open the ELF file, it is a binary file and not human readable. However, some tools I use are able to read it out and interpret it. Those tools can interpret the ELF file, and obtain the information about the symbolic name of a variable/function and its address or even show a function prototype.
From my understanding the ELF and MAP files are basically containing the same information, it is just that the first one is binary and the latter one is the text file.
So what are the actual differences between these two files from the content perspective?
Thank you in advance!

The primary output of the linker (i.e. its main purpose) is to produce the fully linked executable code. That is the ELF (Executable Linkable Format) file. An ELF file may as you have observed contain symbols - these are used for debug. It may also contain meta-data that associates the machine code with the source code from which it was generated. But the bulk of its content (and the part that is not optional) is the executable machine code and data objects that are your application.
The MAP file is an optional information only human readable output that contains information about the location and size of code and data objects in your application. The MAP file includes a summary that shows the total size and memory usage of your code.
In an embedded cross-development environment, the symbol information in the ELF file is used when the code is loaded into a source-level symbolic debugger. The debugger takes the binary code/data segments in the ELF file and loads them onto the target (typically using a JTAG or other debug/programming hardware tool), it loads the symbols and source-level debug meta-data into the debugger, then while the real machine code is executing on the target, that execution is reflected in the debugger in the original source code where you can view, step and break-point the code at the source level.
In short, the ELF file is your program. The MAP file is, as its name suggests, a map of your executable - it tells you where things are in the executable.

How to create an ELF executable from process memory image

First of all, Engilish is not my native language. Please excuse if there are any mistakes.
As stated above, I want to create an ELF executable from process memory image. Up until now, I successfully extracted an ELF Header, Program Headers and a list of Elf64_Dyn structures resides in Dynamic segment. I also restored GOT. However, I can't figure out how to reconstruct section headers.
The problem is when an ELF executable is loaded into memory, section headers are not loaded. If we use a list of Elf64_Dyn structures inside Dynamic segment, we can get .rela* sections' address, GOT's address, string table's address, and so on. However, it doesn't contain addresses for sections like .text and .data. To reconstruct section headers we need section's offset and address, but it seems that there is no way to get these information.
How can I reconstruct section headers properly?
Thanks for your consideration.

How can I reconstruct section headers properly?
You can't, but you don't have to -- sections (and section headers) are not used at runtime (at least not by the dynamic loader).
You can also run strip --strip-all a.out to remove them from a "normal" ELF binary, which will continue to run just fine.

How can I read a custom section within the loader?

I'm trying to embed information (a simple integer) inside the executable object file (Elf) of a process in Linux.
I've accomplished that by writing the integer value as binary inside a file, and then by copying the binary file content using the objcopy command.
objcopy --add-section .customsection=binaryfile processElfFile newProcessElfFile
In this way, inside newProcessElfFile I have a perfectly working copy of the process with a new section containing the integer value, and I can see the section by using
readelf -e newProcessElfFile
I have also verified the section value being correct by using some C code on top of the Libelf library. Everything works fine.
Now, I want to read the integer value contained in the new section and perform a printk when the elf file is loaded to be executed.
In order to achieve this, I need to modify the loader code, which is kernel side.
The problem now is that:
I cannot write code inside the kernel which uses the libelf library, so I cannot access directly the section value as I do with my user-side program.
The elf kernel loader, contained inside linux-VERSION/fs/binfmt_elf.c, in the function load_elf_binary(), doesn't read elf sections, but access the elf program headers, which point towards elf segments, not the single sections.
In order to solve the problem I guess I need to link my custom section within a segment such that I can access it.
So I have 2 related questions:
Can I somehow read directly my custom section within the loader?
If not, How can I make a segment link to the custom section, so that I can access it using the elf file program headers?
Another possibility may be to add the integer value as an element of the already existent .rodata section, but I unfortunately don't know how to perform it and again how to access that section in the kernel.

The ELF header (Elf32_Ehdr or Elf64_Ehdr) contains information pointing to the section header table (members e_shoff, e_shentsize). Together with the section string table index (e_shstrndx), this information can be used to read the section headers and eventually locate the data you are interested in.

Difference between Program header and Section Header in ELF

Q1 What is the difference between Program header and Section Header in ELF?
Q1.1 What is the difference between segment and a section?
I believe pheaders point to sections only.
Q2. What is the difference between File Header and Program Header?
As per GNU ld linker script, Using Id: The GNU Linker:
PHDRS
{
name type [ FILEHDR ] [ PHDRS ] [ AT ( address ) ]
[ FLAGS ( flags ) ] ;
}
You may use the FILEHDR and PHDRS keywords appear after the program header type to further
describe the contents of the segment. The FILEHDR keyword means that the segment should include
the ELF file header. The PHDRS keyword means that the segment should include the ELF program
headers themselves.
This is a bit confusing.

The Executable & Linkable Format wikipage has a nice picture explaining ELF, and the difference between its program header & sections header. See also elf(5)
The [initial] program header is defining segments (in the address space of a process running that ELF executable) projected in virtual memory (the executable point of view) at execve(2) time. The [final] sections header is defining sections (the linkable point of view, for ld(1) etc...). Each section belongs to a segment (and may, or not, be visible -i.e. mapped into memory- at execution time). The ELF file header tells where program header table & section header table are.
Use also objdump(1) and readelf(1) to explore several ELF files (executables, shared objects, linkable objects) existing on your Linux system.
Levine's Linkers & Loaders book has a chapter explaining that in details.
And Drepper's paper How to Write Shared Libraries also has some good explanation.

Q1 What is the difference between the Program header and the Section Header in ELF?
A program header describes a segment or other information that the system needs to prepare the program for execution.
A section is an interface that can represent a lot of things. Look here for details (search for Elf64_Shdr)
A section header is inside a segment.
Q1.1 What is the difference between a segment and a section?
A segment consists of one or more sections, though this fact is transparent to the program header.
Q2. What is the difference between File Header and Program Header?
The ELF file header. This appears at the start of every ELF file (see /usr/include/elf.h). It also has the number of program headers existing in this file.
The ELF file always starts with the ELF file header. And it references the program headers. You need at least one program header.

Modify ELF file

I have an ELF executable and I would like to know how can I modify its .rodata segment.
Also, more generally, how can I modify an ELF executable?

You can use any hexeditor to do that, if you know precisely which part of ELF you need to modify.
If you want to parse ELFs and do more complex logic you should write some code which will open file or better, mmap it. Then you can read ELF header which gives basic information about ELF and points to other important places in ELF. I suggest reading manual for ELF and <include/elf.h>.
If you are using Linux, you can view where sections lie in memory using readelf or objdump.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight