I'm on Ubuntu 20.04, gcc 9.3.0, ld 2.34.
I have a simple hello world program that does not use glibc or any other library and just uses write syscall. Despite this, my binary size is roughly 8Kb. I'm unsure as to why it is that large and not say 1Kb.
C Program:
int
x64_syscall_write(int fd, char const *data, unsigned long int data_size)
{
int result = 0;
__asm__ __volatile__("syscall"
: "=a" (result)
: "a" (1), "D" (fd),
"S" (data), "d" (data_size)
: "r11", "rcx", "memory");
return result;
}
__asm__(".global entry_point\n"
"entry_point:\n"
"xor rbp, rbp\n"
"pop rdi\n"
"mov rsi, rsp\n"
"and rsp, 0xfffffffffffffff0\n"
"call main\n"
"mov rdi, rax\n"
"mov rax, 60\n"
"syscall\n"
"ret");
int
main(int argc, char *argv[])
{
x64_syscall_write(1, "hello\n", 6);
return 0;
}
Built with:
gcc -ffreestanding -static -nostdlib -no-pie -masm=intel \
-fno-unwind-tables -fno-asynchronous-unwind-tables \
-Wl,--gc-sections -fdata-sections -Os \
hello.c -c -o hello.o
# NOTE: I know more could be done here to shave
# off a few more bytes, but I feel this is the bulk of it.
ld -e entry_point hello.o -o hello
hello.o is 1.7Kb.
hello is 8.4Kb.
readelf -Wl hello
Elf file type is EXEC (Executable file)
Entry point 0x40101c
There are 6 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x0001b0 0x0001b0 R 0x1000
LOAD 0x001000 0x0000000000401000 0x0000000000401000 0x000045 0x000045 R E 0x1000
LOAD 0x002000 0x0000000000402000 0x0000000000402000 0x000007 0x000007 R 0x1000
NOTE 0x000190 0x0000000000400190 0x0000000000400190 0x000020 0x000020 R 0x8
GNU_PROPERTY 0x000190 0x0000000000400190 0x0000000000400190 0x000020 0x000020 R 0x8
GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW 0x10
Section to Segment mapping:
Segment Sections...
00 .note.gnu.property
01 .text
02 .rodata
03 .note.gnu.property
04 .note.gnu.property
05
Here you can see that the linker created 3 LOAD segments: one for the ELF header and other metadata, one for .text and one for .rodata.
Linking with -z noseparate-code results in much smaller binary (smaller than hello.o):
ls -l hello*
-rwxr-xr-x 1 user user 1384 Apr 26 22:24 hello
-rw-r--r-- 1 user user 603 Apr 26 22:22 hello.c
-rw-r--r-- 1 user user 1680 Apr 26 22:22 hello.o
readelf -Wl hello
Elf file type is EXEC (Executable file)
Entry point 0x40015c
There are 4 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x00018c 0x00018c R E 0x1000
NOTE 0x000120 0x0000000000400120 0x0000000000400120 0x000020 0x000020 R 0x8
GNU_PROPERTY 0x000120 0x0000000000400120 0x0000000000400120 0x000020 0x000020 R 0x8
GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW 0x10
Section to Segment mapping:
Segment Sections...
00 .note.gnu.property .text .rodata
01 .note.gnu.property
02 .note.gnu.property
03
You can shrink this further by removing .note.GNU-stack and .note.gnu.property sections:
objcopy -R .note.* hello.o hello1.o
ld -e entry_point hello1.o -o hello1 -z noseparate-code
ls -l hello1*
-rwxr-xr-x 1 user user 1072 Apr 26 22:38 hello1
-rw-r--r-- 1 user user 1440 Apr 26 22:37 hello1.o
readelf -Wl hello1
Elf file type is EXEC (Executable file)
Entry point 0x400094
There is 1 program header, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x0000c4 0x0000c4 R E 0x1000
Section to Segment mapping:
Segment Sections...
00 .text .rodata
Related
I made this simple C program and compiled it without ASLR
#include <stdio.h>
#include <stdlib.h>
int a = 10;
int b = 20;
int main(int argc, char *argv[])
{
printf("%lx\n",&a);
printf("%lx\n",&b);
return 0;
}
Every time I execute it, the result is the same:
555555558018
55555555801c
Because of that, I am thinking that the data section should start somewhere near to 0x555555558018.
However, when I list the segments of my binary I see the following:
Elf file type is DYN (Position-Independent Executable file)
Entry point 0x1050
There are 13 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
PHDR 0x0000000000000040 0x0000000000000040 0x0000000000000040
0x00000000000002d8 0x00000000000002d8 R 0x8
INTERP 0x0000000000000318 0x0000000000000318 0x0000000000000318
0x000000000000001c 0x000000000000001c R 0x1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000618 0x0000000000000618 R 0x1000
LOAD 0x0000000000001000 0x0000000000001000 0x0000000000001000
0x0000000000000195 0x0000000000000195 R E 0x1000
LOAD 0x0000000000002000 0x0000000000002000 0x0000000000002000
0x00000000000000e4 0x00000000000000e4 R 0x1000
LOAD 0x0000000000002dd0 0x0000000000003dd0 0x0000000000003dd0
0x0000000000000250 0x0000000000000258 RW 0x1000
DYNAMIC 0x0000000000002de0 0x0000000000003de0 0x0000000000003de0
0x00000000000001e0 0x00000000000001e0 RW 0x8
NOTE 0x0000000000000338 0x0000000000000338 0x0000000000000338
0x0000000000000020 0x0000000000000020 R 0x8
NOTE 0x0000000000000358 0x0000000000000358 0x0000000000000358
0x0000000000000044 0x0000000000000044 R 0x4
GNU_PROPERTY 0x0000000000000338 0x0000000000000338 0x0000000000000338
0x0000000000000020 0x0000000000000020 R 0x8
GNU_EH_FRAME 0x000000000000200c 0x000000000000200c 0x000000000000200c
0x000000000000002c 0x000000000000002c R 0x4
GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 RW 0x10
GNU_RELRO 0x0000000000002dd0 0x0000000000003dd0 0x0000000000003dd0
0x0000000000000230 0x0000000000000230 R 0x1
Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .interp .note.gnu.property .note.gnu.build-id .note.ABI-tag .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt
03 .init .plt .plt.got .text .fini
04 .rodata .eh_frame_hdr .eh_frame
05 .init_array .fini_array .dynamic .got .got.plt .data .bss
06 .dynamic
07 .note.gnu.property
08 .note.gnu.build-id .note.ABI-tag
09 .note.gnu.property
10 .eh_frame_hdr
11
12 .init_array .fini_array .dynamic .got
There is not such an address. I think that maybe there is a difference between C pointers (I observe that they consist of 48 bits), and the Virtual Addresses of the segments (that consist of 64 bits). Where are the C pointers actually pointing to?
I have different hypothesis. Once I read that C pointers are actually offsets of their segments (not sure if this is true). The other thing I can think about, is that C pointers are logical addresses, while the segment's virtual addresses refer to the Linear Address Space. See the difference below:
Memory in x86
I'm trying to build a xv6-like system and I'm copying xv6's code below:
if (ph.p_vaddr % PGSIZE) {
cprintf("exec: addr not page aligned.\n");
goto bad;
}
This is the part where ELF is loaded into memory.
It checks every PT_LOAD segment's vaddr and makes sure it's page aligned before load it into the memory.
But the code is confusing because when I use readelf to check my ELF file to load:
Elf file type is EXEC (Executable file)
Entry point 0x400260
There are 4 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x0021f0 0x0021f0 R E 0x1000
LOAD 0x002eb8 0x0000000000403eb8 0x0000000000403eb8 0x000268 0x0008e8 RW 0x1000
GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW 0x10
GNU_RELRO 0x002eb8 0x0000000000403eb8 0x0000000000403eb8 0x000148 0x000148 R 0x1
Section to Segment mapping:
Segment Sections...
00 .init .text .fini .rodata .eh_frame
01 .init_array .fini_array .data.rel.ro .got .got.plt .data .bss
02
03 .init_array .fini_array .data.rel.ro .got .got.plt
The vaddr is not always page aligned, so I'd like to ask that is it something wrong with my way of compiling or the code itself is wrong?
is it something wrong with my way of compiling or the code itself is wrong?
The code is wrong. .p_vaddr - .p_offset must be page-aligned; .p_vaddr alone does not have to be.
This is because the segment needs to be mmaped, and mmap requires page-aligned offset. In order to mmap the second segment at .p_vaddr, the loader rounds down both .p_vaddr and .p_offset, and mmaps a bit of extra at the beginning of the segment.
I am developing a freestanding application for an ARM Cortex-M microcontroller and while researching the structure of an S-Record file I found that I have some kind of misunderstanding in how the addresses are represented in the S-Record format.
I have a variable defined in my source code like so:
uint32_t g_ip_address = IP_ADDRESS(10, 1, 0, 56); // in LE: 0x3800010A
When I run objdump I see that the variable ends up in the .data section at address 0x1ffe01c4:
$ arm-none-eabi-objdump -t application.elf | grep g_ip_address
1ffe01c4 g O .data 00000004 g_ip_address
This makes sense, given that the memory section of my linker script looks like this and .data is going to RAM:
MEMORY
{
FLASH (rx) : ORIGIN = 0x00000000, LENGTH = 0x0200000 /* 2M */
RAM (rwx) : ORIGIN = 0x1FFE0000, LENGTH = 0x00A0000 /* 640K */
}
However, when I check through the srec file, I'm finding that the address for the record is not 0x1FFE0000. It's 0x0005F570, which seems to put it in the FLASH section (spaces added for clarity).
S315 0005F570 00000000 3800010A 000010180000000014
Is there an implicit offset encoded in a different record entry? How does objcopy get this new address? If this value is being encoded into a function in some way (some pre-main initialization of variables perhaps)?
Ultimately, my goal is to be able to parse the srec file and patch the IP address value to create a new srec file. Is the idiomatic way of doing something like this simply to create a struct that hardcodes some leading magic number sequence that can be detected in the file?
flash.s
.cpu cortex-m0
.thumb
.word 0x00002000
.word reset
.thumb_func
reset:
b reset
.data
.word 0x11223344
.bss
.word 0x00000000
.word 0x00000000
flash.ld
MEMORY
{
rom : ORIGIN = 0x08000000, LENGTH = 0x1000
ram : ORIGIN = 0x20000000, LENGTH = 0x1000
}
SECTIONS
{
.text : { *(.text*) } > rom
.bss : { *(.bss*) } > ram AT > rom
.data : { *(.data*) } > ram AT > rom
}
build it
arm-none-eabi-as --warn --fatal-warnings -mcpu=cortex-m0 flash.s -o flash.o
arm-none-eabi-ld -nostdlib -nostartfiles -T flash.ld flash.o -o so.elf
arm-none-eabi-objdump -D so.elf > so.list
arm-none-eabi-objcopy --srec-forceS3 so.elf -O srec so.srec
arm-none-eabi-objcopy -O binary so.elf so.bin
cat so.list
08000000 <reset-0x8>:
8000000: 00002000 andeq r2, r0, r0
8000004: 08000009 stmdaeq r0, {r0, r3}
08000008 <reset>:
8000008: e7fe b.n 8000008 <reset>
Disassembly of section .bss:
20000000 <.bss>:
...
Disassembly of section .data:
20000008 <.data>:
20000008: 11223344 ; <UNDEFINED> instruction: 0x11223344
cat so.srec
S00A0000736F2E7372656338
S30F080000000020000009000008FEE7D2
S3090800000A443322113A
S70508000000F2
arm-none-eabi-readelf -l so.elf
Elf file type is EXEC (Executable file)
Entry point 0x8000000
There are 3 program headers, starting at offset 52
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x000094 0x08000000 0x08000000 0x0000a 0x0000a R E 0x2
LOAD 0x000000 0x20000000 0x0800000a 0x00000 0x00008 RW 0x1
LOAD 0x00009e 0x20000008 0x0800000a 0x00004 0x00004 RW 0x1
Section to Segment mapping:
Segment Sections...
00 .text
01 .bss
02 .data
hexdump -C so.bin
00000000 00 20 00 00 09 00 00 08 fe e7 44 33 22 11 |. ........D3".|
0000000e
bss is not normally exposed as is, you complicate your linker script to add beginning and end points so you can then zero that range in your bootstrap. For .data you can clearly see what is going on with the standard binutils tools.
You have not provided enough of your code (and linker script), nor a minimal example that demonstrates the problem, so this is about as far as this can go.
There is a remote 64-bit *nix server that can compile a user-provided code (which should be written in Rust, but I don't think it matters since it uses LLVM). I don't know which compiler/linker flags it uses, but the compiled ELF executable looks weird - it has 4 LOAD segments:
$ readelf -e executable
...
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
...
LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000004138 0x0000000000004138 R 0x1000
LOAD 0x0000000000005000 0x0000000000005000 0x0000000000005000
0x00000000000305e9 0x00000000000305e9 R E 0x1000
LOAD 0x0000000000036000 0x0000000000036000 0x0000000000036000
0x000000000000d808 0x000000000000d808 R 0x1000
LOAD 0x0000000000043da0 0x0000000000044da0 0x0000000000044da0
0x0000000000002290 0x00000000000024a0 RW 0x1000
...
On my own system all executables that I was looking at only have 2 LOAD segments:
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
...
LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x00000000003000c0 0x00000000003000c0 R E 0x200000
LOAD 0x00000000003002b0 0x00000000005002b0 0x00000000005002b0
0x00000000000776c8 0x000000000009b200 RW 0x200000
...
What are the circumstances (compiler/linker versions, flags etc) under which a compiler might build an ELF with 4 LOAD segments?
What is the point of having 4 LOAD segments? I imagine that having a segment with read but not execute permission might help against certain exploits, but why have two such segments?
A typical BFD-ld or Gold linked Linux executable has 2 loadable segments, with the ELF header merged with .text and .rodata into the first RE segment, and .data, .bss and other writable sections merged into the second RW segment.
Here is the typical section to segment mapping:
$ echo "int foo; int main() { return 0;}" | clang -xc - -o a.out-gold -fuse-ld=gold
$ readelf -Wl a.out-gold
Elf file type is EXEC (Executable file)
Entry point 0x400420
There are 9 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x000040 0x0000000000400040 0x0000000000400040 0x0001f8 0x0001f8 R 0x8
INTERP 0x000238 0x0000000000400238 0x0000000000400238 0x00001c 0x00001c R 0x1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x0006b0 0x0006b0 R E 0x1000
LOAD 0x000e18 0x0000000000401e18 0x0000000000401e18 0x0001f8 0x000200 RW 0x1000
DYNAMIC 0x000e28 0x0000000000401e28 0x0000000000401e28 0x0001b0 0x0001b0 RW 0x8
NOTE 0x000254 0x0000000000400254 0x0000000000400254 0x000020 0x000020 R 0x4
GNU_EH_FRAME 0x00067c 0x000000000040067c 0x000000000040067c 0x000034 0x000034 R 0x4
GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW 0x10
GNU_RELRO 0x000e18 0x0000000000401e18 0x0000000000401e18 0x0001e8 0x0001e8 RW 0x8
Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .interp .note.ABI-tag .dynsym .dynstr .gnu.hash .hash .gnu.version .gnu.version_r .rela.dyn .init .text .fini .rodata .eh_frame .eh_frame_hdr
03 .fini_array .init_array .dynamic .got .got.plt .data .bss
04 .dynamic
05 .note.ABI-tag
06 .eh_frame_hdr
07
08 .fini_array .init_array .dynamic .got .got.plt
This optimizes the number of mmaps that the kernel must perform to load such executable, but at a security cost: the data in .rodata shouldn't be executable, but is (because it's merged with .text, which must be executable). This may significantly increase the attack surface for someone trying to hijack a process.
Newer Linux systems, in particular using LLD to link binaries, prioritize security over speed, and put ELF header and .rodata into the first R-only segment, resulting in 3 load segments and improved security. Here is a typical mapping:
$ echo "int foo; int main() { return 0;}" | clang -xc - -o a.out-lld -fuse-ld=lld
$ readelf -Wl a.out-lld
Elf file type is EXEC (Executable file)
Entry point 0x201000
There are 10 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x000040 0x0000000000200040 0x0000000000200040 0x000230 0x000230 R 0x8
INTERP 0x000270 0x0000000000200270 0x0000000000200270 0x00001c 0x00001c R 0x1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x000000 0x0000000000200000 0x0000000000200000 0x000558 0x000558 R 0x1000
LOAD 0x001000 0x0000000000201000 0x0000000000201000 0x000185 0x000185 R E 0x1000
LOAD 0x002000 0x0000000000202000 0x0000000000202000 0x001170 0x002005 RW 0x1000
DYNAMIC 0x003010 0x0000000000203010 0x0000000000203010 0x000150 0x000150 RW 0x8
GNU_RELRO 0x003000 0x0000000000203000 0x0000000000203000 0x000170 0x001000 R 0x1
GNU_EH_FRAME 0x000440 0x0000000000200440 0x0000000000200440 0x000034 0x000034 R 0x1
GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW 0
NOTE 0x00028c 0x000000000020028c 0x000000000020028c 0x000020 0x000020 R 0x4
Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .interp .note.ABI-tag .rodata .dynsym .gnu.version .gnu.version_r .gnu.hash .hash .dynstr .rela.dyn .eh_frame_hdr .eh_frame
03 .text .init .fini
04 .data .tm_clone_table .fini_array .init_array .dynamic .got .bss
05 .dynamic
06 .fini_array .init_array .dynamic .got
07 .eh_frame_hdr
08
09 .note.ABI-tag
Not to be left behind, the newer BFD-ld (my version is 2.31.1) also makes ELF header and .rodata read-only, but fails to merge two R-only segments into one, resulting in 4 loadable segments:
$ echo "int foo; int main() { return 0;}" | clang -xc - -o a.out-bfd -fuse-ld=bfd
$ readelf -Wl a.out-bfd
Elf file type is EXEC (Executable file)
Entry point 0x401020
There are 11 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x000040 0x0000000000400040 0x0000000000400040 0x000268 0x000268 R 0x8
INTERP 0x0002a8 0x00000000004002a8 0x00000000004002a8 0x00001c 0x00001c R 0x1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x0003f8 0x0003f8 R 0x1000
LOAD 0x001000 0x0000000000401000 0x0000000000401000 0x00018d 0x00018d R E 0x1000
LOAD 0x002000 0x0000000000402000 0x0000000000402000 0x000110 0x000110 R 0x1000
LOAD 0x002e40 0x0000000000403e40 0x0000000000403e40 0x0001e8 0x0001f0 RW 0x1000
DYNAMIC 0x002e50 0x0000000000403e50 0x0000000000403e50 0x0001a0 0x0001a0 RW 0x8
NOTE 0x0002c4 0x00000000004002c4 0x00000000004002c4 0x000020 0x000020 R 0x4
GNU_EH_FRAME 0x002004 0x0000000000402004 0x0000000000402004 0x000034 0x000034 R 0x4
GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW 0x10
GNU_RELRO 0x002e40 0x0000000000403e40 0x0000000000403e40 0x0001c0 0x0001c0 R 0x1
Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .interp .note.ABI-tag .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn
03 .init .text .fini
04 .rodata .eh_frame_hdr .eh_frame
05 .init_array .fini_array .dynamic .got .got.plt .data .bss
06 .dynamic
07 .note.ABI-tag
08 .eh_frame_hdr
09
10 .init_array .fini_array .dynamic .got
Finally, some of these choices are affected by the --(no)rosegment (or -Wl,z,noseparate-code for BFD ld) linker option.
How is it possible to extract loadable program headers individually from ELF files?
By examining a binary using readelf one can get output similar to:
$ readelf -l helloworld
Elf file type is EXEC (Executable file)
Entry point 0x400440
There are 9 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
PHDR 0x0000000000000040 0x0000000000400040 0x0000000000400040
0x00000000000001f8 0x00000000000001f8 R E 8
INTERP 0x0000000000000238 0x0000000000400238 0x0000000000400238
0x000000000000001c 0x000000000000001c R 1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x000000000000070c 0x000000000000070c R E 200000
LOAD 0x0000000000000e10 0x0000000000600e10 0x0000000000600e10
0x0000000000000230 0x0000000000000238 RW 200000
DYNAMIC 0x0000000000000e28 0x0000000000600e28 0x0000000000600e28
0x00000000000001d0 0x00000000000001d0 RW 8
NOTE 0x0000000000000254 0x0000000000400254 0x0000000000400254
0x0000000000000044 0x0000000000000044 R 4
GNU_EH_FRAME 0x00000000000005e4 0x00000000004005e4 0x00000000004005e4
0x0000000000000034 0x0000000000000034 R 4
GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 RW 10
GNU_RELRO 0x0000000000000e10 0x0000000000600e10 0x0000000000600e10
0x00000000000001f0 0x00000000000001f0 R 1
Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .text .fini .rodata .eh_frame_hdr .eh_frame
03 .init_array .fini_array .jcr .dynamic .got .got.plt .data .bss
04 .dynamic
05 .note.ABI-tag .note.gnu.build-id
06 .eh_frame_hdr
07
08 .init_array .fini_array .jcr .dynamic .got
This question answers how loadable headers are being mapped to memory(and where) but does not specify from where(from which offset and size) are the sections read within the given binary.
Is it determined by the current program header's fields p_offset and p_filesz?
struct Proghdr {
uint32_t p_type;
uint32_t p_offset;
uint32_t p_va;
uint32_t p_pa;
uint32_t p_filesz;
uint32_t p_memsz;
uint32_t p_flags;
uint32_t p_align;
};
struct Elf *elf_header = ...
struct Proghdr *ph;
if (elf_header->e_magic != ELF_MAGIC)
goto bad;
ph = (struct Proghdr *) ((uint8_t *) elf_header + elf_header->e_phoff);
eph = ph + ELFHDR->e_phnum;
for (; ph < eph; ph++)
if(ph->p_type == PT_LOAD)
/*read_pload (dst address in memory, how many bytes to read, offset in the file) */
read_pload(ph->p_pa, ph->p_memsz, ph->p_offset);
Is it determined by the current program header's fields p_offset and p_filesz?
Yes, exactly.
get program header table address by reading e_phoff, header count (number of headers) by reading e_phnum and size of each header by reading e_phentsize from elf file header. the trick is that each header is of same size of e_phentsize. So after every e_phentsize, new header starts and headers for total e_phnum