I need to determine the VMAs for loadable segments of ELF executables. VMAs can be printed from /proc/pid/maps. The relationship between VMAs shown by maps with loadable segments is also clear to me. Each segment consists of one or more VMAs. what is the method used by kernel to form VMAs from ELF segments: whteher it takes into consideration only permissions/flags or something else is also required? As per my understanding, a segment with flags Read, Execute(code) will go in separate VMA having same permission. While next segment with permissions Read, Write(data) should go in an other VMA. But this is not case with second loadable segment, it is usually splitted in two or more VMAs: some with read and write while other with read only. So My assumption that flags are the only culprit for VMA generation seems wrong. I need help to understand this relationship between segments and VMAs.
What I want to do is to programmatically determine the VMAs for loadable segments of ELF with out loading it in memory. So any pointer/help in this direction is the main objective of this post.
A VMA is a homogeneous region of virtual memory with:
the same permissions (PROT_EXEC, etc.);
the same type (MAP_SHARED/MAP_PRIVATE);
the same backing file (if any);
a consistent offset within the file.
For example, if you have a VMA which is RW and you mprotect PROT_READ (you remove the permission to write) a part in the middle of the VMA, the kernel will split the VMA in three VMAs (the first one being RW, the second R and the last RW).
Let's look at a typical VMA from an executable:
$ cat /proc/$$/maps
00400000-004f2000 r-xp 00000000 08:01 524453 /bin/bash
006f1000-006f2000 r--p 000f1000 08:01 524453 /bin/bash
006f2000-006fb000 rw-p 000f2000 08:01 524453 /bin/bash
006fb000-00702000 rw-p 00000000 00:00 0
[...]
The first VMA is the text segment. The second, third and fourth VMAs are the data segment.
Anonymous mapping for .bss
At the beginning of the process, you will have something like this:
$ cat /proc/$$/maps
00400000-004f2000 r-xp 00000000 08:01 524453 /bin/bash
006f1000-006fb000 rw-p 000f1000 08:01 524453 /bin/bash
006fb000-00702000 rw-p 00000000 00:00 0
[...]
006f1000-006fb000 is the part of the text segment which comes from the executable file.
006fb000-00702000 is not present in the executable file because it is initially filled with zeroes. The non-initialized variables of the process are all grouped together (in the .bss segment) and are not represented in the executable file in order to save space (1).
This come from the PT_LOAD entries of the program header table of the executable file (readelf -l) which describe the segments to map into memory:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
[...]
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x00000000000f1a74 0x00000000000f1a74 R E 200000
LOAD 0x00000000000f1de0 0x00000000006f1de0 0x00000000006f1de0
0x0000000000009068 0x000000000000f298 RW 200000
[...]
If you look at the corresponding PT_LOAD entry, you will notice that a part of the the segment is not represented in the file (because the file size is smaller than the memory size).
The part of the data segment which is not in the executable file is initialized with zeros: the dynamic linker uses a MAP_ANONYMOUS mapping for this part of the data segment. This is why is appears as a separate VMA (it does not have the same backing file).
Relocation protection (PT_GNU_RELRO)
When the dynamic, linker has finished doing the relocations (2), it might mark some part of the data segment (the .got section among others) as read-only in order to avoid GOT-poisoning attacks or bugs. The section of the data segment which should be protected after the relocations in described by the PT_GNU_RELRO entry of the program header table: the dynamic linker mprotect(addr, len, PROT_READ) the given region after finishing the relocations (3). This mprotect call splits the second VMA in two VMAs (the first one R and the second one RW).
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
[...]
GNU_RELRO 0x00000000000f1de0 0x00000000006f1de0 0x00000000006f1de0
0x0000000000000220 0x0000000000000220 R
[...]
Summary
The VMAs
00400000-004f2000 r-xp 00000000 08:01 524453 /bin/bash
006f1000-006f2000 r--p 000f1000 08:01 524453 /bin/bash
006f2000-006fb000 rw-p 000f2000 08:01 524453 /bin/bash
006fb000-00702000 rw-p 00000000 00:00 0
are derived from the VirtAddr, MemSiz and Flags fields of the PT_LOAD and PT_GNU_RELRO entries:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
[...]
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x00000000000f1a74 0x00000000000f1a74 R E 200000
LOAD 0x00000000000f1de0 0x00000000006f1de0 0x00000000006f1de0
0x0000000000009068 0x000000000000f298 RW 200000
[...]
GNU_RELRO 0x00000000000f1de0 0x00000000006f1de0 0x00000000006f1de0
0x0000000000000220 0x0000000000000220 R
[...]
First all PT_LOAD entries are processes. Each of them triggers the creation of one VMA by using a mmap(). In addition, if MemSiz > FileSiz, it might create an additional anonymous VMA.
Then all (well there is only once in pratice) PT_GNU_RELRO are processes. Each of them triggers a mprotect() call which might split an existing VMA into different VMAs.
In order to do what you want, the correct way is probably to simulate the mmap and mprotect calls:
// Virtual Memory Area:
struct Vma {
std::uint64_t addr, length;
std::string file_name;
int prot;
int flags;
std::uint64_t offset;
};
// Virtual Address Space:
class Vas {
private:
std::list<Vma> vmas_;
public:
Vma& mmap(
std::uint64_t addr, std::uint64_t length, int prot,
int flags, int fd, off_t offset);
int mprotect(std::uint64_t addr, std::uint64_t len, int prot);
std::list<Vma> const& vmas() const { return vmas_; }
};
for (Elf32_Phdr const& h : phdrs)
if (h.p_type == PT_LOAD) {
vas.mmap(...);
if (anon_size)
vas.mmap(...);
}
for (Elf32_Phdr const& h : phdrs)
if (h.p_type == PT_GNU_RELRO)
vas.mprotect(...);
Some examples of computations
The addresses are slightly different because the VMAs are page-aligned (3) (using 4Kio = 0x1000 pages for x86 and x86_64):
The first VMA is describes by the first PT_LOAD entry:
vma[0].start = page_floor(load[0].virt_addr)
= 0x400000
vma[0].end = page_ceil(load[1].virt_addr + load[1].phys_size)
= page_ceil(0x400000 + 0xf1a74)
= page_ceil(0x4f1a74)
= 0x4f2000
The next VMA is the part of the data segment which as been protected and is described by PT_GNU_RELRO:
vma[1].start = page_floor(relro[0].virt_addr)
= page_floor(0xf1de0)
= 0x6f1000
vma[1].end = page_ceil(relro[0].virt_addr + relo[0].mem_size)
= page_ceil(0x6f1de0 + 0x220)
= page_ceil(0x6f2000)
= 0x6f2000
[...]
Correspondence with the sections
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[ 0] NULL 0000000000000000 00000000
0000000000000000 0000000000000000 0 0 0
[ 1] .interp PROGBITS 0000000000400238 00000238
000000000000001c 0000000000000000 A 0 0 1
[ 2] .note.ABI-tag NOTE 0000000000400254 00000254
0000000000000020 0000000000000000 A 0 0 4
[ 3] .note.gnu.build-i NOTE 0000000000400274 00000274
0000000000000024 0000000000000000 A 0 0 4
[ 4] .gnu.hash GNU_HASH 0000000000400298 00000298
0000000000004894 0000000000000000 A 5 0 8
[ 5] .dynsym DYNSYM 0000000000404b30 00004b30
000000000000d6c8 0000000000000018 A 6 1 8
[ 6] .dynstr STRTAB 00000000004121f8 000121f8
0000000000008c25 0000000000000000 A 0 0 1
[ 7] .gnu.version VERSYM 000000000041ae1e 0001ae1e
00000000000011e6 0000000000000002 A 5 0 2
[ 8] .gnu.version_r VERNEED 000000000041c008 0001c008
00000000000000b0 0000000000000000 A 6 2 8
[ 9] .rela.dyn RELA 000000000041c0b8 0001c0b8
00000000000000c0 0000000000000018 A 5 0 8
[10] .rela.plt RELA 000000000041c178 0001c178
00000000000013f8 0000000000000018 AI 5 12 8
[11] .init PROGBITS 000000000041d570 0001d570
000000000000001a 0000000000000000 AX 0 0 4
[12] .plt PROGBITS 000000000041d590 0001d590
0000000000000d60 0000000000000010 AX 0 0 16
[13] .text PROGBITS 000000000041e2f0 0001e2f0
0000000000099c42 0000000000000000 AX 0 0 16
[14] .fini PROGBITS 00000000004b7f34 000b7f34
0000000000000009 0000000000000000 AX 0 0 4
[15] .rodata PROGBITS 00000000004b7f40 000b7f40
000000000001ebb0 0000000000000000 A 0 0 64
[16] .eh_frame_hdr PROGBITS 00000000004d6af0 000d6af0
000000000000407c 0000000000000000 A 0 0 4
[17] .eh_frame PROGBITS 00000000004dab70 000dab70
0000000000016f04 0000000000000000 A 0 0 8
[18] .init_array INIT_ARRAY 00000000006f1de0 000f1de0
0000000000000008 0000000000000000 WA 0 0 8
[19] .fini_array FINI_ARRAY 00000000006f1de8 000f1de8
0000000000000008 0000000000000000 WA 0 0 8
[20] .jcr PROGBITS 00000000006f1df0 000f1df0
0000000000000008 0000000000000000 WA 0 0 8
[21] .dynamic DYNAMIC 00000000006f1df8 000f1df8
0000000000000200 0000000000000010 WA 6 0 8
[22] .got PROGBITS 00000000006f1ff8 000f1ff8
0000000000000008 0000000000000008 WA 0 0 8
[23] .got.plt PROGBITS 00000000006f2000 000f2000
00000000000006c0 0000000000000008 WA 0 0 8
[24] .data PROGBITS 00000000006f26c0 000f26c0
0000000000008788 0000000000000000 WA 0 0 64
[25] .bss NOBITS 00000000006fae80 000fae48
00000000000061f8 0000000000000000 WA 0 0 64
[26] .shstrtab STRTAB 0000000000000000 000fae48
00000000000000ef 0000000000000000 0 0 1
It you compare the Address of the sections (readelf -S) with the ranges of the VMAs, you find the mappings:
00400000-004f2000 r-xp /bin/bash : .interp, .note.ABI-tag, .note.gnu.build-id, .gnu.hash, .dynsym, .dynstr, .gnu.version, .gnu.version_r, .rela.dyn, .rela.plt, .init, .plt, .text, .fini, .rodata.eh_frame_hdr, .eh_frame
006f1000-006f2000 r--p /bin/bash : .init_array, .fini_array, .jcr, .dynamic, .got
006f2000-006fb000 rw-p /bin/bash : .got.plt, .data, beginning of .bss
006fb000-00702000 rw-p - : rest of .bss
Notes
(1): In fact, its more complicated: a part of the .bss section might be represented in the executable file for page alignment reasons.
(2): In fact, when it has finished doing the non-lazy relocations.
(3): MMU operations are using the page-granularity so the memory ranges of mmap(), mprotect(), munmap() calls are extended to cover full-pages.
Related
As the man page says:
-ffunction-sections
-fdata-sections
Place each function or data item into its own section in the
output file if the target supports arbitrary sections. The
name of the function or the name of the data item determines
the section's name in the output file.
And after compiling this code:
...
int bss_var_1 = 0;
int bss_var_2;
int bss_var_3;
int data_var_1 = 90;
int data_var_2 = 47;
int data_var_3[128] = {212};
int foo() {
printf("hello, foo()\n");
}
int func() {
printf("hello, func()\n");
}
int main(void) {
...
}
I got main.o in my folder, then I listed all its sections, it did place each function and data into its own section, but why do developers need these two options? (for example, any special usage to get their work done)
$ readelf build/main.o -S
There are 34 section headers, starting at offset 0xeb0:
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .text PROGBITS 00000000 000034 000000 00 AX 0 0 2
[ 2] .data PROGBITS 00000000 000034 000000 00 WA 0 0 1
[ 3] .bss NOBITS 00000000 000034 000000 00 WA 0 0 1
[ 4] .bss.bss_var_1 NOBITS 00000000 000034 000004 00 WA 0 0 4
[ 5] .bss.bss_var_2 NOBITS 00000000 000034 000004 00 WA 0 0 4
[ 6] .bss.bss_var_3 NOBITS 00000000 000034 000004 00 WA 0 0 4
[ 7] .data.data_var_1 PROGBITS 00000000 000034 000004 00 WA 0 0 4
[ 8] .data.data_var_2 PROGBITS 00000000 000038 000004 00 WA 0 0 4
[ 9] .data.data_var_3 PROGBITS 00000000 00003c 000200 00 WA 0 0 4
[10] .rodata PROGBITS 00000000 00023c 000047 00 A 0 0 4
[11] .text.foo PROGBITS 00000000 000284 000014 00 AX 0 0 4
[12] .rel.text.foo REL 00000000 000b78 000010 08 I 31 11 4
[13] .text.func PROGBITS 00000000 000298 000014 00 AX 0 0 4
[14] .rel.text.func REL 00000000 000b88 000010 08 I 31 13 4
[15] .text.main PROGBITS 00000000 0002ac 000028 00 AX 0 0 4
[16] .rel.text.main REL 00000000 000b98 000020 08 I 31 15 4
...
This allows linker to remove the unused sections [source]
The operation of eliminating the unused code and data from the final
executable is directly performed by the linker.
In order to do this, it has to work with objects compiled with the
following options: -ffunction-sections -fdata-sections.
These options are usable with C and Ada files. They will place
respectively each function or data in a separate section in the
resulting object file.
Once the objects and static libraries are created with these options,
the linker can perform the dead code elimination. You can do this by
setting the -Wl,--gc-sections option to gcc command or in the -largs
section of gnatmake. This will perform a garbage collection of code
and data never referenced.
Additionally, usage guidelines are provided in the documentation of the flags linked to by Haris:
Together with a linker garbage collection (linker --gc-sections
option) these options may lead to smaller statically-linked
executables (after stripping).
On ELF/DWARF systems these options do not degenerate the quality of
the debug information. There could be issues with other object
files/debug info formats.
Only use these options when there are significant benefits from doing
so. When you specify these options, the assembler and linker create
larger object and executable files and are also slower. These options
affect code generation. They prevent optimizations by the compiler and
assembler using relative locations inside a translation unit since the
locations are unknown until link time. An example of such an
optimization is relaxing calls to short call instructions.
I build OpenSSL-1.0.2n with -g 386 shared option (to work with basic assembly version) to generate shared library libcrypto.so.1.0.0.
Inside crypto/aes folder, aes-x86_64.s is generated and it has different global functions/labels.
The total numbers of lines in aes-x86_64.s is 2535 and various labels are present at different place (or line number in .s file).
328 .globl AES_encrypt
.type AES_encrypt,#function
.align 16
.globl asm_AES_encrypt
.hidden asm_AES_encrypt
asm_AES_encrypt:
334 AES_encrypt:
775 .globl AES_decrypt
.type AES_decrypt,#function
.align 16
.globl asm_AES_decrypt
.hidden asm_AES_decrypt
asm_AES_decrypt:
781 AES_decrypt:
844 .globl private_AES_set_encrypt_key
.type private_AES_set_encrypt_key,#function
.align 16
847 private_AES_set_encrypt_key:
1105 .globl private_AES_set_decrypt_key
.type private_AES_set_decrypt_key,#function
.align 16
1108 private_AES_set_decrypt_key:
1292 .globl AES_cbc_encrypt
.type AES_cbc_encrypt,#function
.align 16
.globl asm_AES_cbc_encrypt
.hidden asm_AES_cbc_encrypt
asm_AES_cbc_encrypt:
1299 AES_cbc_encrypt:
1750 .LAES_Te:
.long 0xa56363c6,0xa56363c6
.long 0x847c7cf8,0x847c7cf8
.long 0x997777ee,0x997777ee
.long 0x8d7b7bf6,0x8d7b7bf6
.long 0x0df2f2ff,0x0df2f2ff
.long 0xbd6b6bd6,0xbd6b6bd6
....
....
2140 .LAES_Td:
.long 0x50a7f451,0x50a7f451
.long 0x5365417e,0x5365417e
.long 0xc3a4171a,0xc3a4171a
.long 0x965e273a,0x965e273a
.long 0xcb6bab3b,0xcb6bab3b
AES_cbc_encrypt is global function declared at line number 776 and label AES_cbc_encrypt is at line number 781.
local label .LAES_Te and .LAES_Td are at line number 1750 and 2140 respectively where long data are stored.
I am able to access global label AES_cbc_encrypt of assembly file from another C program by linking with shared library.
//test_glob.c
#include <stdlib.h>
extern void* AES_cbc_encrypt() ;
int main()
{
long *p;
int i;
p=(long *)(&AES_cbc_encrypt);
for(i=0;i<768;i++)
{
printf("p+%d %p %x\n",i, p+i,*(p+i));
}
}
gcc test_glob.c -lcryto
./a.out
This gives some random output and later segmentation fault.
There must be a way to find the offset of this data section (local label .LAES_Te and .LAES_Td) from global label AES_cbc_encrypt
so that the data can be used in encryption/decryption.
I have following questions.
1. How to find the offset from global label AES_cbc_encrypt to local label .LAES_Te and .LAES_Td so that based on
that offset I can access data from another C program ?
2. Is there any other way to access those data of assembly file from C program ?
3. Is there any way to find the location in memory where those data is loaded and access those memory location to access data ?
I am using gcc-5.4 Linux Ubuntu 16.04 . Any help or link will be highly appreciated. Thanks in advance.
EDIT 1:
readelf -a aes-x86_64.o produces following output.
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: REL (Relocatable file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x0
Start of program headers: 0 (bytes into file)
Start of section headers: 14672 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 0 (bytes)
Number of program headers: 0
Size of section headers: 64 (bytes)
Number of section headers: 16
Section header string table index: 13
Section Headers:
[Nr] Name Type Address Offset Size EntSize Flags Link Info Align
[ 0] NULL 0000000000000000 00000000 0000000000000000 0000000000000000 0 0 0
[ 1] .text PROGBITS 0000000000000000 00000040 0000000000002e40 0000000000000000 AX 0 0 64
[ 2] .rela.text RELA 0000000000000000 00003808 0000000000000018 0000000000000018 I 14 1 8
[ 3] .data PROGBITS 0000000000000000 00002e80 0000000000000000 0000000000000000 WA 0 0 1
[ 4] .bss NOBITS 0000000000000000 00002e80 0000000000000000 0000000000000000 WA 0 0 1
[ 5] .note.GNU-stack PROGBITS 0000000000000000 00002e80 0000000000000000 0000000000000000 0 0 1
[ 6] .debug_line PROGBITS 0000000000000000 00002e80 00000000000005a4 0000000000000000 0 0 1
[ 7] .rela.debug_line RELA 0000000000000000 00003820 0000000000000018 0000000000000018 I 14 6 8
[ 8] .debug_info PROGBITS 0000000000000000 00003424 0000000000000071 0000000000000000 0 0 1
[ 9] .rela.debug_info RELA 0000000000000000 00003838 0000000000000060 0000000000000018 I 14 8 8
[10] .debug_abbrev PROGBITS 0000000000000000 00003495 0000000000000014 0000000000000000 0 0 1
[11] .debug_aranges PROGBITS 0000000000000000 000034b0 0000000000000030 0000000000000000 0 0 16
[12] .rela.debug_arang RELA 0000000000000000 00003898 0000000000000030 0000000000000018 I 14 11 8
[13] .shstrtab STRTAB 0000000000000000 000038c8 0000000000000085 0000000000000000 0 0 1
[14] .symtab SYMTAB 0000000000000000 000034e0 0000000000000228 0000000000000018 15 14 8
[15] .strtab STRTAB 0000000000000000 00003708 00000000000000fb 0000000000000000 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), l (large)
I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
O (extra OS processing required) o (OS specific), p (processor specific)
There are no section groups in this file.
There are no program headers in this file.
Relocation section '.rela.text' at offset 0x3808 contains 1 entries:
Offset Info Type Sym. Value Sym. Name + Addend
000000000fc0 001600000002 R_X86_64_PC32 0000000000000000 OPENSSL_ia32cap_P - 4
Relocation section '.rela.debug_line' at offset 0x3820 contains 1 entries:
Offset Info Type Sym. Value Sym. Name + Addend
000000000030 000100000001 R_X86_64_64 0000000000000000 .text + 0
Relocation section '.rela.debug_info' at offset 0x3838 contains 4 entries:
Offset Info Type Sym. Value Sym. Name + Addend
000000000006 000a0000000a R_X86_64_32 0000000000000000 .debug_abbrev + 0
00000000000c 000b0000000a R_X86_64_32 0000000000000000 .debug_line + 0
000000000010 000100000001 R_X86_64_64 0000000000000000 .text + 0
000000000018 000100000001 R_X86_64_64 0000000000000000 .text + 2e40
Relocation section '.rela.debug_aranges' at offset 0x3898 contains 2 entries:
Offset Info Type Sym. Value Sym. Name + Addend
000000000006 00090000000a R_X86_64_32 0000000000000000 .debug_info + 0
000000000010 000100000001 R_X86_64_64 0000000000000000 .text + 0
The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported.
Symbol table '.symtab' contains 23 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 SECTION LOCAL DEFAULT 1
2: 0000000000000000 0 SECTION LOCAL DEFAULT 3
3: 0000000000000000 0 SECTION LOCAL DEFAULT 4
4: 0000000000000000 483 FUNC LOCAL DEFAULT 1 _x86_64_AES_encrypt
5: 00000000000001f0 609 FUNC LOCAL DEFAULT 1 _x86_64_AES_encrypt_compa
6: 0000000000000520 465 FUNC LOCAL DEFAULT 1 _x86_64_AES_decrypt
7: 0000000000000700 737 FUNC LOCAL DEFAULT 1 _x86_64_AES_decrypt_compa
8: 0000000000000ae0 649 FUNC LOCAL DEFAULT 1 _x86_64_AES_set_encrypt_k
9: 0000000000000000 0 SECTION LOCAL DEFAULT 8
10: 0000000000000000 0 SECTION LOCAL DEFAULT 10
11: 0000000000000000 0 SECTION LOCAL DEFAULT 6
12: 0000000000000000 0 SECTION LOCAL DEFAULT 11
13: 0000000000000000 0 SECTION LOCAL DEFAULT 5
14: 0000000000000460 177 FUNC GLOBAL DEFAULT 1 AES_encrypt
15: 0000000000000460 0 NOTYPE GLOBAL HIDDEN 1 asm_AES_encrypt
16: 00000000000009f0 184 FUNC GLOBAL DEFAULT 1 AES_decrypt
17: 00000000000009f0 0 NOTYPE GLOBAL HIDDEN 1 asm_AES_decrypt
18: 0000000000000ab0 35 FUNC GLOBAL DEFAULT 1 private_AES_set_encrypt_k
19: 0000000000000d70 541 FUNC GLOBAL DEFAULT 1 private_AES_set_decrypt_k
20: 0000000000000f90 1411 FUNC GLOBAL DEFAULT 1 AES_cbc_encrypt
21: 0000000000000f90 0 NOTYPE GLOBAL HIDDEN 1 asm_AES_cbc_encrypt
22: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND OPENSSL_ia32cap_P
No version information found in this file.
EDIT 2:
nm aes-x86_64.o produces following output.
0000000000000f90 T AES_cbc_encrypt
00000000000009f0 T AES_decrypt
0000000000000460 T AES_encrypt
0000000000000f90 T asm_AES_cbc_encrypt
00000000000009f0 T asm_AES_decrypt
0000000000000460 T asm_AES_encrypt
U OPENSSL_ia32cap_P
0000000000000d70 T private_AES_set_decrypt_key
0000000000000ab0 T private_AES_set_encrypt_key
0000000000000520 t _x86_64_AES_decrypt
0000000000000700 t _x86_64_AES_decrypt_compact
0000000000000000 t _x86_64_AES_encrypt
00000000000001f0 t _x86_64_AES_encrypt_compact
0000000000000ae0 t _x86_64_AES_set_encrypt_key
Edit 3:
nm -a gives following output
0000000000000f90 T AES_cbc_encrypt
00000000000009f0 T AES_decrypt
0000000000000460 T AES_encrypt
0000000000000f90 T asm_AES_cbc_encrypt
00000000000009f0 T asm_AES_decrypt
0000000000000460 T asm_AES_encrypt
0000000000000000 b .bss
0000000000000000 d .data
0000000000000000 N .debug_abbrev
0000000000000000 N .debug_aranges
0000000000000000 N .debug_info
0000000000000000 N .debug_line
0000000000000000 n .note.GNU-stack
U OPENSSL_ia32cap_P
0000000000000d70 T private_AES_set_decrypt_key
0000000000000ab0 T private_AES_set_encrypt_key
0000000000000000 t .text
0000000000000520 t _x86_64_AES_decrypt
0000000000000700 t _x86_64_AES_decrypt_compact
0000000000000000 t _x86_64_AES_encrypt
00000000000001f0 t _x86_64_AES_encrypt_compact
0000000000000ae0 t _x86_64_AES_set_encrypt_key
If you hard-code an offset based on this version of the library, it could break with a different version that has any changes in aes-x86_64.s.
So you should add a .globl foo and foo: label to the .s at the position of the data you want to access, and declare it in C as extern uint32_t foo[].
Then the normal code-gen mechanisms for accessing static data from a shared library will kick in. (i.e. load the address from the GOT if necessary).
Also, unless you compile with -fno-plt, &AES_cbc_encrypt will be the address of the PLT stub / wrapper, not the actual function in the library.
If you only need it to work with a specific build of the library:
Then yes I think with -fno-plt, taking the address of a function in the library will compile/assemble to a load from the GOT, so you get the actual address after dynamic linking. -fno-plt is essential for this to work.
It might be fairly far away if it's in another section (.rodata instead of .text probably) so your simple scan of 768 * 4 bytes may not find the table, though.
A better way to find the offset from a symbol you can use & on in C:
Use a debugger: single-step into a function that uses the data, and find what address it's loading from (gdb's built-in disassembly should work).
Or disassemble the binary and look at the little-endian rel32 offset in a RIP-relative load or LEA of the table address. (That offset won't be fixed-up at run-time). Look at the asm source to find an instruction that references the hidden symbol you want, then find that instruction in the disassembly.
That will give you the distance in bytes from the end of that instruction to the table. You can probably see the distance from that instruction to a symbol you can take the address of in C (like you're doing with the function pointer). Also, the disassembler will fill in absolute addresses (relative to some arbitrary base) for load addresses, and for symbols / instructions, so you can subtract those.
I'm trying to understand where exactly does the executable assembly of a program end up, when a program is loaded/running. I found two resources talking about this, but they are somewhat difficult to read:
Understanding ELF using readelf and objdump Linux article (code formatting is messed up)
Michael Guyver, Some Assembly Required*: Relocations, Relocations (lots of assembly which I'm not exactly proficient in)
So, here's a brief example; I'm interested where does the executable section of the tail program end up. Basically, objdump tells me this:
$ objdump -dj .text /usr/bin/tail | head -10
/usr/bin/tail: file format elf32-i386
Disassembly of section .text:
08049100 <.text>:
8049100: 31 ed xor %ebp,%ebp
8049102: 5e pop %esi
8049103: 89 e1 mov %esp,%ecx
...
I'm assuming I'd see calls to tail's 'main()' be made here, had symbols not been stripped. Anyways, the start of the executable section is, according to this, 0x08049100; I'm interested in where it ends up eventually.
Then, I run tail in the background, getting its pid:
$ /usr/bin/tail -f & echo $!
28803
... and I inspect its /proc/pid/maps:
$ cat /proc/28803/maps
00547000-006a8000 r-xp 00000000 08:05 3506 /lib/i386-linux-gnu/libc-2.13.so
...
008c6000-008c7000 r-xp 00000000 00:00 0 [vdso]
08048000-08054000 r-xp 00000000 08:05 131469 /usr/bin/tail
08054000-08055000 r--p 0000b000 08:05 131469 /usr/bin/tail
08055000-08056000 rw-p 0000c000 08:05 131469 /usr/bin/tail
08af1000-08b12000 rw-p 00000000 00:00 0 [heap]
b76de000-b78de000 r--p 00000000 08:05 139793 /usr/lib/locale/locale-archive
...
bf845000-bf866000 rw-p 00000000 00:00 0 [stack]
Now I have tail three times - but the executable segment r-xp (which is the .text?) is apparently at 0x08048000 (an address that apparently was standardized back with SYSV for x86; also see Anatomy of a Program in Memory : Gustavo Duarte for an image)
Using the gnuplot script below, I arrived at this image:
First (topmost) plot shows "File offset" of sections from objdump (starts from 0x0); middle plot shows "VMA" (virtual memory address) of sections from objdump and bottom plot shows layout from /proc/pid/maps - both of these starting from 0x08048000; all three plots show the same range.
Comparing topmost and middle plot, it seems that the sections are more-less translated "as is" from the executable file to the VMA addresses (apart from the end); such that the whole executable file (not just .text section) starts from 0x08048000.
But comparing middle and bottom plot, it seems that when a program is running in memory, then only .text is "pushed back" to 0x08048000 - and not only that, it now appears larger!
The only explanation I have so far, is what I read somewhere (but lost the link): that an image in memory would have to have allocated a whole number of pages (of size e.g. 4096 bytes), and start from a page boundary. The whole number of pages explains the larger size - but, given that all these are virtual addresses, why the need to "snap" them to a page boundary (could one not, just as well, map the virtual address as is to a physical page boundary?)
So - could someone provide an explanation so as to why /proc/pid/maps sees the .text section in a different virtual address region from objdump?
mem.gp gnuplot script:
#!/usr/bin/env gnuplot
set term wxt size 800,500
exec = "/usr/bin/tail" ;
# cannot do - apparently gnuplot waits for children to exit, so locks here:
#runcmd = "bash -c '" . exec . " -f & echo $!'"
#print runcmd
#pid = system(runcmd) ;
#print runcmd, "pid", pid
# run tail -f & echo $! in another shell; then enter pid here:
pid = 28803
# $1 Idx $2 Name $3 Size $4 VMA $5 LMA $6 File off
cmdvma = "<objdump -h ".exec." | awk '$1 ~ \"^[0-9]+$\" && $2 !~ \".gnu_debuglink\" {print $1, $2, \"0X\"$3, \"0X\"$4;}'" ;
cmdfo = "<objdump -h ".exec." | awk '$1 ~ \"^[0-9]+$\" && $2 !~ \".gnu_debuglink\" {print $1, $2, \"0X\"$3, \"0X\"$6;}'" ;
cmdmaps = "<cat /proc/".pid."/maps | awk '{split($1,a,\"-\");b1=strtonum(\"0x\"a[1]);b2=strtonum(\"0x\"a[2]);printf(\"%d \\\"%s\\\" 0x%08X 0x%08X\\n\", NR,$6,b2-b1,b1);}'"
print cmdvma
print cmdfo
print cmdmaps
set format x "0x%08X" # "%016X";
set xtics rotate by -45 font ",7";
unset ytics
unset colorbox
set cbrange [0:25]
set yrange [0.5:1.5]
set macros
set multiplot layout 3,1 columnsfirst
# 0x08056000-0x08048000 = 0xe000
set xrange [0:0xe000]
set tmargin at screen 1
set bmargin at screen 0.667+0.1
plot \
cmdfo using 4:(1+$0*0.01):4:($4+$3):0 with xerrorbars lc palette t "File off", \
cmdfo using 4:(1):2 with labels font ",6" left rotate by -45 t ""
set xrange [0x08048000:0x08056000]
set tmargin at screen 0.667
set bmargin at screen 0.333+0.1
plot \
cmdvma using 4:(1+$0*0.01):4:($4+$3):0 with xerrorbars lc palette t "VMA", \
cmdvma using 4:(1):2 with labels font ",6" left rotate by -45 t ""
set tmargin at screen 0.333
set bmargin at screen 0+0.1
plot \
cmdmaps using 4:(1+$0*0.01):4:($4+$3):0 with xerrorbars lc palette t "/proc/pid/maps" , \
cmdmaps using 4:(1):2 with labels font ",6" left rotate by -45 t ""
unset multiplot
#system("killall -9 " . pid) ;
The short answer is that loadable segments get mapped into memory based on the ELF program headers with type PT_LOAD.
PT_LOAD - The array element specifies a loadable segment, described by
p_filesz and p_memsz. The bytes from the file are mapped to the
beginning of the memory segment. If the segment's memory size
(p_memsz) is larger than the file size (p_filesz), the ``extra'' bytes
are defined to hold the value 0 and to follow the segment's
initialized area. The file size may not be larger than the memory
size. Loadable segment entries in the program header table appear in
ascending order, sorted on the p_vaddr member.
For example, on my CentOS 6.4:
objdump -x `which tail`
Program Header:
LOAD off 0x00000000 vaddr 0x08048000 paddr 0x08048000 align 2**12
filesz 0x0000e4d4 memsz 0x0000e4d4 flags r-x
LOAD off 0x0000e4d4 vaddr 0x080574d4 paddr 0x080574d4 align 2**12
filesz 0x000003b8 memsz 0x0000054c flags rw-
And from /proc/pid/maps:
cat /proc/2671/maps | grep `which tail`
08048000-08057000 r-xp 00000000 fd:00 133669 /usr/bin/tail
08057000-08058000 rw-p 0000e000 fd:00 133669 /usr/bin/tail
You will notice there is a difference between what maps and objdump says for the load address for subsequent sections, but that has to do with the loader accounting how much memory the section takes up as well as the alignment field. The first loadable segment is mapped in at 0x08048000 with a size of 0x0000e4d4, so you'd expect it to go from 0x08048000 to 0x080564d4, but the alignment says to align on 2^12 byte pages. If you do the math you end up at 0x8057000, matching /proc/pid/maps. So the second segment is mapped in at 0x8057000 and has a size of 0x0000054c (ending at 0x805754c), which is aligned to 0x8058000, matching /proc/pid/maps.
Thanks to the comment from #KerrekSB, I reread Understanding ELF using readelf and objdump - Linux article, and I think I sort of got it now (although it would be nice for someone else to confirm if its right).
Basically, the mistake is that the region 08048000-08054000 r-xp 00000000 08:05 131469 /usr/bin/tail from /proc/pid/maps does not start with .text section; and the missing link for knowing this is Program Header Table (PHT), as reported by readelf. Here is what it says for my tail:
$ readelf -l /usr/bin/tail
Elf file type is EXEC (Executable file)
Entry point 0x8049100
There are 9 program headers, starting at offset 52
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
[00] PHDR 0x000034 0x08048034 0x08048034 0x00120 0x00120 R E 0x4
[01] INTERP 0x000154 0x08048154 0x08048154 0x00013 0x00013 R 0x1
[Requesting program interpreter: /lib/ld-linux.so.2]
[02] LOAD 0x000000 0x08048000 0x08048000 0x0b9e8 0x0b9e8 R E 0x1000
[03] LOAD 0x00bf10 0x08054f10 0x08054f10 0x00220 0x003f0 RW 0x1000
[04] DYNAMIC 0x00bf24 0x08054f24 0x08054f24 0x000c8 0x000c8 RW 0x4
[05] NOTE 0x000168 0x08048168 0x08048168 0x00044 0x00044 R 0x4
[06] GNU_EH_FRAME 0x00b918 0x08053918 0x08053918 0x00024 0x00024 R 0x4
[07] GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x4
[08] GNU_RELRO 0x00bf10 0x08054f10 0x08054f10 0x000f0 0x000f0 R 0x1
Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .rodata .eh_frame_hdr .eh_frame
03 .ctors .dtors .jcr .dynamic .got .got.plt .data .bss
04 .dynamic
05 .note.ABI-tag .note.gnu.build-id
06 .eh_frame_hdr
07
08 .ctors .dtors .jcr .dynamic .got
I've added the [0x] line numbering in the "Program Headers:" section manually; otherwise it's hard to link it to Section to Segment mapping: below. Here also note: "Segment has many types, ... LOAD: The segment's content is loaded from the executable file. "Offset" denotes the offset of the file where the kernel should start reading the file's content. "FileSiz" tells us how many bytes must be read from the file. (Understanding ELF...)"
So, objdump tells us:
08049100 <.text>:
... that .text section starts at 0x08049100.
Then, readelf tells us:
[02] LOAD 0x000000 0x08048000 0x08048000 0x0b9e8 0x0b9e8 R E 0x1000
... that header/segment [02] is loaded from the executable file at offset zero into 0x08048000; and that this is marked R E - read and execute region of memory.
Further, readelf tells us:
02 .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .rodata .eh_frame_hdr .eh_frame
... meaning that the header/segment [02] contains many sections - among them, also the .text; this now matches with the objdump view that .text starts higher than 0x08048000.
Finally, /proc/pid/maps of the running program tells us:
08048000-08054000 r-xp 00000000 08:05 131469 /usr/bin/tail
... that the executable (r-xp) "section" of the executable file is loaded at 0x08048000 - and now it is easy to see that this "section", as I called it, is called wrong - it is not a section (as per objdump nomenclature); but it is actually a "header/segment", as readelf sees it (in particular, the header/segment [02] we saw earlier).
Well, hopefully I got this right ( and hopefully someone can confirm if I did so or not :) )
I'm compiling a c file foo.c:
#include <stdlib.h>
extern void *memcpy_optimized(void* __restrict, void* __restrict, size_t);
void foo() {
[blah blah blah]
memcpy_optimized((void *)a, (void *)b, 123);
}
then I have the assembly file memcpy_optimized.S:
.text
.fpu neon
.global memcpy_optimized
.type memcpy_optimized, %function
.align 4
memcpy_optimized:
.fnstart
mov ip, r0
cmp r2, #16
blt 4f # Have less than 16 bytes to copy
# First ensure 16 byte alignment for the destination buffer
tst r0, #0xF
beq 2f
tst r0, #1
ldrneb r3, [r1], #1
[blah blah blah]
.fnend
Both files compile fine with: gcc $< -o $# -c
but when I link the application with both resulting objects, I get the following error:
foo.c:(.text+0x380): undefined reference to `memcpy_optimized(void*, void *, unsigned int)'
Any idea what I'm doing wrong?
readelf -a obj/memcpy_optimized.o
ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: REL (Relocatable file)
Machine: ARM
Version: 0x1
Entry point address: 0x0
Start of program headers: 0 (bytes into file)
Start of section headers: 436 (bytes into file)
Flags: 0x5000000, Version5 EABI
Size of this header: 52 (bytes)
Size of program headers: 0 (bytes)
Number of program headers: 0
Size of section headers: 40 (bytes)
Number of section headers: 11
Section header string table index: 8
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .text PROGBITS 00000000 000040 0000f0 00 AX 0 0 16
[ 2] .data PROGBITS 00000000 000130 000000 00 WA 0 0 1
[ 3] .bss NOBITS 00000000 000130 000000 00 WA 0 0 1
[ 4] .ARM.extab PROGBITS 00000000 000130 000000 00 A 0 0 1
[ 5] .ARM.exidx ARM_EXIDX 00000000 000130 000008 00 AL 1 0 4
[ 6] .rel.ARM.exidx REL 00000000 00044c 000010 08 9 5 4
[ 7] .ARM.attributes ARM_ATTRIBUTES 00000000 000138 000023 00 0 0 1
[ 8] .shstrtab STRTAB 00000000 00015b 000056 00 0 0 1
[ 9] .symtab SYMTAB 00000000 00036c 0000b0 10 10 9 4
[10] .strtab STRTAB 00000000 00041c 00002f 00 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings)
I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
O (extra OS processing required) o (OS specific), p (processor specific)
There are no section groups in this file.
There are no program headers in this file.
Relocation section '.rel.ARM.exidx' at offset 0x44c contains 2 entries:
Offset Info Type Sym.Value Sym. Name
00000000 0000012a R_ARM_PREL31 00000000 .text
00000000 00000a00 R_ARM_NONE 00000000 __aeabi_unwind_cpp_pr0
Unwind table index '.ARM.exidx' at offset 0x130 contains 1 entries:
0x0 <memcpy_optimized>: 0x80b0b0b0
Compact model 0
0xb0 finish
0xb0 finish
0xb0 finish
Symbol table '.symtab' contains 11 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
1: 00000000 0 SECTION LOCAL DEFAULT 1
2: 00000000 0 SECTION LOCAL DEFAULT 2
3: 00000000 0 SECTION LOCAL DEFAULT 3
4: 00000000 0 NOTYPE LOCAL DEFAULT 1 $a
5: 00000000 0 SECTION LOCAL DEFAULT 4
6: 00000000 0 SECTION LOCAL DEFAULT 5
7: 00000000 0 NOTYPE LOCAL DEFAULT 5 $d
8: 00000000 0 SECTION LOCAL DEFAULT 7
9: 00000000 0 FUNC GLOBAL DEFAULT 1 memcpy_optimized
10: 00000000 0 NOTYPE GLOBAL DEFAULT UND __aeabi_unwind_cpp_pr0
No version information found in this file.
Attribute Section: aeabi
File Attributes
Tag_CPU_name: "7-A"
Tag_CPU_arch: v7
Tag_CPU_arch_profile: Application
Tag_ARM_ISA_use: Yes
Tag_THUMB_ISA_use: Thumb-2
Tag_FP_arch: VFPv3
Tag_Advanced_SIMD_arch: NEONv1
Tag_DIV_use: Not allowed
It seems to me that you compiled your foo.c as C++, hence the linking error. What made me say that is that the linker reported the full prototype of the missing function. C functions do not have their full prototype as their symbol (just the name of function), however the C++ mangled names represent the full prototype of the function.
In many Unix and GCC C implementations, names in C are decorated with an initial underscore in object code. So, to call memcpy_optimized in C, you must use the name _memcpy_optimized in assembly.
I have a tool emitting an ELF, which as far as I can tell is compliant to the spec. Readelf output looks fine, but objdump refuses to disassemble anything.
I have simplified the input to a single global var, and "int main(void) { return 0;}" to aid debugging - the tiny section sizes are correct.
In particular, objdump seems unable to find the sections table:
$ arm-none-linux-gnueabi-readelf -S davidm.elf
There are 4 section headers, starting at offset 0x74:
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .text NULL ff000000 000034 00001c 00 AX 0 0 4
[ 2] .data NULL ff00001c 000050 000004 00 WA 0 0 4
[ 3] .shstrtab NULL 00000000 000114 000017 00 0 0 0
$ arm-none-linux-gnueabi-objdump -h davidm.elf
davidm.elf: file format elf32-littlearm
Sections:
Idx Name Size VMA LMA File off Algn
I also have another ELF, built from the exact same objects, only produced with regular toolchain use:
$ objdump -h kernel.elf
kernel.elf: file format elf32-littlearm
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 0000001c ff000000 ff000000 00008000 2**2
CONTENTS, ALLOC, LOAD, READONLY, CODE
1 .data 00000004 ff00001c ff00001c 0000801c 2**2
CONTENTS, ALLOC, LOAD, DATA
Even after I stripped .comment and .ARM.attributes sections (incase objdump requires them) from the 'known good' kernel.elf, it still happily lists the sections there, but not in my tool's davidm.elf.
I have confirmed the contents of the sections are identical between the two with readelf -x.
The only thing I can image is that the ELF file layout is different and breaks some expectations of BFD, which could explain why readelf (and my tool) processes it just fine but objdump has troubles.
Full readelf:
ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: ARM
Version: 0x1
Entry point address: 0xff000000
Start of program headers: 84 (bytes into file)
Start of section headers: 116 (bytes into file)
Flags: 0x5000002, has entry point, Version5 EABI
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 1
Size of section headers: 40 (bytes)
Number of section headers: 4
Section header string table index: 3
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .text NULL ff000000 000034 00001c 00 AX 0 0 4
[ 2] .data NULL ff00001c 000050 000004 00 WA 0 0 4
[ 3] .shstrtab NULL 00000000 000114 000017 00 0 0 0
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings)
I (info), L (link order), G (group), x (unknown)
O (extra OS processing required) o (OS specific), p (processor specific)
There are no section groups in this file.
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x000034 0xff000000 0xff000000 0x00020 0x00020 RWE 0x8000
Section to Segment mapping:
Segment Sections...
00 .text .data
There is no dynamic section in this file.
There are no relocations in this file.
There are no unwind sections in this file.
No version information found in this file.
Could the aggressive packing of the on-disk layout be causing troubles? Am I in violation of some bytestream alignment restrictions BFD expects, documented or otherwise?
Lastly - this file is not intended to be mmap'd into an address space, a loader will memcpy segment data into the desired location, so there is no requirement to play mmap-friendly file-alignment tricks. Keeping the ELF small is more important.
Cheers,
DavidM
EDIT: I was asked to upload the file, and/or provide 'objdump -x'. So I've done both:
davidm.elf
$ objdump -x davidm.elf
davidm.elf: file format elf32-littlearm
davidm.elf
architecture: arm, flags 0x00000002:
EXEC_P
start address 0xff000000
Program Header:
LOAD off 0x00000034 vaddr 0xff000000 paddr 0xff000000 align 2**15
filesz 0x00000020 memsz 0x00000020 flags rwx
private flags = 5000002: [Version5 EABI] [has entry point]
Sections:
Idx Name Size VMA LMA File off Algn
SYMBOL TABLE:
no symbols
OK - finally figured it out.
After building and annotating/debugging libbfd (function elf_object_p()) in the context of a little test app, I found why it was not matching on any of BFD supported targets.
I had bad sh_type flags for the section headers: NULL. Emitting STRTAB or PROGBITS (and eventually NOBITS when I get that far) as appropriate and objdump happily walks my image.
Not really surprising, in retrospect - I'm more annoyed I didn't catch this in comparing readelf outputs than anything else :(
Thanks for the help all :)