bash: ./<filename> No such file or directory - linker

I'm trying to run a binary. But when I'm trying to run the file I'm facing the following error.
`pegasus#pegasus:~/Documents/Courses/heaplab-main/house_of_force$ ./house_of_force
bash: ./house_of_force: No such file or directory`
`pegasus#pegasus:~/Documents/Courses/heaplab-main/house_of_force$ ldd ./house_of_force
linux-vdso.so.1 (0x00007fff7c6da000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f3f879bd000)
../.glibc/glibc_2.28_no-tcache/ld.so.2 => /lib64/ld-linux-x86-64.so.2 (0x00007f3f87bf9000)
`
`
pegasus#pegasus:~/Documents/Courses/heaplab-main/house_of_force$ file ./house_of_force
./house_of_force: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter ../.glibc/glibc_2.28_no-tcache/ld.so.2, for GNU/Linux 3.2.0, BuildID[sha1]=278a2aec8b352ea120c49321ed3254eb15ca8ef5, with debug_info, not stripped`
pegasus#pegasus:~/Documents/Courses/heaplab-main/house_of_force$ readelf -l house_of_force
Elf file type is EXEC (Executable file)
Entry point 0x400730
There are 9 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
PHDR 0x0000000000000040 0x0000000000400040 0x0000000000400040
0x00000000000001f8 0x00000000000001f8 R 0x8
INTERP 0x0000000000000238 0x0000000000400238 0x0000000000400238
0x0000000000000027 0x0000000000000027 R 0x1
[Requesting program interpreter: ../.glibc/glibc_2.28_no-tcache/ld.so.2]
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x0000000000000d88 0x0000000000000d88 R E 0x200000
LOAD 0x0000000000001d70 0x0000000000601d70 0x0000000000601d70
0x00000000000002c0 0x00000000000002c8 RW 0x200000
DYNAMIC 0x0000000000001d80 0x0000000000601d80 0x0000000000601d80
0x0000000000000200 0x0000000000000200 RW 0x8
NOTE 0x0000000000000260 0x0000000000400260 0x0000000000400260
0x0000000000000044 0x0000000000000044 R 0x4
GNU_EH_FRAME 0x0000000000000c04 0x0000000000400c04 0x0000000000400c04
0x000000000000004c 0x000000000000004c R 0x4
GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 RW 0x10
GNU_RELRO 0x0000000000001d70 0x0000000000601d70 0x0000000000601d70
0x0000000000000290 0x0000000000000290 R 0x1
Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .plt.got .text .fini .rodata .eh_frame_hdr .eh_frame
03 .init_array .fini_array .dynamic .got .data .bss
04 .dynamic
05 .note.ABI-tag .note.gnu.build-id
06 .eh_frame_hdr
07
08 .init_array .fini_array .dynamic .got
My System Details:
pegasus#pegasus:~/Documents/Courses/heaplab-main/house_of_force$ uname -a
Linux pegasus 5.15.0-58-generic #64-Ubuntu SMP Thu Jan 5 11:43:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
pegasus#pegasus:~/Documents/Courses/heaplab-main/house_of_force$ lsb_release -a
LSB Version: core-11.1.0ubuntu4-noarch:printing-11.1.0ubuntu4-noarch:security-11.1.0ubuntu4-noarch
Distributor ID: Ubuntu
Description: Ubuntu 22.04.1 LTS
Release: 22.04
Codename: jammy
I've already done chmod a+x house_of_force
What I'm suspecting some shared object or the interpreter is broken.
I'm unable to link them properly.
Help me running the file properly using ./house_of_force

I'm facing the following error
The problem is that your binary is linked in a very weird way.
In particular, its interpreter is set to ../.glibc/glibc_2.28_no-tcache/ld.so.2, and this binary will run only when invoked from a directory in which ../.glibc/glibc_2.28_no-tcache/ld.so.2 exists.
Invoking this binary from any other directory will fail with ENOENT.
It is unlikely that that's what you want this binary to do. You'll need to fix your link line -- usually the interpreter is set to the absolute path to ld.so.
P.S. You probably want to link with this custom GLIBC build in order to solve some problem. But it's unlikely that linking with custom GLIBC is the right solution to whatever that problem is. See http://xyproblem.info.
Update:
It was a file given in linux heap exploitation course.
You should have explained this in your question.
Like the answer says, this binary will only run in a directory in which ../.glibc exists. If you have .glibc/ directory (containing glibc_2.28_no-tcache/ld.so.2), then do this:
cd .glibc
mkdir foo
mv /path/to/house_of_force foo
cd foo
./house_of_force

Related

What's the meaning of 2.2.5 in memcpy#GLIBC_2.2.5? [duplicate]

Currently I'm in a directory which has a file libshared-object.so (name changed for generality).
When I run
$ objdump -p libshared-object.so
I receive the following output:
libshared-object.so: file format elf64-x86-64
Program Header:
LOAD off 0x0000000000000000 vaddr 0x0000000000000000 paddr 0x0000000000000000 align 2**21
filesz 0x00000000000828ee memsz 0x00000000000828ee flags r-x
LOAD off 0x0000000000083768 vaddr 0x0000000000283768 paddr 0x0000000000283768 align 2**21
filesz 0x00000000000048e0 memsz 0x0000000000004af0 flags rw-
DYNAMIC off 0x0000000000084af0 vaddr 0x0000000000284af0 paddr 0x0000000000284af0 align 2**3
filesz 0x00000000000002a0 memsz 0x00000000000002a0 flags rw-
NOTE off 0x00000000000001c8 vaddr 0x00000000000001c8 paddr 0x00000000000001c8 align 2**2
filesz 0x0000000000000024 memsz 0x0000000000000024 flags r--
EH_FRAME off 0x0000000000072c6c vaddr 0x0000000000072c6c paddr 0x0000000000072c6c align 2**2
filesz 0x0000000000002ed4 memsz 0x0000000000002ed4 flags r--
STACK off 0x0000000000000000 vaddr 0x0000000000000000 paddr 0x0000000000000000 align 2**4
filesz 0x0000000000000000 memsz 0x0000000000000000 flags rw-
RELRO off 0x0000000000083768 vaddr 0x0000000000283768 paddr 0x0000000000283768 align 2**0
filesz 0x0000000000001898 memsz 0x0000000000001898 flags r--
Dynamic Section:
NEEDED libQt5Widgets.so.5
NEEDED libQt5Compositor.so.5
NEEDED libQt5Quick.so.5
NEEDED libQt5Qml.so.5
NEEDED libQt5Network.so.5
NEEDED libQt5Gui.so.5
NEEDED libQt5Core.so.5
NEEDED libGL.so.1
NEEDED libpthread.so.0
NEEDED libstdc++.so.6
NEEDED libm.so.6
NEEDED libgcc_s.so.1
NEEDED libc.so.6
SONAME libshared-object.so.1
RPATH /opt/qt5/lib
INIT 0x000000000003fc68
FINI 0x000000000006c234
INIT_ARRAY 0x0000000000283768
INIT_ARRAYSZ 0x00000000000000e8
FINI_ARRAY 0x0000000000283850
FINI_ARRAYSZ 0x0000000000000008
GNU_HASH 0x00000000000001f0
STRTAB 0x00000000000101e8
SYMTAB 0x00000000000036d8
STRSZ 0x0000000000022072
SYMENT 0x0000000000000018
PLTGOT 0x0000000000285000
PLTRELSZ 0x0000000000008df0
PLTREL 0x0000000000000007
JMPREL 0x0000000000036e78
RELA 0x0000000000033458
RELASZ 0x0000000000003a20
RELAENT 0x0000000000000018
VERNEED 0x0000000000033348
VERNEEDNUM 0x0000000000000006
VERSYM 0x000000000003225a
RELACOUNT 0x0000000000000052
Version References:
required from libm.so.6:
0x09691a75 0x00 09 GLIBC_2.2.5
required from libgcc_s.so.1:
0x0b792650 0x00 08 GCC_3.0
required from libc.so.6:
0x06969194 0x00 10 GLIBC_2.14
0x09691a75 0x00 07 GLIBC_2.2.5
required from libQt5Core.so.5:
0x00058a25 0x00 06 Qt_5
required from libQt5Gui.so.5:
0x0dcbd2c9 0x00 12 Qt_5_PRIVATE_API
0x00058a25 0x00 03 Qt_5
required from libstdc++.so.6:
0x0bafd178 0x00 11 CXXABI_1.3.8
0x056bafd3 0x00 05 CXXABI_1.3
0x0297f871 0x00 04 GLIBCXX_3.4.21
0x08922974 0x00 02 GLIBCXX_3.4
Of particular interest is the very last of this information, the Version References:
Version References:
required from libm.so.6:
0x09691a75 0x00 09 GLIBC_2.2.5
required from libgcc_s.so.1:
0x0b792650 0x00 08 GCC_3.0
required from libc.so.6:
0x06969194 0x00 10 GLIBC_2.14
0x09691a75 0x00 07 GLIBC_2.2.5
required from libQt5Core.so.5:
0x00058a25 0x00 06 Qt_5
required from libQt5Gui.so.5:
0x0dcbd2c9 0x00 12 Qt_5_PRIVATE_API
0x00058a25 0x00 03 Qt_5
required from libstdc++.so.6:
0x0bafd178 0x00 11 CXXABI_1.3.8
0x056bafd3 0x00 05 CXXABI_1.3
0x0297f871 0x00 04 GLIBCXX_3.4.21
0x08922974 0x00 02 GLIBCXX_3.4
Question: Where do these version references come from? Take, for example, the line required from libQt5Gui.so.5: .. Qt_5 and Qt_5_PRIVATE_API.
Are references to version Qt_5 and Qt_5_PRIVATE_API coming from the C code that generated libQt5Gui.so.5? Or from some linker option passed to to gcc or ld? Or from something else?
Or from something else?
From something else.
When you build a shared library (say libfoo.so), you can (though don't have to) supply a linker version script giving certain symbols a version tag.
When you later link an executable or a shared library (say libbar.so) against libfoo.so, iff you use a versioned symbol, the version tag of that symbol is recorded in libbar.so (that is what you observed in your question).
This setup allows libfoo.so to change its symbols in ABI-incompatible way, and still support old client programs that were linked against the old symbols.
For example, libc.so.6 on x86_64 has the following versions of memcpy:
0000000000091620 g iD .text 000000000000003d GLIBC_2.14 memcpy
000000000008c420 g iD .text 0000000000000047 (GLIBC_2.2.5) memcpy
Programs that were linked against glibc-2.13 or older will use the GLIBC_2.2.5 version, programs that were linked against glibc-2.14 or newer will use the GLIBC_2.14 version.
If you try to run a program linked against glibc-2.14 on a system with glibc-2.13, you will get an error (missing symbol version), similar to this.
Before the introduction of symbol versioning, changing the ABI of an existing symbol required that you ship an entirely separate library. This is called external library versioning. You can read more about it here.

Why does a simple C program consumes a lot of disk space? [duplicate]

I am on a quest to understand low-level computing. I have noticed my compiled binaries are a lot bigger then I think they should be. So I tried to build the smallest possible c program without any stdlib code as follows:
void _start()
{
while(1) {};
}
gcc -nostdlib -o minimal minimal.c
When I disasseble the binary, it shows me exactly what I expect, namely this exact code in three lines of assembly.
$ objdump -d minimal
minimal: file format elf64-x86-64
Disassembly of section .text:
0000000000001000 <_start>:
1000: 55 push %rbp
1001: 48 89 e5 mov %rsp,%rbp
1004: eb fe jmp 1004 <_start+0x4>
But my actual executable is still 13856 Bytes in size. What is it, that makes this so large? What else is in that file? Does the OS need more than these 6 Bytes of machine code?
Edit #1:
The output of size is:
$ size -A minimal
minimal :
section size addr
.interp 28 680
.note.gnu.build-id 36 708
.gnu.hash 28 744
.dynsym 24 776
.dynstr 1 800
.text 6 4096
.eh_frame_hdr 20 8192
.eh_frame 52 8216
.dynamic 208 16176
.comment 18 0
Total 421
Modern compilers and linkers aren't really optimized for producing ultra-small code on full-scale platforms. Not because the job is difficult, but because there's usually no need to. It isn't necessarily that the compiler or linker adds additional code (although it might), but rather that it won't try hard to pack your data and code into the smallest possible space.
In your case, I note that you're using dynamic linking, even though nothing is actually linked. Using "-static" will shave off about 8kB. "-s" (strip) will get rid of a bit more.
I don't know if it's even possible with gcc to make a truly minimal ELF executable. In your case, that ought to be about 400 bytes, nearly all of which will be the various ELF headers, section table, etc.
I don't know if I'm allowed to link my own website (I'm sure somebody will put me right if not), but I have an article on producing a tiny ELF executable by building it from scratch in binary:
http://kevinboone.me/elfdemo.html
There are many different executable file formats. .com, .exe, .elf, .coff, a.out, etc. They ideally contain the machine code and other sections (.text (code), .data, .bss, .rodata and possibly others, names depend on toolchain) plus they contain debugging information. Notice how your disassembly showed the label _start? that is a string among others and other info to be able to connect that string to the address for debugging. The output of objdump also showed that you are using an elf file, you can easily look up the file format and can trivially write your own program to parse through the file, or try to use readelf and other tools to see what is in there (high level not raw).
On an operating system where in general (not always, but think pc) the programs are being loaded into ram and then run, so you want to have first and foremost a file format that is supported by the operating system, there is no reason for them to support more than one, but they might. It is os/system design dependent, but the os may be designed to not only load the code, but also load/initialize the data (.data, .bss). When booting say an mcu you need to embed the data into the binary blob and the application itself copies the data to ram from the flash, but in an os that isn't necessarily required, but in order to do it you need a file format that can distinguish the sections, target locations, and sizes. Which means extra bytes in the file to define this and a file format.
A binary includes the bootstrap code before it can enter the C generated code, depending on the system, depending on the C library (multiple/many C libraries can be used on a computer and bootstrap is specific to the C library in general not the target, nor operating system, not a compiler thing), so some percentage of the file is the bootstrap code, too when your main program is very tiny the a lot of the file size is overhead.
You can for example use strip to make the file smaller by getting rid of some symbols and other non-essential items like that the file size should get smaller but the objdump disassembly will then not have labels and for the case of x86, a variable length instruction set which is difficult at best to disassemble gets much harder, so the output with or without labels may not reflect the actual instructions, but without the labels the gnu disassembler doesn't reset itself at the labels and can make the output worse.
If you use clang 10.0 and lld 10.0 and strip out unnecessary sections you can get the size of a 64-bit statically linked executable to under 800 bytes.
$ cat minimal.c
void _start(void)
{
int i = 0;
while (i < 11) {
i++;
}
asm( "int $0x80" :: "a"(1), "b"(i) );
}
$ clang -static -nostdlib -flto -fuse-ld=lld -o minimal minimal.c
$ ls -l minimal
-rwxrwxr-x 1 fpm fpm 1376 Sep 4 17:38 minimal
$ readelf --string-dump .comment minimal
String dump of section '.comment':
[ 0] Linker: LLD 10.0.0
[ 13] clang version 10.0.0 (Fedora 10.0.0-2.fc32)
$ readelf -W --section-headers minimal
There are 9 section headers, starting at offset 0x320:
Section Headers:
[Nr] Name Type Address Off Size ES Flg Lk Inf Al
[ 0] NULL 0000000000000000 000000 000000 00 0 0 0
[ 1] .note.gnu.build-id NOTE 0000000000200190 000190 000018 00 A 0 0 4
[ 2] .eh_frame_hdr PROGBITS 00000000002001a8 0001a8 000014 00 A 0 0 4
[ 3] .eh_frame PROGBITS 00000000002001c0 0001c0 00003c 00 A 0 0 8
[ 4] .text PROGBITS 0000000000201200 000200 00002a 00 AX 0 0 16
[ 5] .comment PROGBITS 0000000000000000 00022a 000040 01 MS 0 0 1
[ 6] .symtab SYMTAB 0000000000000000 000270 000048 18 8 2 8
[ 7] .shstrtab STRTAB 0000000000000000 0002b8 000055 00 0 0 1
[ 8] .strtab STRTAB 0000000000000000 00030d 000012 00 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
L (link order), O (extra OS processing required), G (group), T (TLS),
C (compressed), x (unknown), o (OS specific), E (exclude),
l (large), p (processor specific)
$ strip -R .eh_frame_hdr -R .eh_frame minimal
$ strip -R .comment -R .note.gnu.build-id minimal
strip: minimal: warning: empty loadable segment detected at vaddr=0x200000, is this intentional?
$ readelf -W --section-headers minimal
There are 3 section headers, starting at offset 0x240:
Section Headers:
[Nr] Name Type Address Off Size ES Flg Lk Inf Al
[ 0] NULL 0000000000000000 000000 000000 00 0 0 0
[ 1] .text PROGBITS 0000000000201200 000200 00002a 00 AX 0 0 16
[ 2] .shstrtab STRTAB 0000000000000000 00022a 000011 00 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
L (link order), O (extra OS processing required), G (group), T (TLS),
C (compressed), x (unknown), o (OS specific), E (exclude),
l (large), p (processor specific)
$ ll minimal
-rwxrwxr-x 1 fpm fpm 768 Sep 4 17:45 minimal

How to reverse a stripped ELF compiled -static?

I'm reversing an x64 ELF file which is compiled with -static,and it seems to be stripped.
So,there is not any symbols when I run it in GDB.
Can I recover the symbols?or do something to make it easier?
objdump -tT 6dd
6dd: file format elf64-x86-64
objdump: 6dd: not a dynamic object
SYMBOL TABLE:
no symbols
DYNAMIC SYMBOL TABLE:
no symbols
Entry point: 0x401058
0x0000000000400190 - 0x00000000004001b0 is .note.ABI-tag
0x00000000004001b0 - 0x00000000004001d4 is .note.gnu.build-id
0x00000000004001d8 - 0x00000000004002f8 is .rela.plt
0x00000000004002f8 - 0x0000000000400310 is .init
0x0000000000400310 - 0x00000000004003d0 is .plt
0x00000000004003d0 - 0x00000000004f1b78 is .text
0x00000000004f1b80 - 0x00000000004f339c is __libc_freeres_fn
0x00000000004f33a0 - 0x00000000004f3448 is __libc_thread_freeres_fn
0x00000000004f3448 - 0x00000000004f3456 is .fini
0x00000000004f3460 - 0x0000000000511424 is .rodata
0x0000000000511428 - 0x0000000000511430 is __libc_atexit
0x0000000000511430 - 0x0000000000511488 is __libc_subfreeres
0x0000000000511488 - 0x0000000000511490 is __libc_thread_subfreeres
0x0000000000511490 - 0x0000000000527174 is .eh_frame
0x0000000000527174 - 0x00000000005272f6 is .gcc_except_table
0x0000000000727ef0 - 0x0000000000727f10 is .tdata
0x0000000000727f10 - 0x0000000000727f48 is .tbss
0x0000000000727f10 - 0x0000000000727f18 is .init_array
0x0000000000727f18 - 0x0000000000727f20 is .fini_array
0x0000000000727f20 - 0x0000000000727f30 is .ctors
0x0000000000727f30 - 0x0000000000727f40 is .dtors
0x0000000000727f40 - 0x0000000000727f48 is .jcr
0x0000000000727f50 - 0x0000000000727fd0 is .data.rel.ro
0x0000000000727fd0 - 0x0000000000727fe0 is .got
0x0000000000727fe8 - 0x0000000000728060 is .got.plt
0x0000000000728060 - 0x0000000000729890 is .data
0x00000000007298a0 - 0x000000000072cbe8 is .bss
0x000000000072cbf0 - 0x000000000072cc38 is __libc_freeres_ptrs
Short answer: No.
You need to get hold of the source and recompile to generate a debugging binary.

Virtual memory addresses of objdump vs /proc/pid/maps?

I'm trying to understand where exactly does the executable assembly of a program end up, when a program is loaded/running. I found two resources talking about this, but they are somewhat difficult to read:
Understanding ELF using readelf and objdump Linux article (code formatting is messed up)
Michael Guyver, Some Assembly Required*: Relocations, Relocations (lots of assembly which I'm not exactly proficient in)
So, here's a brief example; I'm interested where does the executable section of the tail program end up. Basically, objdump tells me this:
$ objdump -dj .text /usr/bin/tail | head -10
/usr/bin/tail: file format elf32-i386
Disassembly of section .text:
08049100 <.text>:
8049100: 31 ed xor %ebp,%ebp
8049102: 5e pop %esi
8049103: 89 e1 mov %esp,%ecx
...
I'm assuming I'd see calls to tail's 'main()' be made here, had symbols not been stripped. Anyways, the start of the executable section is, according to this, 0x08049100; I'm interested in where it ends up eventually.
Then, I run tail in the background, getting its pid:
$ /usr/bin/tail -f & echo $!
28803
... and I inspect its /proc/pid/maps:
$ cat /proc/28803/maps
00547000-006a8000 r-xp 00000000 08:05 3506 /lib/i386-linux-gnu/libc-2.13.so
...
008c6000-008c7000 r-xp 00000000 00:00 0 [vdso]
08048000-08054000 r-xp 00000000 08:05 131469 /usr/bin/tail
08054000-08055000 r--p 0000b000 08:05 131469 /usr/bin/tail
08055000-08056000 rw-p 0000c000 08:05 131469 /usr/bin/tail
08af1000-08b12000 rw-p 00000000 00:00 0 [heap]
b76de000-b78de000 r--p 00000000 08:05 139793 /usr/lib/locale/locale-archive
...
bf845000-bf866000 rw-p 00000000 00:00 0 [stack]
Now I have tail three times - but the executable segment r-xp (which is the .text?) is apparently at 0x08048000 (an address that apparently was standardized back with SYSV for x86; also see Anatomy of a Program in Memory : Gustavo Duarte for an image)
Using the gnuplot script below, I arrived at this image:
First (topmost) plot shows "File offset" of sections from objdump (starts from 0x0); middle plot shows "VMA" (virtual memory address) of sections from objdump and bottom plot shows layout from /proc/pid/maps - both of these starting from 0x08048000; all three plots show the same range.
Comparing topmost and middle plot, it seems that the sections are more-less translated "as is" from the executable file to the VMA addresses (apart from the end); such that the whole executable file (not just .text section) starts from 0x08048000.
But comparing middle and bottom plot, it seems that when a program is running in memory, then only .text is "pushed back" to 0x08048000 - and not only that, it now appears larger!
The only explanation I have so far, is what I read somewhere (but lost the link): that an image in memory would have to have allocated a whole number of pages (of size e.g. 4096 bytes), and start from a page boundary. The whole number of pages explains the larger size - but, given that all these are virtual addresses, why the need to "snap" them to a page boundary (could one not, just as well, map the virtual address as is to a physical page boundary?)
So - could someone provide an explanation so as to why /proc/pid/maps sees the .text section in a different virtual address region from objdump?
mem.gp gnuplot script:
#!/usr/bin/env gnuplot
set term wxt size 800,500
exec = "/usr/bin/tail" ;
# cannot do - apparently gnuplot waits for children to exit, so locks here:
#runcmd = "bash -c '" . exec . " -f & echo $!'"
#print runcmd
#pid = system(runcmd) ;
#print runcmd, "pid", pid
# run tail -f & echo $! in another shell; then enter pid here:
pid = 28803
# $1 Idx $2 Name $3 Size $4 VMA $5 LMA $6 File off
cmdvma = "<objdump -h ".exec." | awk '$1 ~ \"^[0-9]+$\" && $2 !~ \".gnu_debuglink\" {print $1, $2, \"0X\"$3, \"0X\"$4;}'" ;
cmdfo = "<objdump -h ".exec." | awk '$1 ~ \"^[0-9]+$\" && $2 !~ \".gnu_debuglink\" {print $1, $2, \"0X\"$3, \"0X\"$6;}'" ;
cmdmaps = "<cat /proc/".pid."/maps | awk '{split($1,a,\"-\");b1=strtonum(\"0x\"a[1]);b2=strtonum(\"0x\"a[2]);printf(\"%d \\\"%s\\\" 0x%08X 0x%08X\\n\", NR,$6,b2-b1,b1);}'"
print cmdvma
print cmdfo
print cmdmaps
set format x "0x%08X" # "%016X";
set xtics rotate by -45 font ",7";
unset ytics
unset colorbox
set cbrange [0:25]
set yrange [0.5:1.5]
set macros
set multiplot layout 3,1 columnsfirst
# 0x08056000-0x08048000 = 0xe000
set xrange [0:0xe000]
set tmargin at screen 1
set bmargin at screen 0.667+0.1
plot \
cmdfo using 4:(1+$0*0.01):4:($4+$3):0 with xerrorbars lc palette t "File off", \
cmdfo using 4:(1):2 with labels font ",6" left rotate by -45 t ""
set xrange [0x08048000:0x08056000]
set tmargin at screen 0.667
set bmargin at screen 0.333+0.1
plot \
cmdvma using 4:(1+$0*0.01):4:($4+$3):0 with xerrorbars lc palette t "VMA", \
cmdvma using 4:(1):2 with labels font ",6" left rotate by -45 t ""
set tmargin at screen 0.333
set bmargin at screen 0+0.1
plot \
cmdmaps using 4:(1+$0*0.01):4:($4+$3):0 with xerrorbars lc palette t "/proc/pid/maps" , \
cmdmaps using 4:(1):2 with labels font ",6" left rotate by -45 t ""
unset multiplot
#system("killall -9 " . pid) ;
The short answer is that loadable segments get mapped into memory based on the ELF program headers with type PT_LOAD.
PT_LOAD - The array element specifies a loadable segment, described by
p_filesz and p_memsz. The bytes from the file are mapped to the
beginning of the memory segment. If the segment's memory size
(p_memsz) is larger than the file size (p_filesz), the ``extra'' bytes
are defined to hold the value 0 and to follow the segment's
initialized area. The file size may not be larger than the memory
size. Loadable segment entries in the program header table appear in
ascending order, sorted on the p_vaddr member.
For example, on my CentOS 6.4:
objdump -x `which tail`
Program Header:
LOAD off 0x00000000 vaddr 0x08048000 paddr 0x08048000 align 2**12
filesz 0x0000e4d4 memsz 0x0000e4d4 flags r-x
LOAD off 0x0000e4d4 vaddr 0x080574d4 paddr 0x080574d4 align 2**12
filesz 0x000003b8 memsz 0x0000054c flags rw-
And from /proc/pid/maps:
cat /proc/2671/maps | grep `which tail`
08048000-08057000 r-xp 00000000 fd:00 133669 /usr/bin/tail
08057000-08058000 rw-p 0000e000 fd:00 133669 /usr/bin/tail
You will notice there is a difference between what maps and objdump says for the load address for subsequent sections, but that has to do with the loader accounting how much memory the section takes up as well as the alignment field. The first loadable segment is mapped in at 0x08048000 with a size of 0x0000e4d4, so you'd expect it to go from 0x08048000 to 0x080564d4, but the alignment says to align on 2^12 byte pages. If you do the math you end up at 0x8057000, matching /proc/pid/maps. So the second segment is mapped in at 0x8057000 and has a size of 0x0000054c (ending at 0x805754c), which is aligned to 0x8058000, matching /proc/pid/maps.
Thanks to the comment from #KerrekSB, I reread Understanding ELF using readelf and objdump - Linux article, and I think I sort of got it now (although it would be nice for someone else to confirm if its right).
Basically, the mistake is that the region 08048000-08054000 r-xp 00000000 08:05 131469 /usr/bin/tail from /proc/pid/maps does not start with .text section; and the missing link for knowing this is Program Header Table (PHT), as reported by readelf. Here is what it says for my tail:
$ readelf -l /usr/bin/tail
Elf file type is EXEC (Executable file)
Entry point 0x8049100
There are 9 program headers, starting at offset 52
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
[00] PHDR 0x000034 0x08048034 0x08048034 0x00120 0x00120 R E 0x4
[01] INTERP 0x000154 0x08048154 0x08048154 0x00013 0x00013 R 0x1
[Requesting program interpreter: /lib/ld-linux.so.2]
[02] LOAD 0x000000 0x08048000 0x08048000 0x0b9e8 0x0b9e8 R E 0x1000
[03] LOAD 0x00bf10 0x08054f10 0x08054f10 0x00220 0x003f0 RW 0x1000
[04] DYNAMIC 0x00bf24 0x08054f24 0x08054f24 0x000c8 0x000c8 RW 0x4
[05] NOTE 0x000168 0x08048168 0x08048168 0x00044 0x00044 R 0x4
[06] GNU_EH_FRAME 0x00b918 0x08053918 0x08053918 0x00024 0x00024 R 0x4
[07] GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x4
[08] GNU_RELRO 0x00bf10 0x08054f10 0x08054f10 0x000f0 0x000f0 R 0x1
Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .rodata .eh_frame_hdr .eh_frame
03 .ctors .dtors .jcr .dynamic .got .got.plt .data .bss
04 .dynamic
05 .note.ABI-tag .note.gnu.build-id
06 .eh_frame_hdr
07
08 .ctors .dtors .jcr .dynamic .got
I've added the [0x] line numbering in the "Program Headers:" section manually; otherwise it's hard to link it to Section to Segment mapping: below. Here also note: "Segment has many types, ... LOAD: The segment's content is loaded from the executable file. "Offset" denotes the offset of the file where the kernel should start reading the file's content. "FileSiz" tells us how many bytes must be read from the file. (Understanding ELF...)"
So, objdump tells us:
08049100 <.text>:
... that .text section starts at 0x08049100.
Then, readelf tells us:
[02] LOAD 0x000000 0x08048000 0x08048000 0x0b9e8 0x0b9e8 R E 0x1000
... that header/segment [02] is loaded from the executable file at offset zero into 0x08048000; and that this is marked R E - read and execute region of memory.
Further, readelf tells us:
02 .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .rodata .eh_frame_hdr .eh_frame
... meaning that the header/segment [02] contains many sections - among them, also the .text; this now matches with the objdump view that .text starts higher than 0x08048000.
Finally, /proc/pid/maps of the running program tells us:
08048000-08054000 r-xp 00000000 08:05 131469 /usr/bin/tail
... that the executable (r-xp) "section" of the executable file is loaded at 0x08048000 - and now it is easy to see that this "section", as I called it, is called wrong - it is not a section (as per objdump nomenclature); but it is actually a "header/segment", as readelf sees it (in particular, the header/segment [02] we saw earlier).
Well, hopefully I got this right ( and hopefully someone can confirm if I did so or not :) )

objdump won't show my ELF sections

I have a tool emitting an ELF, which as far as I can tell is compliant to the spec. Readelf output looks fine, but objdump refuses to disassemble anything.
I have simplified the input to a single global var, and "int main(void) { return 0;}" to aid debugging - the tiny section sizes are correct.
In particular, objdump seems unable to find the sections table:
$ arm-none-linux-gnueabi-readelf -S davidm.elf
There are 4 section headers, starting at offset 0x74:
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .text NULL ff000000 000034 00001c 00 AX 0 0 4
[ 2] .data NULL ff00001c 000050 000004 00 WA 0 0 4
[ 3] .shstrtab NULL 00000000 000114 000017 00 0 0 0
$ arm-none-linux-gnueabi-objdump -h davidm.elf
davidm.elf: file format elf32-littlearm
Sections:
Idx Name Size VMA LMA File off Algn
I also have another ELF, built from the exact same objects, only produced with regular toolchain use:
$ objdump -h kernel.elf
kernel.elf: file format elf32-littlearm
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 0000001c ff000000 ff000000 00008000 2**2
CONTENTS, ALLOC, LOAD, READONLY, CODE
1 .data 00000004 ff00001c ff00001c 0000801c 2**2
CONTENTS, ALLOC, LOAD, DATA
Even after I stripped .comment and .ARM.attributes sections (incase objdump requires them) from the 'known good' kernel.elf, it still happily lists the sections there, but not in my tool's davidm.elf.
I have confirmed the contents of the sections are identical between the two with readelf -x.
The only thing I can image is that the ELF file layout is different and breaks some expectations of BFD, which could explain why readelf (and my tool) processes it just fine but objdump has troubles.
Full readelf:
ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: ARM
Version: 0x1
Entry point address: 0xff000000
Start of program headers: 84 (bytes into file)
Start of section headers: 116 (bytes into file)
Flags: 0x5000002, has entry point, Version5 EABI
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 1
Size of section headers: 40 (bytes)
Number of section headers: 4
Section header string table index: 3
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .text NULL ff000000 000034 00001c 00 AX 0 0 4
[ 2] .data NULL ff00001c 000050 000004 00 WA 0 0 4
[ 3] .shstrtab NULL 00000000 000114 000017 00 0 0 0
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings)
I (info), L (link order), G (group), x (unknown)
O (extra OS processing required) o (OS specific), p (processor specific)
There are no section groups in this file.
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x000034 0xff000000 0xff000000 0x00020 0x00020 RWE 0x8000
Section to Segment mapping:
Segment Sections...
00 .text .data
There is no dynamic section in this file.
There are no relocations in this file.
There are no unwind sections in this file.
No version information found in this file.
Could the aggressive packing of the on-disk layout be causing troubles? Am I in violation of some bytestream alignment restrictions BFD expects, documented or otherwise?
Lastly - this file is not intended to be mmap'd into an address space, a loader will memcpy segment data into the desired location, so there is no requirement to play mmap-friendly file-alignment tricks. Keeping the ELF small is more important.
Cheers,
DavidM
EDIT: I was asked to upload the file, and/or provide 'objdump -x'. So I've done both:
davidm.elf
$ objdump -x davidm.elf
davidm.elf: file format elf32-littlearm
davidm.elf
architecture: arm, flags 0x00000002:
EXEC_P
start address 0xff000000
Program Header:
LOAD off 0x00000034 vaddr 0xff000000 paddr 0xff000000 align 2**15
filesz 0x00000020 memsz 0x00000020 flags rwx
private flags = 5000002: [Version5 EABI] [has entry point]
Sections:
Idx Name Size VMA LMA File off Algn
SYMBOL TABLE:
no symbols
OK - finally figured it out.
After building and annotating/debugging libbfd (function elf_object_p()) in the context of a little test app, I found why it was not matching on any of BFD supported targets.
I had bad sh_type flags for the section headers: NULL. Emitting STRTAB or PROGBITS (and eventually NOBITS when I get that far) as appropriate and objdump happily walks my image.
Not really surprising, in retrospect - I'm more annoyed I didn't catch this in comparing readelf outputs than anything else :(
Thanks for the help all :)

Resources