LLVM (arm-none-eabi target) is producing an ARM.exidx section for C based code(?) - arm

Compiling a simple HelloWorld.c using Clang/LLVM (arm-none-eabi target) produces a relocation section '.rel.ARM.exidx' but using arm-gcc does not. These LLVM produced unwind table entries are correctly tagged as canunwind. But why are they even produced at all as they are not needed and just cause bloat as you get an entry for every C function in your AXF?
readelf edxidx from HelloWorld.o
Relocation section '.rel.ARM.exidx' at offset 0x580 contains 2 entries:
Offset Info Type Sym.Value Sym. Name
00000000 00000b2a R_ARM_PREL31 00000000 .text
00000008 00000b2a R_ARM_PREL31 00000000 .text
Unwind table index '.ARM.exidx' at offset 0xcc contains 2 entries:
0x0 <print_uart0>: 0x1 [cantunwind]
0x54 <c_entry>: 0x1 [cantunwind]
In testing Clang defaults: If I pass "-funwind-tables" to Clang to force unwinding for even C functions, I get what I would expect had I been writing .cpp functions and "-fno-unwind-tables" results in the same as above.
Relocation section '.rel.ARM.exidx' at offset 0x5a4 contains 4 entries:
Offset Info Type Sym.Value Sym. Name
00000000 00000b2a R_ARM_PREL31 00000000 .text
00000000 00001600 R_ARM_NONE 00000000 __aeabi_unwind_cpp_pr0
00000008 00000b2a R_ARM_PREL31 00000000 .text
00000008 00001600 R_ARM_NONE 00000000 __aeabi_unwind_cpp_pr0
Unwind table index '.ARM.exidx' at offset 0xcc contains 2 entries:
0x0 <print_uart0>: 0x8001b0b0
Compact model index: 0
0x01 vsp = vsp + 8
0xb0 finish
0xb0 finish
0x54 <c_entry>: 0x809b8480
Compact model index: 0
0x9b vsp = r11
0x84 0x80 pop {r11, r14}
1) Is there anyway to turn off the .ARM.exidx section when only using C functions as they will always be flagged as "cantunwind".
2) Anyway to strip this section during linking? (gc-section will not workof course since these table entries reference in-use functions)
3) Why does arm-gcc not create this section (well, it does if you are using new lib, nano, etc... but I use and link no std libs)

I'll answer (2), since that's what I did. Add to your linker script:
/DISCARD/ :
{
*(.ARM.exidx)
}

Related

What's the meaning of 2.2.5 in memcpy#GLIBC_2.2.5? [duplicate]

Currently I'm in a directory which has a file libshared-object.so (name changed for generality).
When I run
$ objdump -p libshared-object.so
I receive the following output:
libshared-object.so: file format elf64-x86-64
Program Header:
LOAD off 0x0000000000000000 vaddr 0x0000000000000000 paddr 0x0000000000000000 align 2**21
filesz 0x00000000000828ee memsz 0x00000000000828ee flags r-x
LOAD off 0x0000000000083768 vaddr 0x0000000000283768 paddr 0x0000000000283768 align 2**21
filesz 0x00000000000048e0 memsz 0x0000000000004af0 flags rw-
DYNAMIC off 0x0000000000084af0 vaddr 0x0000000000284af0 paddr 0x0000000000284af0 align 2**3
filesz 0x00000000000002a0 memsz 0x00000000000002a0 flags rw-
NOTE off 0x00000000000001c8 vaddr 0x00000000000001c8 paddr 0x00000000000001c8 align 2**2
filesz 0x0000000000000024 memsz 0x0000000000000024 flags r--
EH_FRAME off 0x0000000000072c6c vaddr 0x0000000000072c6c paddr 0x0000000000072c6c align 2**2
filesz 0x0000000000002ed4 memsz 0x0000000000002ed4 flags r--
STACK off 0x0000000000000000 vaddr 0x0000000000000000 paddr 0x0000000000000000 align 2**4
filesz 0x0000000000000000 memsz 0x0000000000000000 flags rw-
RELRO off 0x0000000000083768 vaddr 0x0000000000283768 paddr 0x0000000000283768 align 2**0
filesz 0x0000000000001898 memsz 0x0000000000001898 flags r--
Dynamic Section:
NEEDED libQt5Widgets.so.5
NEEDED libQt5Compositor.so.5
NEEDED libQt5Quick.so.5
NEEDED libQt5Qml.so.5
NEEDED libQt5Network.so.5
NEEDED libQt5Gui.so.5
NEEDED libQt5Core.so.5
NEEDED libGL.so.1
NEEDED libpthread.so.0
NEEDED libstdc++.so.6
NEEDED libm.so.6
NEEDED libgcc_s.so.1
NEEDED libc.so.6
SONAME libshared-object.so.1
RPATH /opt/qt5/lib
INIT 0x000000000003fc68
FINI 0x000000000006c234
INIT_ARRAY 0x0000000000283768
INIT_ARRAYSZ 0x00000000000000e8
FINI_ARRAY 0x0000000000283850
FINI_ARRAYSZ 0x0000000000000008
GNU_HASH 0x00000000000001f0
STRTAB 0x00000000000101e8
SYMTAB 0x00000000000036d8
STRSZ 0x0000000000022072
SYMENT 0x0000000000000018
PLTGOT 0x0000000000285000
PLTRELSZ 0x0000000000008df0
PLTREL 0x0000000000000007
JMPREL 0x0000000000036e78
RELA 0x0000000000033458
RELASZ 0x0000000000003a20
RELAENT 0x0000000000000018
VERNEED 0x0000000000033348
VERNEEDNUM 0x0000000000000006
VERSYM 0x000000000003225a
RELACOUNT 0x0000000000000052
Version References:
required from libm.so.6:
0x09691a75 0x00 09 GLIBC_2.2.5
required from libgcc_s.so.1:
0x0b792650 0x00 08 GCC_3.0
required from libc.so.6:
0x06969194 0x00 10 GLIBC_2.14
0x09691a75 0x00 07 GLIBC_2.2.5
required from libQt5Core.so.5:
0x00058a25 0x00 06 Qt_5
required from libQt5Gui.so.5:
0x0dcbd2c9 0x00 12 Qt_5_PRIVATE_API
0x00058a25 0x00 03 Qt_5
required from libstdc++.so.6:
0x0bafd178 0x00 11 CXXABI_1.3.8
0x056bafd3 0x00 05 CXXABI_1.3
0x0297f871 0x00 04 GLIBCXX_3.4.21
0x08922974 0x00 02 GLIBCXX_3.4
Of particular interest is the very last of this information, the Version References:
Version References:
required from libm.so.6:
0x09691a75 0x00 09 GLIBC_2.2.5
required from libgcc_s.so.1:
0x0b792650 0x00 08 GCC_3.0
required from libc.so.6:
0x06969194 0x00 10 GLIBC_2.14
0x09691a75 0x00 07 GLIBC_2.2.5
required from libQt5Core.so.5:
0x00058a25 0x00 06 Qt_5
required from libQt5Gui.so.5:
0x0dcbd2c9 0x00 12 Qt_5_PRIVATE_API
0x00058a25 0x00 03 Qt_5
required from libstdc++.so.6:
0x0bafd178 0x00 11 CXXABI_1.3.8
0x056bafd3 0x00 05 CXXABI_1.3
0x0297f871 0x00 04 GLIBCXX_3.4.21
0x08922974 0x00 02 GLIBCXX_3.4
Question: Where do these version references come from? Take, for example, the line required from libQt5Gui.so.5: .. Qt_5 and Qt_5_PRIVATE_API.
Are references to version Qt_5 and Qt_5_PRIVATE_API coming from the C code that generated libQt5Gui.so.5? Or from some linker option passed to to gcc or ld? Or from something else?
Or from something else?
From something else.
When you build a shared library (say libfoo.so), you can (though don't have to) supply a linker version script giving certain symbols a version tag.
When you later link an executable or a shared library (say libbar.so) against libfoo.so, iff you use a versioned symbol, the version tag of that symbol is recorded in libbar.so (that is what you observed in your question).
This setup allows libfoo.so to change its symbols in ABI-incompatible way, and still support old client programs that were linked against the old symbols.
For example, libc.so.6 on x86_64 has the following versions of memcpy:
0000000000091620 g iD .text 000000000000003d GLIBC_2.14 memcpy
000000000008c420 g iD .text 0000000000000047 (GLIBC_2.2.5) memcpy
Programs that were linked against glibc-2.13 or older will use the GLIBC_2.2.5 version, programs that were linked against glibc-2.14 or newer will use the GLIBC_2.14 version.
If you try to run a program linked against glibc-2.14 on a system with glibc-2.13, you will get an error (missing symbol version), similar to this.
Before the introduction of symbol versioning, changing the ABI of an existing symbol required that you ship an entirely separate library. This is called external library versioning. You can read more about it here.

What are these sections in a Linker's map file?

My COSMIC-C linker generates a map file for my STM8S microcontroller project, which despite having a few familiar sections, is a little inexpressive.
here is the map file output and a few modules :
--------
Segments
--------
start 00008080 end 00008084 length 4 segment .const
start 00008087 end 00008298 length 529 segment .text
start 00004000 end 00004000 length 0 segment .eeprom
start 00000000 end 00000000 length 0 segment .bsct
start 00000000 end 0000000a length 10 segment .ubsct
start 0000000a end 0000000a length 0 segment .bit
start 0000000a end 0000000a length 0 segment .share
start 00000100 end 00000100 length 0 segment .data
start 00000100 end 00000100 length 0 segment .bss
start 00000000 end 000003be length 958 segment .info.
start 00008000 end 00008080 length 128 segment .const
start 00008084 end 00008087 length 3 segment .init
-------
Modules
-------
D:\Program Files (x86)\COSMIC\FSE_Compilers\CXSTM8\lib\crtsi0.sm8:
start 00008087 end 000080d7 length 80 section .text
start 00000100 end 00000100 length 0 section .bss
start 00000000 end 00000000 length 0 section .ubsct
start 00000000 end 00000034 length 52 section .info.
Release\clockcontrol.o:
start 000080d7 end 000080f7 length 32 section .text
start 00000034 end 000000c1 length 141 section .info.
Release\main.o:
start 000080f7 end 00008145 length 78 section .text
start 000000c1 end 00000146 length 133 section .info.
start 00008080 end 00008084 length 4 section .const
I know about the .text and .data. and I can assume the bsct and ubsct are data and bss (despite having a .data and .bss already); also about .eeprom and .const which may represent their obvious memory sections. but :
what are .info, .bit, .share, .init ?
is my assumption about .bsct and .ubsct correct ? if no, what are these sections and if yes, why we have both .bsct/.ubsct and .data/.bss ?
why do we have two .const? (they are consecutive)
despite being defined, none of the items mentioned in question 1 appear in any modules which are my codes. are they just standards ?
do these sections follow a naming convention or they're just something out of COSMIC ? I mean if they are standard or not.
with many thanx.
Update:
the COSMIC linker documentation has many descriptions on how to make a linker script, but doesn't have a predefined table. it seems the script is edited by STVD (the IDE). despite that, there is an example (not very related to my question) which may help:

Empty space between .text and .fini data segments?

I compiled a simple C program (gcc -o file file.cpp) and obtained the following output on running objdump -h file,
12 .text 00000172 0000000000400400 0000000000400400 00000400 2**4
CONTENTS, ALLOC, LOAD, READONLY, CODE
13 .fini 00000009 0000000000400574 0000000000400574 00000574 2**2
CONTENTS, ALLOC, LOAD, READONLY, CODE
I have a quick question here.
Why is there a gap of 2 bytes after the .text section? 0x400400 + 0x172 = 0x400572, but the .fini section starts from 0x400574? Has this got something to do with alignment? I noticed similar gaps between some other sections as well.
The last column of the output from objdump -h file is the alignment of the section. The alignment of .fini is 4 (2**2 is 2 to the power of 2), which is why it starts at 0x400574 instead of 0x400572.
When linking against glibc for x86-64, the alignment of 4 for the .fini section is specified in crti.o:
.section .fini,"ax",#progbits
.p2align 2
.globl _fini
.type _fini, #function
_fini:

Virtual memory addresses of objdump vs /proc/pid/maps?

I'm trying to understand where exactly does the executable assembly of a program end up, when a program is loaded/running. I found two resources talking about this, but they are somewhat difficult to read:
Understanding ELF using readelf and objdump Linux article (code formatting is messed up)
Michael Guyver, Some Assembly Required*: Relocations, Relocations (lots of assembly which I'm not exactly proficient in)
So, here's a brief example; I'm interested where does the executable section of the tail program end up. Basically, objdump tells me this:
$ objdump -dj .text /usr/bin/tail | head -10
/usr/bin/tail: file format elf32-i386
Disassembly of section .text:
08049100 <.text>:
8049100: 31 ed xor %ebp,%ebp
8049102: 5e pop %esi
8049103: 89 e1 mov %esp,%ecx
...
I'm assuming I'd see calls to tail's 'main()' be made here, had symbols not been stripped. Anyways, the start of the executable section is, according to this, 0x08049100; I'm interested in where it ends up eventually.
Then, I run tail in the background, getting its pid:
$ /usr/bin/tail -f & echo $!
28803
... and I inspect its /proc/pid/maps:
$ cat /proc/28803/maps
00547000-006a8000 r-xp 00000000 08:05 3506 /lib/i386-linux-gnu/libc-2.13.so
...
008c6000-008c7000 r-xp 00000000 00:00 0 [vdso]
08048000-08054000 r-xp 00000000 08:05 131469 /usr/bin/tail
08054000-08055000 r--p 0000b000 08:05 131469 /usr/bin/tail
08055000-08056000 rw-p 0000c000 08:05 131469 /usr/bin/tail
08af1000-08b12000 rw-p 00000000 00:00 0 [heap]
b76de000-b78de000 r--p 00000000 08:05 139793 /usr/lib/locale/locale-archive
...
bf845000-bf866000 rw-p 00000000 00:00 0 [stack]
Now I have tail three times - but the executable segment r-xp (which is the .text?) is apparently at 0x08048000 (an address that apparently was standardized back with SYSV for x86; also see Anatomy of a Program in Memory : Gustavo Duarte for an image)
Using the gnuplot script below, I arrived at this image:
First (topmost) plot shows "File offset" of sections from objdump (starts from 0x0); middle plot shows "VMA" (virtual memory address) of sections from objdump and bottom plot shows layout from /proc/pid/maps - both of these starting from 0x08048000; all three plots show the same range.
Comparing topmost and middle plot, it seems that the sections are more-less translated "as is" from the executable file to the VMA addresses (apart from the end); such that the whole executable file (not just .text section) starts from 0x08048000.
But comparing middle and bottom plot, it seems that when a program is running in memory, then only .text is "pushed back" to 0x08048000 - and not only that, it now appears larger!
The only explanation I have so far, is what I read somewhere (but lost the link): that an image in memory would have to have allocated a whole number of pages (of size e.g. 4096 bytes), and start from a page boundary. The whole number of pages explains the larger size - but, given that all these are virtual addresses, why the need to "snap" them to a page boundary (could one not, just as well, map the virtual address as is to a physical page boundary?)
So - could someone provide an explanation so as to why /proc/pid/maps sees the .text section in a different virtual address region from objdump?
mem.gp gnuplot script:
#!/usr/bin/env gnuplot
set term wxt size 800,500
exec = "/usr/bin/tail" ;
# cannot do - apparently gnuplot waits for children to exit, so locks here:
#runcmd = "bash -c '" . exec . " -f & echo $!'"
#print runcmd
#pid = system(runcmd) ;
#print runcmd, "pid", pid
# run tail -f & echo $! in another shell; then enter pid here:
pid = 28803
# $1 Idx $2 Name $3 Size $4 VMA $5 LMA $6 File off
cmdvma = "<objdump -h ".exec." | awk '$1 ~ \"^[0-9]+$\" && $2 !~ \".gnu_debuglink\" {print $1, $2, \"0X\"$3, \"0X\"$4;}'" ;
cmdfo = "<objdump -h ".exec." | awk '$1 ~ \"^[0-9]+$\" && $2 !~ \".gnu_debuglink\" {print $1, $2, \"0X\"$3, \"0X\"$6;}'" ;
cmdmaps = "<cat /proc/".pid."/maps | awk '{split($1,a,\"-\");b1=strtonum(\"0x\"a[1]);b2=strtonum(\"0x\"a[2]);printf(\"%d \\\"%s\\\" 0x%08X 0x%08X\\n\", NR,$6,b2-b1,b1);}'"
print cmdvma
print cmdfo
print cmdmaps
set format x "0x%08X" # "%016X";
set xtics rotate by -45 font ",7";
unset ytics
unset colorbox
set cbrange [0:25]
set yrange [0.5:1.5]
set macros
set multiplot layout 3,1 columnsfirst
# 0x08056000-0x08048000 = 0xe000
set xrange [0:0xe000]
set tmargin at screen 1
set bmargin at screen 0.667+0.1
plot \
cmdfo using 4:(1+$0*0.01):4:($4+$3):0 with xerrorbars lc palette t "File off", \
cmdfo using 4:(1):2 with labels font ",6" left rotate by -45 t ""
set xrange [0x08048000:0x08056000]
set tmargin at screen 0.667
set bmargin at screen 0.333+0.1
plot \
cmdvma using 4:(1+$0*0.01):4:($4+$3):0 with xerrorbars lc palette t "VMA", \
cmdvma using 4:(1):2 with labels font ",6" left rotate by -45 t ""
set tmargin at screen 0.333
set bmargin at screen 0+0.1
plot \
cmdmaps using 4:(1+$0*0.01):4:($4+$3):0 with xerrorbars lc palette t "/proc/pid/maps" , \
cmdmaps using 4:(1):2 with labels font ",6" left rotate by -45 t ""
unset multiplot
#system("killall -9 " . pid) ;
The short answer is that loadable segments get mapped into memory based on the ELF program headers with type PT_LOAD.
PT_LOAD - The array element specifies a loadable segment, described by
p_filesz and p_memsz. The bytes from the file are mapped to the
beginning of the memory segment. If the segment's memory size
(p_memsz) is larger than the file size (p_filesz), the ``extra'' bytes
are defined to hold the value 0 and to follow the segment's
initialized area. The file size may not be larger than the memory
size. Loadable segment entries in the program header table appear in
ascending order, sorted on the p_vaddr member.
For example, on my CentOS 6.4:
objdump -x `which tail`
Program Header:
LOAD off 0x00000000 vaddr 0x08048000 paddr 0x08048000 align 2**12
filesz 0x0000e4d4 memsz 0x0000e4d4 flags r-x
LOAD off 0x0000e4d4 vaddr 0x080574d4 paddr 0x080574d4 align 2**12
filesz 0x000003b8 memsz 0x0000054c flags rw-
And from /proc/pid/maps:
cat /proc/2671/maps | grep `which tail`
08048000-08057000 r-xp 00000000 fd:00 133669 /usr/bin/tail
08057000-08058000 rw-p 0000e000 fd:00 133669 /usr/bin/tail
You will notice there is a difference between what maps and objdump says for the load address for subsequent sections, but that has to do with the loader accounting how much memory the section takes up as well as the alignment field. The first loadable segment is mapped in at 0x08048000 with a size of 0x0000e4d4, so you'd expect it to go from 0x08048000 to 0x080564d4, but the alignment says to align on 2^12 byte pages. If you do the math you end up at 0x8057000, matching /proc/pid/maps. So the second segment is mapped in at 0x8057000 and has a size of 0x0000054c (ending at 0x805754c), which is aligned to 0x8058000, matching /proc/pid/maps.
Thanks to the comment from #KerrekSB, I reread Understanding ELF using readelf and objdump - Linux article, and I think I sort of got it now (although it would be nice for someone else to confirm if its right).
Basically, the mistake is that the region 08048000-08054000 r-xp 00000000 08:05 131469 /usr/bin/tail from /proc/pid/maps does not start with .text section; and the missing link for knowing this is Program Header Table (PHT), as reported by readelf. Here is what it says for my tail:
$ readelf -l /usr/bin/tail
Elf file type is EXEC (Executable file)
Entry point 0x8049100
There are 9 program headers, starting at offset 52
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
[00] PHDR 0x000034 0x08048034 0x08048034 0x00120 0x00120 R E 0x4
[01] INTERP 0x000154 0x08048154 0x08048154 0x00013 0x00013 R 0x1
[Requesting program interpreter: /lib/ld-linux.so.2]
[02] LOAD 0x000000 0x08048000 0x08048000 0x0b9e8 0x0b9e8 R E 0x1000
[03] LOAD 0x00bf10 0x08054f10 0x08054f10 0x00220 0x003f0 RW 0x1000
[04] DYNAMIC 0x00bf24 0x08054f24 0x08054f24 0x000c8 0x000c8 RW 0x4
[05] NOTE 0x000168 0x08048168 0x08048168 0x00044 0x00044 R 0x4
[06] GNU_EH_FRAME 0x00b918 0x08053918 0x08053918 0x00024 0x00024 R 0x4
[07] GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x4
[08] GNU_RELRO 0x00bf10 0x08054f10 0x08054f10 0x000f0 0x000f0 R 0x1
Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .rodata .eh_frame_hdr .eh_frame
03 .ctors .dtors .jcr .dynamic .got .got.plt .data .bss
04 .dynamic
05 .note.ABI-tag .note.gnu.build-id
06 .eh_frame_hdr
07
08 .ctors .dtors .jcr .dynamic .got
I've added the [0x] line numbering in the "Program Headers:" section manually; otherwise it's hard to link it to Section to Segment mapping: below. Here also note: "Segment has many types, ... LOAD: The segment's content is loaded from the executable file. "Offset" denotes the offset of the file where the kernel should start reading the file's content. "FileSiz" tells us how many bytes must be read from the file. (Understanding ELF...)"
So, objdump tells us:
08049100 <.text>:
... that .text section starts at 0x08049100.
Then, readelf tells us:
[02] LOAD 0x000000 0x08048000 0x08048000 0x0b9e8 0x0b9e8 R E 0x1000
... that header/segment [02] is loaded from the executable file at offset zero into 0x08048000; and that this is marked R E - read and execute region of memory.
Further, readelf tells us:
02 .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .rodata .eh_frame_hdr .eh_frame
... meaning that the header/segment [02] contains many sections - among them, also the .text; this now matches with the objdump view that .text starts higher than 0x08048000.
Finally, /proc/pid/maps of the running program tells us:
08048000-08054000 r-xp 00000000 08:05 131469 /usr/bin/tail
... that the executable (r-xp) "section" of the executable file is loaded at 0x08048000 - and now it is easy to see that this "section", as I called it, is called wrong - it is not a section (as per objdump nomenclature); but it is actually a "header/segment", as readelf sees it (in particular, the header/segment [02] we saw earlier).
Well, hopefully I got this right ( and hopefully someone can confirm if I did so or not :) )

objdump won't show my ELF sections

I have a tool emitting an ELF, which as far as I can tell is compliant to the spec. Readelf output looks fine, but objdump refuses to disassemble anything.
I have simplified the input to a single global var, and "int main(void) { return 0;}" to aid debugging - the tiny section sizes are correct.
In particular, objdump seems unable to find the sections table:
$ arm-none-linux-gnueabi-readelf -S davidm.elf
There are 4 section headers, starting at offset 0x74:
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .text NULL ff000000 000034 00001c 00 AX 0 0 4
[ 2] .data NULL ff00001c 000050 000004 00 WA 0 0 4
[ 3] .shstrtab NULL 00000000 000114 000017 00 0 0 0
$ arm-none-linux-gnueabi-objdump -h davidm.elf
davidm.elf: file format elf32-littlearm
Sections:
Idx Name Size VMA LMA File off Algn
I also have another ELF, built from the exact same objects, only produced with regular toolchain use:
$ objdump -h kernel.elf
kernel.elf: file format elf32-littlearm
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 0000001c ff000000 ff000000 00008000 2**2
CONTENTS, ALLOC, LOAD, READONLY, CODE
1 .data 00000004 ff00001c ff00001c 0000801c 2**2
CONTENTS, ALLOC, LOAD, DATA
Even after I stripped .comment and .ARM.attributes sections (incase objdump requires them) from the 'known good' kernel.elf, it still happily lists the sections there, but not in my tool's davidm.elf.
I have confirmed the contents of the sections are identical between the two with readelf -x.
The only thing I can image is that the ELF file layout is different and breaks some expectations of BFD, which could explain why readelf (and my tool) processes it just fine but objdump has troubles.
Full readelf:
ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: ARM
Version: 0x1
Entry point address: 0xff000000
Start of program headers: 84 (bytes into file)
Start of section headers: 116 (bytes into file)
Flags: 0x5000002, has entry point, Version5 EABI
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 1
Size of section headers: 40 (bytes)
Number of section headers: 4
Section header string table index: 3
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .text NULL ff000000 000034 00001c 00 AX 0 0 4
[ 2] .data NULL ff00001c 000050 000004 00 WA 0 0 4
[ 3] .shstrtab NULL 00000000 000114 000017 00 0 0 0
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings)
I (info), L (link order), G (group), x (unknown)
O (extra OS processing required) o (OS specific), p (processor specific)
There are no section groups in this file.
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x000034 0xff000000 0xff000000 0x00020 0x00020 RWE 0x8000
Section to Segment mapping:
Segment Sections...
00 .text .data
There is no dynamic section in this file.
There are no relocations in this file.
There are no unwind sections in this file.
No version information found in this file.
Could the aggressive packing of the on-disk layout be causing troubles? Am I in violation of some bytestream alignment restrictions BFD expects, documented or otherwise?
Lastly - this file is not intended to be mmap'd into an address space, a loader will memcpy segment data into the desired location, so there is no requirement to play mmap-friendly file-alignment tricks. Keeping the ELF small is more important.
Cheers,
DavidM
EDIT: I was asked to upload the file, and/or provide 'objdump -x'. So I've done both:
davidm.elf
$ objdump -x davidm.elf
davidm.elf: file format elf32-littlearm
davidm.elf
architecture: arm, flags 0x00000002:
EXEC_P
start address 0xff000000
Program Header:
LOAD off 0x00000034 vaddr 0xff000000 paddr 0xff000000 align 2**15
filesz 0x00000020 memsz 0x00000020 flags rwx
private flags = 5000002: [Version5 EABI] [has entry point]
Sections:
Idx Name Size VMA LMA File off Algn
SYMBOL TABLE:
no symbols
OK - finally figured it out.
After building and annotating/debugging libbfd (function elf_object_p()) in the context of a little test app, I found why it was not matching on any of BFD supported targets.
I had bad sh_type flags for the section headers: NULL. Emitting STRTAB or PROGBITS (and eventually NOBITS when I get that far) as appropriate and objdump happily walks my image.
Not really surprising, in retrospect - I'm more annoyed I didn't catch this in comparing readelf outputs than anything else :(
Thanks for the help all :)

Resources