I'm trying to use this answer by writing a custom target .json file with "linker-flavor":"gcc". My full target .json file is:
{
"llvm-target": "avr-atmel-none",
"cpu": "atmega328p",
"target-endian": "little",
"target-pointer-width": "16",
"os": "none",
"target-env": "gnu",
"target-vendor": "unknown",
"arch": "avr",
"data-layout": "e-p:16:16:16-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-n8",
"executables": true,
"linker": "avr-gcc",
"linker-flavor": "gcc",
"pre-link-args": {
"gcc": ["-Os -mmcu=atmega328p"]
},
"exe-suffix": ".elf",
"post-link-args": {
"gcc": ["-Wl,--gc-sections"]
},
"no-default-libraries": false
}
Running cargo build with this finishes without any error messages:
$ cargo build --release -v
Compiling core v0.1.0 (https://github.com/gergoerdi/rust-avr-libcore-mini?rev=adda44aa91ac517aab6915447592ee4cad26564c#adda44aa)
Running `rustc --crate-name core /home/cactus/.cargo/git/checkouts/rust-avr-libcore-mini-37e279d93a70b45a/adda44a/src/lib.rs --crate-type lib --emit=dep-info,link -C opt-level=3 -C metadata=655bb622dd229da9 -C extra-filename=-655bb622dd229da9 --out-dir /home/cactus/prog/rust/avr/chip8-avr/target/avr-atmega328p/release/deps --target avr-atmega328p -L dependency=/home/cactus/prog/rust/avr/chip8-avr/target/avr-atmega328p/release/deps -L dependency=/home/cactus/prog/rust/avr/chip8-avr/target/release/deps --cap-lints allow`
Compiling chip8-engine v0.1.0 (https://github.com/gergoerdi/rust-avr-chip8-engine?rev=c6f88737bae4dae0bd6c5c2bbc73737e6dfadfcd#c6f88737)
Running `rustc --crate-name chip8_engine /home/cactus/.cargo/git/checkouts/rust-avr-chip8-engine-4bce60f3f178d33a/c6f8873/src/lib.rs --crate-type lib --emit=dep-info,link -C opt-level=3 -C metadata=2197ff1f15f697c9 -C extra-filename=-2197ff1f15f697c9 --out-dir /home/cactus/prog/rust/avr/chip8-avr/target/avr-atmega328p/release/deps --target avr-atmega328p -L dependency=/home/cactus/prog/rust/avr/chip8-avr/target/avr-atmega328p/release/deps -L dependency=/home/cactus/prog/rust/avr/chip8-avr/target/release/deps --extern core=/home/cactus/prog/rust/avr/chip8-avr/target/avr-atmega328p/release/deps/libcore-655bb622dd229da9.rlib --cap-lints allow`
Compiling chip8-avr v0.1.0 (file:///home/cactus/prog/rust/avr/chip8-avr)
Running `rustc --crate-name chip8_avr src/main.rs --crate-type bin --emit=dep-info,link -C opt-level=3 -C metadata=014a8fed19cbc611 -C extra-filename=-014a8fed19cbc611 --out-dir /home/cactus/prog/rust/avr/chip8-avr/target/avr-atmega328p/release/deps --target avr-atmega328p -L dependency=/home/cactus/prog/rust/avr/chip8-avr/target/avr-atmega328p/release/deps -L dependency=/home/cactus/prog/rust/avr/chip8-avr/target/release/deps --extern chip8_engine=/home/cactus/prog/rust/avr/chip8-avr/target/avr-atmega328p/release/deps/libchip8_engine-2197ff1f15f697c9.rlib --extern core=/home/cactus/prog/rust/avr/chip8-avr/target/avr-atmega328p/release/deps/libcore-655bb622dd229da9.rlib`
Finished release [optimized] target(s) in 15.99 secs
However, the resulting ELF file's .text section is empty:
$ avr-objdump -h target/avr-atmega328p/release/chip8-avr.elf
target/avr-atmega328p/release/chip8-avr.elf: file format elf32-avr
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 00000000 00000000 00000000 00000074 2**1
CONTENTS, ALLOC, LOAD, READONLY, CODE
1 .data 00000000 00800060 00000000 00000074 2**0
CONTENTS, ALLOC, LOAD, DATA
2 .stab 0000012c 00000000 00000000 00000074 2**2
CONTENTS, READONLY, DEBUGGING
3 .stabstr 0000005d 00000000 00000000 000001a0 2**0
CONTENTS, READONLY, DEBUGGING
4 .comment 00000011 00000000 00000000 000001fd 2**0
CONTENTS, READONLY
So to figure out what's going on, I thought I'd replace my avr-gcc
with a small shellscript that logs its arguments before passing it to
the real avr-gcc executable.
This shows me that rustc/cargo is trying to run the following
command line to do the linking:
/usr/bin/avr-gcc -Os -mmcu=atmega328p \
-L /home/cactus/prog/rust/rust-avr/build/build/x86_64-unknown-linux-gnu/stage1/lib/rustlib/avr-atmega328p/lib \
/home/cactus/prog/rust/avr/chip8-avr/target/avr-atmega328p/release/deps/chip8_avr-014a8fed19cbc611.0.o \
-o /home/cactus/prog/rust/avr/chip8-avr/target/avr-atmega328p/release/deps/chip8_avr-014a8fed19cbc611.elf \
-Wl,--gc-sections \
-L /home/cactus/prog/rust/avr/chip8-avr/target/avr-atmega328p/release/deps -L /home/cactus/prog/rust/avr/chip8-avr/target/release/deps -L /home/cactus/prog/rust/rust-avr/build/build/x86_64-unknown-linux-gnu/stage1/lib/rustlib/avr-atmega328p/lib \
-Wl,-Bstatic /home/cactus/prog/rust/avr/chip8-avr/target/avr-atmega328p/release/deps/libchip8_engine-2197ff1f15f697c9.rlib \
/home/cactus/prog/rust/avr/chip8-avr/target/avr-atmega328p/release/deps/libcore-655bb622dd229da9.rlib \
-Wl,-Bdynamic -Wl,--gc-sections
If I run the exact same command manually, with the exact same environment variables, I get a good ELF file with
the right contents (note that its .text section is not empty):
$ /usr/bin/avr-gcc -Os -mmcu=atmega328p -L /home/cactus/prog/rust/rust-avr/build/build/x86_64-unknown-linux-gnu/stage1/lib/rustlib/avr-atmega328p/lib /home/cactus/prog/rust/avr/chip8-avr/target/avr-atmega328p/release/deps/chip8_avr-014a8fed19cbc611.0.o -o /home/cactus/prog/rust/avr/chip8-avr/target/avr-atmega328p/release/deps/chip8_avr-014a8fed19cbc611.elf -Wl,--gc-sections -L /home/cactus/prog/rust/avr/chip8-avr/target/avr-atmega328p/release/deps -L /home/cactus/prog/rust/avr/chip8-avr/target/release/deps -L /home/cactus/prog/rust/rust-avr/build/build/x86_64-unknown-linux-gnu/stage1/lib/rustlib/avr-atmega328p/lib -Wl,-Bstatic /home/cactus/prog/rust/avr/chip8-avr/target/avr-atmega328p/release/deps/libchip8_engine-2197ff1f15f697c9.rlib /home/cactus/prog/rust/avr/chip8-avr/target/avr-atmega328p/release/deps/libcore-655bb622dd229da9.rlib -Wl,-Bdynamic -Wl,--gc-sections
$ avr-objdump -h target/avr-atmega328p/release/deps/chip8_avr-014a8fed19cbc611.elf
target/avr-atmega328p/release/deps/chip8_avr-014a8fed19cbc611.elf: file format elf32-avr
Sections:
Idx Name Size VMA LMA File off Algn
0 .data 0000020e 00800100 00001a56 00001af0 2**4
CONTENTS, ALLOC, LOAD, DATA
1 .text 00001a56 00000000 00000000 00000094 2**1
CONTENTS, ALLOC, LOAD, READONLY, CODE
2 .bss 000001fa 0080030e 0080030e 00001cfe 2**0
ALLOC
3 .stab 000007ec 00000000 00000000 00001d00 2**2
CONTENTS, READONLY, DEBUGGING
4 .stabstr 000000b0 00000000 00000000 000024ec 2**0
CONTENTS, READONLY, DEBUGGING
5 .comment 00000011 00000000 00000000 0000259c 2**0
CONTENTS, READONLY
So why does cargo silently produce a nonsensical empty ELF file, if
running the (supposedly) same command from the shell results in a valid ELF file?
This is caused by a bug in the target .json file; specifically, this part:
"pre-link-args": {
"gcc": ["-Os -mmcu=atmega328p"]
},
The arguments are passed directly as argv to the linker, so multiple arguments need to be split into multiple elements of the array here:
"pre-link-args": {
"gcc": ["-Os", "-mmcu=atmega328p"]
},
The reason this problem didn't show up when using the special logging version of avr-gcc is that the log only contained all the arguments together, so there was no difference between the two representations.
As for avr-gcc '-Os -mmcu=atmega328p' creating an empty .elf file, it seems that is simply a side-effect of not specifying any (valid) -mmcu argument.
Related
Can anyone explain why gcc is adding the .comment and .note.gnu.property sections in the object code, and how can I tell to gcc not to add them, is it possible?
Thanks.
why gcc is adding the .comment and .note.gnu.property sections
You can examine the contents of the .comment section with e.g. readelf -x.comment:
echo "int foo() { return 42; }" | gcc -xc - -c -o foo.o
readelf -x.comment foo.o
Hex dump of section '.comment':
0x00000000 00474343 3a202844 65626961 6e203131 .GCC: (Debian 11
0x00000010 2e322e30 2d313029 2031312e 322e3000 .2.0-10) 11.2.0.
Obviously the "why" is to make it easier to understand what compiler produced the object file.
I don't believe there is a GCC flag to suppress this, but objcopy --remove-section .comment foo.o bar.o will get rid of it.
The .note.gnu.property can be removed in similar fashion.
Here is a discussion of what it may contain.
tks makred it, objcopy -O binary -R .note -R .comment -R .note.gnu.property foo , elf format to binary, size from 129M to 4K
I have a number of object files in ELF format, with the usual .text and other common sections, and I was wondering if the gnu ld or gold could be used to link a number of ELF object files into an ELF executable, even if the architecture (an 8-bit micro with a proprietary toolchain) is not known beforehand by the linker. In essence I'm asking if the linking process is, to some extent, platform independent once you have all the required obect files, or if on the contrary I will need to roll my own linker at some point.
No, it won't work.
A major thing the linker has to do is to handle relocations. Relocations are arch-specific:
int f(){return 42;}
$ gcc -c foo.c -o foo && readelf -r foo
Relocation section '.rela.eh_frame' at offset 0x198 contains 1 entry:
Offset Info Type Sym. Value Sym. Name + Addend
000000000020 000200000002 R_X86_64_PC32 0000000000000000 .text + 0
$ gcc -m32 -c foo.c -o foo && readelf -r foo
Relocation section '.rel.text' at offset 0x1d0 contains 2 entries:
Offset Info Type Sym.Value Sym. Name
00000004 00000b02 R_386_PC32 00000000 __x86.get_pc_thunk.ax
00000009 00000c0a R_386_GOTPC 00000000 _GLOBAL_OFFSET_TABLE_
Relocation section '.rel.eh_frame' at offset 0x1e0 contains 2 entries:
Offset Info Type Sym.Value Sym. Name
00000020 00000202 R_386_PC32 00000000 .text
00000040 00000502 R_386_PC32 00000000 .text.__x86.get_pc_thu
$ clang -target arm-linux-gnueabi -c foo.c -o foo && readelf -r foo
Relocation section '.rel.ARM.exidx' at offset 0x104 contains 1 entry:
Offset Info Type Sym.Value Sym. Name
00000000 0000032a R_ARM_PREL31 00000000 .text
Moreover the linker script which says how the ELF file should be generated (page size, start address, etc.) is arch-specific:
ld -m elf_x86_64 --verbose
ld -m elf_i386 --verbose
arm-linux-gnueabi-ld --verbose
If your not compiling to a static executable, the linker has to generate PLT entries as well which are native code (and thus arch-specific).
Some architecture have arch-specific segments as well (eg. .ARM.extab, .ARM.exidx).
I'm trying to protect my application against buffer overflow exploits. Among other things, I'm using non-executable stacks and link my binaries with the noexecstack flag (by passing -Wl,-z,noexecstack to gcc).
Everything seems fine - readelf confirms that PT_GNU_STACK specifies correct permissions:
$ readelf -l target | grep -A1 GNU_STACK
GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 RW 10
So does execstack:
$ execstack -q target
- target
There's only one problem. All my stacks are actually executable:
root#170ubuntu16p04-64smp-1:~# cat /proc/12878/task/*/maps | grep stack
7ffcac654000-7ffcac675000 rwxp 00000000 00:00 0 [stack]
7fe540e66000-7fe541666000 rwxp 00000000 00:00 0 [stack]
7fe540665000-7fe540e65000 rwxp 00000000 00:00 0 [stack]
7fe53b800000-7fe53c000000 rwxp 00000000 00:00 0 [stack]
I've trapped allocate_stack calls and examined protection flags. In theory, they should be initialized according to PT_GNU_STACK. But in my case, it seems like PT_GNU_STACK was ignored and _dl_stack_flags was initialized with default permissions.
Does anyone know what could have caused this? Everything seems correct, but stacks are still executable.
I'm using gcc 4.8.3 / glibc 2.11.
Olaf and Employed Russian pushed me in the right direction. A third-party shared object was poisoning my stacks.
But it wasn't linked to my main executable directly. Both ldd and lddtree weren't showing any libraries with RWE stacks, so I've decided to dig deeper and wrote a script that checked all shared objects currently mapped into a process memory:
#!/bin/bash
if [ -z "$1" ]; then
echo "Usage: $0 <target>"
exit 1;
fi
kav_pid=`pidof $1`
for so in `cat /proc/$kav_pid/task/*/maps | awk '/.so$/ {print $6}' | sort | uniq`; do
stack_perms=`readelf -Wl $so | awk '/GNU_STACK/ {print $7}'`
if [ -z "$stack_perms" ]; then
echo "$so doesn't have PT_GNU_STACK"
elif [ "$stack_perms" != "RW" ]; then
echo "$so has unexpected permissions: $stack_perms"
fi
done
And it worked! I found a library with RWE permissions:
$ ./find_execstack.sh target
/target/dir/lib64/lib3rdparty.so has unexpected permissions: RWE
To make sure that it was this library that poisoned my stacks, I opened my application with gdb and set a breakpoint in dlopen. And Bingo! Here are the permissions before dlopening lib3rdparty.so:
7ffffffde000-7ffffffff000 rw-p 00000000 00:00 0 [stack]
And here are they right after the dlopen:
7ffffffde000-7ffffffff000 rwxp 00000000 00:00 0 [stack]
As it turned out, lib3rdparty.so was built using a different toolchain and that went unnoticed utill now.
Olaf, Employed Russian, thank you!
what could have caused this?
In addition to the main executable's PT_GNU_STACK having correct permissions, you also need to have PT_GNU_STACK with correct permissions in every directly-linked shared library.
If any one of these libraries does not have PT_GNU_STACK at all, or has one with executable permissions, it will "poison" all of your stacks with executable permission.
So run
for j in $(ldd target | grep -o '=> .* ' | sed -e 's/=> //' -e '/^ *$/d' ); do
out=$(readelf -Wl $j | grep STACK)
[[ -z "$out" ]] && echo "missing GNU_STACK in $j"
echo $out | grep -q RWE && echo "executable GNU_STACK in $j"
done
and you will likely see at least one library with missing or executable stack.
P.S. I see that Olaf has already (partially) suggested this.
Using commands like BYTE or LONG, it is possible to include explicit bytes of data in an output section from a linker script. The linked page also describes that those commands can be used to output the value of symbols.
I would have expected that if you perform partial linking (i.e., using the -r option of ld), relocation records would be emitted for the symbols that are outputted in this way. However, it seems that the linker just outputs the currently known value1 of the symbol.
Here is a MWE to clarify what I mean.
test.c:
int foo = 1, bar = 2;
test.ld:
SECTIONS {
.data : {
*(.data)
LONG(foo)
LONG(bar)
}
}
Then run the following:
$ gcc -c test.c
$ ld -T test.ld -r -o test.elf test.o
$ readelf -r test.elf
There are no relocations in this file.
$ readelf -x .data test.elf
Hex dump of section '.data':
0x00000000 01000000 02000000 00000000 04000000 ................
As you can see, no relocations are created and the values that are outputted are the currently known values of foo and bar.
Could this be a bug? If not, is there any way to force the linker to output relocation records for symbols added to an output section?
1 I'm not sure of this is the correct term. What I mean is the value that you see when you run readelf -s on the input object file.
I've recently compiled a simple hello world C program under Debian Linux using gcc:
gcc -mtune=native -march=native -m32 -s -Wunused -O2 -o hello hello.c
The file size was 2980 bytes. I opened it in a hex editor and i saw the following lines:
GCC: (Debian 4.4.5-8) 4.4.5 GCC: (Debian 4.4.5-10) 4.4.5 .shstrtab .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .text .fini .rodata .eh_frame .ctors .dtors .jcr .dynamic .got .got.plt data.data .bss .comment
Are they really needed? No way to reduce executable size?
use -Qn to avoid that.
aa$ touch hello.c
aa$ gcc -c hello.c
aa$ objdump -s hello.o
hello.o: file format elf32-i386
Contents of section .comment:
0000 00474343 3a202844 65626961 6e20342e .GCC: (Debian 4.
0010 372e322d 35292034 2e372e32 00 7.2-5) 4.7.2.
aa$ gcc -Qn -c hello.c
aa$ objdump -s hello.o
hello.o: file format elf32-i386
aa$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/i486-linux-gnu/4.7/lto-wrapper
Target: i486-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 4.7.2-5' --with-bugurl=file:///usr/share/doc/gcc-4.7/README.Bugs --enable-languages=c,c++,go,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.7 --enable-shared --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.7 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object --enable-plugin --enable-objc-gc --enable-targets=all --with-arch-32=i586 --with-tune=generic --enable-checking=release --build=i486-linux-gnu --host=i486-linux-gnu --target=i486-linux-gnu
Thread model: posix
gcc version 4.7.2 (Debian 4.7.2-5)
aa$
That's in a comment section in the ELF binary. You can strip it out:
$ gcc -m32 -O2 -s -o t t.c
$ ls -l t
-rwxr-xr-x 1 me users 5488 Jun 7 11:58 t
$ readelf -p .comment t
String dump of section '.comment':
[ 0] GCC: (Gentoo 4.5.1-r1 p1.4, pie-0.4.5) 4.5.1
[ 2d] GCC: (Gentoo 4.5.2 p1.1, pie-0.4.5) 4.5.2
$ strip -R .comment t
$ readelf -p .comment t
readelf: Warning: Section '.comment' was not dumped because it does not exist!
$ ls -l t
-rwxr-xr-x 1 me users 5352 Jun 7 11:58 t
The gains are tiny though, not sure it's worth it.
This is in a comment section which isn't loaded in memory (and note that ELF files usually use padding so that memory mapping them will keep a correct alignment). If you want to get rid of such unneeded sections, see the various objcopy options and find out:
objcopy --remove-section .comment a.o b.o
I had the same issue myself, but using MinGW's GCC implementation - stripping the executable and passing the -Qn option did nothing, and I couldn't remove the ".comment" section as there wasn't one.
In order to stop the compiler including this information, regardless of which sections are in your executable, you can pass the -fno-ident parameter to the compiler and linker:
Without the parameter (strings -a [filename]):
!This program cannot be run in DOS mode.
.text
0`.rdata
0#.idata
GCC: (tdm64-2) 4.8.1
With the parameter:
!This program cannot be run in DOS mode.
.text
0`.idata
It appears that you'd be able to 'just' strip that if you don't want it; See this page for a nice run-down.
http://timelessname.com/elfbin/
Note that the page (of course) also resorts to using assembly, which you may not want to do, but the general gist applies
You can inform the loader which sections to include in your output with a linker script. You can see what sections are included in the file using the objdump command. As you've noticed there's a good bit of 'junk' in an elf - junk that is until you wish you had it.
Note though, that the size of an elf executable file is not indicative of the memory foot print of the image as realized in memory. A lot of the 'junk' isn't in the memory image and the image can call sbreak and or mmap to acquire more memory, the elf file takes no account of stack usage - essentially all of your automatic variables are unaccounted for. These are only three examples others abound.