why gcc is adding .comment and .note.gnu.property section? - c

Can anyone explain why gcc is adding the .comment and .note.gnu.property sections in the object code, and how can I tell to gcc not to add them, is it possible?
Thanks.

why gcc is adding the .comment and .note.gnu.property sections
You can examine the contents of the .comment section with e.g. readelf -x.comment:
echo "int foo() { return 42; }" | gcc -xc - -c -o foo.o
readelf -x.comment foo.o
Hex dump of section '.comment':
0x00000000 00474343 3a202844 65626961 6e203131 .GCC: (Debian 11
0x00000010 2e322e30 2d313029 2031312e 322e3000 .2.0-10) 11.2.0.
Obviously the "why" is to make it easier to understand what compiler produced the object file.
I don't believe there is a GCC flag to suppress this, but objcopy --remove-section .comment foo.o bar.o will get rid of it.
The .note.gnu.property can be removed in similar fashion.
Here is a discussion of what it may contain.

tks makred it, objcopy -O binary -R .note -R .comment -R .note.gnu.property foo , elf format to binary, size from 129M to 4K

Related

Why there are no .rel.dyn/.got.plt section in dynamic ELF files?

I have code like this
// test_printf.c
#include <stdio.h>
int f(){
printf("aaa %d\n", 1);
}
And I compile it with the following code
gcc -shared -fPIC -m32 -g -c -o test_printf.so test_printf.c
I think if run readelf -S test_printf.so, I will see .rel.dyn and .rel.plt. This is because both of these sections act like what .rel.data and .rel.text in static linked programs do.
For example, in my program, since printf is an external symbol, and is referred by my test_printf.so. So when I look into test_printf.so's relocation table, there should be one entry names printf. I check that, and the entry exists.
Then I think, since printf is an external symbol, its location should be determined at runtime. However, we must allocate a .got.plt section for this printf function, and the section should be in both a dynamicly linked executive and a dynamic library(test_printf.so).
However, when I run readelf -S, there is no .got.plt section, and I am confused about this. Is this section not necessary in a dynamic library(test_printf.so)? I don't think it is possible. Suppose test_printf.so is finally linked with executive program a, then how can a know where is the .got.plt section for printf? Does this .got.plt finally generated in a?
Meanwhile, I have a question 2. Are .rel.dyn and .rel.plt always present, if there are .got and .got.plt?
.rel.dyn and .got.plt can be present in a shared library .elf it depends on the structure of the functions in the library, this is in https://www.technovelty.org/linux/plt-and-got-the-key-to-code-sharing-and-dynamic-libraries.html
.rela.dyn is present if on extern function is included :
$ cat test.c
extern int foo;
int function(void) {
return foo;
}
$ gcc -shared -fPIC -o libtest.so test.c
Also see https://eli.thegreenplace.net/2011/08/25/load-time-relocation-of-shared-libraries/ and
https://eli.thegreenplace.net/2011/11/03/position-independent-code-pic-in-shared-libraries
https://blog.ramdoot.in/how-can-i-link-a-static-library-to-a-dynamic-library-e1f25c8095ef (linking static libraries into a dynamic library)

Can I use a generic ELF linker for an unknown (proprieatary) architecture ELF objects?

I have a number of object files in ELF format, with the usual .text and other common sections, and I was wondering if the gnu ld or gold could be used to link a number of ELF object files into an ELF executable, even if the architecture (an 8-bit micro with a proprietary toolchain) is not known beforehand by the linker. In essence I'm asking if the linking process is, to some extent, platform independent once you have all the required obect files, or if on the contrary I will need to roll my own linker at some point.
No, it won't work.
A major thing the linker has to do is to handle relocations. Relocations are arch-specific:
int f(){return 42;}
$ gcc -c foo.c -o foo && readelf -r foo
Relocation section '.rela.eh_frame' at offset 0x198 contains 1 entry:
Offset Info Type Sym. Value Sym. Name + Addend
000000000020 000200000002 R_X86_64_PC32 0000000000000000 .text + 0
$ gcc -m32 -c foo.c -o foo && readelf -r foo
Relocation section '.rel.text' at offset 0x1d0 contains 2 entries:
Offset Info Type Sym.Value Sym. Name
00000004 00000b02 R_386_PC32 00000000 __x86.get_pc_thunk.ax
00000009 00000c0a R_386_GOTPC 00000000 _GLOBAL_OFFSET_TABLE_
Relocation section '.rel.eh_frame' at offset 0x1e0 contains 2 entries:
Offset Info Type Sym.Value Sym. Name
00000020 00000202 R_386_PC32 00000000 .text
00000040 00000502 R_386_PC32 00000000 .text.__x86.get_pc_thu
$ clang -target arm-linux-gnueabi -c foo.c -o foo && readelf -r foo
Relocation section '.rel.ARM.exidx' at offset 0x104 contains 1 entry:
Offset Info Type Sym.Value Sym. Name
00000000 0000032a R_ARM_PREL31 00000000 .text
Moreover the linker script which says how the ELF file should be generated (page size, start address, etc.) is arch-specific:
ld -m elf_x86_64 --verbose
ld -m elf_i386 --verbose
arm-linux-gnueabi-ld --verbose
If your not compiling to a static executable, the linker has to generate PLT entries as well which are native code (and thus arch-specific).
Some architecture have arch-specific segments as well (eg. .ARM.extab, .ARM.exidx).

Output relocatable section data from linker script

Using commands like BYTE or LONG, it is possible to include explicit bytes of data in an output section from a linker script. The linked page also describes that those commands can be used to output the value of symbols.
I would have expected that if you perform partial linking (i.e., using the -r option of ld), relocation records would be emitted for the symbols that are outputted in this way. However, it seems that the linker just outputs the currently known value1 of the symbol.
Here is a MWE to clarify what I mean.
test.c:
int foo = 1, bar = 2;
test.ld:
SECTIONS {
.data : {
*(.data)
LONG(foo)
LONG(bar)
}
}
Then run the following:
$ gcc -c test.c
$ ld -T test.ld -r -o test.elf test.o
$ readelf -r test.elf
There are no relocations in this file.
$ readelf -x .data test.elf
Hex dump of section '.data':
0x00000000 01000000 02000000 00000000 04000000 ................
As you can see, no relocations are created and the values that are outputted are the currently known values of foo and bar.
Could this be a bug? If not, is there any way to force the linker to output relocation records for symbols added to an output section?
1 I'm not sure of this is the correct term. What I mean is the value that you see when you run readelf -s on the input object file.

How to make objdump show assembly of sections only appeared in source code?

I would like to produce assemblies like the one in the answer of this question Using GCC to produce readable assembly?
for simple test code: test.c
void main(){
int i;
for(i=0;i<10;i++){
printf("%d\n",i);
}
}
gcc command : gcc -g test.c -o test.o
objdump command: objdump -d -M intel -S test.o
But what i got is assemblies starts with .init section
080482bc<_init>: and end with .fini section 080484cc<_fini>
which i do not want them to be shown.
why is this happening ? and how can i avoid showing sections that are not in the source file?
Right now you're creating an executable file and not an object file. The executable file of course contains lot of extra sections.
If you want to create an object file, use the -c flag to GCC.
You can specify sections using -j option.
So objdump -d executable -j .text -j .plt will only show disassembly from .text and .plt sections.

Why does the compiler version appear in my ELF executable?

I've recently compiled a simple hello world C program under Debian Linux using gcc:
gcc -mtune=native -march=native -m32 -s -Wunused -O2 -o hello hello.c
The file size was 2980 bytes. I opened it in a hex editor and i saw the following lines:
GCC: (Debian 4.4.5-8) 4.4.5 GCC: (Debian 4.4.5-10) 4.4.5 .shstrtab .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .text .fini .rodata .eh_frame .ctors .dtors .jcr .dynamic .got .got.plt data.data .bss .comment
Are they really needed? No way to reduce executable size?
use -Qn to avoid that.
aa$ touch hello.c
aa$ gcc -c hello.c
aa$ objdump -s hello.o
hello.o: file format elf32-i386
Contents of section .comment:
0000 00474343 3a202844 65626961 6e20342e .GCC: (Debian 4.
0010 372e322d 35292034 2e372e32 00 7.2-5) 4.7.2.
aa$ gcc -Qn -c hello.c
aa$ objdump -s hello.o
hello.o: file format elf32-i386
aa$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/i486-linux-gnu/4.7/lto-wrapper
Target: i486-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 4.7.2-5' --with-bugurl=file:///usr/share/doc/gcc-4.7/README.Bugs --enable-languages=c,c++,go,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.7 --enable-shared --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.7 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object --enable-plugin --enable-objc-gc --enable-targets=all --with-arch-32=i586 --with-tune=generic --enable-checking=release --build=i486-linux-gnu --host=i486-linux-gnu --target=i486-linux-gnu
Thread model: posix
gcc version 4.7.2 (Debian 4.7.2-5)
aa$
That's in a comment section in the ELF binary. You can strip it out:
$ gcc -m32 -O2 -s -o t t.c
$ ls -l t
-rwxr-xr-x 1 me users 5488 Jun 7 11:58 t
$ readelf -p .comment t
String dump of section '.comment':
[ 0] GCC: (Gentoo 4.5.1-r1 p1.4, pie-0.4.5) 4.5.1
[ 2d] GCC: (Gentoo 4.5.2 p1.1, pie-0.4.5) 4.5.2
$ strip -R .comment t
$ readelf -p .comment t
readelf: Warning: Section '.comment' was not dumped because it does not exist!
$ ls -l t
-rwxr-xr-x 1 me users 5352 Jun 7 11:58 t
The gains are tiny though, not sure it's worth it.
This is in a comment section which isn't loaded in memory (and note that ELF files usually use padding so that memory mapping them will keep a correct alignment). If you want to get rid of such unneeded sections, see the various objcopy options and find out:
objcopy --remove-section .comment a.o b.o
I had the same issue myself, but using MinGW's GCC implementation - stripping the executable and passing the -Qn option did nothing, and I couldn't remove the ".comment" section as there wasn't one.
In order to stop the compiler including this information, regardless of which sections are in your executable, you can pass the -fno-ident parameter to the compiler and linker:
Without the parameter (strings -a [filename]):
!This program cannot be run in DOS mode.
.text
0`.rdata
0#.idata
GCC: (tdm64-2) 4.8.1
With the parameter:
!This program cannot be run in DOS mode.
.text
0`.idata
It appears that you'd be able to 'just' strip that if you don't want it; See this page for a nice run-down.
http://timelessname.com/elfbin/
Note that the page (of course) also resorts to using assembly, which you may not want to do, but the general gist applies
You can inform the loader which sections to include in your output with a linker script. You can see what sections are included in the file using the objdump command. As you've noticed there's a good bit of 'junk' in an elf - junk that is until you wish you had it.
Note though, that the size of an elf executable file is not indicative of the memory foot print of the image as realized in memory. A lot of the 'junk' isn't in the memory image and the image can call sbreak and or mmap to acquire more memory, the elf file takes no account of stack usage - essentially all of your automatic variables are unaccounted for. These are only three examples others abound.

Resources