GNU LD filling unused space - linker

I'm trying to understand the behaviour of the GNU linker and how sections are treated.
I'm editing the stm32_flash.ld file in this stm32 project.
When I modify the linker script to put the following as the first section:
.my_test :
{
. = ALIGN(4);
KEEP(*(.my_test))
LONG(0xdeadbeef);
. = ALIGN(4);
} >FLASH
I can see the built binary has the 0xdeadbeef as the first bytes, as I would expect.
$ od -An -tx1 -w1 -v build/program.bin | head
ef
be
ad
de
00
a0
00
20
31
5e
However, if I use the following as the first section:
.my_test :
{
. = ALIGN(8);
KEEP(*(.my_test))
FILL(0xDEADBEEF)
. = 0x8000;
} > FLASH
Then it looks like the linker completely skips this section:
$ od -An -tx1 -w1 -v build/program.bin | head
00
a0
00
20
2d
de
00
08
c1
d9
But I would expect the first 0x8000 bytes to be filled with 0xdeadbeef. Why is the linker ignoring my section?

Does any of your code "emit" to section .my_test?
With your first example, there was a command to "emit" four bytes into that section with the LONG command.
With your second example, there is no explicit "emit" command. FILL only takes effect when it knows how much to fill: nothing emitted, nothing filled.
You either need to tell your code to put some code or data into your .my_test section using __attribute__((__section__(".my_test")) , or simply edit your second example with a LONG again:
.my_test :
{
. = ALIGN(8);
KEEP(*(.my_test))
FILL(0xDEADBEEF)
. = 0x7FFC; /* Note shorter to accommodate following LONG */
LONG(0xBEEFDEAD) /* Deliberately reversed to demonstrate the result */
} > FLASH

Related

Can't strip away attributes and some symbols from elf

I am designing a risc-v processor and am using gcc to write some test programs for it.
I see these symbols in the elf file which don't seem to be really needed for the program execution, but I can't seem to be able to strip them away.
Here is my simple program:
// file: loop_c.c
int a = 0;
void _start() {
for (int i = 0; i < 10; ++i) {
a += 20;
}
}
I am compiling this into elf as follows:
riscv32-unknown-elf-gcc -static -nostdlib -T riscv32i.ld loop_c.c -o loop_c.elf
When I look into the hex contents of loop_c.elf, I see the following bits which I can't seem to be able to remove:
strip removes a few of them, but not all. I used the following command:
riscv32-unknown-elf-strip --strip-unneeded loop_c.elf
Is there any way to remove these bits completely and just set them to 0?
EDIT: Compiler version:
riscv32-unknown-elf-gcc (GCC) 11.1.0
EDIT2: Is there a name for the parts highlighted in the image above?
EDIT3: Okay, made some progress. The names of the sections that strip doesn't remove automatically are (in the second image above):
.shstrtab
.riscv.attributes
.sbss
I could remove the second and third ones with the following command:
riscv32-unknown-elf-strip -R .riscv.attributes loop_c.elf
riscv32-unknown-elf-strip -R .sbss loop_c.elf
But searching online, it seems that it's very hard or impossible to remove .shstrtab. I'm not sure why, but it seems that it's necessary for some reason.
EDIT4: My linker script. This is very bare bones for my CPU design. Obviously nothing close to what is used in the real world:
OUTPUT_FORMAT("elf32-littleriscv", "elf32-littleriscv", "elf32-littleriscv")
ENTRY(_start)
MEMORY
{
INST (rx) : ORIGIN = 0x1000, LENGTH = 0x1000 /* 4096 bytes or 1024 instructions max, 1 instruction = 4 bytes. */
DATA (rwx) : ORIGIN = 0x2000, LENGTH = 0x1000 /* 4096 bytes or 1024 words of data. 1 word = 4 bytes. */
}
SECTIONS
{
.text :
{
*(.text)
}> INST
.data :
{
*(.data)
}> DATA
}

Link an ELF binary with a c program

Given only access to a standalone ELF program I want to be able to call a function within the program from my own program.
Let's say the below code is main.c
#include <stdio.h>
extern int mystery(int a,int b);
int main() {
int a = 0;
int b = 1;
printf("mystery(a,b) = %d\n",mystery(a,b));
return 0;
}
The function mystery exists in some elf file not_my_program.
What I'm trying to do is something along the lines of
gcc main.c not_my_program
However this gives me an undefined reference error to mystery . I've looked for methods
on forums and found that converting this elf file into a shared object file is not possible. I've also looked into compiling main.c into a relocatable object file with
gcc -c main.c
and then using ld to link the elf with main.o but I could not figure out how to do it. The elf is 32 bit but I've omitted the -m32 flag. If the flag is different for ld please let me know. Any help would be very much appreciated.
edit:
output of readelf -h not_my_program
ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: DYN (Shared object file)
Machine: Intel 80386
Version: 0x1
Entry point address: 0x10e0
Start of program headers: 52 (bytes into file)
Start of section headers: 15116 (bytes into file)
Flags: 0x0
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 11
Size of section headers: 40 (bytes)
Number of section headers: 30
Section header string table index: 29
This hacky way worked with a very simple case.
[ aquila ~ ] $ cat 1.c
int func (int a) { return a * (a-1) ; }
int main(int argc) { return func (argc) ; }
[ aquila ~ ] $ cc 1.c
[ aquila ~ ] $ ./a.out ; echo $?
0
[ aquila ~ ] $ readelf -s a.out | grep func
43: 0000000000400487 19 FUNC GLOBAL DEFAULT 11 func
[ aquila ~ ] $ cat 2.c
#include <stdlib.h>
static __attribute__((constructor)) void main() {
int (*func)() = (int (*)())0x0000000000400487;
exit(func(3));
}
[ aquila ~ ] $ cc -fPIC -shared 2.c -o a.so
[ aquila ~ ] $ LD_PRELOAD=./a.so ./a.out ; echo $?
6
The caller in 2.c is made into a constructor with an exit so that the main program's main() is not called, in an attempt to limit the execution of the code other than the caller and func() itself. The return value being 6 instead of 0 shows both that the call worked and that the main program's main() did not get called.
Given only access to a standalone ELF program I want to be able to call a function within the program from my own program
It sounds like you have an XY problem.
While what you desire is technically possible, the difficulty of doing this is approximately 1000x of what you have tried so far. If you are not prepared to spend a month or two getting this working, you should look for other solutions.
Effectively you would have to write a custom ELF loader to load not_my_program into memory and initialize it, but then call mystery instead of main in it.
Note also that mystery may depend on global data, and that data may be initialized in main, so there is no guarantee that mystery will work at all when called before main.
P.S. Would it be sufficient to call mystery from a debugger? That can be achieved in under 30 seconds.

some of stm32 binaries cannot run because the first 8 bytes(isr_vector and entry points) of binaries is incorrect. but don't know how to solve

When I compile stm32 projects(using cmake), some large project cannot run but other small ones can(with exactly same configure)!
And after one day debugging, I believe the problem is at the first 2 words(isr_vector address and entry point address is incorrect)
Description:
At first I have one bootloader(binary 11k) and one project(130k), all can run perfectly with keils 5(windows). And I'm trying to move to linux, I use CMake reconstruct the project, replace startup.s/asm for gnuc, giving gcc ldscript(sections and memory), Now the bootloader can run from 0x8000000. But the project cannot run from either 0x8000000 or 0x80080000(certainly I changed isr, ldscripth and jlink script according to address);
Problem:
When compiling bootloader(or any thing can run) the first to words of binary looks like:
00 00 01 20 0d 1c 00 08 //0x20010000 & 0x08001c0d
which means isr_vecotr is placed in ram and programs main at 0x08001c0d. and when debugging the program runs at 0x08001c0d.
But, when I compile Project, the first words looks like:
5c 84 02 08 00 00 00 20 //0x0800845c & 0x20000000
And when I debuging using Ozone, the program runs at 0x20000000 and every thing is unbelievable.
Configure:
Toolchain File:
set(CMAKE_SYSTEM_NAME Generic)
set(CMAKE_SYSTEM_PROCESSOR arm)
SET(CMAKE_SYSTEM_VERSION 1)
set(CMAKE_CROSSCOMPILING 1)
set(CMAKE_C_COMPILER_WORKS 1)
set(CMAKE_CXX_COMPILER "arm-none-eabi-g++")
set(CMAKE_C_COMPILER "arm-none-eabi-gcc")
set(CMAKE_ASM_COMPILER "arm-none-eabi-gcc")
add_definitions(-DCMAKE_BUILD_TYPE=Release)
set(COMMON_FLAGS "-DSTM32F10X_HD -DUSE_STDPERIPH_DRIVER -DHSE_VALUE=8000000 --specs=nosys.specs -mfloat-abi=soft -MMD -mcpu=cortex-m3 -mthumb -mthumb-interwork -Wall")
Ldscript: both is same:(I just want to run without bootloader first, at 0x8000000)
MEMORY
{
RAM (xrw) : ORIGIN = 0x20000000, LENGTH = 64K
FLASH (rx) : ORIGIN = 0x08000000, LENGTH = 512K
CCMRAM (xrw) : ORIGIN = 0x00000000, LENGTH = 0
}
__stack = ORIGIN(RAM) + LENGTH(RAM);
_estack = __stack; /* STM specific definition */
__Main_Stack_Size = 1024 ;
PROVIDE ( _Main_Stack_Size = __Main_Stack_Size ) ;
__Main_Stack_Limit = __stack - __Main_Stack_Size ;
PROVIDE ( _Main_Stack_Limit = __Main_Stack_Limit ) ;
_Minimum_Stack_Size = 256 ;
PROVIDE ( _Heap_Begin = _end_noinit ) ;
PROVIDE ( _Heap_Limit = __stack - __Main_Stack_Size ) ;
ENTRY(_start)
SECTIONS
{
.isr_vector : ALIGN(4)
{
FILL(0xFF)
__vectors_start = ABSOLUTE(.) ;
__vectors_start__ = ABSOLUTE(.) ; /* STM specific definition */
KEEP(*(.isr_vector)) /* Interrupt vectors */
KEEP(*(.cfmconfig)) /* Freescale configuration words */
*(.after_vectors .after_vectors.*) /* Startup code and ISR */
} >FLASH
.inits : ALIGN(4)
{ .........bulabulabula}.....
Here I just wanna say I do place the isrvector at beginning of flash.... but the binary tell me not.
Envirement:
System : Ubuntu14.04.1(4.10.0)
Toolchain : arm-none-eabi-gcc 6.3.1
Debuger : Jlink 6.14b + Jlink Debugger 9.20
IDE : Clion 2017
Other : Cmake 3.8, uCos II,my ldscript, OZone.

How to place kernel_entry function beginning of the text section

My Kernel text section starts at address 0x80100000 & kernel_entry function is at the address 0x80585f70. I want kernel_entry should place beginning of the text section at address 0x80100000.
Starting address of text section
$ head -n 10 ../../../System.map
80100000 A _text
80100400 T __kernel_entry
80100400 T _stext
Entry point address in initrd image
$ readelf -h vmlinuz-initrd
ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: MIPS R3000
Version: 0x1
Entry point address: 0x80585f70
Start of program headers: 52 (bytes into file)
Start of section headers: 63759852 (bytes into file)
Flags: 0x50001001, noreorder, o32, mips32
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 2
Size of section headers: 40 (bytes)
Number of section headers: 25
Section header string table index: 22
$
I tried to solve the problem using linux linker script(vmlinux.lds) by adding head.o module at the beginning of text section.
vmlinux.lds linux linker script
SECTIONS
{
. = 0x80100000;
/* read-only */
_text = .; /* Text and read-only data */
.text : {
. = ALIGN(8); head.o(.ref.text)
. = ALIGN(8); *(.text.hot) *(.text) *(.ref.text) *(.devinit.text) *(.devexit.text) *(.cpuinit.text) *(.cpuexit.text) *(.text.unlikely)
. = ALIGN(8); __sched_text_start = .; *(.sched.text) __sched_text_end = .;
. = ALIGN(8); __lock_text_start = .; *(.spinlock.text) __lock_text_end = .;
. = ALIGN(8); __kprobes_text_start = .; *(.kprobes.text) __kprobes_text_end = .;
kernel_entry function format .ref.text
$ objdump -t head.o
head.o: file format elf32-little
SYMBOL TABLE:
00000000 l d .text 00000000 .text
00000000 l d .data 00000000 .data
00000000 l d .bss 00000000 .bss
00000000 l d .ref.text 00000000 .ref.text
00000000 l d .cpuinit.text 00000000 .cpuinit.text
00000000 l d .reginfo 00000000 .reginfo
00000000 l d .pdr 00000000 .pdr
00000400 g O .text 00000000 _stext
00000400 g F .text 00000000 __kernel_entry
00000000 g F .ref.text 000000c8 kernel_entry
But I am unable to change this due to vmlinux.lds automatically generated through vmlinux.lds.S file.
I tried to put the same line ". = ALIGN(8); head.o(.ref.text)"
from vmlinux.lds but head.o is not found while building the kernel.
Please help me how can I resolve the problem.
First of all, if you insert a symbol into a section, then all subsequent symbols in the section are shifted. If some code relies on the assumption that there is a certain object located at .text + hardcoded offset then the code will be broken.
If you decided to do that anyway, then you need to be able to edit the vmlinux.lds.S file or at least patch the generated vmlinux.lds before the linking stage. The vmlinux.lds.S is essentially a linker script + a bunch of C macros, so the syntax is pretty similar.
The basic idea is to put the kernel_entry() to a separate section called .kernel_entry and add a record to the vmlinux.lds.S that puts .kernel_entry to the .text before all other sections. The separate section is needed to ensure there is only one symbol in the section.
How to do that:
Put the kernel_entry() function to its own section. You may do that in the source code with a GCC attribute like this:
void kernel_entry() __attribute__((section(".kernel_entry")));
{
/* function body */
}
Put the .kernel_entry section to beginning of the .text in vmlinux.lds.S.
Each architecture has its own vmlinux.lds.S, so the exact contents of the file may be different, but in general, you may find a .text section defined in vmlinux.lds.S like this:
.text : {
TEXT_TEXT
SCHED_TEXT
LOCK_TEXT
KPROBES_TEXT
IRQENTRY_TEXT
*(.text.*)
} :text = 0
The TEXT_TEXT, SCHED_TEXT and so on are actually C macros that expand to some linker script commands. You may use the regular syntax and add *(.kernel_entry) to the beginning of the section, like this:
.text : {
*(.kernel_entry)
TEXT_TEXT
SCHED_TEXT
LOCK_TEXT
KPROBES_TEXT
IRQENTRY_TEXT
*(.text.*)
} :text = 0
That is it. The linker will put .kernel_entry (containing kernel_entry()) to the beginning of the .text which is seems to be what you want.
Good luck!

How to convert from binary to relocatable object file and back?

I wish to inject an object file into an existing binary. The method I am attempting is:
Convert a compiled binary into a relocatable object file.
Use gcc/ld to link the relocatable object file with the object file to be embedded.
Given the source:
#include <stdlib.h>
#include <stdio.h>
int main(void)
{
puts("main");
return EXIT_SUCCESS;
}
I compile this to host with the following:
gcc -Wall host.c -o host
I do the conversion to relocatable object file with:
objcopy -B i386 -I binary -O elf64-x86-64 host host.o
I then attempt a link with:
gcc host.o -o host
Ideally, this would relink the relocatable object file back to a binary. This would also give a chance to link in any extra object files. Unfortunately the command gives the following error:
/usr/lib/gcc/x86_64-linux-gnu/4.6.1/../../../x86_64-linux-gnu/crt1.o: In function `_start':
(.text+0x20): undefined reference to `main'
collect2: ld returned 1 exit status
My question is why is this error appearing and how would I go about properly relinking?
Something I tried was to link in another object file at this point which contained a dummy main (because I figured I could manually patch up the entry point later anyway), but what happened was that the new binary seemed to relocate the old code in a weird way with the symbol table completely messed up.
Extra Information
readelf on the binary yields the following:
mike#mike-ubuntu:~/Desktop/inject-obj$ readelf -h host
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x400410
Start of program headers: 64 (bytes into file)
Start of section headers: 4424 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 9
Size of section headers: 64 (bytes)
Number of section headers: 30
Section header string table index: 27
And on the relocatable object file:
mike#mike-ubuntu:~/Desktop/inject-obj$ readelf -h host.o
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: REL (Relocatable file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x0
Start of program headers: 0 (bytes into file)
Start of section headers: 8480 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 0 (bytes)
Number of program headers: 0
Size of section headers: 64 (bytes)
Number of section headers: 5
Section header string table index: 2
Rationale
For those interested, the rationale can be found here.
An executable file that is not PIE is impossible to make relocatable. Relocations have already been performed and the record of those relocations was thrown away. That is, relocating it would require finding all addresses of objects or functions inside the code and data of the binary, but it's impossible to determine whether a sequence of bytes is an address or some other sort of data or code.
There should be a way to do what you originally wanted to do (adding in new code), but the approach you're taking is doomed to failure.

Resources