Can't strip away attributes and some symbols from elf - c

I am designing a risc-v processor and am using gcc to write some test programs for it.
I see these symbols in the elf file which don't seem to be really needed for the program execution, but I can't seem to be able to strip them away.
Here is my simple program:
// file: loop_c.c
int a = 0;
void _start() {
for (int i = 0; i < 10; ++i) {
a += 20;
}
}
I am compiling this into elf as follows:
riscv32-unknown-elf-gcc -static -nostdlib -T riscv32i.ld loop_c.c -o loop_c.elf
When I look into the hex contents of loop_c.elf, I see the following bits which I can't seem to be able to remove:
strip removes a few of them, but not all. I used the following command:
riscv32-unknown-elf-strip --strip-unneeded loop_c.elf
Is there any way to remove these bits completely and just set them to 0?
EDIT: Compiler version:
riscv32-unknown-elf-gcc (GCC) 11.1.0
EDIT2: Is there a name for the parts highlighted in the image above?
EDIT3: Okay, made some progress. The names of the sections that strip doesn't remove automatically are (in the second image above):
.shstrtab
.riscv.attributes
.sbss
I could remove the second and third ones with the following command:
riscv32-unknown-elf-strip -R .riscv.attributes loop_c.elf
riscv32-unknown-elf-strip -R .sbss loop_c.elf
But searching online, it seems that it's very hard or impossible to remove .shstrtab. I'm not sure why, but it seems that it's necessary for some reason.
EDIT4: My linker script. This is very bare bones for my CPU design. Obviously nothing close to what is used in the real world:
OUTPUT_FORMAT("elf32-littleriscv", "elf32-littleriscv", "elf32-littleriscv")
ENTRY(_start)
MEMORY
{
INST (rx) : ORIGIN = 0x1000, LENGTH = 0x1000 /* 4096 bytes or 1024 instructions max, 1 instruction = 4 bytes. */
DATA (rwx) : ORIGIN = 0x2000, LENGTH = 0x1000 /* 4096 bytes or 1024 words of data. 1 word = 4 bytes. */
}
SECTIONS
{
.text :
{
*(.text)
}> INST
.data :
{
*(.data)
}> DATA
}

Related

In arm assembly, how can I create an array then increment each element by 10, for example?

I would like to modify and complete an example found in my textbook (Harris-Harris). How can I make a program that declares an array of 5 elements for example and then increments each element by 10? This program must also print the elements of the array.
I've searched some resources and figured out that there are various ways to create an array in Assembly ARM. In these examples that I found, however, there are directives that I don't understand (for example .word or .skip) that are not explained in my textbook.
Hope this helps. This example uses the stack to allocate the array.
See assembly memory allocation directives for more help on allocating global variables. Moving the variable scores into global scope of the file test.c below can reveal the assembly code for that sort of solution.
What is below comes from a quick walk thru of Hello world for bare metal ARM using QEMU
Output of these steps will add 10 to 5 array integers and print their values. The code is in c but instructions are provided on how to display the code as assembly so you can study the array allocation and utilization.
Here is the expected output:
SCORE 10
SCORE 10
SCORE 10
SCORE 10
SCORE 10
ctrl-a x
QEMU: Terminated
These are the commands, notice startup.s is assembly code, test.c is c code like left side of your post, and test.ld is a linker script file to create the image QEMU needs test.bin to execute. Also, notice test.elf is available with debug symbols for use in gdb.
arm-none-eabi-as -mcpu=arm926ej-s -g startup.s -o startup.o
arm-none-eabi-gcc -c -mcpu=arm926ej-s -g test.c -o test.o
arm-none-eabi-ld -T test.ld test.o startup.o -o test.elf
arm-none-eabi-objcopy -O binary test.elf test.bin
qemu-system-arm -M versatilepb -m 128M -nographic -kernel test.bin
To examine assembly code in gdb see commands below.
Notice the array scores in this code is on the stack, read about local stack variables for ARM assembly here and examine the assembly instructions below at 0x10888 where sp is incremented by 24 (20 for scores and 4 for i).
qemu-system-arm -M versatilepb -m 128M -nographic -kernel test.bin -s -S
arm-none-eabi-gdb test.elf -ex "target remote:1234"
(gdb) b c_entry
(gdb) cont
(gdb) x/20i c_entry
0x10180 <c_entry>: push {r11, lr}
0x10184 <c_entry+4>: add r11, sp, #4
0x10188 <c_entry+8>: sub sp, sp, #24
I use a macbook and install tools like this, they end up in /usr/local/bin:
brew install --cask gcc-arm-embedded
brew install qemu
The c code used is similar to what you posted and combines the article snip w/ a function to print an integer. References are inline of the code.
// test.c
// https://balau82.wordpress.com/2010/02/28/hello-world-for-bare-metal-arm-using-qemu/
volatile unsigned int *const UART0DR = (unsigned int *)0x101f1000;
void print_uart0(const char *s) {
while (*s != '\0') { /* Loop until end of string */
*UART0DR = (unsigned int)(*s); /* Transmit char */
s++; /* Next char */
}
}
// https://www.geeksforgeeks.org/c-program-to-print-all-digits-of-a-given-number/
void print_int(int N) {
#define MAX 10
char arr[MAX]; // To store the digit of the number N
int i = MAX - 1;
int minus = ( N < 0 );
int r;
arr[i--] = 0;
while (N != 0) { // Till N becomes 0
r = N % 10; // Extract the last digit of N
arr[i--] = r + '0'; // Put the digit in arr[]
N = N / 10; // Update N to N/10 to extract next last digit
}
arr[i] = ( minus ) ? '-' : ' ';
print_uart0(arr + i );
}
void c_entry() {
int i;
int scores[5] = {0, 0, 0, 0, 0};
for (i = 0; i < 5; i++) {
scores[i] += 10;
print_uart0("SCORE ");
print_int( scores[i] );
print_uart0("\n");
}
}
Using the same startup.s assembly from article:
.global _Reset
_Reset:
LDR sp, =stack_top
BL c_entry
B .
Using the same linker script from article:
ENTRY(_Reset)
SECTIONS
{
. = 0x10000;
.startup . : { startup.o(.text) }
.text : { *(.text) }
.data : { *(.data) }
.bss : { *(.bss COMMON) }
. = ALIGN(8);
. = . + 0x1000; /* 4kB of stack memory */
stack_top = .;
}

Is it possible to make a hardcoding with the help of the command objcopy

I'm working on Linux and I've just heard that there was a command objcopy, I've found the relative command on my x86_64 PC: x86_64-linux-gnu-objcopy.
With its help, I can convert a file into an obj file: x86_64-linux-gnu-objcopy -I binary -O elf64-x86-64 custom.config custom.config.o
The file custom.config is a human-readable file. It contains two lines:
name titi
password 123
Now I can execute objdump -x -s custom.config.o to check its information.
custom.config.o: file format elf64-little
custom.config.o
architecture: UNKNOWN!, flags 0x00000010:
HAS_SYMS
start address 0x0000000000000000
Sections:
Idx Name Size VMA LMA File off Algn
0 .data 00000017 0000000000000000 0000000000000000 00000040 2**0
CONTENTS, ALLOC, LOAD, DATA
SYMBOL TABLE:
0000000000000000 l d .data 0000000000000000 .data
0000000000000000 g .data 0000000000000000 _binary_custom_config_start
0000000000000017 g .data 0000000000000000 _binary_custom_config_end
0000000000000017 g *ABS* 0000000000000000 _binary_custom_config_size
Contents of section .data:
0000 6e616d65 20746974 690a7061 7373776f name titi.passwo
0010 72642031 32330a rd 123.
As all we know, we can open, read or write a file, such as custom.config in any C/C++ project. Now, I'm thinking if it's possible to use this obj file custom.config.o immediately in a C/C++ project. For example, is it possible to read the content of the file custom.config.o immediately without calling the I/O functions, such as open, read or write. If possible, I think this might become some kind of hardcoding style and avoid calling the I/O functions?
Even if I tried this on Win10 with MinGW (MinGW-W64 project, GCC 8.1.0), this should work for you with only minor adaptions.
As you see from the info objdump gave you, the file's contents is placed in the .data section that is the common section for non-constant variables.
And some symbols were defined for it. You can declare these symbols in your C source.
The absolute value _binary_custom_config_size is special, because it is marked *ABS*. Currently I know no other way to obtain its value than to declare a variable of any type and take its address.
This is my show_config.c:
#include <stdio.h>
#include <string.h>
extern const char _binary_custom_config_start[];
extern const char _binary_custom_config_size;
int main(void) {
size_t size = (size_t)&_binary_custom_config_size;
char config[size + 1];
strncpy(config, _binary_custom_config_start, size);
config[size] = '\0';
printf("config = \"%s\"\n", config);
return 0;
}
Because the "binary" file (actually a text) has no final '\0' character, you need to append one to get a correctly terminated C string.
You could as well declare _binary_custom_config_end and use it to calculate the size, or as a limit.
Building everything goes like this (I used the -g option to debug):
$ objcopy -I binary -O elf64-x86-64 -B i386 custom.config custom.config.o
$ gcc -Wall -Wextra -pedantic -g show_config.c custom.config.o -o show_config
And the output shows the success:
$ show_config.exe
config = "name titi
password 123"
If you need the file's contents in another section, you will add the option to rename the section to objcopy's call. Add any flag you need, the example shows .rodata that is used for read-only data:
--rename-section .data=.rodata,alloc,load,readonly,data,contents

Ambiguous behaviour of .bss segment in C program

I wrote the simple C program (test.c) below:-
#include<stdio.h>
int main()
{
return 0;
}
and executed the follwing to understand size changes in .bss segment.
gcc test.c -o test
size test
The output came out as:-
text data bss dec hex filename
1115 552 8 1675 68b test
I didn't declare anything globally or of static scope. So please explain why the bss segment size is of 8 bytes.
I made the following change:-
#include<stdio.h>
int x; //declared global variable
int main()
{
return 0;
}
But to my surprise, the output was same as previous:-
text data bss dec hex filename
1115 552 8 1675 68b test
Please explain.
I then initialized the global:-
#include<stdio.h>
int x=67; //initialized global variable
int main()
{
return 0;
}
The data segment size increased as expected, but I didn't expect the size of bss segment to reduce to 4 (on the contrary to 8 when nothing was declared). Please explain.
text data bss dec hex filename
1115 556 4 1675 68b test
I also tried the comands objdump, and nm, but they too showed variable x occupying .bss (in 2nd case). However, no change in bss size is shown upon size command.
I followed the procedure according to:
http://codingfox.com/10-7-memory-segments-code-data-bss/
where the outputs are coming perfectly as expected.
When you compile a simple main program you are also linking startup code.
This code is responsible, among other things, to init bss.
That code is the code that "uses" 8 bytes you are seeing in .bss section.
You can strip that code using -nostartfiles gcc option:
-nostartfiles
Do not use the standard system startup files when linking. The standard system libraries are used normally, unless -nostdlib or -nodefaultlibs is used
To make a test use the following code
#include<stdio.h>
int _start()
{
return 0;
}
and compile it with
gcc -nostartfiles test.c
Youll see .bss set to 0
text data bss dec hex filename
206 224 0 430 1ae test
Your first two snippets are identical since you aren't using the variable x.
Try this
#include<stdio.h>
volatile int x;
int main()
{
x = 1;
return 0;
}
and you should see a change in .bss size.
Please note that those 4/8 bytes are something inside the start-up code. What it is and why it varies in size isn't possible to tell without digging into all the details of mentioned start-up code.

How to insert data into compiled binary for MCU

I am trying to insert a md5 hash of part of my binary into the binary, for keeping track of MCU FW version.
I have approached it like this:
in the link script I have split the flash in two sections
MEMORY
{
FLASH0 (rx) : ORIGIN = 0x8000000, LENGTH = 64K - 16
FLASH1 (r) : ORIGIN = 0x800FFF0, LENGTH = 16
RAM (xrw) : ORIGIN = 0x20000000, LENGTH = 8K
}
Then I have specified a output section like so:
.fw_version :
{
KEEP(*(.fw_version))
} >FLASH1
Next I have my firmware_version.c file containing only:
#define FW_VERSION_SIZE 16
const unsigned char FW_VERSION[FW_VERSION_SIZE]
__attribute__((section(".fw_version"), used)) = {0};
Then after the binary is compiled and objcopy has been used to create a .bin file I have a 65536 B large file, I split that file at 65520 bytes, do a md5 checksum of the first part and insert that into the second part (16 B). Lastly I do cat parta partb > final.bin.
When i examine this binary with hexdump I can see that the md5 checksum is indeed at the end.
Using objdump -h I get:
...
8 .fw_version 00000010 0800fff0 0800fff0 00017ff0 2**2
...
and objdump -t gives:
...
0800fff0 g O .fw_version 00000010 FW_VERSION
...
I thought that this meant that I could just use FW_VERSION[i] to get part i of the md5 checksum from within the mcu fw but when I examine the memory in gdb I get that it's all zeroed out like it was never changed.
What am I missing here?
[edit] the device is a stm32f030c8t6 arm cortex m0 programmed through gdb.
Like I commented under the question I found that the (one) reason for it not working was that while I was manipulating the .bin file while I loaded the .elf file when programming with gdb.
It should (could) have worked if I used a programmer or bootloader to download the .bin file to the target.
I found a better (I think) way of doing it though.
Compile all the sources in the project to .o files.
cat *.o > /tmp/tmp.something_unique. I used $(shell mktemp) in the Makefile
openssl dgst -md5 -binary /tmp/tmp.something_unique > version_file
objcopy -I binary -O elf32-littlearm -B arm version_file v_file.o
linkscript has a section .fw_version : { KEEP(v_file.o(.data)) } >FLASH1
link application
in application get the address of the version number by doing extern unsigned char _binary_version_file_start; uint8_t *FW_VERSION = &_binary_version_file_start; const size_t FW_VERSION_SIZE = (size_t) &_binary_version_file_size;. Note that the uses of & are correct.
This will result in the checksum being taken over all the objects that are compiled from source and then this checksum is linked into the binary that is flashed in the target.

huge binary files with objcopy

Im having problems when I define global variables in a basic C program for an ARM9 processor. I'm using EABI GNU compiler and the binary generated from a 12KB elf is 4GB! I assume the issue is with my scatter file but Im having trouble getting my head around it.
I have 256KB of ROM (base address 0xFFFF0000) and 32KBs of RAM (base 0x01000000)
SECTIONS {
. = 0xFFFF0000;
.text : {
* (vectors);
* (.text);
}
.rodata : { *(.rodata) }
. = 0x01000000;
sbss = .;
.data : { *(.data) }
.bss : { *(.bss) }
ebss = .;
bssSize = ebss - sbss;
}
And my program is as follows:
int a=10;
int main() {
int b=5;
b = (a>b)? a : b;
return b;
};
If I declare a as a local variable, i.e. there is no .data section then everything works.
fine. Any help greatly appreciated.
--16th March 2011--
Can anyone help with this, Im getting nowhere and have read the manuals, forums etc...
My boot, compile command and objcopy commands are pasted below
.section "vectors"
reset: b start
undef: b undef
swi: b swi
pabt: b pabt
dabt: b dabt
nop
irq: b irq
fiq: b fiq
.text
start:
ldr sp, =0x01006000
bl main
stop: b stop
arm-none-eabi-gcc -mcpu=arm926ej-s -Wall -nostartfiles -Wall main.c boot.s -o main.elf -T \ scatter_file
arm-none-eabi-objcopy ./main.elf --output-target=binary ./main.bin
arm-none-eabi-objdump ./main.elf --disassemble-all > ./main.dis
I found the problem. The objcopy command will try to create the entire address space described in the linker script, from the lowest address to the highest including everything in between. You can tell it to just generate the ROM code as follows:
objcopy ./main.elf -j ROM --output-target=binary ./main.bin
I also changed the linker script slightly
MEMORY {
ram(WXAIL) : ORIGIN = 0x01000000, LENGTH = 32K
rom(RX) : ORIGIN = 0xFFFF0000, LENGTH = 32K
}
SECTIONS {
ROM : {
*(vectors);
*(.text);
*(.rodata);
} > rom
RAM : {
*(.data);
*(.bss);
} > ram
}
You are creating a file which will starts at address 0x01000000 and will contains at least up to address 0xFFFF0000. No wonder that it is nearly 4GB. What would you like? Try with options -R to remove the data segments if you don't want them (as it is probably the case if you are preparing a ROM initialization file).
Adding the (NOLOAD) argument worked for me. E.g.
MEMORY {
ram(WXAIL) : ORIGIN = 0x01000000, LENGTH = 32K
rom(RX) : ORIGIN = 0xFFFF0000, LENGTH = 32K
}
SECTIONS {
ROM : {
*(vectors);
*(.text);
*(.rodata);
} > rom
RAM (NOLOAD) : {
*(.data);
*(.bss);
} > ram
}

Resources