Loading HEX data into memory - c

I am compiling baremetal software (no OS) for the Beagleboard (ARM Cortex A8) with Codesourcerys GCC arm EABI compiler. Now this compiles to a binary or image file that I can load up with the U-Boot bootloader.
The question is, Can I load hexdata into memory dynamically in runtime (So that I can load other image files into memory)? I can use gcc objcopy to generate a hexdump of the software. Could I use this information and load it into the appropriate address? Would all the addresses of the .text .data .bss sections be loaded correctly as stated in the linker script?
The hexdata output generated by
$(OBJCOPY) build/$(EXE).elf -O binary build/$(EXE).bin
od -t x4 build/$(EXE).bin > build/$(EXE).bin.hex
look like this:
0000000 e321f0d3 e3a00000 e59f1078 e59f2078
0000020 e4810004 e1510002 3afffffc e59f006c
0000040 e3c0001f e321f0d2 e1a0d000 e2400a01
0000060 e321f0d1 e1a0d000 e2400a01 e321f0d7
... and so on.
Is it as simple as to just load 20 bytes for each line into the desired memory address and everything would work by just branching the PC into the correct address? Did I forget something?

when you use -O binary you pretty much give up your .text, .data. .bss control. For example if you have one word 0x12345678 at address 0x10000000 call that .text, and one word of .data at 0x20000000, 0xAABBCCDD, and you use -O binary you will get a 0x10000004 byte length file which starts with the 0x12345678 and ends with 0xAABBCCDD and has 0x0FFFFFFC bytes of zeros. try to dump that into a chip and you might wipe out your bootloader (uboot, etc) or trash a bunch of registers, etc. not to mention dealing with potentially huge files and an eternity to transfer to the board depending on how you intend to do that.
What you can do which is typical with rom based bootloaders, is if using gcc tools
MEMORY
{
bob : ORIGIN = 0x10000000, LENGTH = 16K
ted : ORIGIN = 0x20000000, LENGTH = 16K
}
SECTIONS
{
.text : { *(.text*) } > bob
.bss : { *(.bss*) } > ted AT > bob
.data : { *(.data*) } > ted AT > bob
}
The code (.text) will be linked as if the .bss and .data are at their proper places in memory , 0x20000000, but the bytes are loaded by the executable (an elf loader or -O binary, etc) tacked onto the end of .text. Normally you use more linkerscript magic to determine where the linker did this. On boot, your .text code should first zero the .bss and copy the .data from the .text space to the .data space and then you can run normally.
uboot can probably handle formats other than .bin yes? It is also quite easy to write an elf tool that extracts the different parts of binaries and makes your own .bins, not using objcopy. It is also quite easy to write code that never relies on .bss being zero nor has a .data. solving all of these problems.

If you can write to random addresses without an OS getting in the way, there's no point in using some random hex dump format. Just load the binary data directly to the desired address. Converting on the fly from hex to binary to store in memory buys you nothing. You can load binary data to any address using plain read() or fread(), of course.
If you're loading full-blown ELF files (or similar), you of course need to implement whatever tasks that particular format expects from the object loader, such as allocating memory for BSS data, possibly resolving any unresolved addresses in the code (jumps and such), and so on.

Yes, it is possible to write to memory (on an embedded system) during run-time.
Many bootloaders copy data from a read-only memory (e.g. Flash), into writeable memory (SRAM) then transfer execution to that address.
I've worked on other systems that can download a program from a port (USB, SD Card) into writeable memory then transfer execution to that location.
I've written functions that download data from a serial port and programmed it into a Flash Memory (and EEPROM) device.
For memory to memory copies, use either memcpy or write your own, use pointers that are assigned a physical address.
For copying data from a port to memory, figure out how to get data from a device (such as a UART) then copy the data from its register into your desired location, via pointers.
Example:
#define UART_RECEIVE_REGISTER_ADDR (0x2000)
//...
volatile uint8_t * p_uart_receive_reg = (uint8_t*) UART_RECEIVE_REGISTER_ADDR;
*my_memory_location = *p_uart_receive_reg; // Read device and put into memory.
Also, search Stack Overflow for "embedded C write to memory"

Related

How can i extract constants' addresses,added by compiler optimization, from ELF file?

I'm writing some code size analysis tool for my C program, using the output ELF file.
I'm using readelf -debug-dump=info to generate Dwarf format file.
I've noticed that My compiler is adding as a part of the optimization new consts, that are not in Dwarf file, to the .rodata section.
So .rodata section size includes their sizes but i don't have their sizes in Dwarf.
Here is an example fro map file:
*(.rodata)
.rodata 0x10010000 0xc0 /<.o file0 path>
0x10010000 const1
0x10010040 const2
.rodata 0x100100c0 0xa /<.o file1 path>
fill 0x100100ca 0x6
.rodata 0x100100d0 0x6c /<.o file2 path>
0x100100d0 const3
0x100100e0 const4
0x10010100 const5
0x10010120 const6
fill 0x1001013c 0x4
In file1 above, although i didn't declare on const variable - the compiler does, this const is taking space in .rodata yet there is no symbol/name for it.
Here is the code inside some function that generates it:
uint8 arr[3][2] = {{146,179},
{133, 166},
{108, 141}} ;
So compiler add some consts values to optimize the load to the array.
How can i extract theses hidden additions from data sections?
I want to be able to fully characterize my code - How much space is used in each file, etc...
I am guessing here - it will be linker dependent, but when you have code such as:
uint8 arr[3][2] = {{146,179},
{133, 166},
{108, 141}} ;
arr at run-time exists in r/w memory, but its initialiser will be located in R/O memory to be copied to the R/W memory when the array is initialised. The linker need only provide the address, because the size will be known locally as a compile-time constant embedded as a literal in the initializing code. Consequently the size information does not appear in the map, because the linker discards that information.
Length is however implicit by the address of adjacent objects for filled space. So for example:
The size of const1 for example is equal to const2 - const1 and for const6 it is 0x1001013c - const6.
It is all rather academic however - you have precise control over this in terms of the size of your constant initialisers. They are not magically created data unrelated to your code, and I am not convinced that thy are a product of optimization as you suggest. The non-zero initialisers must exist regardless of optimisation options, and in any case optimisation primarily affects the size and/or speed of code (.text) rather then data. The impact on data sizes is likely to relate only to padding and alignment and in debug builds possibly "guard-space" for overrun detection.
However there is no need at all for you to guess. You can determine how this data is used by inspecting the disassembly or observing its execution (at the instruction level) in a debugger - to see exactly where initialised variables are copying the data from. You could even place an read-access break-point at these addresses and you will determine directly what code is utilizing them.
to get the size of elf file in details use
"You can use nm and size to get the size of functions and ELF sections.
To get the size of the functions (and objects with static storage duration):
$ nm --print-size --size-sort --radix=d tst.o
The second column shows the size in decimal of function and objects.
To get the size of the sections:
$ size -A -d tst.o
The second column shows the size in decimal of the sections."
Tool to analyze size of ELF sections and symbol

RAM & ROM memory segments

There are different memory segments such as .bss, .text, .data,.rodata,....
I've failed to know which of them locates in RAM and which of them locates in FLASH memory, many sources have mentioned them in both sections of (RAM & ROM) memories.
Please provide a fair explanation of the memory segments of RAM and flash.
ATMEL studio compiler
ATMEGA 32 platform
Hopefully you understand the typical uses of those section names. .text being code, .rodata read only data, .data being non-zero read/write data (global variables for example that have been initialized at compile time), .bss read/write data assumed to be zero, uninitialized. (global variables that were not initialized).
so .text and .rodata are read only so they can be in flash or ram and be used there. .data and .bss are read/write so they need to be USED in ram, but in order to put that information in ram it has to be in a non-volatile place when the power is off, then copied over to ram. So in a microcontroller the .data information will live in flash and the bootstrap code needs to copy that data to its home in ram where the code expects to find it. For .bss you dont need all those zeros you just need the starting address and number of bytes and the bootstrap can zero that memory.
so all of them can/do live in both. but the typical use case is the read only ones are USED in flash, and the read/write USED in ram.
They are located wherever your project's linker script defines them to be located.
Some targets locate and execute code in ROM, while others may copy code from ROM to RAM on start-up and execute from RAM - usually for performance reasons on faster processors. As such .text and .rodata may be located in R/W or R/O memory. However .bss and .data cannot by definition be located in R/O memory.
ROM cannot be written to, but RAM can be written to.
ROM holds the (BIOS) Basic Input / Output System, but RAM holds the programs running and the data used.
ROM is much smaller than RAM.
ROM is non-volatile (permanent), but RAM is volatile.

RAM usage AT32UC3B0512

I'm searching for a way to see the RAM usage of my application running on an at32uc3b0512.
arv32-size.exe foo.elf tells me:
text data bss dec hex filename
263498 11780 86524 361802 5854a foo.elf
According to 'google', RAM usage is .data + .bss. But .data + .bss is already (11780+86524)/1024 = 96kb, which would mean that my RAM is full (at32uc3b0512 -> 96kb SRAM). But the application works as desired. Am I wrong???
The chip you are using has 96kB of RAM and that is also the sum of your .bss and .data sections. This does not mean that all of your RAM is being used up, rather it is merely showing how the RAM is being allocated.
The program on MCU is usually located in FLASH
this is not true if you have some OS present
and load program to memory on runtime from somewhere like SD card
not all MCU's can do that
I suspect that is not your case
The program Flash is 512 KByte big (I guess from your IC's number)
The SDRAM is used for C engine/OS,stack and heap
your chip has 96 KByte
the C engine is something like OS handling
dynamic allocations,heap,stack,subroutine calls
and including RTL used during compilation
and of coarse dummy Interrupt sub routines for unused interrupts...
When you compile program to ELF/HEX what ever
the compiler/linker tells you only
how big the program code and data is (located in program FLASH memory)
how big static variables you have
the rest is unknown until runtime itself
So if you need to know how big chunk of memory you take
then you need to extract it from runtime
by some RTL call to get memory status
or by estimating it yourself based on knowledge of
what your program does
how much of dynamic memory is used
heap/stack trashing/usage
recursions level, etc...
Or you can try to increasingly allocate memory until you hit out of memory
and count how big chunk you allocated altogether
then release it of coarse
the used memory is then ~ 96KB - altogether_allocated_memory
(+/-) granularity ...

Why is the entry point address in my executable 0x8048330? (0x330 being the offset of .text section)

I wrote a small program to add two integers and on using readelf -a executable_name it showed the entry point address in elf header as:
Entry point address: 0x8048330
How does my executable know this address beforehand even before loader loads it in memory?
elf_format.pdf says this member gives the virtual address to which the system first transfers control, thus starting the process. Can anyone please explain what is the meaning of this statement and what is the meaning of virtual address here?
Also let me know, from where the executable file gets the value of 0x8048330 as entry point address. Just for cross check I compiled another program and for that also, the entry point address remains the same value 0x8048330 (offset of .text section being 0x330 in both the cases).
For first question:
the entry point you saw, 0x8048330, is a virtual memory address (in the opposite, is physical memory).
This means your executive doesn't have to know what physical address to map. (after it loads with a loader)
It doesn't even have the access to the physical memory.
To the process of your program, your .text section always starts from 0x8048330, your system (OS and hardware) will then map it (the virtual address) to the physical memory at run-time.
mapping and managing physical memory is a lot of things, you can check on Google for more information.
For the second question
I'm not sure which part confused you so I'll try to cover them all:
Could more than one program have same entry point?
Yes, there could be another program with the same entry point 0x8048330. because this address is virtual, the programs will be mapped to different physical memory at run-time when you try to run them at the same time.
Does the entry always 0x8048330?
Well, Linux executives are start from 0x8048000, but the offset of .text section is related to other sections length. So no, it could be 0x8048034 or anything else.
Why it always start from 0x8048000?
I think it's kind of history thing, the designer of Linux picked this one for some unknown or even random reason. you can refer this thread to see what under that area.
The entry address is set by the link editor, at the time when it creates the
executable. The loader maps the program file at the address(es) specified
by the ELF headers before transferring control to the entry address.
To use a concrete example, consider the following:
% file a.out
a.out: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), statically linked, \
for GNU/Linux 2.6.15, not stripped
% readelf -e a.out
... snip ...
Elf file type is EXEC (Executable file)
Entry point 0x8048170
There are 6 program headers, starting at offset 52
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x000000 0x08048000 0x08048000 0x7cca6 0x7cca6 R E 0x1000
LOAD 0x07cf98 0x080c5f98 0x080c5f98 0x00788 0x022fc RW 0x1000
... snip ...
The first program header specifies that the contents of the file at
file offset 0 should be mapped to virtual address 0x08048000. The
file and memory sizes for this segment are 0x7cca6 bytes. This
segment is to be mapped in readable and executable but not writable
(it contains the program's code).
The entry point address specified in the ELF header is 0x8048170, which
falls inside the region containing program code.
The book "Linkers and Loaders" by John Levine is a good resource to consult on matters related to link editors and loaders.
About the virtual address question:
Normal userland applications work with virtual addresses which means they don't access directly the memory space. The OS (with the help of some microprocessor's special functions) maps this virtual addresses to physical addresses.
This way, the OS prevents applications from reading/writing into other applications memory or OS reserved memory. Also, this allows the paging of memory (use hard disk as memory) in a transparent way for the application.

where should the .bss section of ELF file take in memory?

It is known that .bss section was not stored in the disk, but the .bss section in memory should be initialized to zero. but where should it take in the memory? Is there any information displayed in the ELF header or the Is the .bss section likely to appear next to the data section, or something else??
The BSS is between the data and the heap, as detailed in this marvelous article.
You can find out the size of each section using size:
cnicutar#lemon:~$ size try
text data bss dec hex filename
1108 496 16 1620 654 try
To know where the bss segment will be in memory, it is sufficient to run readelf -S program, and check the Addr column on the .bss row.
In most cases, you will also see that the initialized data section (.data) comes immediately before. That is, you will see that Addr+Size of the .data section matches the starting address of the .bss section.
However, that is not always necessarily the case. These are historical conventions, and the ELF specification (to be read alongside the platform specific supplement, for instance Chapter 5 in the one covering 32-bit x86 machines) allows for much more sophisticated configurations, and not all of them are supported by Linux.
For instance, the section may not be called .bss at all. The only 2 properties that make a BSS section such are:
The section is marked with SHT_NOBITS (that is, it takes space in memory but none on the storage) which shows up as NOBITS in readelf's output.
It maps to a loadable (PT_LOAD), readable (PF_R), and writeable (PF_W) segment. Such a segment is also shorter on storage than it is in memory (p_filesz < p_memsz).
You can have multiple BSS sections: PowerPC executables may have .sbss and .sbss2 for uninitialized data variables.
Finally, the BSS section is not necessarily adjacent to the data section or the heap. If you check the Linux kernel (more in particular the load_elf_binary function) you can see that the BSS sections (or more precisely, the segment it maps to) may even be interleaved with code and initialized data. The Linux kernel manages to sort that out.

Resources