What is the role of .s files in a C project? - c

I am working with an ARM Cortex M3 chip (STM32F2) and ST provides a "standard peripheral library". It has some useful .c and .h files. It also has .s files.
What is the purpose of these .s files in the context of a C project? How do I get my compiler/linker/? to take them into account?

The .s extension is the convention used by GNU and many other tool-chains for assembler files.
Last I looked the STM32 Standard Peripheral Library itself contains no assembler files, however the CMSIS library contains start-up code for various STM32 parts, for example startup_stm32f2xx.s is start-up code for all STM32F2xx series devices. There are different implementations for different tool-chains; you need to build and link the file associated with your specific part and tool-chain. If you are using an example project that builds and runs or an IDE that creates part-specific projects for you, this will probably already have been done - if you have code that runs it certainly has.
How you build and link the code will depend on what tool-chain you are using. Most IDE based tools will automatically recognise the extension and invoke the assembler to generate an object file that will be linked like any other. The exact content differs slightly between tool-chain versions, but primarily creates the C runtime environment (stack and heap), initialises the processor, defines an initial interrupt/exception vector table, initialises static data and jumps to main().
The core of the file for the Keil/ARM RealView version for example looks like this:
; Reset handler
Reset_Handler PROC
EXPORT Reset_Handler [WEAK]
IMPORT SystemInit
IMPORT __main
LDR R0, =SystemInit
BLX R0
LDR R0, =__main
BX R0
ENDP
Reset_Handler is the address Program Counter (PC) register will be set to after a processor reset.
SystemInit is an external C code function that does the bulk of the initialisation - this may need customisation for your hardware. Cortex-M is unusual in that it can start running C code immediately after reset because the vector table includes both the reset address and the initial stack pointer address, which is automatically loaded to the SP register on reset. As a result you do not need much assembler knowledge to get one running.
__main() is the compiler supplied entry point for your C code. It is not the main() function you write, but performs initialisation for the standard library, static data, the heap before calling your `main()' function.
The GCC version is somewhat more involved since it does much of the work done by __main() in the Keil/ARM RealView version, but essentially it performs the same function.
Note that in the CMSIS SystemInit() is defined in system_stm32f2xx.c, and may need customisation for your board (correct crystal frequency, PLL setup, external SRAM configuration etc.). Because this is C code, and well commented, you will probably be more comfortable with it.

They usually contain assembly code. The assembler turns them into object files which are later linked by the linker with the main stuff. But I imagine it does depend on the compiler, toolchain etc.

The .s files usually contain the Vector tables. It defines what should the system do when an interrupt occurs. This table (code) is placed in a memory address defined by you in linker file. For example, every time a reset occurs what or rather where should your processor begin from , what code should it run. similarly, there are other handlers ( interrupt vectors). In STM32 , usually the controller loops on particular handlers.
As given in the below example:See this link for detailed explanation
.section INTERRUPT_VECTOR, "x"
.global _Reset
_Reset:
B Reset_Handler /* Reset */
B . /* Undefined */
B . /* SWI */
B . /* Prefetch Abort */
B . /* Data Abort */
B . /* reserved */
B . /* IRQ */
B . /* FIQ */
Reset_Handler:
LDR sp, =stack_top
BL c_entry
B .
This assembly code later is converted to object files and linked with your .c files and .ld to create a .elf or .bin files.

You've probably got a Keil-based development environment for your ST kit. Depending on the version of your compiler, the project file should have different sections for C, C++, and assembler code. In your IDE, open your project and look for "Project Properties" or something like it.
You can import and export symbols to and from the assembler code so that it and the C/C++ code will link. With Keil it all integrates reasonably well.
The EXPORT directive tells the assembler to make the specified symbol public so that your C/C++ code can link to it.
The IMPORT directive tells the assembler that the specified symbol is defined elsewhere and will be resolved at link time.

Related

GNU gcc ld: constant CRC value for special sections with references to external code

I am working on a project which contains of subset of code which needs to be validated from flash at runtime with a CRC value. This code is placed into its own section of flash using the linker and during build the CRC value is calculated and injected into the appropriate area of memory. Then at runtime the flash is read and CRCed and compared to the stored value. This is all working correctly and as intended.
The code which is placed into this special section of flash is considered critical which is why it needs to be verified periodically as correct at runtime. The CRC value is also supposed to be used to validate that no changes were made to the critical section from version to version. This is what is not working as expected.
When the changes to non-critical sections of code are made (for example things placed into the normal .text region of flash) there are small differences in the critical code. Upon examining the changes it appears that most, perhaps all, of the changes are due to external function/variable references which are not in the critical code section of flash. This of course makes sense because the linker will be inserting the calls to other functions wherever they might get placed in flash which of course can change.
Is it possible to force the linker to make references to external functions/variables static in this section of flash? I was thinking this could be accomplished with some kind of lookup table which contained virtual memory/function addresses and then actual memory/function addresses and the critical code section would only reference the virtual addresses?
You don't say what CPU you are using, but suppose you want to call routine do_stuff from your critical section, with signature int do_stuff(int a, intb). Then you need a header in your critical section:
int tramp_do_stuff(int a, int b);
and an assembly file with a trampoline for each function in your normal section:
.org 7ffff00H ;; however you specify a fixed address in your asm
_tramp_do_stuff: ;; this address is fixed
JMP do_stuff ;; This address gets set by linker
_tramp_next_trampoline:

Compiling PowerPC binary with gcc and restrict useable registers

I have a PowerPC device running a software and I'd like to modify this software by inserting some own code parts.
I can easily write my own assembler code, put it somewhere in an unused region in RAM, replace any instruction in the "official" code by b 0x80001234 where 0x80001234 is the RAM address where my own code extension is loaded.
However, when I compile a C code with powerpc-eabi-gcc, gcc assumes it compiles a complete program and not only "code parts" to be inserted into a running program.
This leads to a problem: The main program uses some of the CPUs registers to store data, and when I just copy my extension into it, it will mess with the previous contents.
For example, if the main program I want to insert code into uses register 5 and register 8 in that code block, the program will crash if my own code writes to r5 or r8. Then I need to convert the compiled binary back to assembler code, edit the appropriate registers to use registers other than r5 and r8 and then compile that ASM source again.
Waht I'm now searching for is an option to the ppc-gcc which tells it "never ever use the PPC registers r5 and r8 while creating the bytecode".
Is this possible or do I need to continue crawling through the ASM code on my own replacing all the "used" registers with other registers?
You should think of another approach to solve this problem.
There is a gcc extension to reserve a register as a global variable:
register int *foo asm ("r12");
Please note that if you use this extension, your program does no longer confirm to the ABI of the operating system you are working on. This means that you cannot call any library functions without risking program crashes, overwritten variables, or crashes.

ARM + gcc: don't use one big .rodata section

I want to compile a program with gcc with link time optimization for an ARM processor. When I compile without LTO, the system gets compiled. When I enable LTO
(with -flto), I get the following assembler-error:
Error: invalid literal constant: pool needs to be closer
Looking around the web I found out that this has something to do with the constants in my system, which are placed in a special section called .rodata, which is called a constant pool and is placed right after the .text section in my system. It seems that when compiling with LTO because of inlining and other optimizations this .rodata section gets too far away from the instructions, so that the addressing of the constants is not possible anymore. Is it possible to place the constants right after the function that uses them? Or is it possible to use another addressing mode so the .rodata section can still be addressed? Thanks.
This is an assembler message, not a linker message, so this happens before sections are generated.
The assembler has a pseudo instruction for loading constants into registers:
ldr r0, =0x12345678
this is expanded into
ldr r0, [constant_12345678, r15]
...
bx lr
constant_12345678:
dw 0x12345678
The constant pool usually follows the return instruction. With function inlining, the function can get long enough that the return instruction is too far away; unfortunately, the compiler has no idea of the distance between memory addresses, and the assembler has no idea of control flow other than "flow does not pass beyond the return instruction, so it is safe to emit the constant pool here".
Unfortunately, there is no good solution at the moment.
You could try an asm block containing
b 1f
.ltorg
1:
This will force-emit the constant pool at this point, at the cost of an extra branch instruction.
It may be possible to instruct the assembler to omit the branch if the constant pool is empty, but I cannot test that at the moment, so this is probably not valid:
.if (2f - 1f)
.b 2f
.endif
1:
.ltorg
2:
"This is an assembler message, not a linker message, so this happens before sections are generated" - I am not sure but I think it is a little bit more complicated with LTO. Compiling (including assembling) of the individual c-files with LTO enabled works fine and does not cause any problems. The problem occurs when I try to link them together with LTO enabled. I don't know how LTO is exactly done, but apparently this also includes calling the assembler again and then I get this error message. When linking without LTO, everything is fine and when I look at the disassemly I can see that my constants are not placed after a function. Instead all constants are placed in the .rodata section. With LTO enabled because of inlining, my functions probably get to large to reach the constant pool...

ARM Cortex M-3 GCC/newlib initialization

I've just started to delve into the world of ARM Cortex-M microcontrollers, and I've decided not to use an existing development board or easy-to-use IDE, but to get right into the bare metal of these things, so I've got myself an STM32F103 soldered onto a prototyping board and am now trying to get things working with the gcc-arm-embedded Toolchain from Launchpad. After a hard time of reading manuals about linker scripts and the like, I have now written my own linker script and startup code that basically does nothing but copy the .data section from ROM to RAM, zero out .bss, then call SystemInit() from ST's Standard Peripheral Library to do the basic uC initialization and finally calling main().
Now, from the few tutorials I found about Cortex M-3 development, I saw that they use the -nostartfiles flag to the linker, but now I'm wondering: Do I have to initialize newlib by myself in that case? Or should I rather use the default start files from GCC/newlib and drop -nostartfiles? But in that case, I'd still have to do some initialization, like copying .data to RAM and setting up the vector table, which requires a custom linker script. So where do I do that?
And I do not even want to start thinking about C++!
So, what is the recommended way of initializing such a Cortex-M3 based microcontroller and its libc (not counting peripheral stuff)?
Thanks in advance!
As far as I know you shouldn't call any stdlib function for an bare C app. But you should for an C++ app, because there're static initializers, vtable for RTTI and so on to be initialized. newlib itself contains such functions from stdlib like mem*, *printf, and so on, fitted for a MCU with small ROM size, as far as I know.
But there's often nothing to be actively initialized. If an std-function does have a global data, it hopefully declares and stores it in some variables, that are stored in the .data section. E.g. __errno is a candidate for this. But you can't be sure what your newlib-implementation does, because it is up to developers to decide, how do they design the internal workflow in their lib.
Take a look at the code snippet below. This is a startup routine (Reset-Handler) written in C. ST delivers their startup file as an assembler file (*.s), but you could also do it in C. NXP on the other hand generate their projects with an .c startup file.
The function call below the comment for C++ could be omitted, if your app is only a C app. Symbols for _data and _idata are generated by linker (defined in a linker script).
__set_PSP((uint32_t)&_vStackTop); // set stack pointer
SCB->VTOR = (uint32_t)&VectorTable; // set the pointer to the vector table
pDest = &_data;
pSrc = &_idata;
// fill .data section
for ( ; pDest < &_edata; )
{
*pDest = *pSrc;
++pSrc;
++pDest;
}
// fill .bss section
for (pDest = &_bss; pDest < &_ebss; ++pDest)
{
*pDest = 0;
}
//
// Call C++ library initialization, if your app is an C++ app
//
__libc_init_array();
main(); // enter main
for(;;) // you shouldn't land here at anytime
{
}

Position independent code, shared libraries and code veneers - getting them to work together

I'm developing for an embedded platform and I'm having a hard time working out how to link shared libraries dynamically. I'm using the bFLT file format and I don't have control over where the executable and shared library is loaded.
My loader correctly loads the shared library and executable into memory and modifies the executable's GOT at run time to link to the shared library.
I can successfully take the address of the function and I know it's correct from disassembling the code at that location. However, if I try to call the function, the whole thing crashes.
Turns out GCC adds a 'code veneer' when calling shared library functions and takes a detour when the function is called and doesn't actually branch to the address of the function. The address that the code veneer branches to isn't relocated properly because it doesn't show up in the list of relocations in the executable binary.
The disassembly of the veneer looks like this:
000008d0 <__library_call_veneer>:
8d0: e51ff004 ldr pc, [pc, #-4] ; 8d4 <__library_call_veneer+0x4>
8d4: 03000320 .word 0x03000320 ; This address isn't correctly relocated!
If I take the address of the function and put it into a function pointer (therefore, bypassing the 'code veneer') and call it, the shared library works perfectly.
So for example:
#define DIRECT_LIB_CALL(x, args...) do { \
typeof(x) * volatile tmp = x; \
tmp(#args); \
} while (0)
DIRECT_LIB_CALL(library_call); /* works */
library_call(); /* crashes */
Is there a way to either, tell GCC to not produce a code veneer and branch directly to the address located in the GOT or somehow make the address that the code veneer branches show up in the list of relocations to perform?
I found a workaround to this problem. It's not the best or cleanest method but it does the job in my case.
I took advantage of the --wrap option in my linker which redirects symbols to __wrap_symbol. With this, I set up a awk script that automatically generates ASM files that load a properly relocated address into the pc. Any library calls would be redirected to this code. Basically what I did was make my own code veneers. Since the generated code veneer wasn't being referenced, it simply got optimized away.
Additionally, I had to place my veneers in the .data section since anything in the .text section was not relocated correctly. Since, the platform I'm working on doesn't differentiate between code and data that much, this hacky workaround works.
Here's a link to the project I'm working on where you can look up the specifics.

Resources