STM32 linkerscript initialization sections, are they needed when using C? - c

As far as I know, sections like .init, .preinit_array, .init_array, .finit, .fini_array... found in STM32CubeIDE linkerscripts are used in C++ for calling the static objects' constructors that need to be executed before main (and the fini versions for the destructors).
My assumption is that these sections are used by functions called implicitly by the compiler and the C/C++ runtime libraries, but if your firmware is written in C, are all these sections really needed? What does the compiler do behind the scenes?

You can live without many of them.
Other than C++, some of them may initialize things required by the standard library. If you only call pure functions from the standard library and you only have code in C or assembly then you could try taking them out.
If you are trying to do this as a learning exercise, then take them out and just see what doesn't work. Also search on google, there are loads of sites that explain this sort of thing in a way that is far too long to reproduce here.
If you are just trying to get your project finished, then leave them alone. They only add a tiny amount to your program size and it isn't worth your time to fight with them.

are used in C++ for calling the static objects' constructors that need
to be executed before main (and the fini versions for the
destructors).
It is not 100% truth. cubeIDE uses gcc based ARM toolchain which has some extensions which may use some of those sections. For example, you can use use attributes to make functions which will be executed before main and/or called after the main function return.
void __attribute__((constructor)) called_before_main(void)
{
/* some code */
}
void __attribute__((destructor)) called_after_main(void)
{
/* some code */
}
If you even not use any of those, external libraries may use them. Even if you do not use external libraries keeping those sections does not hurt as they will be discarded if they are empty.

Related

Automatic compile-time mechanism for calling initialization code of modules in C

We are in the process of modularizing our code for an embedded device, trying to get from $%#!$ spaghetti code to something we can actually maintain. There are several versions of this device, which mostly differ in the amount of peripherals they have, some have an sd card, some have ethernet and so and on. Those peripherals need intialization code.
What I'm looking for is a way to execute the specific initialization code of each component just by putting the .h/.c files into the project (or not). In C++ right now I would be tempted to put a global object into each component that registers the neccessary functions/methods during its initialization. Is there something similiar for plain C code, preferably some pre-processor/compile-time magic?
It should look something like this
main.c
int main(void) {
initAllComponents();
}
sdio.c
MAGIC_REGISTER_COMPONENT(sdio_init)
STATUS_T sdio_init() {
...
}
ethernet.c
MAGIC_REGISTER_COMPONENT(ethernet_init)
STATUS_T ethernet_init() {
...
}
and just by putting the sdio.c/ethernet.c (and/or .h) into the project initAllComponents() would call the respective *_init() functions.
TIA
There is nothing in plain C that does this. There are however compiler extensions and linker magic that you can do.
GCC (and some others that try to be compatible) has __attribute__((constructor)) that you can use on functions. This requires support from the runtime (libc or ld.so or equivalent).
For the linker magic (that I personally prefer since it gives you more control on when things actually happen) look at this question and my answer. I have a functional implementation of it on github. This also requires gcc and a linker that creates the right symbols for us, but doesn't require any support from the runtime.

CPU dependent code: how to avoid function pointers?

I have performance critical code written for multiple CPUs. I detect CPU at run-time and based on that I use appropriate function for the detected CPU. So, now I have to use function pointers and call functions using these function pointers:
void do_something_neon(void);
void do_something_armv6(void);
void (*do_something)(void);
if(cpu == NEON) {
do_something = do_something_neon;
}else{
do_something = do_something_armv6;
}
//Use function pointer:
do_something();
...
Not that it matters, but I'll mention that I have optimized functions for different cpu's: armv6 and armv7 with NEON support. The problem is that by using function pointers in many places the code become slower and I'd like to avoid that problem.
Basically, at load time linker resolves relocs and patches code with function addresses. Is there a way to control better that behavior?
Personally, I'd propose two different ways to avoid function pointers: create two separate .so (or .dll) for cpu dependent functions, place them in different folders and based on detected CPU add one of these folders to the search path (or LD_LIB_PATH). The, load main code and dynamic linker will pick up required dll from the search path. The other way is to compile two separate copies of library :)
The drawback of the first method is that it forces me to have at least 3 shared objects (dll's): two for the cpu dependent functions and one for the main code that uses them. I need 3 because I have to be able to do CPU detection before loading code that uses these cpu dependent functions. The good part about the first method is that the app won't need to load multiple copies of the same code for multiple CPUs, it will load only the copy that will be used. The drawback of the second method is quite obvious, no need to talk about it.
I'd like to know if there is a way to do that without using shared objects and manually loading them at runtime. One of the ways would be some hackery that involves patching code at run-time, it's probably too complicated to get it done properly). Is there a better way to control relocations at load time? Maybe place cpu dependent functions in different sections and then somehow specify what section has priority? I think MAC's macho format has something like that.
ELF-only (for arm target) solution is enough for me, I don't really care for PE (dll's).
thanks
You may want to lookup the GNU dynamic linker extension STT_GNU_IFUNC. From Drepper's blog when it was added:
Therefore I’ve designed an ELF extension which allows to make the decision about which implementation to use once per process run. It is implemented using a new ELF symbol type (STT_GNU_IFUNC). Whenever the a symbol lookup resolves to a symbol with this type the dynamic linker does not immediately return the found value. Instead it is interpreting the value as a function pointer to a function that takes no argument and returns the real function pointer to use. The code called can be under control of the implementer and can choose, based on whatever information the implementer wants to use, which of the two or more implementations to use.
Source: http://udrepper.livejournal.com/20948.html
Nonetheless, as others have said, I think you're mistaken about the performance impact of indirect calls. All code in shared libraries will be called via a (hidden) function pointer in the GOT and a PLT entry that loads/calls that function pointer.
For the best performance you need to minimize the number of indirect calls (through pointers) per second and allow the compiler to optimize your code better (DLLs hamper this because there must be a clear boundary between a DLL and the main executable and there's no optimization across this boundary).
I'd suggest doing these:
moving as much of the main executable's code that frequently calls DLL functions into the DLL. That'll minimize the number of indirect calls per second and allow for better optimization at compile time too.
moving almost all your code into separate CPU-specific DLLs and leaving to main() only the job of loading the proper DLL OR making CPU-specific executables w/o DLLs.
Here's the exact answer that I was looking for.
GCC's __attribute__((ifunc("resolver")))
It requires fairly recent binutils.
There's a good article that describes this extension: Gnu support for CPU dispatching - sort of...
Lazy loading ELF symbols from shared libraries is described in section 1.5.5 of Ulrich Drepper's DSO How To (updated 2011-12-10). For ARM it is described in section 3.1.3 of ELF for ARM.
EDIT: With the STT_GNU_IFUNC extension mentioned by R. I forgot that was an extension. GNU Binutils supports that for ARM, apparently since March 2011, according to changelog.
If you want to call functions without the indirection of the PLT, I suggest function pointers or per-arch shared libraries inside which function calls don't go through PLTs (beware: calling an exported function is through the PLT).
I wouldn't patch the code at runtime. I mean, you can. You can add a build step: after compilation disassemble your binaries, find all offsets of calls to functions that have multi-arch alternatives, build table of patch locations, link that into your code. In main, remap the text segment writeable, patch the offsets according to the table you prepared, map it back to read-only, flush the instruction cache, and proceed. I'm sure it will work. How much performance do you expect to gain by this approach? I think loading different shared libraries at runtime is easier. And function pointers are easier still.

how does inline functions expose internal data structures?

I hear this a lot of times that: "inline functions in C expose internal data structures" and that is one of the reasons some people do not like them.
Can someone please explain, how?
Thanks in advance.
Lets say I have a program code.c and a function func(). I can 1) make func() inline - which will expose whatever I do with my data-structures in code.c 2) I can put func() in a library and provide that as a shared lib (which is not readable - I guess ?? :p) ---- Is this a correct analysis?
Since you put inline function definitions in a header file (unless used in a single cpp file), which would need to be included by consumers then I guess you are exposing the inner workings of your code.
But, since the alternative is usually macros, I doubt that is a good reason against them.
It would certainly be more transparent compared to something compiled into a library or object module. That's because you can see the source code, and therefore write code which manipulates the data structures any way you want.
However, for non-line functions for which you have source, I am at a loss how that could be more protected.
There are software corporations which jealously guard their software source code, and only release object modules to be linked with, or shared libraries, or (dread!) .DLLs.
Inline methods expand all method calls in place. So instead of having foo() be a JMP or CALL instruction it just copies the actual instructions of foo() where it was called. If this contains critical data then that would become exposed although inline functions are typically used for short one to two line methods or larger expressions.

crt0.o and crt1.o -- What's the difference?

Recently I've been trying to debug some low-level work and I could not find the crt0.S for the compiler (avr-gcc) but I did find a crt1.S (and the same with the corresponding .o files).
What is the difference between these two files? Is crt1 something completely different or what? They both seem to have to do with something for 'bootstrapping' (setting up stack frame and such), but why the distinction?
Both crt0/crt1 do the same thing, basically do what is needed before calling main() (like initializing stack, setting irqs, etc.). You should link with one or the other but not both. They are not really libraries but really inline assembly code.
As far as I understand, crt comes in two "flavors"
crt1 is used on systems that support constructors and destructors (functions called before and after main and exit). In this case main is treated like a normal function call.
crt0 is used on systems that do not support constructors/destructors.

Is there a compiler feature to inject custom function entry and exit code?

Currently coding on Windows with VS2005 (but wouldn't mind knowing if there are options for other compilers and platforms. I'm most interested in OSX as an alternative platform.) I have a C (no C++) program and I'd like to do the following...
Given a function, say...
int MyFunction(int myparam)
{
// Entry point.
...
// Exit point.
return 1;
}
I'd like to put a snippet of code at the entry point and at the exit point. BUT, I'd rather not have to modify the 100's of functions that are already out there. Is there a way to define function entry and exit code that the compiler will inject for all my functions without having to modify them all?
Most solutions I've found or tried will require me to edit every single function, which is a lot of work. I figure someone else must have hit something like this already and solved it. I can't be unique in this request I suspect.
It's Microsoft-specific, but you can hook into the _penter and _pexit functions to do something when entering and exiting a function -- you'll have to compile your project with some special flags.
There's a little bit of a tutorial here, and you can find a few more results on how to use them on Google. Also, this blog post goes into some detail on the assembly that you need to do to avoid messing up the stack on entry and exit.
GCC has the -finstrument-functions flag which allows you to define two functions that will be called at the beginning and end of each function call:
void __cyg_profile_func_enter(void *this_fn, void *call_site);
void __cyg_profile_func_exit(void *this_fn, void *call_site);
You're looking for something called aspect oriented programming or AOP.
This isn't something that's supported natively in C (or C++). There are some library based implementations listed on the linked page for C (though I don't know how mature / useful these are)
OpenWatcom C and C++ compilers have -ee and -ep parameters for that:
-ee call epilogue hook routine
-ep[=<num>] call prologue hook routine with <num> stack bytes available
They will cause the compiler to emit calls to __EPI and __PRO user-defined hook routines.
There is also
-en emit routine names in the code segment
that will emit the function name into the object code as a string of characters just before the function prologue sequence is generated. May be useful for the __PRO routine.
More information in these and other compiler options can be found in the C/C++ user guide available among other manuals at http://openwatcom.org/index.php/Manuals

Resources