If a function f() is called and implemented in the same c file (module) - who resolves this call? The compiler or the linker?
I think it's technically implementation-dependent, but typically references within the same file will be resolved by the compiler. There's no point deferring it until link time since the compiler knows which function is being called, and the compiler may be able to generate the code for the function call more efficiently if it doesn't have to leave a place for the linker to fill in an address. (For example, it may be able to use a relative jump instruction with a 16-bit offset for a call to a nearby function, instead of an absolute jump with a 32-bit or 64-bit address embedded in the code.)
This may change if the called function is declared as a weak symbol: in that case, although the function is defined in the current translation unit, that definition may be overridden by one from another module at link time, so the compiler has to treat it as a call to a function in another module.
It depends on the symbol's linkage. If f is an internal function such as those declared/defined with static, then f is resolved in the compiling (by the compiler). If the f is a weak symbol, then it is resolved in the dynamically linking (by the dynamic loader). If f is a strong symbol, then it is resolved in the compiling (by the compiler).
Especially, when the program is compiled with optimization, f may be directly inlined into the caller's body which is done by the compiler.
Related
I was wondering if there is way to remove ALL the unused functions listed in the map file for an embedded project developed in C and using the IAR embedded workbench for ARM IDE, which uses its own compiler and linker:
IAR C/C++ Compiler for ARM 8.30
IAR ELF Linker for ARM 8.30
IAR Assembler for ARM 8.30
I have noticed that not all the functions listed in the map file are the used functions that actually are used at run time, is there any optimization way to remove all unused functions?
For example a third library is used in the project and FuncA() is part of that inside which there might be a switch case and for every case a different static function in called, lets say FuncA1(), FuncA2(), ... FuncAn(). We would enter each case based on the code and usage of the FuncA() so it it obvious that not all of the FuncA1(), FuncA2(), ... FuncAn() functions would be called in the project, however, all of them are listed in the map file.
Is it possible to remove such functions from the map file? If yes how?
Removal of unused functions with external linkage is necessarily a function of the linker rather then the compiler. However a linker is not required to support that and any support is toolchain dependent and may require specific link-time optimisation switches to be applied.
Unused functions with static linkage could be removed by the compiler.
We could enter each case based on the code and the function that calls FuncA() so it it obvious that not all of the FuncA1(), FuncA2(), ... FuncAn() functions would be called
If the functions FuncAx() have static linkage, but are explicitly referenced in the function FuncA() with external linkage, then neither the compiler nor the linker should be able to remove the functions, because the compiler has no a-priori knowledge of how FuncA() will be called, and the linker has no reference to functions with static linkage, or necessarily understanding of the language semantics that would make it apparent the switch cases in question are not invoked.
It is possible I guess that a sophisticated toolchain with a C language aware linker and with link-time whole program optimisation might remove dead code more aggressively, but that is certainly tool-chain specific. Most linkers are source language agnostic and merely resolve symbols in the object code and in some case remove code to which no link has been made.
The following is how a function call(for the 1st time) would be resolved in a PIC
Jump to the PLT entry of our symbol.
Jump to the GOT entry of our symbol.
Jump back to the PLT entry and push an offset on the stack. That the
offset is actually an Elf_Rel structure describing how to patch the symbol.
Jump to the PLT stub entry.
Push a pointer to a link_map structure in order for the linker to
find in which library the symbol belongs to.
Call resolver routine.
Patch the GOT entry.
This is different from how a data reference is made which just uses the GOT table
So, why is there this difference? Why 2 different approaches?
why is there this difference? Why 2 different approaches?
What you described is lazy relocation.
You don't have to use it, and will not use it if e.g. LD_BIND_NOW=1 is set in the environment.
It's an optimization: it allows you to reduce the amount of work that the dynamic linker has to perform, when a particular program invocation does not exercise many possible program execution paths.
Imagine a program that can call foo(), bar() or baz(), depending on arguments, and which calls exactly one of the routines in any given execution.
If you didn't use lazy relocation, the dynamic loader would have to resolve all 3 routines at program startup. Lazy relocation allows dynamic loader to only perform the one relocation that is actually required in any given execution (the one function that is getting called), and at exactly the right time (when the function is being called).
Now, why can't variables also be resolved that way?
Because there is no convenient way for the dynamic loader to know when to perform that relocation.
Suppose the globals are a, b and c, and that foo() references a and b, bar() references b and c, and baz() references a and c. In theory the dynamic loader could scan bodies of foo, bar and baz, and build a map of "if calling foo, then also resolve globals a and b", etc. But it's much simpler and faster to just resolve all references to globals at startup.
Programming Language : C
At our work,we have a project which has a header file say header1.h . This file contains some function which are declared as external scope (via extern) and also defined as inline in the same header file(header1.h).
Now this file is included at several places in different C files.
My understanding is that it will produce an error of multiple definitions with my past experience with GCC , and that is what I expect. But at our work we do not get these errors. Only difference is that we are using different compiler driver.
From my past experience, the best guess that I am making is that, the symbols are generated as weak symbols at the time of compilation and linker is using that information to choose one of them.
Could functions defined as inline result in weak symbols ? Is it possible, or there might be some other reason.
Also if inline can result in creation of weak symbols ,would there be a feature to turn it off or on.
If a function is inline, the entire function body will be copied in every time the function is used (instead of the normal assembler call/return semantic).
(Modern compilers, uses inline as a hint, and the actual result might just be a static function, with a unique copy in every compiled file it was used)
I've seen code like below in a project:
extern void my_main(void) __attribute__ ((__noreturn__, asection(".main","f=ax")));
What does this do?
The project does not have a direct main() function in it. Does the above code indicate to the compiler that my_main() should be treated as main()?
Also, what does the .main memory section indicate?
What the above declaration basically does is declare an extern function called my_main() with no arguments.
The __attribute__ section is a GNU/LLVM attribute syntax. Attributes are basically pragmas that describe some non-standard or extended feature of the function in question - in this case, my_main().
There are two attributes applied to my_main().
__noreturn__ (search for noreturn) indicates that the function will never return.
This is different from returning void - in void-type functions, calls to the function still return at some point, even without a value. This means execution will jump/return back to the caller.
In noreturn (a.k.a. _noreturn or __noreturn__) functions, this indicates that, among other things, calls to this function shouldn't add the return address to the stack, as the function itself will either exit before execution returns, or will long jump to another point in execution.
It is also used in places where adding the return address to the stack will disrupt the stack in a way that interferes with the called function (though this is rare and I've only ever seen it used for this reason once).
The second attribute, asection(".main","f=ax"), is a little more vague. I can't seem to find specific documentation for it, but it seems more or less pretty straightforward.
What it appears to be doing is specifying a linker section as well as what appears to be a unix filemode specifying that the resulting binary is executable, though I could be wrong.
When you write native code, all functionality is placed into appropriate sections of the target binary format (e.g. ELF, Mach-O, PE, etc.) The most common sections are .text, .rodata, and .data.
However, when invoking ld, the GCC linker, you can specify a linker script to specify exactly how you want the target binary to be constructed.
This includes sections, sizes, and even the object files you want to use to make the file, specifying where they should go and their size limits.
One common misconception is that you never use ld. This isn't the case; when you run gcc or g++ or the clang-family of compilers without the -c flag, you inadvertently invoke ld with a default linker script used to link your binaries.
Linker scripts are important especially for embedded hardware where ROM must be built to memory specification.
So back to your line of code: it places my_func() into an arbitrary section called .main. That's all it does. Ultimately, somewhere in your project, there is a linker script that specifies how .main is used and where it goes.
I would imagine the goal of this code was to place my_main() at an exact address in the target binary/executable, so whatever is using it knows the exact location of that function (asection(".main")) and can use it as an entry point (__noreturn__).
What I do
When writing shared libraries for Linux, I tend to pay attention to relocations, symbol visibility, GOT/PLT etc.
When applicable, I am trying to avoid calling PLT stubs when functions from the same library call each other. For example, let's say a shared object provides two public functions - foo() and bar() (either of those can be called by user). The bar() function, however, also calls foo(). So what I do in this case is this:
Define _foo() and _bar() functions that have private visibility.
Define foo() and bar() weak aliases for _foo() and _bar() respectively.
That way, the code in shared object never uses weak symbols. It only invokes local functions, directly. For example, when _bar() is invoked, it calls _foo() directly.
But users are not aware of _* functions and always use corresponding weak aliases.
How I do it
In Linux, this is achieved by using the following construct:
extern __typeof (_NAME) NAME __attribute__(weak, alias("_NAME"));
The problem
Unfortunately, this does not work for OS X. I have no deep knowledge of OS X or its binary formats, so I poked around a bit and found a few examples of weak functions (like this one), but those don't quite do the same as you can have a weak symbol, but not a weak symbol that is an alias for DSO's local function.
Possible solution...
For now, I have just disabled this feature (that is implemented using macros) so that all symbols are global and have default visibility. The only way I can think of to achieve the same for now is to have all _foo functions with private visibility and have corresponding foo functions with default visibility and calling their "hidden" counterparts.
A better way?
That, however, requires a good chunk of code to be changed. Therefore I would prefer not to go there unless there is really no other way.
So what is the closes OS X alternative or the easiest way to get the same semantics/behavior?
On OS X, calls made within the library are automatically direct calls and do not go through the dyld stub. The evidence to the fact is that if you want to be able to inject alternative functions to service a call, you'll need to use interposable to force indirect access to the symbols and force execution of the call through the dyld stubs. Otherwise, by default, local calls will be direct and will not incur the overhead of running through dyld.
Thus, your optimization on Linux is already the default behavior and the alias is not needed.
Still, if you want to do this just to make your platform compatible code simpler, you can still make the aliases. You just need to use "weak_import" or "weak" (if you want coalesced) as your attribute name.
extern typeof (_NAME) NAME __attribute(weak_import, alias("_NAME"));
Apple reference on Weak Linking: Marking Symbols for Weak Linking
Apple reference on Mach-O runtime binding : Scope and Treatment of Symbol Definitions