Understanding Linker Version Scripts - c

I am trying to understand version script that is being used for resolving duplicate symbol problem.
I have created a main_demo.c file with the following content
#include<stdio.h>
extern void fun1_std(void);
extern void fun1_linux(void);
int main ()
{
fun1_linux(); //defined under libfun1_linux.c
fun1_std(); //defined under libfun2_std.c
return 0;
}
Content of libfun1_std.c
#include<stdio.h>
extern void fun (void);
void fun1_std()
{
fun();
}
Content of libfun1_linux.c
#include<stdio.h>
extern void fun (void);
void fun1_linux()
{
fun();
}
Now , I have created shared object for both the above files (which are calling fun()) named as libfun1_std.so and libfun1_linux.so respectively. The function "fun()" is defined under two shared objects that looks like the following:
//library name ==> libfun2_linux.so
#include <stdio.h>
void fun()
{
printf("In fun of libfun2_linux\n");
//Similarly libfun2_std.so is printing "In fun of libfun2_std"
}
My expectation is that symbol "fun" used in libfun1_linux.so is resolved by libfun2_linux.so and similarly "fun" used in libfun2_std.so is resolved by libfun2_std.so
As we can see if we create a executable/process by linking the main program with all the above shared libraries, process will get duplicate symbol in it's address space i.e, fun(). SO the output look like this:
In fun of libfun2_std
In fun of libfun2_std
To resolve this I have created a version script that will rename the symbol.
Content of version script:
LIBFUN2_STD_1.0 {
global: *; //i.e, every symbol is global
};
and compiled libfun2_std.so with this version script.
Let's see the symbol again:-
$ readelf -a libfun2_std.so | grep fun
13: 0000000000000735 18 FUNC GLOBAL DEFAULT 12 fun##LIBFUN2_STD_1.0
35: 0000000000000000 0 FILE LOCAL DEFAULT ABS libfun2_std.c
48: 0000000000000735 18 FUNC GLOBAL DEFAULT 12 fun
000000: Rev: 1 Flags: BASE Index: 1 Cnt: 1 Name: libfun2_std.so
Also the content of caller library
nm libfun1_std.so | grep fun
0000000000000705 T fun1_std
U fun##LIBFUN2_STD_1.0.
So now it seems that we have resolved the duplicate symbol. Let's create the main executable again
gcc -Wall -o main_demo main_demo.c -lfun1_std -lfun2_std -lfun1_linux -
lfun2_linux
But during execution I got the follwing result again:
In fun of libfun2_std
In fun of libfun2_std
Lets see the content of readelf for libfun2_linux.so
$ readelf -a libfun2_linux.so | grep fun
13: 00000000000006b5 18 FUNC GLOBAL DEFAULT 11 fun
34: 0000000000000000 0 FILE LOCAL DEFAULT ABS libfun2_linux.c
47: 00000000000006b5 18 FUNC GLOBAL DEFAULT 11 fun
Question:-
Why having version script for one library is not sufficient as now symbols are different? If we create two version script (one for libfun2_linux.so) it is working fine.
How version script works? (Read some links which explain that it works by creatig trees but didnot get it completely). Please suggest some link where it is clearly explained.

Related

How to verify external symbols in an .h file to the .c file?

In C it is an idiomatic pattern to have your .h file contain declarations of the externally visible symbols in the corresponding .c file. The purpose if this is to support a kind of "module & interface" thinking, e.g enabling a cleaner structure.
In a big legacy C system I'm working on it is not uncommon that functions are declared in the wrong header files probably after moving a function to another module, since it still compiles, links and runs, but that makes the modules less explicit in their interfaces and indicates wrong dependencies.
Is there a way to verify / confirm / guarantee that the .h file has all the external symbols from .c and no external symbols that are not there?
E.g. if I have the following files
module.c
int func1(void) {}
bool func2(int c) {}
static int func3(void) {}
module.h
extern int func1(void);
extern bool func4(char *v);
I want to be pointed to the fact that func4 is not an external visible symbol in module.c and that func2 is missing.
Modern compilers give some assistance in as so much that they can detect a missing declaration that you actually referenced, but it does not care from which file it comes.
What are my options, other than going over each pair manually, to obtain this information?
I want to be pointed to the fact that func4 is not an external visible symbol in module.c and that func2 is missing.
Using POSIX-ish linux with bash, diff and ctags and given really simple example of input files, you could do this:
$ #recreate input
$ cat <<EOF >module.c
int func1(void) {}
bool func2(int c) {}
static int func3(void) {}
EOF
$ cat <<EOF >module.h
extern int func1(void);
extern bool func4(char *v);
EOF
$ # helper function for extracting only non-static function declarations
$ f() { ctags -x --c-kinds=fp "$#" | grep -v static | cut -d' ' -f1; }
$ # simply a diff
$ diff <(f module.c) <(f module.h)
2,3c2
< func2
---
> func4
$ diff <(f module.c) <(f module.h) |
> grep '^<\|^>' |
> sed -E 's/> (.*)/I would like to point the fact that \1 is not externally visible symbol/; s/< (.*)/\1 is missing/'
func2 is missing
I would like to point the fact that func4 is not externally visible symbol
This will break if for example static keyword is not on the same line as function identifier is introduced, because ctags will not output it them. So the real job of this is getting the list of externally visible function declarations. This is not an easy task and writing such tool is left to others : )
It does not make any sense as if you call not defined function, the linker will complain.
More important is to have all functions prototypes - as compiler has to know how to call them. But in this case compilers emit warnings.
Some notes: you do not need the keyword extern as functions are extern by default.
This is the time to shine for some of my favorite compiler warning flags:
CFLAGS += -Wmissing-prototypes \
-Wstring-prototypes \
-Wmissing-declarations \
-Wold-style-declaration \
-Wold-style-definition \
-Wredundant-decls
This at least ensures, that all the source files containing implementations of a function that is not static also have a previous external declaration & prototype of said function, ie. in your example:
module.c:4:6: warning: no previous prototype for ‘func2’ [-Wmissing-prototypes]
4 | bool func2(int c) { return c == 0; }
| ^~~~~
If we'd provide just a forward declaration that doesn't constitute a full prototype we'd still get:
In file included from module.c:1:
module.h:7:1: warning: function declaration isn’t a prototype [-Wstrict-prototypes]
7 | extern bool func2();
| ^~~~~~
module.c:4:6: warning: no previous prototype for ‘func2’ [-Wmissing-prototypes]
4 | bool func2(int c) { return c == 0;}
| ^~~~~
Only providing a full prototype will fix that warning. However, there's no way to make sure that all declared functions are actually also implemented. One could go about this using linker module definition files, a script using nm(1) or a simple "example" or unit test program, that includes every header file and tries to call all functions.
To list the differences between the exported symbols in a .c module in C and the corresponding .h file you can use chcheck. Just give the module name on the command line
python3 chcheck.py <module>
and it will list what externally visible functions are defined in the .c module but not exposed in the .h header file, and if there are any functions in the header module that are not defined in the corresponding .c file.
It only checks for function declarations/definitions at this point.
Disclaimer I wrote this to solve my own problem. Its built in Python on top of #eliben:s excellent pycparser.
Output for the example in the question is
Externally visible definitions in 'module.c' that are not in 'module.h':
func2
Declarations in 'module.h' that have no externally visible definition in 'module.c':
func4

Why size of extern variables is 0 in ELF?

Extern declarations of variables in a c program result in size 0 in ELF. Why isn't actual size stored in ELF when known? For cases like incomplete arrays I understand there is no size information but for other cases it should be possible to store size.
I tried some simple codes and verified in ELF size emitted is zero.
// file1.c
extern int var;
int main()
{
var = 2;
}
// file 2.c
long long int var = 8;
gcc -c file1.c
readelf -s file1.o
...
9: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND var
...
gcc -c file2.c
readelf -s file2.o
...
7: 0000000000000000 8 OBJECT GLOBAL DEFAULT 2 var
...
If size of var was stored as 4 in file1.o, linker can actually detect potential mismatch due to size when linking with file2.o.
So why isn't size emitted as it can help catch some subtle issues like this?
In file1, var is just a placeholder. It will not occupy any memory. The extern identifier is to indicate to the compiler and the linker that the variable var is stored elsewhere.
It would be wrong to have two storage locations for the single variable var as you have suggested.
It is a quirk of the C language that you can define a different type for extern variable and a different type for the underlying global variable as you have done in your example. This is one of the reasons that we have static analysis tools.

Location of global variables with DWARF (and relocation)

When dynamically linking a binary with libraries, relocation information is used to bind the variables/functions of the different ELF objects. However DWARF is not affected by relocation: how is a debugger supposed to resolve global variables?
Let's say I have liba.so (a.c) defining a global variable (using GNU/Linux with GCC or Clang):
#include <stdio.h>
int foo = 10;
int test(void) {
printf("&foo=%p\n", &foo);
}
and an program b linked against liba.so (b.c):
#include <stdio.h>
extern int foo;
int main(int argc, char** argv) {
test();
printf("&foo=%p\n", &foo);
return 0;
}
I expect that "foo" will be instanciated in liba.so
but in fact it is instanciated in both liba.so and b:
$ ./b
&foo=0x600c68 # <- b .bss
&foo=0x600c68 # <- b .bss
The foo variable which is used (both by b and by lib.so) is in the .bss of b
and not in liba.so:
[...]
0x0000000000600c68 - 0x0000000000600c70 is .bss
[...]
0x00007ffff7dda9c8 - 0x00007ffff7dda9d4 is .data in /home/foo/bar/liba.so
0x00007ffff7dda9d4 - 0x00007ffff7dda9d8 is .bss in /home/foo/bar/liba.so
The foo variable is instanciated twice:
once in liba.so (this instance is not used when linked with program b)
once in b (this instance is used instance of the other in b).
(I don't really understand why the variable is instanciated in the executable.)
There is only a declaration in b (as expected) in the DWARF informations:
$ readelf -wi b
[...]
<1><ca>: Abbrev Number: 9 (DW_TAG_variable)
<cb> DW_AT_name : foo
<cf> DW_AT_decl_file : 1
<d0> DW_AT_decl_line : 3
<d1> DW_AT_type : <0x57>
<d5> DW_AT_external : 1
<d5> DW_AT_declaration : 1
[...]
and a location is found in liba.so:
$ readelf -wi liba.so
[...]
<1><90>: Abbrev Number: 5 (DW_TAG_variable)
<91> DW_AT_name : foo
<95> DW_AT_decl_file : 1
<96> DW_AT_decl_line : 3
<97> DW_AT_type : <0x57>
<9b> DW_AT_external : 1
<9b> DW_AT_location : 9 bloc d'octets: 3 d0 9 20 0 0 0 0 0 (DW_OP_addr: 2009d0)
[...]
This address is the location of the (unsued) instance of foo in liba.so (.data).
I end up with 2 instances of the foo global variable (on in liba.so and one in b);
only the first one can be seen with DWARF;
only the secone one is used.
How is the debugger supposed to resolve the foo global variable?
I don't really understand why the variable is instanciated in the executable.
You can find the answer here.
How is the debugger supposed to resolve the foo global variable
The debugger reads symbol tables (in addition to debug info), and foo does get defined in both the main executable b, and in liba.so:
nm b | grep foo
0000000000600c68 B foo
(I read the Oracle doc provided by #Employed Russian.)
The global variable reinstanciation is done for non-PIC code in order to dereference the variable in a non-PIC way without patching the non-PIC code:
a copy of the variable is done for non-PIC code;
the variable is instanciated in the executable;
a copy relocation instruction is used to copy the data from the source shared objet at dynamic linking time;
the instance in the shared objet is not used (after the relocation copy has been done).
Copy relocation instructions:
$readelf -r b
Relocation section '.rela.dyn' at offset 0x638 contains 2 entries:
Offset Info Type Sym. Value Sym. Name + Addend
000000600c58 000300000006 R_X86_64_GLOB_DAT 0000000000000000 __gmon_start__ + 0
000000600ca8 001200000005 R_X86_64_COPY 0000000000600ca8 foo + 0
For functions, the GOT+PLT technique is used the same way they are used in PIC code.

Replacing static function in kernel module

Folks,
I'm trying to hack a kernel module by modifying its symbol. The basic idea is to replace the original function with new function by overwriting its address in the symtab. However, I found when declaring the function as static, the hacking fails. But it works with non-static function. My example code is below:
filename: orig.c
int fun(void) {
printk(KERN_ALERT "calling fun!\n");
return 0;
}
int evil(void) {
printk(KERN_ALERT "===== EVIL ====\n");
return 0;
}
static int init(void) {
printk(KERN_ALERT "Init Original!");
fun();
return 0;
}
void clean(void) {
printk(KERN_ALERT "Exit Original!");
return;
}
module_init(init);
module_exit(clean);
Then I follow the styx's article to replace the original function "fun" in symtab to call function "evil", http://www.phrack.org/issues.html?issue=68&id=11
>objdump -t orig.ko
...
000000000000001b g F .text 000000000000001b evil
0000000000000056 g F .text 0000000000000019 cleanup_module
0000000000000036 g F .text 0000000000000020 init_module
0000000000000000 g F .text 000000000000001b fun
...
By executing the elfchger
>./elfchger -s fun -v 1b orig.ko
[+] Opening orig.ko file...
[+] Reading Elf header...
>> Done!
[+] Finding ".symtab" section...
>> Found at 0xc630
[+] Finding ".strtab" section...
>> Found at 0xc670
[+] Getting symbol' infos:
>> Symbol found at 0x159f8
>> Index in symbol table: 0x1d
[+] Replacing 0x00000000 with 0x0000001b... done!
I can successfully change the fun's symbol table to be equal to evil and inserting the module see the effects:
000000000000001b g F .text 000000000000001b evil
...
000000000000001b g F .text 000000000000001b fun
> insmod ./orig.ko
> dmesg
[ 7687.797211] Init Original!
[ 7687.797215] ===== EVIL ====
While this works fine. When I change the declaration of fun to be "static int fun(void)" and follows the same steps as mentioned above, I found the evil does not get called. Could anyone give me some suggestion?
Thanks,
William
Short version: Declaring a function as 'static' makes it local and prevents the symbol to be exported. Thus, the call is linked statically, and the dynamic linker does not effect the call in any way at load time.
Long Version
Declaring a symbol as 'static' prevents the compiler from exporting the symbol, making it local instead of global. You can verify this by looking for the (missing) 'g' in your objdump output, or at the lower-case 't' (instead of 'T') in the output of 'nm'. The compiler might also inline the local function, in which case the symbol table wouldn't contain it at all.
Local symbols have to be unique only for the translation unit in which they are defined. If your module consisted of multiple translation units, you could have a static fun() in each of them. An nm or objdump of the finished .ko may then contain multiple local symbols called fun.
This also implies that local symbols are valid only in their respective translation unit, and also can be referred (in your case: called) only from inside this unit. Otherwise, the linker just would not now, which one you mean. Thus, the call to static fun() is already linked at compile time, before the module is loaded.
At load time, the dynamic linker won't tamper with the local symbol fun or references (in particular: calls) to it, since:
its local linkage already done
there are potentially more symbols named 'fun' throughout and the dynamic linker would not be able to tell, which one you meant

Override a function call in C

I want to override certain function calls to various APIs for the sake of logging the calls, but I also might want to manipulate data before it is sent to the actual function.
For example, say I use a function called getObjectName thousands of times in my source code. I want to temporarily override this function sometimes because I want to change the behaviour of this function to see the different result.
I create a new source file like this:
#include <apiheader.h>
const char *getObjectName (object *anObject)
{
if (anObject == NULL)
return "(null)";
else
return "name should be here";
}
I compile all my other source as I normally would, but I link it against this function first before linking with the API's library. This works fine except I can obviously not call the real function inside my overriding function.
Is there an easier way to "override" a function without getting linking/compiling errors/warnings? Ideally I want to be able to override the function by just compiling and linking an extra file or two rather than fiddle around with linking options or altering the actual source code of my program.
With gcc, under Linux you can use the --wrap linker flag like this:
gcc program.c -Wl,-wrap,getObjectName -o program
and define your function as:
const char *__wrap_getObjectName (object *anObject)
{
if (anObject == NULL)
return "(null)";
else
return __real_getObjectName( anObject ); // call the real function
}
This will ensure that all calls to getObjectName() are rerouted to your wrapper function (at link time). This very useful flag is however absent in gcc under Mac OS X.
Remember to declare the wrapper function with extern "C" if you're compiling with g++ though.
If it's only for your source that you want to capture/modify the calls, the simplest solution is to put together a header file (intercept.h) with:
#ifdef INTERCEPT
#define getObjectName(x) myGetObjectName(x)
#endif
Then you implement the function as follows (in intercept.c which doesn't include intercept.h):
const char *myGetObjectName (object *anObject) {
if (anObject == NULL) return "(null)";
return getObjectName(anObject);
Then make sure each source file where you want to intercept the call has the following at the top:
#include "intercept.h"
When you compile with "-DINTERCEPT", all files will call your function rather than the real one, whereas your function will still call the real one.
Compiling without the "-DINTERCEPT" will prevent interception from occurring.
It's a bit trickier if you want to intercept all calls (not just those from your source) - this can generally be done with dynamic loading and resolution of the real function (with dlload- and dlsym-type calls) but I don't think it's necessary in your case.
You can override a function using LD_PRELOAD trick - see man ld.so. You compile shared lib with your function and start the binary (you even don't need to modify the binary!) like LD_PRELOAD=mylib.so myprog.
In the body of your function (in shared lib) you write like this:
const char *getObjectName (object *anObject) {
static char * (*func)();
if(!func)
func = (char *(*)()) dlsym(RTLD_NEXT, "getObjectName");
printf("Overridden!\n");
return(func(anObject)); // call original function
}
You can override any function from shared library, even from stdlib, without modifying/recompiling the program, so you could do the trick on programs you don't have a source for. Isn't it nice?
If you use GCC, you can make your function weak. Those can be overridden by non-weak functions:
test.c:
#include <stdio.h>
__attribute__((weak)) void test(void) {
printf("not overridden!\n");
}
int main() {
test();
}
What does it do?
$ gcc test.c
$ ./a.out
not overridden!
test1.c:
#include <stdio.h>
void test(void) {
printf("overridden!\n");
}
What does it do?
$ gcc test1.c test.c
$ ./a.out
overridden!
Sadly, that won't work for other compilers. But you can have the weak declarations that contain overridable functions in their own file, placing just an include into the API implementation files if you are compiling using GCC:
weakdecls.h:
__attribute__((weak)) void test(void);
... other weak function declarations ...
functions.c:
/* for GCC, these will become weak definitions */
#ifdef __GNUC__
#include "weakdecls.h"
#endif
void test(void) {
...
}
... other functions ...
Downside of this is that it does not work entirely without doing something to the api files (needing those three lines and the weakdecls). But once you did that change, functions can be overridden easily by writing a global definition in one file and linking that in.
You can define a function pointer as a global variable. The callers syntax would not change. When your program starts, it could check if some command-line flag or environment variable is set to enable logging, then save the function pointer's original value and replace it with your logging function. You would not need a special "logging enabled" build. Users could enable logging "in the field".
You will need to be able to modify the callers' source code, but not the callee (so this would work when calling third-party libraries).
foo.h:
typedef const char* (*GetObjectNameFuncPtr)(object *anObject);
extern GetObjectNameFuncPtr GetObjectName;
foo.cpp:
const char* GetObjectName_real(object *anObject)
{
return "object name";
}
const char* GetObjectName_logging(object *anObject)
{
if (anObject == null)
return "(null)";
else
return GetObjectName_real(anObject);
}
GetObjectNameFuncPtr GetObjectName = GetObjectName_real;
void main()
{
GetObjectName(NULL); // calls GetObjectName_real();
if (isLoggingEnabled)
GetObjectName = GetObjectName_logging;
GetObjectName(NULL); // calls GetObjectName_logging();
}
Building on #Johannes Schaub's answer with a solution suitable for code you don't own.
Alias the function you want to override to a weakly-defined function, and then reimplement it yourself.
override.h
#define foo(x) __attribute__((weak))foo(x)
foo.c
function foo() { return 1234; }
override.c
function foo() { return 5678; }
Use pattern-specific variable values in your Makefile to add the compiler flag -include override.h.
%foo.o: ALL_CFLAGS += -include override.h
Aside: Perhaps you could also use -D 'foo(x) __attribute__((weak))foo(x)' to define your macros.
Compile and link the file with your reimplementation (override.c).
This allows you to override a single function from any source file, without having to modify the code.
The downside is that you must use a separate header file for each file you want to override.
There's also a tricky method of doing it in the linker involving two stub libraries.
Library #1 is linked against the host library and exposes the symbol being redefined under another name.
Library #2 is linked against library #1, interecepting the call and calling the redefined version in library #1.
Be very careful with link orders here or it won't work.
Below are my experiments. There are 4 conclusions in the body and in the end.
Short Version
Generally speaking, to successfully override a function, you have to consider:
weak attribute
translation unit arrangement
Long Version
I have these source files.
.
├── decl.h
├── func3.c
├── main.c
├── Makefile1
├── Makefile2
├── override.c
├── test_target.c
└── weak_decl.h
main.c
#include <stdio.h>
void main (void)
{
func1();
}
test_target.c
#include <stdio.h>
void func3(void);
void func2 (void)
{
printf("in original func2()\n");
}
void func1 (void)
{
printf("in original func1()\n");
func2();
func3();
}
func3.c
#include <stdio.h>
void func3 (void)
{
printf("in original func3()\n");
}
decl.h
void func1 (void);
void func2 (void);
void func3 (void);
weak_decl.h
void func1 (void);
__attribute__((weak))
void func2 (void);
__attribute__((weak))
void func3 (void);
override.c
#include <stdio.h>
void func2 (void)
{
printf("in mock func2()\n");
}
void func3 (void)
{
printf("in mock func3()\n");
}
Makefile1:
ALL:
rm -f *.o *.a
gcc -c override.c -o override.o
gcc -c func3.c -o func3.o
gcc -c test_target.c -o test_target_weak.o -include weak_decl.h
ar cr all_weak.a test_target_weak.o func3.o
gcc main.c all_weak.a override.o -o main -include decl.h
Makefile2:
ALL:
rm -f *.o *.a
gcc -c override.c -o override.o
gcc -c func3.c -o func3.o
gcc -c test_target.c -o test_target_strong.o -include decl.h # HERE -include differs!!
ar cr all_strong.a test_target_strong.o func3.o
gcc main.c all_strong.a override.o -o main -include decl.h
Output for Makefile1 result:
in original func1()
in mock func2()
in mock func3()
Output for Makefile2:
rm *.o *.a
gcc -c override.c -o override.o
gcc -c func3.c -o func3.o
gcc -c test_target.c -o test_target_strong.o -include decl.h # -include differs!!
ar cr all_strong.a test_target_strong.o func3.o
gcc main.c all_strong.a override.o -o main -include decl.h
override.o: In function `func2':
override.c:(.text+0x0): multiple definition of `func2' <===== HERE!!!
all_strong.a(test_target_strong.o):test_target.c:(.text+0x0): first defined here
override.o: In function `func3':
override.c:(.text+0x13): multiple definition of `func3' <===== HERE!!!
all_strong.a(func3.o):func3.c:(.text+0x0): first defined here
collect2: error: ld returned 1 exit status
Makefile4:2: recipe for target 'ALL' failed
make: *** [ALL] Error 1
The symbol table:
all_weak.a:
test_target_weak.o:
0000000000000013 T func1 <=== 13 is the offset of func1 in test_target_weak.o, see below disassembly
0000000000000000 W func2 <=== func2 is [W]eak symbol with default value assigned
w func3 <=== func3 is [w]eak symbol without default value
U _GLOBAL_OFFSET_TABLE_
U puts
func3.o:
0000000000000000 T func3 <==== func3 is a strong symbol
U _GLOBAL_OFFSET_TABLE_
U puts
all_strong.a:
test_target_strong.o:
0000000000000013 T func1
0000000000000000 T func2 <=== func2 is strong symbol
U func3 <=== func3 is undefined symbol, there's no address value on the left-most column because func3 is not defined in test_target_strong.c
U _GLOBAL_OFFSET_TABLE_
U puts
func3.o:
0000000000000000 T func3 <=== func3 is strong symbol
U _GLOBAL_OFFSET_TABLE_
U puts
In both cases, the override.o symbols:
0000000000000000 T func2 <=== func2 is strong symbol
0000000000000013 T func3 <=== func3 is strong symbol
U _GLOBAL_OFFSET_TABLE_
U puts
disassembly:
test_target_weak.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <func2>: <===== HERE func2 offset is 0
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
4: 48 8d 3d 00 00 00 00 lea 0x0(%rip),%rdi # b <func2+0xb>
b: e8 00 00 00 00 callq 10 <func2+0x10>
10: 90 nop
11: 5d pop %rbp
12: c3 retq
0000000000000013 <func1>: <====== HERE func1 offset is 13
13: 55 push %rbp
14: 48 89 e5 mov %rsp,%rbp
17: 48 8d 3d 00 00 00 00 lea 0x0(%rip),%rdi # 1e <func1+0xb>
1e: e8 00 00 00 00 callq 23 <func1+0x10>
23: e8 00 00 00 00 callq 28 <func1+0x15>
28: e8 00 00 00 00 callq 2d <func1+0x1a>
2d: 90 nop
2e: 5d pop %rbp
2f: c3 retq
So the conclusion is:
A function defined in .o file can override the same function defined in .a file. In above Makefile1, the func2() and func3() in override.o overrides the counterparts in all_weak.a. I tried with both .o files but it don't work.
For GCC, You don't need to split the functions into separate .o files as said in here for Visual Studio toolchain. We can see in above example, both func2() (in the same file as func1()) and func3() (in a separate file) can be overridden.
To override a function, when compiling its consumer's translation unit, you need to specify that function as weak. That will record that function as weak in the consumer.o. In above example, when compiling the test_target.c, which consumes func2() and func3(), you need to add -include weak_decl.h, which declares func2() and func3() as weak. The func2() is also defined in test_target.c but it's OK.
Some further experiment
Still with the above source files. But change the override.c a bit:
override.c
#include <stdio.h>
void func2 (void)
{
printf("in mock func2()\n");
}
// void func3 (void)
// {
// printf("in mock func3()\n");
// }
Here I removed the override version of func3(). I did this because I want to fall back to the original func3() implementation in the func3.c.
I still use Makefile1 to build. The build is OK. But a runtime error happens as below:
xxx#xxx-host:~/source/override$ ./main
in original func1()
in mock func2()
Segmentation fault (core dumped)
So I checked the symbols of the final main:
0000000000000696 T func1
00000000000006b3 T func2
w func3
So we can see the func3 has no valid address. That's why segment fault happens.
So why? Didn't I add the func3.o into the all_weak.a archive file?
ar cr all_weak.a func3.o test_target_weak.o
I tried the same thing with func2, where I removed the func2 implementation from ovrride.c. But this time there's no segment fault.
override.c
#include <stdio.h>
// void func2 (void)
// {
// printf("in mock func2()\n");
// }
void func3 (void)
{
printf("in mock func3()\n");
}
Output:
xxx#xxx-host:~/source/override$ ./main
in original func1()
in original func2() <====== the original func2() is invoked as a fall back
in mock func3()
My guess is, because func2 is defined in the same file/translation unit as func1. So func2 is always brought in with func1. So the linker can always resolve func2, be it from the test_target.c or override.c.
But for func3, it is defined in a separate file/translation unit (func3.c). If it is declared as weak, the consumer test_target.o will still record func3() as weak. But unfortunately the GCC linker will not check the other .o files from the same .a file to look for an implementation of func3(). Though it is indeed there.
all_weak.a:
func3.o:
0000000000000000 T func3 <========= func3 is indeed here!
U _GLOBAL_OFFSET_TABLE_
U puts
test_target_weak.o:
0000000000000013 T func1
0000000000000000 W func2
w func3
U _GLOBAL_OFFSET_TABLE_
U puts
So I must provide an override version in override.c otherwise the func3() cannot be resolved.
But I still don't know why GCC behaves like this. If someone can explain, please.
(Update 9:01 AM 8/8/2021:
this thread may explain this behavior, hopefully.)
So further conclusion is:
If you declare some symbol as weak, you'd better provide override versions of all the weak functions. Otherwise, the original version cannot be resolved unless it lives within the same file/translation unit of the caller/consumer.
You could use a shared library (Unix) or a DLL (Windows) to do this as well (would be a bit of a performance penalty). You can then change the DLL/so that gets loaded (one version for debug, one version for non-debug).
I have done a similar thing in the past (not to achieve what you are trying to achieve, but the basic premise is the same) and it worked out well.
[Edit based on OP comment]
In fact one of the reasons I want to
override functions is because I
suspect they behave differently on
different operating systems.
There are two common ways (that I know of) of dealing with that, the shared lib/dll way or writing different implementations that you link against.
For both solutions (shared libs or different linking) you would have foo_linux.c, foo_osx.c, foo_win32.c (or a better way is linux/foo.c, osx/foo.c and win32/foo.c) and then compile and link with the appropriate one.
If you are looking for both different code for different platforms AND debug -vs- release I would probably be inclined to go with the shared lib/DLL solution as it is the most flexible.

Resources