Function with same names in different files - C - c

I have two functions with same name and want to use it in my application.
I referred various answers like here and here but couldn't get clear solution.
I have following functions
// xxxx_input.h
int8_t input_system_init(InputParams params);
int8_t input_system_easy_load(uint32_t interval_ms);
// yyyy_input.h
int8_t input_system_init(InputParams params);
int8_t input_system_easy_load(uint32_t interval_ms);
Reason there are two files is xxxx_input and yyyy_input work way different internally.
Modifying the function isn't easy since the code is provided by external party and we have to keep the xxxx_input files.
What we can do is modify yyyy_input.h but functions like input_system_easy_load are to be kept consistent as they are called from different places.
Is there a way we can achieve the same?
I tried replacing xxxx_input with yyyy_input.h but since the include directory already contains the same function it gives error.
input_system_init multiply defined (by xxxx_input.o and yyyy_input.o).

If you have the source code to the functions defined in xxxx_input.h and yyyy_input.h, you could compile both modules with command line options redefining the function names via the preprocessor:
gcc -Dinput_system_init=xxxx_input_system_init -Dinput_system_easy_load=xxxx_input_system_easy_load xxxx_input.c
gcc -Dinput_system_init=yyyy_input_system_init -Dinput_system_easy_load=yyyy_input_system_easy_load yyyy_input.c
You would then compile your code with modified prototypes and you could link all 3 modules together.
If the modules are provided in object form only, you could define wrapper functions xxxx_input_system_init and xxxx_input_system_easy_load that you would link with the xxxx_input.o to produce a dynamic library, and the same for the yyyy alternatives. You would use modified prototypes in your module and would link it with the dynamic libraries.
Mike Kinghan showed a simpler approach for object files and libraries on systems where objcopy is available.
To get modified prototypes automatically, you can use this include file:
my_input_system.h:
#define input_system_init xxxx_input_system_init
#define input_system_easy_load xxxx_input_system_easy_load
#include "xxxx_input.h"
#undef input_system_init
#undef input_system_easy_load
#define input_system_init yyyy_input_system_init
#define input_system_easy_load yyyy_input_system_easy_load
#include "yyyy_input.h"
#undef input_system_init
#undef input_system_easy_load
/* prevent direct use of the redefined functions */
#define input_system_init do_not_use_input_system_init#
#define input_system_easy_load do_not_use_input_system_easy_load#

I'll walk through a solution you can use if your supplier has given you object files compiled
with GCC or Clang for GNU/Linux computers.
My supplier has given me a header foo_a.h file that declares function foo
$cat foo_a.h
#pragma once
extern void foo(void);
and the matching object file foo.o:
$ nm foo_a.o
0000000000000000 T foo
U _GLOBAL_OFFSET_TABLE_
U puts
that defines foo.
Likewise they've given me a header foo_b.h that also declares foo
$ cat foo_b.h
#pragma once
extern void foo(void);
and the matching object file foo_b.o
$ nm foo_b.o
0000000000000000 T foo
U _GLOBAL_OFFSET_TABLE_
U puts
that also defines foo.
The functions foo_a.o:foo and foo_b.o:foo do different
things (or different variations of the same thing). I want to do both these things in the same program,
prog.c:
$ cat prog.c
extern void foo_a(void);
extern void foo_b(void);
int main(void)
{
foo_a(); // Calls `foo_a.o:foo`
foo_b(); // Calls `foo_b.o:foo`
return 0;
}
I can make such a program as follows:
$ objcopy --redefine-sym foo=foo_a foo_a.o prog_foo_a.o
$ objcopy --redefine-sym foo=foo_b foo_b.o prog_foo_b.o
Now I have made a copy prog_foo_a.o of foo_a.o in which the symbol foo is
renamed foo_a, and a copy prog_foo_b.o of foo_b.o in which
the symbol foo is renamed foo_b.
Then I compile and link like this:
$ gcc -c -Wall -Wextra prog.c
$ gcc -o prog prog.o prog_foo_a.o prog_foo_b.o
And prog runs like:
$ ./prog
foo_a
foo_b
Perhaps my supplier has given me foo_a.o within a static library liba.a
that also contains other object files that refer to foo_a.o:foo? And similarly
with foo_b.o.
That's OK. Instead of:
$ objcopy --redefine-sym foo=foo_a foo_a.o prog_foo_a.o
$ objcopy --redefine-sym foo=foo_b foo_b.o prog_foo_b.o
I will run:
$ objcopy --redefine-sym foo=foo_a liba.a libprog_a.a
$ objcopy --redefine-sym foo=foo_b libb.a libprog_b.a
and this will give me a new static library libprog_a.a
in which foo is renamed foo_a in all the object files in the
library. Similarly foo is renamed foo_b throughout libprog_b.a.
Then I'll link prog:
$ gcc -o prog prog.o -L. -lprog_a -lprog_b
Consider a potential drawback with this solution. Possibly my supplier has
given me foo_a.o and foo_b.o with debugging information in them, and I want to
used it for debugging my prog with gdb?
I have changed the original symbol names, foo_a.o:foo to foo_a and
foo_b.o:foo to foo_b, but I haven't changed the debugging info associated
with those symbols. Debugging with gdb will still work, but some of the
debugging output will be incorrect and possibly confusing. E.g. if I put
a breakpoint on foo_a, gdb will run to it and stop, but it will say
it has stopped at foo from file foo_a.c. And if I then breakpoint at foo_b, gdb will run to
it and again say it is at foo, but from file foo_b.c. If the person doing
the debugging doesn't know how the program was built, this would certainly be
confusing.
But giving you debugging info with binaries is not far from giving you the
source code, so as you haven't got source code you likely don't have
debugging info and are not concerned about it.

Related

Rename a function without changing its references

I have an object file compiled using gcc with -ffunction-sections option. I have access to the source file but iam not allowed to modify it.
file.c
void foo(void)
{
bar();
}
void bar(void)
{
abc();
}
What iam trying to achieve is to make all the references to bar take an absolute address(which I'll assign in the linker script) whereas bar will be placed at some other address by the linker.
A possible solution is to rename bar to file_bar without changing the call to bar inside foo(). I tried using objcopy -redefine-syms but it seems to rename even the calls to bar.
Solution provided by busybee solves the problem unless the functions are in the same compilation unit.
foo1.c
#include <stdio.h>
extern void bar1();
void foo1(){
printf("foo1\n");
}
int main(){
printf("main\n");
foo1();
bar1();
}
bar1.c
#include <stdio.h>
void bar1(){
printf("bar1\n");
}
wrapper.c
#include <stdio.h>
void __wrap_foo1(){
printf("wrap_foo1\n");
}
void __wrap_bar1(){
printf("wrap_bar1\n");
}
Now,
$ gcc -c -ffunction-sections foo1.c bar1.c wrapper.c
$ gcc -Wl,--wrap=foo1 -Wl,--wrap=bar1 -o output foo1.o bar1.o wrapper.o
$ ./output
main
foo1
wrap_bar1
All functions to be redirected are in their own compilation unit
The linker has the option "--wrap" that replaces all references to the symbol "xxx" by "__wrap_xxx" and the symbol itself by "__real_xxx". It is used to put a wrapper function as an "interceptor" in between call and function.
But with this option you can do whatever you like with those symbols in your linker script. You just need to define "__wrap_xxx" with a symbol so that the references are resolvable.
Depending on your needs you can also write a dummy function named "__wrap_xxx()" that does not even call "__real_xxx()". Or you can place "__real_xxx" in a vector table, or... whatever you can think of.
All functions to be redirected are non-static ("global"), patching immediate values
I looked through the answers of the other question the OP posted in a comment. This gave me the idea to weaken the symbols in question and to override them with a value by the linker.
This example might give you some insight. I tested in on Linux which has address space layout randomization so all addresses are offsets from a random base. But for the OP's target system it should work as expected.
foo1.c
Because of arbitrary values for the redirected addresses the functions can't be called. But the program can print their addresses.
#include <stdio.h>
void foo1(void) {
}
extern void bar1(void);
int main(void) {
printf("%p\n", main);
printf("%p\n", foo1);
printf("%p\n", bar1);
return 0;
}
bar1.c
void bar1(void) {
}
wrapper.ld
This is the first alternative to give the linker the addresses to be used, an additional linker script. For the second one see below. The standard linker script will be augmented here, there is no need to copy and patch it. Because of the simple structure this is probably the most simple way to provide many redirected addresses which can be easily automated.
foo1 = 0x1000;
bar1 = 0x2000;
Note: This is not C! It is "linker script" syntax which happens to be quite similar.
How I built and tested
This command sequence can be automated and sorted for your liking. Especially the calls of objcopy could be done by some loop over a list.
gcc -c -ffunction-sections foo1.c
objcopy --weaken-symbol=foo1 foo1.o foo2.o
gcc -c -ffunction-sections bar1.c
objcopy --weaken-symbol=bar1 bar1.o bar2.o
gcc foo1.o bar1.o -o original
echo original
./original
gcc foo2.o bar2.o -o weakened
echo weakened
./weakened
gcc foo2.o bar2.o wrapper.ld -o redirected
echo redirected
./redirected
Instead of an additional linker script the symbol definitions can be given on the command line, too. This is the mentioned second alternative.
gcc foo2.o bar2.o -Wl,--defsym=foo1=0x1000 -Wl,--defsym=bar1=0x2000 -o redirected
BTW, the linker understands #file to read all arguments from the file file. So there's "no limit" on the size of the linker command.
All functions to be redirected are non-static ("global"), overwriting with new functions
Instead of providing immediate values you can of course just provide your alternative functions. This works like above but instead of the additional linker script or symbol definitions you write a source file.
wrapper.c
Yes, that's right, the names are equal to the names of the originals! Because we made the symbols of the original functions weak, we'll get no error message from the linker when it overwrites the references with the addresses of the new functions.
void foo1(void) {
}
void bar1(void) {
}
Build the redirected program like this (only new commands shown):
gcc -c -ffunction-sections wrapper.c
gcc foo2.o bar2.o wrapper.o -o redirected
A function to be redirected is static
Well, depending on your target architecture it will probably not be possible. This is because of the relocation entry of the reference. It will be some kind of relative, telling the linker to resolve by an offset into the section of the function instead to resolve by the symbol of the function.
I didn't investigate this further.

Remove dead code when linking static library into dynamic library

Suppose I have the following files:
libmy_static_lib.c:
#include <stdio.h>
void func1(void){
printf("func1() called from a static library\n");
}
void unused_func1(void){
printf("printing from the unused function1\n");
}
void unused_func2(void){
printf("printing from unused function2\n");
}
libmy_static_lib.h:
void func(void);
void unused_func1(void);
void unused_func2(void);
my_prog.c:
#include "libmy_static_lib.h"
#include <stdio.h>
void func_in_my_prog()
{
printf("in my prog\n");
func1();
}
And here is how I link the library:
# build the static library libmy_static_lib.a
gcc -fPIC -c -fdata-sections --function-sections -c libmy_static_lib.c -o libmy_static_lib.o
ar rcs libmy_static_lib.a libmy_static_lib.o
# build libmy_static_lib.a into a new shared library
gcc -fPIC -c ./my_prog.c -o ./my_prog.o
gcc -Wl,--gc-sections -shared -m64 -o libmy_shared_lib.so ./my_prog.o -L. -l:libmy_static_lib.a
There are 2 functions in libmy_static_lib.c that are not used, and from this post, I think
gcc fdata-sections --function-sections
should create a symbol for each function, and
gcc -Wl,--gc-sections
should remove the unused symbols when linking
however when I run
nm libmy_shared_lib.so
It is showing that these 2 unused functions are also being linked into the shared library.
Any suggestions on how to have gcc remove the unused functions automatically?
Edit:
I am able to use the above options for gcc to remove the unused functions if I am linking a static library directly to executable. But it doesn't remove the unused functions if I link the static library to a shared library.
You can use a version script to mark the entry points in combination with -ffunction-sections and --gc-sections.
For example, consider this C file (example.c):
int
foo (void)
{
return 17;
}
int
bar (void)
{
return 251;
}
And this version script, called version.script:
{
global: foo;
local: *;
};
Compile and link the sources like this:
gcc -Wl,--gc-sections -shared -ffunction-sections -Wl,--version-script=version.script example.c
If you look at the output of objdump -d --reloc a.out, you will notice that only foo is included in the shared object, but not bar.
When removing functions in this way, the linker will take indirect dependencies into account. For example, if you turn foo into this:
void *
foo (void)
{
extern int bar (void);
return bar;
}
the linker will put both foo and bar into the shared object because both are needed, even though only bar is exported.
(Obviously, this will not work on all platforms, but ELF supports this.)
You're creating a library, and your symbols aren't static, so it's normal that the linker doesn't remove any global symbols.
This -gc-sections option is designed for executables. The linker starts from the entrypoint (main) and discovers the function calls. It marks the sections that are used, and discards the others.
A library doesn't have 1 entrypoint, it has as many entrypoints as global symbols, which explains that it cannot clean your symbols. What if someone uses your .h file in his program and calls the "unused" functions?
To find out which functions aren't "used", I'd suggest that you convert void func_in_my_prog() to int main() (or copy the source into a modified one containing a main()), then create an executable with the sources, and add -Wl,-Map=mapfile.txt option when linking to create a mapfile.
gcc -Wl,--gc-sections -Wl,--Map=mapfile.txt -fdata-sections -ffunction-sections libmy_static_lib.c my_prog.c
This mapfile contains the discarded symbols:
Discarded input sections
.drectve 0x00000000 0x54 c:/gnatpro/17.1/bin/../lib/gcc/i686-pc-mingw32/6.2.1/crt2.o
.drectve 0x00000000 0x1c c:/gnatpro/17.1/bin/../lib/gcc/i686-pc-
...
.text$unused_func1
0x00000000 0x14 C:\Users\xx\AppData\Local\Temp\ccOOESqJ.o
.text$unused_func2
0x00000000 0x14 C:\Users\xx\AppData\Local\Temp\ccOOESqJ.o
.rdata$zzz 0x00000000 0x38 C:\Users\xx\AppData\Local\Temp\ccOOESqJ.o
...
now we see that the unused functions have been removed. They don't appear in the final executable anymore.
There are existing tools that do that (using this technique but not requiring a main), for instance Callcatcher. One can also easily create a tool to disassemble the library and check for symbols defined but not called (I've written such tools in python several times and it's so much easier to parse assembly than from high-level code)
To cleanup, you can delete the unused functions manually from your sources (one must be careful with object-oriented languages and dispatching calls when using existing/custom assembly analysis tools. On the other hand, the compiler isn't going to remove a section that could be used, so that is safe)
You can also remove the relevant sections in the library file, avoiding to change source code, for instance by removing sections:
$ objcopy --remove-section .text$unused_func1 --remove-section text$unused_func2 libmy_static_lib.a stripped.a
$ nm stripped.a
libmy_static_lib.o:
00000000 b .bss
00000000 d .data
00000000 r .rdata
00000000 r .rdata$zzz
00000000 t .text
00000000 t .text$func1
00000000 T _func1
U _puts

C symbol visibility in static archives

I have files foo.c bar.c and baz.c, plus wrapper code myfn.c defining a function myfn() that uses code and data from those other files.
I would like to create something like an object file or archive, myfn.o or libmyfn.a, so that myfn() can be made available to other projects without also exporting a load of symbols from {foo,bar,baz}.o as well.
What's the right way to do that in Linux/gcc? Thanks.
Update: I've found one way of doing it. I should've emphasised originally that this was about static archives, not DSOs. Anyway, the recipe:
#define PUBLIC __attribute__ ((visibility("default"))) then mark myfn() as PUBLIC in myfn.c. Don't mark anything else PUBLIC.
Compile objects with gcc -c foo.c bar.c baz.c myfn.c -fvisibility=hidden, which marks everything as hidden except for myfn().
Create a convenience archive using ld's partial-linking switch: ld -r foo.o bar.o baz.o myfn.o -o libmyfn.a
Localise everything that wasn't PUBLIC like so: objcopy --localize-hidden libmyfn.a
Now nm says myfn is the only global symbol in libmyfn.a and subsequent linking into other programs works just fine: gcc -o main main.c -L. -lmyfn (here, the program calls myfn(); if it tried to call foo() then compilation would fail).
If I use ar instead of ld -r in step 3 then compilation fails in step 5: I guess ar hasn't linked foo etc to myfn, and no longer can once those functions are localised, whereas ld -r resolves the link before it gets localised-away.
I'd welcome any response that confirms this is the "right" way, or describes a slicker way of achieving the same.
Unfortunately, C linkage for globals is all-or-nothing, in the sense that the globals of all modules would be available in libmyfn.a's final list of external symbols.
gcc tool chain offers an extension that lets you hide symbols from outside users, while making them available to other translation units in your library:
foo.h:
void foo();
foo.c:
void foo() __attribute__ ((visibility ("hidden")));
myfn.h:
void myfn();
myfn.c:
#include <stdio.h>
#include "foo.h"
void myfn() {
printf("calling foo...\n");
foo();
printf("calling foo again...\n");
foo();
}
For portability, you would probably benefit from making a macro for __attribute__ ((visibility ("hidden"))), and placing it in a conditional compilation block conditioned on gcc.
In addition, Linux offers a utility called strip, which lets you remove some of the symbols from compiled object files. Options -N and -K let you identify individual symbols that you want to keep or remove.
Start with this to build a static library
gcc -c -O2 foo.c bar.c baz.c myfn.c
ar av libmyfunctions.a foo.o bar.o baz.o myfn.o
Compile and link with other programs like:
gcc -O2 program.c -lmyfunctions -o myprogram
Now your libmyfunctions.a will ultimately have extra stuff from the source that isn't required by the code in myfn.c But the linker should do a reasonable job of removing this when it creates the final program.
Suppose myfn.c has function myfun() which you want to use in other three files foo.c, bar.c & baz.c
Now create a shared library from code in myfn.c viz libmyf.a
Use this function call myfun() in other three files. Declare function as extern in these files. Now you can create object code of these thee files and link the libmyf.a at linking phase.
Refer to following link for using shared libraries.
http://www.cprogramming.com/tutorial/shared-libraries-linux-gcc.html

How does a C static library work?

What code goes into the final executable when using a library?
As an example, we have two files:
/*main.c*/
int main (int argc, char* argv[]){
fc(1); /*This function is defined in fc.c*/
}
Another file:
/*fc.c*/
int fc(int x){
return fe(x);
}
int fe(int y){
return y + 1;
}
We compile fc.c:
gcc -c fc.c
We then get fc.o.
Now lets build a library named test:
ar rcs libtest.a fc.o
We now have libtest.a.
Now we compile main.c
gcc -c main.c
And we obtain main.o
Let's link our main.o to our libtest.a
gcc -L. main.o -ltest
We get the desired a.out
Checking it's symbols:
nm a.out
In between all the symbols, we find:
080483cc T fc
080483df T fe
Seems good.
BUT!
If our main.c changes for this?
/*main.c*/
int main (int argc, char* argv[]){
fe(1); /*This function is defined in fc.c*/
}
After compiling main.c and linking the new main.o to our library, I will still find a symbol for fc. But I don't need that code.
Questions
-Shouldn't the library "give me" only the code I need in main.c?
-Do the functions need to be in separate modules before being added to the library?
-What if I had 300 functions? Would I need to make 300 modules?
Yes, place each function in a separate module. That way the linker will link in only the items needed.
In short, there are compiler flags to prune unused functions from the final executable code, however they are not enabled by default.
GCC can do this "garbage collection" of unused functions if these flags are added:
-ffunction-sections as a compile-time flag. It instructs the compiler to create a separate section (see object file format) for each function. There's also -fdata-sections flag with similar meaning that works for variables.
-Wl,--gc-sections as a link-time flag. The -Wl part instructs GCC to pass the following options to the linker. --gc-sections means "garbage select sections from which all code is unsed". Since due to the compile-time options each function has got a separate section, it effectively performs function-level pruning.

when dlopen one so, it's symbol is not covered by main symbol, why?

libp2.c
#include <stdio.h>
void pixman()
{
printf("pixman in libp1\n");
}
libc2.c
#include <stdio.h>
void pixman();
void cairo()
{
printf("cairo2\n");
pixman();
}
main.c
#include <stdio.h>
#include <dlfcn.h>
void pixman()
{
printf("pixman in main\n");
}
int main()
{
pixman();
void* handle=NULL;
void (*callfun)();
handle=dlopen("/home/zpeng/test/so_test/libc2.so",RTLD_LAZY);
callfun = (void(*)())dlsym(handle, "cairo");
callfun();
...
}
compile
gcc -c libp2.c -fPIC -olibp2.o
rm libp2.a
ar -rs libp2.a libp2.o
gcc -shared -fPIC libc2.c ./libp2.a -o libc2.so
gcc main.c -ldl -L. -g
the result:
pixman in main
cairo2
pixman: libp2
why the last is not "pixman in main"?
I see the symbols processing(LD_DEBUG=symbols), it begins with :
21180: symbol=pixman; lookup in file=./a.out
21180: symbol=pixman; lookup in file=/lib64/libdl.so.2
21180: symbol=pixman; lookup in file=/lib64/tls/libc.so.6
21180: symbol=pixman; lookup in file=/lib64/ld-linux-x86-64.so.2
21180: symbol=pixman; lookup in file=/home/zpeng/test/so_test/libc2.so
if I add -lc2 or -rdynamic to gcc main cmd , it will generate:
pixman in main
cairo2
pixman in main
My questions:
why lookup symbol in a.out but not get the result and continue to search libc2.so when without -rdynamic and -lc2 ?
Why the last is not "pixman in main"?
That's because shared libraries have their own global offset table or GOT. When you use the cairo function in libc2.so, the pixman function that will be called is the same function that was resolved when compiling the .so file in the first place.
That is:
# creates object file only -- contains first pixman implementation
gcc -c libp2.c -fPIC -olibp2.o
# just turns the object file into an archive
ar -rs libp2.a libp2.o
# creates the .so file -- all symbols in libc2.c are resolved here
# and you passed in the .a file for that purpose. The .a file containing the
# first pixman implementation gets put in libc2.so.
gcc -shared -fPIC libc2.c ./libp2.a -o libc2.so
After this, anyone using libc2.so will get the copy stored in libc2.so. The lookup order you post is for a.out I believe and it's right. It looks for pixman in a.out, then libc2.so, and so on.
Why lookup symbol in a.out but not get the result and continue to search libc2.so when without -rdynamic and -lc2?
The rdynamic option loads ALL symbols to the dynamic symbol table -- not just the ones it thinks are used (lc2 has the same effect). When you load all those symbols you have a conflict -- the pixman function. The main.c implementation is used in this case. As others have pointed out, this will probably generate a warning.
You need to compile the sources that get archived into the .a file with -fvisibility=hidden, to indicate that, although they are global functions, they are not meant to be used outside the resulting library but are instead meant to resolve symbols inside the library. That will cause the symbols in the .a file to appear with the qualifier " t " in nm -a instead of " T ", which is used for symbols available to other libraries.
It just auto binded to LOCAL symbol,
Since there not __attribute__((visibility("default"))) explicit in libp2.c, the compiler auto bind this function calling to LOCAL .symtab, instead of .dynsym
appendix1: more about ELF header: readelf -s xxx.lib
appendix2: keyword of ld argument -Bsymbolic-functions

Resources