Rename a function without changing its references - c

I have an object file compiled using gcc with -ffunction-sections option. I have access to the source file but iam not allowed to modify it.
file.c
void foo(void)
{
bar();
}
void bar(void)
{
abc();
}
What iam trying to achieve is to make all the references to bar take an absolute address(which I'll assign in the linker script) whereas bar will be placed at some other address by the linker.
A possible solution is to rename bar to file_bar without changing the call to bar inside foo(). I tried using objcopy -redefine-syms but it seems to rename even the calls to bar.
Solution provided by busybee solves the problem unless the functions are in the same compilation unit.
foo1.c
#include <stdio.h>
extern void bar1();
void foo1(){
printf("foo1\n");
}
int main(){
printf("main\n");
foo1();
bar1();
}
bar1.c
#include <stdio.h>
void bar1(){
printf("bar1\n");
}
wrapper.c
#include <stdio.h>
void __wrap_foo1(){
printf("wrap_foo1\n");
}
void __wrap_bar1(){
printf("wrap_bar1\n");
}
Now,
$ gcc -c -ffunction-sections foo1.c bar1.c wrapper.c
$ gcc -Wl,--wrap=foo1 -Wl,--wrap=bar1 -o output foo1.o bar1.o wrapper.o
$ ./output
main
foo1
wrap_bar1

All functions to be redirected are in their own compilation unit
The linker has the option "--wrap" that replaces all references to the symbol "xxx" by "__wrap_xxx" and the symbol itself by "__real_xxx". It is used to put a wrapper function as an "interceptor" in between call and function.
But with this option you can do whatever you like with those symbols in your linker script. You just need to define "__wrap_xxx" with a symbol so that the references are resolvable.
Depending on your needs you can also write a dummy function named "__wrap_xxx()" that does not even call "__real_xxx()". Or you can place "__real_xxx" in a vector table, or... whatever you can think of.
All functions to be redirected are non-static ("global"), patching immediate values
I looked through the answers of the other question the OP posted in a comment. This gave me the idea to weaken the symbols in question and to override them with a value by the linker.
This example might give you some insight. I tested in on Linux which has address space layout randomization so all addresses are offsets from a random base. But for the OP's target system it should work as expected.
foo1.c
Because of arbitrary values for the redirected addresses the functions can't be called. But the program can print their addresses.
#include <stdio.h>
void foo1(void) {
}
extern void bar1(void);
int main(void) {
printf("%p\n", main);
printf("%p\n", foo1);
printf("%p\n", bar1);
return 0;
}
bar1.c
void bar1(void) {
}
wrapper.ld
This is the first alternative to give the linker the addresses to be used, an additional linker script. For the second one see below. The standard linker script will be augmented here, there is no need to copy and patch it. Because of the simple structure this is probably the most simple way to provide many redirected addresses which can be easily automated.
foo1 = 0x1000;
bar1 = 0x2000;
Note: This is not C! It is "linker script" syntax which happens to be quite similar.
How I built and tested
This command sequence can be automated and sorted for your liking. Especially the calls of objcopy could be done by some loop over a list.
gcc -c -ffunction-sections foo1.c
objcopy --weaken-symbol=foo1 foo1.o foo2.o
gcc -c -ffunction-sections bar1.c
objcopy --weaken-symbol=bar1 bar1.o bar2.o
gcc foo1.o bar1.o -o original
echo original
./original
gcc foo2.o bar2.o -o weakened
echo weakened
./weakened
gcc foo2.o bar2.o wrapper.ld -o redirected
echo redirected
./redirected
Instead of an additional linker script the symbol definitions can be given on the command line, too. This is the mentioned second alternative.
gcc foo2.o bar2.o -Wl,--defsym=foo1=0x1000 -Wl,--defsym=bar1=0x2000 -o redirected
BTW, the linker understands #file to read all arguments from the file file. So there's "no limit" on the size of the linker command.
All functions to be redirected are non-static ("global"), overwriting with new functions
Instead of providing immediate values you can of course just provide your alternative functions. This works like above but instead of the additional linker script or symbol definitions you write a source file.
wrapper.c
Yes, that's right, the names are equal to the names of the originals! Because we made the symbols of the original functions weak, we'll get no error message from the linker when it overwrites the references with the addresses of the new functions.
void foo1(void) {
}
void bar1(void) {
}
Build the redirected program like this (only new commands shown):
gcc -c -ffunction-sections wrapper.c
gcc foo2.o bar2.o wrapper.o -o redirected
A function to be redirected is static
Well, depending on your target architecture it will probably not be possible. This is because of the relocation entry of the reference. It will be some kind of relative, telling the linker to resolve by an offset into the section of the function instead to resolve by the symbol of the function.
I didn't investigate this further.

Related

Why I got "clang: error: cannot specify -o when generating multiple output files"?

I am a newbie in C. I have two simple source code files f1.c and f2.c.
f1.c looks like:
#include <stdio.h>
#include "f.h"
void f1(void) {
// some code ...
}
function f2() in f2.c relies on f1() in f1.c.
#include <stdio.h>
#include "f.h"
void f2(void) {
f1();
}
f1.c and f2.c share a same header f.h,
void f1(void);
void f2(void);
There are no main() access, I just want to compile these two file into a .o file without linker (using -c option),
gcc -c f1.c f2.c -o f2.o
then I got,
clang: error: cannot specify -o when generating multiple output files
but when I mentioned only f2.c, it works well,
gcc -c f2.c -o f2.o
So what's the problem? Thanks!
You should look into the compilation process for C. The first stage is compiling the .c source code into .o object files. The .c files do not need to see the other .c files; they are accepting as fact what you've told them about the existence of external functions. It's not until the linker comes in that it really needs to see the function because the implementation details don't matter to your .c file, just the interface, which you've presumably given it in the header.
What you can do, if you like, is drop the -o flag specifying the output file you want to create. Just compile with
gcc -c f1.c f2.c
and it will know to create f1.o and f2.o which will be able to link against each other when the time comes that you do want to go through with the linking process.
I am curious, however, what your intentions may be for wanting to compile these without linking. I only ask as you refer to yourself as a newbie, so I am wondering if maybe there is an end goal you have in mind and perhaps aren't asking the right question.

C symbol visibility in static archives

I have files foo.c bar.c and baz.c, plus wrapper code myfn.c defining a function myfn() that uses code and data from those other files.
I would like to create something like an object file or archive, myfn.o or libmyfn.a, so that myfn() can be made available to other projects without also exporting a load of symbols from {foo,bar,baz}.o as well.
What's the right way to do that in Linux/gcc? Thanks.
Update: I've found one way of doing it. I should've emphasised originally that this was about static archives, not DSOs. Anyway, the recipe:
#define PUBLIC __attribute__ ((visibility("default"))) then mark myfn() as PUBLIC in myfn.c. Don't mark anything else PUBLIC.
Compile objects with gcc -c foo.c bar.c baz.c myfn.c -fvisibility=hidden, which marks everything as hidden except for myfn().
Create a convenience archive using ld's partial-linking switch: ld -r foo.o bar.o baz.o myfn.o -o libmyfn.a
Localise everything that wasn't PUBLIC like so: objcopy --localize-hidden libmyfn.a
Now nm says myfn is the only global symbol in libmyfn.a and subsequent linking into other programs works just fine: gcc -o main main.c -L. -lmyfn (here, the program calls myfn(); if it tried to call foo() then compilation would fail).
If I use ar instead of ld -r in step 3 then compilation fails in step 5: I guess ar hasn't linked foo etc to myfn, and no longer can once those functions are localised, whereas ld -r resolves the link before it gets localised-away.
I'd welcome any response that confirms this is the "right" way, or describes a slicker way of achieving the same.
Unfortunately, C linkage for globals is all-or-nothing, in the sense that the globals of all modules would be available in libmyfn.a's final list of external symbols.
gcc tool chain offers an extension that lets you hide symbols from outside users, while making them available to other translation units in your library:
foo.h:
void foo();
foo.c:
void foo() __attribute__ ((visibility ("hidden")));
myfn.h:
void myfn();
myfn.c:
#include <stdio.h>
#include "foo.h"
void myfn() {
printf("calling foo...\n");
foo();
printf("calling foo again...\n");
foo();
}
For portability, you would probably benefit from making a macro for __attribute__ ((visibility ("hidden"))), and placing it in a conditional compilation block conditioned on gcc.
In addition, Linux offers a utility called strip, which lets you remove some of the symbols from compiled object files. Options -N and -K let you identify individual symbols that you want to keep or remove.
Start with this to build a static library
gcc -c -O2 foo.c bar.c baz.c myfn.c
ar av libmyfunctions.a foo.o bar.o baz.o myfn.o
Compile and link with other programs like:
gcc -O2 program.c -lmyfunctions -o myprogram
Now your libmyfunctions.a will ultimately have extra stuff from the source that isn't required by the code in myfn.c But the linker should do a reasonable job of removing this when it creates the final program.
Suppose myfn.c has function myfun() which you want to use in other three files foo.c, bar.c & baz.c
Now create a shared library from code in myfn.c viz libmyf.a
Use this function call myfun() in other three files. Declare function as extern in these files. Now you can create object code of these thee files and link the libmyf.a at linking phase.
Refer to following link for using shared libraries.
http://www.cprogramming.com/tutorial/shared-libraries-linux-gcc.html

Using a function from another C file placed in another directory?

Say I have a parent directory A with two subdirectories B and C.
Sub-directory C has a helper.c and helper.h as shown:
//helper.c
void print(){
printf("Hello, World!\n");
}
//helper.h
void print();
Now, in sub directory B, I have a main.c which just calls the print function:
//main.c
#include<stdio.h>
#include"../C/helper.h"
void main(){
print();
}
I tried the following commands for compiling main.c:
Command 1: gcc main.c //Gives undefined reference to 'print' error
Command 2: gcc main.c ../C/helper.c //Compiles successfully
Now I removed the #include"../C/helper.h" from main .c and tried the Command 2 again. It still works.
So I have the following questions:
i) What difference does it make whether the helper.h file is included or
helper.c?
ii) Why command 1 fails?
iii) Is there a way to compile my C program without having to specify
helper.c everytime?
What happens when you execute:
Command 1: gcc main.c //Gives undefined reference to 'print' error
When execute gcc main.c
Compiler compiles main.c and creates objective file. This file will contain unresolved link to function print(). Because there is no implementation of function print() in main.c file.
After compilation gcc tries to make full executable file. To do this gcc combines all objective files and tries to resolve all unresolved links. As you remember there is unresolved link for function print(), gcc can't find implementation and raise the error.
When you execute
Command 2: gcc main.c ../C/helper.c //Compiles successfully
gcc compiles both files. Second file ../C/helper.c contains implementation of function print(), so linker can find it and resolve reference to it in function main().
i) What difference does it make whether the helper.h file is included or helper.c?
In your case helper.h contains forward declaration of function print(). This gives information to compiler how to make call of function print().
ii) Why command 1 fails?
See above.
iii) Is there a way to compile my C program without having to specify helper.c everytime?
Use make utility. Compile helper.c in separate objective file helper.o and use it in linkage command.
helper.o: ../C/helper.c ../C/helper.h
gcc -c ../C/helper.c
main.o: main.c main.h
gcc -c main.c
testprog: main.o helper.o
g++ main.o helper.o -o testprog
See make utility manual for details.
Commands should be indented by TAB.
First you need to understand that #include simply adds whatever text is in the #include parameter to the position in the file the statement is in, for example:
//file1.h
void foo();
//main.c
#include "file1.txt"
int main(int argc, char **argv)
{
foo();
return 0;
}
Will cause the pre-compilation to generate this unified file for compilation:
//main.c.tmp
void foo();
int main(int argc, char **argv)
{
foo();
return 0;
}
So to answer your first and second questions:
When you include a header file (or any file) that only contains declarations (i.e function signatures) without definitions (i.e function implementations), as in the example above, the linker will fail in finding the definitions and you will get the 'undefined reference' error.
When you include a c code file (or any file) that contains definitions, these definitions will be merged to your code and the linker will have them, that's why it works.
and as for your third question
It is bad practice to include c files directly in other c files, the common approach is to keep separate c files with headers exposing the functionality they provide, include the header files and link against the compiled c files, for example in your case:
gcc main.c helper.c -o out
Will allow you to include helper.c in main.c and still work because you instructed the compiler to compile both files instead of just main.c so when linking occurs the definitions from the compilation will be found and you will not get the undefined behavior error
This is, in a nutshell. I abstracted a lot of what's going on to pass on the general idea. this is a nice article describing the compilation process in fair detail and this is a nice overview of the entire process.
I'll try to answer:
i) What difference does it make whether the helper.h file is included or helper.c?
When you include a file, you don't want to expose your implementation, hence its better to include h files, that contains only the "signatures" - api of your implementation.
ii) Why command 1 fails?
When you compile you must add all your resources to the executable, otherwise he won't compile.
iii) Is there a way to compile my C program without having to specify
helper.c everytime?
You can use Makefile to compile your program. Maybe this tutorial can help you.
i) What difference does it make whether the helper.h file is included
or helper.c?
Including helper.c means that helper.c gets compiled each time as if it were part of main.c
Including helper.h lets the compiler know what argument types the function print() takes and returns so the compiler can give an error or warning if you call print() incorrectly
ii) Why command 1 fails?
The compiler is not being told where to find the actual code for the print function. As explained, including the .h file only helps the compiler with type checking.
iii) Is there a way to compile my C program without having to specify
helper.c everytime?
You can compile it once into an object file and optionally you can add that obj to a static or dynamically loaded library. You still need to help the compiler find that obj or library. For example,
gcc -c helper.c
gcc main.c helper.o
The correct way to avoid compiling modules that don't need compiling is to use a Makefile. A Makefile compares when a module was last compiled compared to when it was last modified and that way it knows what needs to be compiled and what doesn't.

How does a C static library work?

What code goes into the final executable when using a library?
As an example, we have two files:
/*main.c*/
int main (int argc, char* argv[]){
fc(1); /*This function is defined in fc.c*/
}
Another file:
/*fc.c*/
int fc(int x){
return fe(x);
}
int fe(int y){
return y + 1;
}
We compile fc.c:
gcc -c fc.c
We then get fc.o.
Now lets build a library named test:
ar rcs libtest.a fc.o
We now have libtest.a.
Now we compile main.c
gcc -c main.c
And we obtain main.o
Let's link our main.o to our libtest.a
gcc -L. main.o -ltest
We get the desired a.out
Checking it's symbols:
nm a.out
In between all the symbols, we find:
080483cc T fc
080483df T fe
Seems good.
BUT!
If our main.c changes for this?
/*main.c*/
int main (int argc, char* argv[]){
fe(1); /*This function is defined in fc.c*/
}
After compiling main.c and linking the new main.o to our library, I will still find a symbol for fc. But I don't need that code.
Questions
-Shouldn't the library "give me" only the code I need in main.c?
-Do the functions need to be in separate modules before being added to the library?
-What if I had 300 functions? Would I need to make 300 modules?
Yes, place each function in a separate module. That way the linker will link in only the items needed.
In short, there are compiler flags to prune unused functions from the final executable code, however they are not enabled by default.
GCC can do this "garbage collection" of unused functions if these flags are added:
-ffunction-sections as a compile-time flag. It instructs the compiler to create a separate section (see object file format) for each function. There's also -fdata-sections flag with similar meaning that works for variables.
-Wl,--gc-sections as a link-time flag. The -Wl part instructs GCC to pass the following options to the linker. --gc-sections means "garbage select sections from which all code is unsed". Since due to the compile-time options each function has got a separate section, it effectively performs function-level pruning.

Can I re-compile a file with new code?

I have a question. I was wondering if you could re-compile code with another piece of code. For example (theoretical):
main.c:
#include <stdio.h>
void showme();
int main()
{
showme();
}
void showme()
{
fprintf(stderr, "errtest, show me");
}
Compile this file to main. (So the main is compiled)
After this I want to add a piece of code.
addthis.c:
void test()
{
test();
}
Now I want to use the (compiled) main and re-compile it with addthis.c.
When running it (./mainWithAddthis) should show the print 2 times.
I hope I explained it clear. Anybody an idea?
You need a forward declaration for your void test() like you have one for the void showme(). Compile each .c file with -c (compile only) option:
gcc -c addthis.c -o addthis.o
gcc -c main.c -o main.o
Then link the two object files with:
gcc main.o addthis.o -o main
Then enjoy ./main :-)
Your first code will not compile since there's not definition of test();.
As I understand, you want to take the compiled main and add it with the code generated on addthis.o to create a 2nd application named mainWithAddthis. This is not possible!
You are either confused or trying to do some hardcore trick.
Building an executable is a two step process.
For every source file you specify (in your project/makefile), your compiler will build an object file
For every object file you specify (in your project/makefile), your linker will link them together and make your executable
One way to re-compile would be simply to re-build your entire project. You'd get more or less the same result.
But it sounds like what you want to do is recompile only the source file, addthis.c, then re-link the old version of main.o (the object file compiled for main.c) with the new version of addthis.o. How to do this is completely dependent on the compiler and build system you use.
Also, that solution will only work if you have main.o, addthis.c, and have the exact same compiler binaries/install, and compiler flags used to generate main.o. If this is all on your box, then you're probably okay.
If you only have the files addthis.c and main.exe, then no there is no portable way to do what you want.
You can't do what you are talking about after the fact without some hardcore time with a hex editor.
However, if you plan ahead and build it into your software, you can use dynamic loading to achieve the same effect, which is how a lot of software provides plugin functionality. Check out glib modules for a common way to do this in C.
main.c
void f();
int main()
{
f();
return 0;
}
addon1.c
#include <stdio.h>
void f()
{
printf("I am the ONE.\n");
}
addon2.c
#include <stdio.h>
void f()
{
printf("I am the TWO.\n");
}
Compilation
gcc -c main.c -o main.o
gcc -c addon1.c -o addon1.o
gcc -c addon2.c -o addon2.o
gcc main.o addon1.o -o main1
gcc main.o addon2.o -o main2
You will have ./main1 and ./main2 programs which will print ...ONE. and ...TWO..

Resources