Conditionally replacing a C function at runtime - c

Is it possible to conditionally replace a function at runtime in C (in particular, a function in a dynamically loaded library)?
I know that you can use LD_PRELOAD or just make a function of the same name, such as:
// Silly example intercepting exit
typedef void (*exit_func)(int code);
void exit(int code)
{
exit_func orig_exit = (exit_func)dlsym(RTLD_NEXT, "exit");
NSLog(#"EXIT CALLED WITH CODE %d!!!!", code);
orig_exit(code);
}
However, is it possible to CONDITIONALLY replace a function, at runtime, after the program has loaded and is running?
if(some_condition)
{
swap_impementations(exit, my_exit);
}
Edit: This is somewhat similar to Is it possible to swap C functions? but specifically, I am trying to intercept a call to a function from a different library that was loaded by the operating system.
What this means is that, for example, were I to intercept the exit() function from stdlib, ANY call to exit() from ANYWHERE would call my implementation instead of the original, much like my example above, except controllable at runtime.
There have been suggestions of hooking the call by overwriting the original with a jump instruction, but I was hoping for something that doesn't require stomping on executable memory, like perhaps there was something I could call in the dynamic linker to "re-link" the function after the program starts and point it somewhere else?

Use function pointer for this purpose.

Related

Best way to abstract away an init function?

I am making a low level library that requires initialization to work properly which I implemented with a init function. I am wondering if there is a way to make the init call be called once the user calls a library function ideally without:
Any overhead
No repeated calls
No exposed global variables. (my current solution does this, which I don't quite like)
my current solution as per comment request:
bool isinit = 0;
void init()
{
isinit = 1;
// init code
}
void lib_function()
{
if(!isinit) init();
// function code
}
The compiler seems to be smart enough (using -0fast on gcc) to not make that comparison each time a lib_function is called, but this still exposes a global variable which I don't like.
Best way to abstract away an init function?
Surely your library has some state. Typically, a library exposes functions that work on a specific structure. Do not use global variables - do not write spaghetti code. Expose the structure that holds the state of your library, and make all functions of your library take a pointer to the structure as an argument. Use a namespace - prepend all exported symbols with a prefix. An init function is just like int lib_init(struct lib_the_struct *t); - it will be self-understandable that users need to initialize the structure with that function before use. For example: fopen(), pthread_create.
Write an init function in your library. Write clear documentation stating, that the user of your library has to call the function once before calling any other function. For example: https://curl.se/libcurl/c/curl_global_init.html .
If you're happy with a solution that is a common extension rather than part of the C standard, you can mark your init function with the constructor attribute, which ensures it will be called automatically during program initialization (or during shared library load if you eventually end up using that).
I would fix this with assert so that the if will dissappear in release build and if you forget to call the init_function somewhere you get the error while developing.
Also turn isinit into a static so every library can have its own variable with the same name.
#include <assert.h>
#ifndef NDEBUG
static int isinit = 0;
#endif
void lib_function()
{
assert(isinit && "library: init not called");
}
There will be overhead if you run if(!isinit) init(); each time you call a function. At least an extra branch.
As for removing global variables, do in your example but static bool isinit = 0;. This reduces the scope of the variable to the local translation unit (.c file and all .h files it includes). It's no longer "global". Note that this isn't ideal in multi-threaded scenarios - you will have to protect the variable with a mutex then.
Overall though, what you are trying to do isn't a good idea. It is very common convention for C libraries to have an init function and the user of the library is expected to call it before calling anything else or they are to blame, not your library. Naturally you have to make this clear to them with source code documentation. It is common to have a list of prerequisites in source code comments together with every function declaration placed in the header file of the library.

Can a function know what's calling it?

Can a function tell what's calling it, through the use of memory addresses maybe? For example, function foo(); gets data on whether it is being called in main(); rather than some other function?
If so, is it possible to change the content of foo(); based on what is calling it?
Example:
int foo()
{
if (being called from main())
printf("Hello\n");
if (being called from some other function)
printf("Goodbye\n");
}
This question might be kind of out there, but is there some sort of C trickery that can make this possible?
For highly optimized C it doesn't really make sense. The harder the compiler tries to optimize the less the final executable resembles the source code (especially for link-time code generation where the old "separate compilation units" problem no longer prevents lots of optimizations). At least in theory (but often in practice for some compilers) functions that existed in the source code may not exist in the final executable (e.g. may have been inlined into their caller); functions that didn't exist in the source code may be generated (e.g. compiler detects common sequences in many functions and "out-lines" them into a new function to avoid code duplication); and functions may be replaced by data (e.g. an "int abcd(uint8_t a, uint8_t b)" replaced by a abcd_table[a][b] lookup table).
For strict C (no extensions or hacks), no. It simply can't support anything like this because it can't expect that (for any compiler including future compilers that don't exist yet) the final output/executable resembles the source code.
An implementation defined extension, or even just a hack involving inline assembly, may be "technically possible" (especially if the compiler doesn't optimize the code well). The most likely approach would be to (ab)use debugging information to determine the caller from "what the function should return to when it returns".
A better way for a compiler to support a hypothetical extension like this may be for the compiler to use some of the optimizations I mentioned - specifically, split the original foo() into 2 separate versions where one version is only ever called from main() and the other version is used for other callers. This has the bonus of letting the compiler optimize out the branches too - it could become like int foo_when_called_from_main() { printf("Hello\n"); }, which could be inlined directly into the caller, so that neither version of foo exists in the final executable. Of course if foo() had other code that's used by all callers then that common code could be lifted out into a new function rather than duplicating it (e.g. so it might become like int foo_when_called_from_main() { printf("Hello\n"); foo_common_code(); }).
There probably isn't any hypothetical compiler that works like that, but there's no real reason you can't do these same optimizations yourself (and have it work on all compilers).
Note: Yes, this was just a crafty way of suggesting that you can/should refactor the code so that it doesn't need to know which function is calling it.
Knowing who called a specific function is essentially what a stack trace is visualizing. There are no general standard way of extracting that though. In theory one could write code that targeted each system type the software would run on, and implement a stack trace function for each of them. In that case you could examine the stack and see what is before the current function.
But with all that said and done, the question you should probably ask is why? Writing a function that functions in a specific way when called from a specific function is not well isolated logic. Instead you could consider passing in a parameter to the function that caused the change in logic. That would also make the result more testable and reliable.
How to actually extract a stack trace has already received many answers here: How can one grab a stack trace in C?
I think if loop in C cannot have a condition as you have mentioned.
If you want to check whether this function is called from main(), you have to do the printf statement in the main() and also at the other function.
I don't really know what you are trying to achieve but according to what I understood, what you can do is each function will pass an additional argument that would uniquely identify that function in form of a character array, integer or enumeration.
for example:
enum function{main, add, sub, div, mul};
and call functions like:
add(3,5,main);//adds 3 and 5. called from main
changes to the code would be typical like if you are adding more functions. but it's an easier way to do it.
No. The C language does not support obtaining the name or other information of who called a function.
As all other answers show, this can only be obtained using external tools, for example that use stack traces and compiler/linker emitted symbol tables.

__attribute__((constructor)) how it change entry point?

I know that from the point of view of the C programming language
main() is the entry point of the program.
But from the point of view of the operating system is __start at ctrt0 startup routines which are linked into a program that performs any initialization work required before calling the program's main() function (correct me if wrong here).
So we have some attributes which we can use for our functions, one of them is
constructor attribute which is called before main(), who is responsible to call this function?
__attribute__((construtor))
void foo(void)
{
puts("Constructor called by ... ?\n");
}
and how would it look in step by step call stack? Thanks!
Functions marked as "constructor" are placed in a special section in the executable file. The "start" function will then invoke those functions one by one, before it calls main.
The same with "destructor" functions, they are again placed in a special section and called by the code executing after exit is called or main returns.

Force function to return

For an analysis I'm doing I want to be able to "catch" specific malloc calls. I therefore created a function wrapper to malloc, named malloc_wrapper:
void *malloc_wrap(size_t size) {
return malloc(size);
}
All left is just slightly modify the source code by switching some malloc calls with malloc_wrap. I then use Intel Pin to capture what I need.
Unfortunately, it didn't work. I didn't see malloc_wrap being called in the assembly code, so it was probably inlined. Quick search, and I added this to the function header:
__attribute__ ((noinline))
Great, now I'm able to spot the function entry, but not the exit. I can't see any ret call at the end of the function. How can I force the compiler to compile my wrapper function regularly?

change the C function contents at run-time

I have a use-case in which the function contents are to be selected conditionally. And that function pointer is passed to other module for later invocation.
void encode_function(int flag)
{
if (flag == 1)
encode1();
if (flag == 2)
encode2();
encode_defaults();
}
Once the encode_function() is populated, I will pass it to other module, where it will be invoked.
I am trying to achieve it in C language, but no success so far. I tried to look at dyncall library but it supports only dynamic parameter changes.
I am looking for something which allows me to change the function contents at run-time.
Some existing question Is there a way to modify the code of a function in a Linux C program at runtime?
You are basically looking for the ability to treat code as data, which is a feature rarely available in compiled imperative static languages like c. The capability of modifying functions at run time is generally categorized as high order functions (you're basically trying to return a function as data somewhere in the code).
If the problem can be solved by several static functions, you can pass in function pointers to different implementations, otherwise I think c does not have the ability to really treat code as data and modify them on the fly.
This question makes attempt to implement high order function in c, but going this far might not be what you want.

Resources