C - Architecture independent function call interposition - c

I'm writing a piece of code where I have a function pointer that gets invoked. What I'd like to do is interpose on this function call to do something, and then invoke anotherfunction call with the same arguments. I wonder if there is some way to do this without having to write assembly for each architecture I'm targeting. Perhaps there are some GCC tricks?
As an example I call my function pointer and it invokes
foo (/*arguments*/) {
do_something...
bar(/*same arguments*/);
}
In assembly this is fairly easy. At least in x86 I just make sure that my stack pointer is reset to the beginning of my stack frame and jump to function bar (not call).
EDIT: Perhaps the example isn't clear. The user expects to be calling function bar but instead I have redirected it to function foo (I don't know what arguments bar takes). I want to do something in foo before calling bar with the same arguments that were passed on. In this way, whatever I'm doing in foo is transparent to the user who thinks they just called bar.

Have a look into gcc option -finstrument-functions.

Sounds like what the the gcc specific ___builtin_apply_args is for. It's an intristic that captures the passed in argument, and you can call another function with those arguments using __builtin_apply

libffi does all (or at least most) of what you need for source-level interposing.
Another option is to use dynamic binary instrumentation tools like DynamoRIO or Pin.

You could try creating a global function pointer variable that is used as a look-up for pre-binding the two function calls to one another. For instance,
typedef void (*bar_type)(int arg1, int arg2);
bar_type function_ptr; //a global function pointer used for binding
//create a bar_type function that is our "actual" function call
void __bar(int arg1, int arg2)
{
//do something else
}
//create a bar_type function called "foo" that is bound to calling whatever
//function is being pointed to by function_ptr
void foo(int arg1, int arg2)
{
//do something
function_ptr(arg1, arg2); //"foo" now calls "__bar"
}
bar_type transform_func(bar_type func_call, bar_type int_call)
{
function_ptr = func_call; //set the global function ptr variable
return int_call;
}
//create your function pointer bar that will call "foo" before calling "__bar"
bar_type bar = transform_func(__bar, foo);
//later on in your code
bar(3, 4); //this will call foo() which will then call __bar() internally
You could also with this approach create a macro that for the user where you could define bar as a macro that looks like
#define bar(arg1, arg2) (*(transform_func(__bar, foo)))(arg1, arg2);
Hopefully this isn't too kludgy ... there is definitely a performance hit from what could be done with assembly, but using the global function pointer would be a way to re-bind a function call.

Can you use a function-like macro to implement "do something" in the context of the calling function, and then do a normal call to the function pointer? (Yes, standard disclaimers about macros, esp. function-like macros apply...)
For example something like:
#define CALL_FUNC(fp,ARG1,ARG2) do {<do something>;fp(ARG1,ARG2);} while (0)
And then in the application, replace where you de-reference the function pointer with the macro.
It's not clear to me from the original question if foo or bar is the function called through the function pointer, so you might need to adjust the macro, but the general approach stays the same.

Related

C: Declare and execute function in one line

I can do the following in GCC:
#define INIT_MODULE(name) \
({ extern int name(void); name(); })
int main(void) {
return INIT_MODULE(x);
}
Here, the function (expanded from name) is created, executed, and returned via a statement expression (GCC extension). This is a minimal repo: I am actually doing some __asm__ magic to make the name function, hence the macro.
I would like to have this be a one-liner, and not call another macro to create the name function. In my use case, the caller will only call INIT_MODULE once, and does not/should not know the name of the underlying function it is calling.
Basically, I need a way to declare, run, and return the value of a function, all in one line (without using GCC extensions!).
What I DONT want:
// ...
DECL_MODULE(x);
int main(void) {
return INIT_MODULE(x);
}
Any thoughts?
I think there's more to what you want than to simply declare, run and return the function - otherwise you could do it right away in the same manner you already achieved.
The problem that I see is that you want the function to be visible externally, as if it was declared outside of the scope of the caller. And that's clearly not possible, since #define just replaces the content of the macro (it can't move it in another point of the code).
Actually, you could use low-level goto and some arithmetics, but I'd rather not recommend you that path.

Can I push/pop data to the GCC C return stack?

In GCC C, is there a way to push/pop data to the C return stack?
I'm not talking about implementing my own stack (I know how to do that); I mean using the existing C return stack to explicitly push/pop parameters (within the same level of braces, of course).
For example, something like:
extern int bar;
void foo(void) {
PUSH(bar);
bar = 12;
doSomething(); // that depends on the value of bar
bar = POP(); // restore original value of bar
}
If there were any easy way to do this, I think it would be a cleaner alternative to using a local variable like "oldBar" explicitly.
if you use a temporary variable, it's basically the same thing. The temporary variable is allocated on the stack or optimized to a register.
e.g.
extern int bar;
void foo(void) {
int tmp = bar
bar = 12;
doSomething(); // that depends on the value of bar
bar = tmp; // restore original value of bar
}
Apparently C doesn't actually require a stack structure to be used for calls, so this functionality wouldn't make sense. This is claimed in the memory layout section of this article https://www.seebs.net/c/c_tcn4e.html
Quite simply, not every compiler even has a "stack". Some systems don't really have any such feature. Every compiler for C has some kind of mechanism for handling function calls, but that doesn't mean it's a stack. More significantly, it is quite common for function parameters or local variables not to be stored on any "stack", but to be stored in CPU registers. That distinction can matter a lot, and should have been covered, rather than hand-waved away.
Technically, you could also use alloca() (located in alloca.h) to do this, but the only way to deallocate that memory is for the function call to return. It also doesn't really do what you're suggesting. alloca isn't part of the C standard either
If you are writing in pure C, you can use variadic argument function. It is, of course, not a full solution, as stack is cleared after a function call and argument is pushed only when you call a function.
But if you need to use it as you described, it might work:
extern int bar;
void doSomething(...);
void foo(void) {
doSomething(&bar);
//you do not need here to pop to restore the bar value, as you push a copy of value
}
If you want to do some other actions after push (which I can't even imagine), you can use a wrapper function.
extern int bar;
void doSomething();
void doSomethingWrapper(...) {
//here arguments that are passed in ... are pushed
doSomething();
//and here they are poped
}
void foo(void) {
//....
doSomethingWrapper(&bar);
//....
}
If you just want to manipulate stack in a cunning way the only solution is to use inline assembly. Or you can call function void push(...) to push whatever you want and use longjmp() in body of push() to avoid popping arguments when control flow leaves the function.
OK, this does what I wanted:
#define PUSHINT(var) int old__##var = var
#define POPINT(var) var = old__##var
extern int bar;
void foo(void) {
PUSHINT(bar);
bar = 12;
doSomething(); // that depends on the value of bar
POPINT(bar); // restore original value of bar
}
It would be better if I could guarantee that the "old__" prefix was unique each time, but I don't know how to do that (it's unlikely to be a problem tho).
Also it doesn't truly do a PUSH/POP (altho it does store on the return stack, at least on my hardware). You can't push one value and pop a different one.
But I didn't want to do that in the first place.
I may rename the macros to SAVE() and RESTORE()...
(Thanks to the hint from #BobbySacamano's answer.)

Determining to which function a pointer is pointing in C?

I have a pointer to function, assume any signature.
And I have 5 different functions with same signature.
At run time one of them gets assigned to the pointer, and that function is called.
Without inserting any print statement in those functions, how can I come to know the name of function which the pointer currently points to?
You will have to check which of your 5 functions your pointer points to:
if (func_ptr == my_function1) {
puts("func_ptr points to my_function1");
} else if (func_ptr == my_function2) {
puts("func_ptr points to my_function2");
} else if (func_ptr == my_function3) {
puts("func_ptr points to my_function3");
} ...
If this is a common pattern you need, then use a table of structs instead of a function pointer:
typedef void (*my_func)(int);
struct Function {
my_func func;
const char *func_name;
};
#define FUNC_ENTRY(function) {function, #function}
const Function func_table[] = {
FUNC_ENTRY(function1),
FUNC_ENTRY(function2),
FUNC_ENTRY(function3),
FUNC_ENTRY(function4),
FUNC_ENTRY(function5)
}
struct Function *func = &func_table[3]; //instead of func_ptr = function4;
printf("Calling function %s\n", func->func_name);
func ->func(44); //instead of func_ptr(44);
Generally, in C such things are not available to the programmer.
There might be system-specific ways of getting there by using debug symbols etc., but you probably don't want to depend on the presence of these for the program to function normally.
But, you can of course compare the value of the pointer to another value, e.g.
if (ptr_to_function == some_function)
printf("Function pointer now points to some_function!\n");
The function names will not be available at runtime.
C is not a reflective language.
Either maintain a table of function pointers keyed by their name, or supply a mode of calling each function that returns the name.
The debugger could tell you that (i.e. the name of a function, given its address).
The symbol table of an unstripped ELF executable could also help. See nm(1), objdump(1), readelf(1)
Another Linux GNU/libc specific approach could be to use at runtime the dladdr(3) function. Assuming your program is nicely and dynamically linked (e.g. with -rdynamic), it can find the symbol name and the shared object path given some address (of a globally named function).
Of course, if you have only five functions of a given signature, you could compare your address (to the five addresses of them).
Notice that some functions don't have any ((globally visible) names, e.g. static functions.
And some functions could be dlopen-ed and dlsym-ed (e.g. inside plugins). Or their code be synthetized at runtime by some JIT-ing framework (libjit, gccjit, LLVM, asmjit). And the optimizing compiler can (and does!) inline functions, clone them, tail-call them, etc.... so your question might not make any sense in general...
See also backtrace(3) & Ian Taylor's libbacktrace inside GCC.
But in general, your quest is impossible. If you really need such reflective information in a reliable way, manage it yourself (look into Pitrat's CAIA system as an example, or somehow my MELT system), perhaps by generating some code during the build.
To know where a function pointer points is something you'll have to keep track of with your program. Most common is to declare an array of function pointers and use an int variable as index of this array.
That being said, it is nowadays also possible to tell in runtime which function that is currently executed, by using the __func__ identifier:
#include <stdio.h>
typedef const char* func_t (void);
const char* foo (void)
{
// do foo stuff
return __func__;
}
const char* bar (void)
{
// do bar stuff
return __func__;
}
int main (void)
{
func_t* fptr;
fptr = foo;
printf("%s executed\n", fptr());
fptr = bar;
printf("%s executed\n", fptr());
return 0;
}
Output:
foo executed
bar executed
Not at all - the symbolic name of the function disappears after compilation. Unlike a reflective language, C isn't aware of how its syntax elements were named by the programmer; especially, there's no "function lookup" by name after compilation.
You can of course have a "database" (e.g. an array) of function pointers that you can compare your current pointer to.
This is utterly awful and non-portable, but assuming:
You're on Linux or some similar, ELF-based system.
You're using dynamic linking.
The function is in a shared library or you used -rdynamic when linking.
Probably a lot of other assumptions you shouldn't be making...
You can obtain the name of a function by passing its address to the nonstandard dladdr function.
set your linker to output a MAP file.
pause the program
inspect the address contained in the pointer.
look up the address in the MAP file to find out which function is being pointed to.
A pointer to a C function is an address, like any pointer. You can get the value from a debugger. You can cast the pointer to any integer type with enough bits to express it completely, and print it. Any compilation unit that can use the pointer, ie, has the function name in scope, can print the pointer values or compare them to a runtime variable, without touching anything inside the functions themselves.

Changing stdout (putch() function) on the fly in C

I'm using the XC8 compiler. For that, you have to define your own void putch(char data) function in order for functions like printf() to work, as is described here. Basically, putch() is the function which is used to write characters to stdout.
I now want to change this function on the fly. I have two different functions, putch_a() and putch_b() and want to be able to change which one is used for putch() itself, on the fly.
I thought of this:
unsigned use_a_not_b;
void putch(char data) {
if (use_a_not_b) {
putch_a(data);
} else {
putch_b(data);
}
}
However, this reduces execution speed. Would there be a way to use pointers for this? I have read this answer, and made the following code:
void putch_a(char data);
void putch_b(char data);
void (*putch)(char) = putch_a; // to switch to putch_a
void (*putch)(char) = putch_b; // to switch to putch_b
Would that work? Is there a faster or better-practice way?
No, and why not
To answer your question: no you can't, in the way that you are thinking (i.e. function pointer). What a function pointer is is a variable with an address of another variable. To illustrate, consider how this works when you have a function pointer foo pointing to function bar.
int bar() {
}
void baz(int (*foo)()) {
int x = foo(); // Calls the function pointed to bar foo
}
int main() {
int (*foo)();
foo = &bar;
baz(foo); // Cal baz() passing it foo, which points to bar()
}
What foo holds is the address of bar. When you pass foo to some function that expects a function pointer parameter (in this case baz()), the function dereferences the pointer, i.e. looks at the memory address associated with foo, gets the address sored in it, in our case the the address of bar, and then calls a function (in our case, bar) at that address. To be very careful about this: in the above example baz() says
Let me look at the memory associated with foo, it has another address in it
Load that address from memory, and call a function at that address. That function returns an int and takes no parameters.
Let's contrast this with a function that calls bar() directly:
void qux() {
int x = bar(); // Call bar()
}
In this case there is no function pointer. What there is, is an address, supplied by the linker. The linker lays out all the functions in your program, and it knows, for instance that bar() is at address 0xDEADBEEF. So in qux() there is just a jump 0xDEADBEEF call. In contrast in baz() there is something like (pseudo-addembly):
pop bar off the stack into register A
read memory address pointed to by register A into register B
jump to memory location pointed to by register B
The way putch() gets called from printf(), for instance, is exactly like qux() calls bar(), and not like the way baz() does: putch gets statically linked into your program, so the address of putch() is hardcoded in there, simple because fprintf() doesn't take a function pointer to call for a parameter.
Why #define is not the answer
#define is a preprocessor directive, that is, "symbols" defined with #define are replaced with their values before the compiler even sees your code. This means that #define makes your program less dynamically modifiable not more. This is desirable in some cases, but in your case it will not help you. To illustrate if you define a symbol like this:
#define Pi 3.14
Then everywhere you use Pi it is as if you typed 3.14. Bacause Pi does not exist, as far as the compiler is concerned, you cannot even take an address of it to make a pointer to it.
Closest you can get to a dynamic putch
Like the others have said, you can have some sort of case statement, conditional, or a global pointer, but the putch function itself has to be there in the same form.
Global function pointer solution:
void (*myPutch)(char);
putch(char ch) {
myPutch(ch);
}
int main() {
myPutch = putch_Type_A();
...
myPutch = putch_Type_B();
}
If/then/else solution has been provided in other answers
goto solution: This would be an ugly (but fun!) hack, and only possible on von Neumann-type machines, but in these conditions you could have your putch look like this:
putch(char ch) {
goto PutchTypeB
PutchTypeA:
// Code goes here
return;
PutchTypeB:
// Code goes here
return;
}
You would then overwrite the goto instruction with goto to some other memory address. You'd have to figure out the opcodes for doing this (from disassembly, probably), and this isn't possible on Harvard architecture machines, so it is out on AVR processors, but it would be fun, if cludgey.
No. That isn't guaranteed to work due to the way code is generated and linked. However...
void (*output_function)(char) = putch_a;
void putch(char c) {
output_function(c);
}
Now you can change output_function whenever you like...
There is no concept of "speed" in C. That's an attribute introduced by implementations. There are fast implementations (or rather, implementations that produce fast code, in the case of "compilers") and slow implementations (or implementations that produce slow code).
Either way, this is unlikely to be a significant bottleneck. Produce a program that solves a useful program, profile it to determine the most significant bottlenecks and work on optimising those.
Before you optimize this, make sure what you have really does reduce execution speed noticeably. In an i/o function there's usually a lot of other stuff going on (checking if buffer space is free, calculating buffer offsets, informing hardware data is available, being interrupted while data is actually transmitted to hardware, etc.) that would make a single extra if/else inconsequential.
In most cases your first block should be fine.
In comments you mention maybe needing to extend this structure to multiple putch() functions.
Maybe try
enum PUTCH { sel_putch_a, sel_putch_b, ... };
enum PUTCH putch_select;
void putch(char c) {
switch(putch_select) {
case sel_putch_a : putch_a(c); break;
case sel_putch_b : putch_b(c); break;
/* ... */
}
}
The compiler should be able to optimize the switch statement to a simple computation and a goto. If the putch_<n> functions are inlineable, this doesn't even cost an extra call/return.
The solution using a pointer-to-function in another answer is more flexible in terms of being able to change the available putch functions on the fly, or define them in other files (for example, if you're writing a library or framework to be used by others), but it does require an extra call/return overhead (compared to the simple case of just defining a single putch function).

Set a function pointer to a static address

I'm injecting a DLL into another process and want to call a function that is in that binary based on it's address (0x54315).
How can I actually declare a function, and then set it to this address?
#define FUNC 0x54315
void *myFuncPtr;
int main()
{
myFuncPtr = FUNC; // pretty sure this isn't how
myFuncPtr(); // call it?
}
The existing answers work, but you don't even need a variable for the function pointer. You can just do:
#define myfunc ((void (*)(void))0x54315)
and then call it as myfunc() just like you would an ordinary function. Note that you should change the type in the cast to match the actual argument and return types of the function.
You need to define myFuncPtr as a function pointer, a void* isn't callable.
Best to use a typedef for that:
typedef void (*funptr)(void);
funprt myFuncPtr;
(Assuming your function takes nothing and returns nothing.)
Then you'll get a warning on the assignment - use a type cast to "silence" it, since this is indeed what you need to do.
You're pretty much on your own with this though, if the signature doesn't match, the calling convention is wrong, or the address is wrong, the compiler cannot validate anything and you get to pick up the pieces.
Your code should work once the syntax is corrected to actually be a function pointer. I failed to read it properly for my first version of this answer. Sorry.
As stated by Mat, the proper syntax for a function pointer would be:
void (*myFuncPtr)(void) = (void (*)(void)) FUNC;
This is often simplified by using a typedef since the C function pointer syntax is somewhat convoluted.
Also, you're must be really sure the function to be called is at that same exact address every time your injected DLL runs. I'm not sure how you can be sure of that, though ...
Also, you would need to pay attention to the calling conventions and any arguments the function at FUNC might be expecting, since if you get that wrong you will likely end up with stack corruption.

Resources