I'm trying to make some improvements to a interpreter for microcontrollers that I'm working on. For executing built-in functions I currently have something like this (albeit a bit faster):
function executeBuiltin(functionName, functionArgs) {
if (functionName=="foo") foo(getIntFromArg(functionArgs[0]));
if (functionName=="bar") bar(getIntFromArg(functionArgs[0]),getBoolFromArg(functionArgs[1]),getFloatFromArg(functionArgs[2]));
if (functionName=="baz") baz();
...
}
But it is for an embedded device (ARM) with very limited resources, and I need to cut down on the code size drastically. What I'd like to do is to have a general-purpose function for calling other functions with different arguments - something like this:
function executeBuiltin(functionName, functionArgs) {
functionData = fast_lookup(functionName);
call_with_args(functionData.functionPointer, functionData.functionArgumentTypes, functionArgs);
}
So I want to be able to call a standard C function and pass it whatever arguments it needs (which could all be of different types). For this, I need a call_with_args function.
I want to avoid re-writing every function to take argc+argv. Ideally each function that was called would be an entirely standard C function.
There's a discussion about this here - but has anything changed since 1993 when that post was written? Especially as I'm running on ARM where arguments are in registers rather than on the stack. Even if it's not in standard C, is there anything GCC specific that can be done?
UPDATE: It seems that despite behaviour being 'undefined' according to the spec, it looks like because of the way C calls work, you can pass more arguments to a function than it is expecting and everything will be fine, so you can unpack all the arguments into an array of uint32s, and can then just pass each uint32 to the function.
That makes writing 'nice' code for calls much easier, and it appears to work pretty well (on 32 bit platforms). The only problem seems to be when passing 64 bit numbers and compiling for 64bit x86 as it seems to do something particularly strange in that case.
Would it be possible to do at compile time with macros?
Something along the lines of:
https://www.redhat.com/archives/libvir-list/2014-March/msg00730.html
If runtime was required, perhaps __buildin_apply_args() could be leveraged.
from this document, section 5.5, Parameter Passing, it seems like parameters are passed both in registers and in stack, as with most of today platforms.
With "non standard C" I am thinking to pack the parameters and call the function following the documentation with some asm(). However you need a minimal information about the signature of the function being called anyway (I mean, how many bits for each argument to be passed).
From this point of view I would prefer to prepare an array of function names, an array of function pointers and an array of enumerated function signatures (in the number of bits of each argument... you don't need to differentiate void* from char* for example) and a switch/case on the signatures, and a switch/case on the last one. So I have reported two answers here.
You can do a very simple serialization to pass arbitrary arguments. Create an array and memcpy sizeof(arg) bytes into it for each passed argument.
Or you can create structs for function arguments.
Every function takes a char* or a void*. Then you pass either a pointer to a struct with that functions parameters, or you define a set of macros or functions to encode and decode arbitrary data from an array and pass the pointer to that array.
Related
Hey I have implemented some callbacks in my C program.
typedef void (*server_end_callback_t)(void *callbackArg);
then I have variable inside structure to store this callback
server->server_end_callback = on_sever_end;
What I have noticed it that I can pass in on_server_end callback function implementation that skips void *callbackArg and the code works correctly (no errors).
Is it correct to skip some arguments like void * implementing callback functions which prototypes takes such arguments?
void on_server_end(void) {
// some code goes here
}
I believe it is an undefined behavior from the C point of view, but it works because of the calling convention you are using.
For example, AMD64 ABI states that the first six arguments get passed to the calling function using CPU registers, not stack. So neither caller nor callee need no clean-up for the first six arguments and it works fine.
For more info please refer the Wikipedia.
The code works correctly because of the convention of passing arguments. Caller knows that callee expects some arguments - exactly one. So, it prepares the argument(s) (either in register or on stack - depending on ABI on your platform). Then callee uses those parameters or not. After return from callee, caller cleans up the stack if necessary. That's the mistery.
However, you shall not abuse this specific behaviour by passing incompatible function. It is a good practice to always compile your code with options -W -Wall -Werror (clang/gcc and compatible). Enabling such option would provide you a compilation error.
C allows a certain amount of playing fast and loose with function arguments. So
void (*fptr) ();
means "a pointer to a function which takes zero or more arguments". However this is for backwards compatibility, it's not wise to use it in new C code. The other way round
void (*fptr)(void *ptr)
{
/* don't use the void */
}
/* in another scope */
(*fptr)(); /* call with no arguments */
also works, as long as you don't use the void *, and I believe it is guaranteed to work though I'm not completely sure about that (on a modern machine the calling convention is to pass the first arguments in registers, so you just get a garbage register, and it will work). Again, it is a very bad idea to rely on it.
You can pass a void *, which you then cast to a structure of appropriate type containing as many arguments as you wish. That is a good idea and a sensible use of C's flexibility.
Is it correct to skip some arguments like void * implementing callback functions which prototypes takes such arguments?
No it is not. Any function with a given function declaration is not compatile with a function of a different function declaration. This rule applies for pointers to functions too.
So if you have a function such as pthread_create(..., my_callback, ...); and it expects you to pass a function pointer of type void* (*) (void*), then you cannot pass a function pointer of a different format. This invokes undefined behavior and compilers may generate incorrect code.
That being said, function pointer compatibility is a common non-standard extension on many systems. If the calling convention of the system is specified in a way that the function format doesn't matter, and the specific compiler port supports it, then such code might work just fine.
Such code is however not portable and not standard. It is best to avoid it whenever possible.
Let's say I have a function:
int foo (int A, char B){...}
One of the features I want to implement is the capability for the user to call any function on the application through the Linux terminal. So as an input for the software, in the terminal they type something like:
foo 2 'a'
Then my application parses that, and using the symbol tables it is able to find the address for foo(), as well as the type for all its parameters.
However, I'm not sure how I would pass the parameters to the function when calling it, since I can have hundreds of different parameters types combination depending on the function called.
Any hint how that could be achieved without having hundreds of nested if statements to cast the parameters to the correct types before calling the functions?
That functionality is similar to what GDB has, where you can do call foo(2,'a') and GDB calls that function to you.
There are two approaches to this. If what you described is all you want to do, then you can use the dyncall library so that you dont have to worry about platform/compiler-specific calling semantics yourself:
The dyncall library encapsulates architecture-, OS- and compiler-specific function call semantics in a virtual bind argument parameters from left to right and then call interface allowing programmers to call C functions in a completely dynamic manner. In other words, instead of calling a function directly, the dyncall library provides a mechanism to push the function parameters manually and to issue the call afterwards.
The other approach is, if you might want to do more: e.g. what if an argument cannot be created by a literal? What if the argument is the output of another function? Can you write f(123, g("a")) in your console? Can you write x=g("a"); f(x)? And if(cond) x="a" else x="b"; f(x) In this case you need to embed a scripting language like e.g. LUA.
If you compile your binary with debug information, you can extract it using libdwarf (https://www.prevanders.net/dwarf.html), so for every function you can get a list a parameters with types and you would know how to interpret user's input.
I need to call a function in C by just knowing it address, and no information
on it prototype (I can't cast it to a C function pointer).
The information I have on this function is it address.
I also know the parameters I want to pass to it (Thanks to a void pointer) and
the size of the arguments array (accessed trough the void pointer).
I also want to respect the C calling convention. For x86 version, I pretty much
know how to do it (allocate the space on the stack, copy the parameters to
that space and finally call the function).
The problem is with x64 convention (Linux one for now) where parameters are
passed through registers. I have no idea of the size of each parameter to fill
appropriately registers, I only know the size of the parameter array.
Also, I don't want to depend on gcc so I can't use __builtin_apply that seems
to be not standard and also be pretty dark.
I want to write my own piece of code to support multi compiler and also to
learn interesting stuff.
So basically, the function I want to write as the same prototype as
__builtin_apply which is:
void *call_ptr(void (*fun)(), void *params, size_t size);
I want also the code to write it in C (thanks to asm inline) or pure x64 asm.
So is there a way to do this properly and with respect of the calling
convention ? Or is this impossible with the x64 convention without knowing
exactly the prototype of the function called ?
Especially for x64 calling convention on Linux this will not work at all.
The reason is the very complicated calling convention.
Some examples:
void funcA(float64 x);
void funcB(int64 x);
In these two cases the value "x" is passed to the functions differently because floating point and integer are passed to the functions in different registers.
void funcC(float64 x,int64 y);
void funcD(int64 y,float64 x);
In these two cases the arguments "x" and "y" are in different order. However they are passed to the function in the same way (both functions use the same register for "x" and the same register for "y").
Conclusion: To create a function that does what you want you'd have to pass a string containing the argument types of each argument to the assembler function. The number/size of arguments is definitely not enough. However it would definitely be possible - as long as it must work only on Linux.
I think, all of your decision will not be supported multi-compiler, because the mechanism of passing arguments to function (registers, their order, stack, memory) - it's compiler dependence feature...
I am working on some legacy C code. The original code was written in the mid-90s, targeting Solaris and Sun's C compiler of that era. The current version compiles under GCC 4 (albeit with many warnings), and it seems to work, but I'm trying to tidy it up -- I want to squeeze out as many latent bugs as possible as I determine what may be necessary to adapt it to 64-bit platforms, and to compilers other than the one it was built for.
One of my main activities in this regard has been to ensure that all functions have full prototypes (which many did not have), and in that context I discovered some code that calls a function (previously un-prototyped) with fewer arguments than the function definition declares. The function implementation does use the value of the missing argument.
Example:
impl.c:
int foo(int one, int two) {
if (two) {
return one;
} else {
return one + 1;
}
}
client1.c:
extern foo();
int bar() {
/* only one argument(!): */
return foo(42);
}
client2.c:
extern int foo();
int (*foop)() = foo;
int baz() {
/* calls the same function as does bar(), but with two arguments: */
return (*foop)(17, 23);
}
Questions: is the result of a function call with missing arguments defined? If so, what value will the function receive for the unspecified argument? Otherwise, would the Sun C compiler of ca. 1996 (for Solaris, not VMS) have exhibited a predictable implementation-specific behavior that I can emulate by adding a particular argument value to the affected calls?
EDIT: I found a stack thread C function with no parameters behavior which gives a very succinct and specific, accurate answer. PMG's comment at the end of the answer taks about UB. Below were my original thoughts, which I think are along the same lines and explain why the behaviour is UB..
Questions: is the result of a function call with missing arguments defined?
I would say no... The reason being is that I think the function will operate as-if it had the second parameter, but as explained below, that second parameter could just be junk.
If so, what value will the function receive for the unspecified argument?
I think the values received are undefined. This is why you could have UB.
There are two general ways of parameter passing that I'm aware of... (Wikipedia has a good page on calling conventions)
Pass by register. I.e., the ABI (Application Binary Interface) for the plat form will say that registers x & y for example are for passing in parameters, and any more above that get passed via stack...
Everything gets passed via stack...
Thus when you give one module a definition of the function with "...unspecified (but not variable) number of parameters..." (the extern def), it will not place as many parameters as you give it (in this case 1) in either the registers or stack location that the real function will look in to get the parameter values. Therefore the second area for the second parameter, which is missed out, essentially contains random junk.
EDIT: Based on the other stack thread I found, I would ammended the above to say that the extern declared a function with no parameters to a declared a function with "unspecified (but not variable) number of parameters".
When the program jumps to the function, that function assumes the parameter passing mechanism has been correctly obeyed, so either looks in registers or the stack and uses whatever values it finds... asumming them to be correct.
Otherwise, would the Sun C compiler of ca. 1996 (for Solaris, not VMS) have exhibited a >> predictable implementation-specific behavior
You'd have to check your compiler documentation. I doubt it... the extern definition would be trusted completely so I doubt the registers or stack, depending on parameter passing mechanism, would get correctly initialised...
If the number or the types of arguments (after default argument promotions) do not match the ones used in the actual function definition, the behavior is undefined.
What will happen in practice depends on the implementation. The values of missing parameters will not be meaningfully defined (assuming the attempt to access missing arguments will not segfault), i.e. they will hold unpredictable and possibly unstable values.
Whether the program will survive such incorrect calls will also depend on the calling convention. A "classic" C calling convention, in which the caller is responsible for placing the parameters into the stack and removing them from there, will be less crash-prone in presence of such errors. The same can be said about calls that use CPU registers to pass arguments. Meanwhile, a calling convention in which the function itself is responsible for cleaning the stack will crash almost immediately.
It is very unlikely the bar function ever in the past would give consistent results. The only thing I can imagine is that it is always called on fresh stack space and the stack space was cleared upon startup of the process, in which case the second parameter would be 0. Or the difference between between returning one and one+1 didn't make a big difference in the bigger scope of the application.
If it really is like you depict in your example, then you are looking at a big fat bug. In the distant past there was a coding style where vararg functions were implemented by specifying more parameters than passed, but just as with modern varargs you should not access any parameters not actually passed.
I assume that this code was compiled and run on the Sun SPARC architecture. According to this ancient SPARC web page: "registers %o0-%o5 are used for the first six parameters passed to a procedure."
In your example with a function expecting two parameters, with the second parameter not specified at the call site, it is likely that register %01 always happened to have a sensible value when the call was made.
If you have access to the original executable and can disassemble the code around the incorrect call site, you might be able to deduce what value %o1 had when the call was made. Or you might try running the original executable on a SPARC emulator, like QEMU. In any case this won't be a trivial task!
I'd like to be able to generically pass a function to a function in C. I've used C for a few years, and I'm aware of the barriers to implementing proper closures and higher-order functions. It's almost insurmountable.
I scoured StackOverflow to see what other sources had to say on the matter:
higher-order-functions-in-c
anonymous-functions-using-gcc-statement-expressions
is-there-a-way-to-do-currying-in-c
functional-programming-currying-in-c-issue-with-types
emulating-partial-function-application-in-c
fake-anonymous-functions-in-c
functional-programming-in-c-with-macro-higher-order-function-generators
higher-order-functions-in-c-as-a-syntactic-sugar-with-minimal-effort
...and none had a silver-bullet generic answer, outside of either using varargs or assembly. I have no bones with assembly, but if I can efficiently implement a feature in the host language, I usually attempt to.
Since I can't have HOF easily...
I'd love higher-order functions, but I'll settle for delegates in a pinch. I suspect that with something like the code below I could get a workable delegate implementation in C.
An implementation like this comes to mind:
enum FUN_TYPES {
GENERIC,
VOID_FUN,
INT_FUN,
UINT32_FUN,
FLOAT_FUN,
};
typedef struct delegate {
uint32 fun_type;
union function {
int (*int_fun)(int);
uint32 (*uint_fun)(uint);
float (*float_fun)(float);
/* ... etc. until all basic types/structs in the
program are accounted for. */
} function;
} delegate;
Usage Example:
void mapint(struct fun f, int arr[20]) {
int i = 0;
if(f.fun_type == INT_FUN) {
for(; i < 20; i++) {
arr[i] = f.function.int_fun(arr[i]);
}
}
}
Unfortunately, there are some obvious downsides to this approach to delegates:
No type checks, save those which you do yourself by checking the 'fun_type' field.
Type checks introduce extra conditionals into your code, making it messier and more branchy than before.
The number of (safe) possible permutations of the function is limited by the size of the 'fun_type' variable.
The enum and list of function pointer definitions would have to be machine generated. Anything else would border on insanity, save for trivial cases.
Going through ordinary C, sadly, is not as efficient as, say a mov -> call sequence, which could probably be done in assembly (with some difficulty).
Does anyone know of a better way to do something like delegates in C?
Note: The more portable and efficient, the better
Also, Note: I've heard of Don Clugston's very fast delegates for C++. However, I'm not interested in C++ solutions--just C .
You could add a void* argument to all your functions to allow for bound arguments, delegation, and the like. Unfortunately, you'd need to write wrappers for anything that dealt with external functions and function pointers.
There are two questions where I have investigated techniques for something similar providing slightly different versions of the basic technique. The downside of this is that you lose compile time checks since the argument lists are built at run time.
The first is my answer to the question of Is there a way to do currying in C. This approach uses a proxy function to invoke a function pointer and the arguments for the function.
The second is my answer to the question C Pass arguments as void-pointer-list to imported function from LoadLibrary().
The basic idea is to have a memory area that is then used to build an argument list and to then push that memory area onto the stack as part of the call to the function. The result is that the called function sees the memory area as a list of parameters.
In C the key is to define a struct which contains an array which is then used as the memory area. When the called function is invoked, the entire struct is passed by value which means that the arguments set into the array are then pushed onto the stack so that the called function sees not a struct value but rather a list of arguments.
With the answer to the curry question, the memory area contains a function pointer as well as one or more arguments, a kind of closure. The memory area is then handed to a proxy function which actually invokes the function with the arguments in the closure.
This works because the standard C function call pushes arguments onto the stack, calls the function and when the function returns the caller cleans up the stack because it knows what was actually pushed onto the stack.