Calling function with variable arguments dynamically - c

Is there a possibility to call function with variable arguments from the C code dynamically?
For example I have text file which contains data (written using this scheme: function_name arguments) for example:
func1 1 2 3
func2 1
func3
My program written in C is parsing this file and looks in a populated array (which holds function name in string and target native function pointer) for function with given name by comparing the string and calls a pointer of this function with arguments from the text file. For example functions like that:
void func1(int a, int b, int c) { }
void func2(int a, int b) { }
void func3() { }
The problem is that even if I know the number of arguments, I don't know how to write in C function pointer call with dynamic number of arguments. Is there a possibility to populate va_list (I know that this is NOT a container or a typical array!) then pass to the native function or any other way to do this? The only way which came into my mind is populating dynarec block with x86 code for calling native function with variable arguments, but it's not a clean solution. Is such thing even possible in plain C?
If it is hard to understand just write and I'll try to explain better. And if you want to write "use va_list" - then read carefully my post once again.
Thanks in advance.

I'm self answering, because this will be a solution for other people. If you want to call functions with variable arguments dynamically without writing direct machine code just use avcall library from FFCALL. Quick reference guide of avcall library can be found here. It's a crossplatform library that makes this possible. For example to call function named func1 with return type void which takes three arguments and all of them are of type int just do this:
#include <avcall.h>
int a = 1, b = 2, c = 3;
av_alist alist;
av_start_void(alist,&func1);
av_int(alist,a);
av_int(alist,b);
av_int(alist,c);
av_call(alist);
You can of course use this for functions which returns value or takes arguments of different type, for more just look at avcall library manual page. :)

I like your way of thinking, because obviously, you are a true hacker, but...
do not try to do it like this.
The proper way of doing this is to go alter these functions so that each one of them accepts an array of int instead of individual int parameters. But I suppose that if you had the freedom to change them, you would have done it already and you would not be asking.
The next best way of doing it is to write a number of functions, conv1(), conv2(), conv3() etc, each accepting an array of int, and a pointer to a function which accepts individual int parameters. So, convN() accepts an array of N integers, and a pointer to a function which accepts N individual int parameters. It reads each int from the array and passes it to the function as a parameter. It can do this, because it has been specifically written to work with a function of precisely that number of parameters. Then, in your table with function names and pointers to functions, add a pointer to the right convN() function, depending on the number of parameters that the target function expects.
Don't hack it.

Related

Direct access to the function stack

I previously asked a question about C functions which take an unspecified number of parameters e.g. void foo() { /* code here */ } and which can be called with an unspecified number of arguments of unspecified type.
When I asked whether it is possible for a function like void foo() { /* code here */ } to get the parameters with which it was called e.g. foo(42, "random") somebody said that:
The only you can do is to use the calling conventions and knowledge of the architecture you are running at and get parameters directly from the stack. source
My question is:
If I have this function
void foo()
{
// get the parameters here
};
And I call it: foo("dummy1", "dummy2") is it possible to get the 2 parameters inside the foo function directly from the stack?
If yes, how? Is it possible to have access to the full stack? For example if I call a function recursively, is it possible to have access to each function state somehow?
If not, what's the point with the functions with unspecified number of parameters? Is this a bug in the C programming language? In which cases would anyone want foo("dummy1", "dummy2") to compile and run fine for a function which header is void foo()?
Lots of 'if's:
You stick to one version of a compiler.
One set of compiler options.
Somehow manage to convince your compiler to never pass arguments in registers.
Convince your compiler not to treat two calls f(5, "foo") and f(&i, 3.14) with different arguments to the same function as error. (This used to be a feature of, for example, the early DeSmet C compilers).
Then the activation record of a function is predictable (ie you look at the generated assembly and assume it will always be the same): the return address will be there somewhere and the saved bp (base pointer, if your architecture has one), and the sequence of the arguments will be the same. So how would you know what actual parameters were passed? You will have to encode them (their size, offset), presumably in the first argument, sort of what printf does.
Recursion (ie being in a recursive call makes no difference) each instance has its activation record (did I say you have to convince your compiler never optimise tail calls?), but in C, unlike in Pascal, you don't have a link backwards to the caller's activation record (ie local variables) since there are no nested function declarations. Getting access to the full stack ie all the activation records before the current instance is pretty tedious, error prone and mostly interest to writers of malicious code who would like to manipulate the return address.
So that's a lot of hassle and assumptions for essentially nothing.
Yes you can access passed parameters directly via stack. But no, you can't use old-style function definition to create function with variable number and type of parameters. Following code shows how to access a param via stack pointer. It is totally platform dependent , so i have no clue if it going to work on your machine or not, but you can get the idea
long foo();
int main(void)
{
printf( "%lu",foo(7));
}
long foo(x)
long x;
{
register void* sp asm("rsp");
printf("rsp = %p rsp_ value = %lx\n",sp+8, *((long*)(sp + 8)));
return *((long*)(sp + 8)) + 12;
}
get stack head pointer (rsp register on my machine)
add the offset of passed parameter to rsp => you get pointer to long x on stack
dereference the pointer, add 12 (do whatever you need) and return the value.
The offset is the issue since it depends on compiler, OS, and who knows on what else.
For this example i simple checked checked it in debugger, but if it really important for you i think you can come with some "general" for your machine solution.
If you declare void foo(), then you will get a compilation error for foo("dummy1", "dummy2").
You can declare a function that takes an unspecified number of arguments as follows (for example):
int func(char x,...);
As you can see, at least one argument must be specified. This is so that inside the function, you will be able to access all the arguments that follow the last specified argument.
Suppose you have the following call:
short y = 1000;
int sum = func(1,y,5000,"abc");
Here is how you can implement func and access each of the unspecified arguments:
int func(char x,...)
{
short y = (short)((int*)&x+1)[0]; // y = 1000
int z = (int )((int*)&x+2)[0]; // z = 5000
char* s = (char*)((int*)&x+3)[0]; // s[0...2] = "abc"
return x+y+z+s[0]; // 1+1000+5000+'a' = 6098
}
The problem here, as you can see, is that the type of each argument and the total number of arguments are unknown. So any call to func with an "inappropriate" list of arguments, may (and probably will) result in a runtime exception.
Hence, typically, the first argument is a string (const char*) which indicates the type of each of the following arguments, as well as the total number of arguments. In addition, there are standard macros for extracting the unspecified arguments - va_start and va_end.
For example, here is how you can implement a function similar in behavior to printf:
void log_printf(const char* data,...)
{
static char str[256] = {0};
va_list args;
va_start(args,data);
vsnprintf(str,sizeof(str),data,args);
va_end(args);
fprintf(global_fp,str);
printf(str);
}
P.S.: the example above is not thread-safe, and is only given here as an example...

PostgreSQL module - How to map an array of parameters into the function?

Am trying to convert a little bit of C test code into a PostgreSQL v1 module
Code originally designed as a simple command line, which takes a variable number - an array - of text arguments; from 3 to 7
the original code's declarations are commented; I'm now trying to convert that into a PG shared object function. All of the arguments to the function will be text strings (base command, options, etc.)
how can I declare/pass the array into the function?
PG_FUNCTION_INFO_V1(embed_0);
Datum
embed_0(PG_FUNCTION_ARGS)
// THIS was the declaration when it was a C executable:
// int main(int argc, char *argv[])
{
// don't think mapping argc to a PG type is needed here, right?
// (argc is not a parameter?)
int *argc; // = PG_GETARG_INT32(0);
char argv[] = PG_GETARG_TEXT_P(0);
int i;
Object *pName, *pCall, *pPart1, *pPart2, *pArgs, *pValue;
If you have a variable number of arguments you need to either:
Declare it VARIADIC;
Create n different signatures for the function with different numbers of arguments; or
In C create only the most general form, the 7-argument variety, and then create SQL function wrappers for the fewer-arguments cases that call the most general form.
If you really only need 3,4,5,6, and 7-argument versions I'd do something like:
CREATE FUNCTION embed0(text,text,text,text,text,text,text) RETURNS text
LANGUAGE 'C' AS 'embed0','embed0';
CREATE OR REPLACE FUNCTION embed0(text,text,text) RETURNS text AS $$
SELECT embed0($1,$2,$2,NULL,NULL,NULL,NULL);
$$ LANGUAGE 'SQL';
// ... etc
If 7 args was just an arbitrary upper bound and you can actually take any number of arguments you should instead just write:
CREATE FUNCTION embed0(text,text,text,VARIADIC text) RETURNS text
LANGUAGE 'C' AS 'embed0','embed0';
and handle the variable arguments in your C function. See the PostgreSQL source code for the concat function for how. Its implementation is text_concat in src/backend/utils/adt/varlena.c on line 3842 in current git head; your line number will differ. Most of the work is done in concat_internal.
Another example is the format function, with C implementation text_format (via lookup in pg_proc.h), located in varlena.c (according to git grep '^text_format'; Pg coding style rules specify that the function name must begin on the hard left of the line), line 3953 in current git. While a more complicated function it might be better as an example for you because it does all the work in one place rather than splitting out for a helper function call. It's declared in pg_proc.h but if it were declared in SQL it'd look something like:
CREATE FUNCTION format(text, VARIADIC text) RETURNS text AS 'internal','text_format';
There you'll see that VARIADIC arguments are accessed like any other from C, using the PG_GETARG_...(argno) macros. The PG_NARGS() macro reports the number of arguments passed. VARIADIC arguments may be null so you have to use PG_ARGISNULL(argno) and handle the case of a null argument.
So: I'd write it as a VARIADIC function using PG_NARGS, PG_GETARG_TEXT_P, PG_ARGISNULL. Because Pg's VARIADIC functions cannot be called implicitly with zero variadic arguments, I'd write a wrapper for the 3-argument case that does:
CREATE OR REPLACE FUNCTION embed_0(text,text,text) RETURNS text AS $$
SELECT embed_0($1,$2,$2, VARIADIC ARRAY[]::text[]);
$$ LANGUAGE 'SQL';
, passing an empty array as the variadic parameter. That way you can call it with 3 args too.
BTW, when coding be aware that the string in a Pg text is not null terminated, unlike those passed to main(). You must use the lengths PostgreSQL provides.. See src/include/fmgr.h, the tutorial, and the text handling in the functions in the source code. Don't use strlen, strcat, strcpy, etc because they expect null-terminated strings.

Generic function pointer in C

I am writing a generic test function that will accept a function address (read from a map file) and arguments as comma separated data as arguments from a socket.
I am able to implement it for known function pointers.
like
void iif(int a, int b, float f);
typedef void (*fn_t)(int a, int b, float f);
With above approach I would write function pointers for all types of function implementation in the code base. Is there any generic way to do this?
No, since the compiler needs to know how to represent the arguments. It can't know that for a function pointer type that excludes the information, and thus it can't generate the call.
Functions with a small number of parameters might pass them in CPU registers, "spilling over" to the stack when many parameters are called for, for instance.
You can use varargs to come around this, doing so essentially "locks down" the way the arguments are passed. Of course, it forces the called functions to deal with varargs, which is not very convenient.
You can do the following.
fn_t fncptr;
fncptr= MapAddress + 0x(offset);
MapAdress is where you map file to memory address. (You can cast to DWORD before, if C++ compiler fails to add offset to void) Offset is where the function code in file. But rememder, you will need exetuce address to pointer in windows is PAGE_EXETUCE_READWRITE. ThenCall it like,
fncptr(arg1, arg2, arg3);
if compiler fails in first code, do this:
fn_t fncptr;
fncptr= (fn_t)((DWORD)MapAddress + 0x(offset));

dlsym/dlopen with runtime arguments

I am trying to do something like the following
enum types {None, Bool, Short, Char, Integer, Double, Long, Ptr};
int main(int argc, char ** args) {
enum types params[10] = {0};
void* triangle = dlopen("./foo.so", RTLD_LAZY);
void * fun = dlsym(triangle, ars[1]);
<<pseudo code>>
}
Where pseudo code is something like
fun = {}
for param in params:
if param == None:
fun += void
if param == Bool:
fun += Boolean
if param == Integer:
fun += int
...
returnVal = fun.pop()
funSignature = returnval + " " + funName + "(" + Riffle(fun, ",") + ")"
exec funSignature
Thank you
Actually, you can do nearly all you want. In C language (unlike C++, for example), the functions in shared objects are referenced merely by their names. So, to find--and, what is most important, to call--the proper function, you don't need its full signature. You only need its name! It's both an advantage and disadvantage --but that's the nature of a language you chose.
Let me demonstrate, how it works.
#include <dlfcn.h>
typedef void* (*arbitrary)();
// do not mix this with typedef void* (*arbitrary)(void); !!!
int main()
{
arbitrary my_function;
// Introduce already loaded functions to runtime linker's space
void* handle = dlopen(0,RTLD_NOW|RTLD_GLOBAL);
// Load the function to our pointer, which doesn't know how many arguments there sould be
*(void**)(&my_function) = dlsym(handle,"something");
// Call something via my_function
(void) my_function("I accept a string and an integer!\n",(int)(2*2));
return 0;
}
In fact, you can call any function that way. However, there's one drawback. You actually need to know the return type of your function in compile time. By default, if you omit void* in that typedef, int is assumed as return type--and, yes, it's a correct C code. The thing is that the compiler needs to know the size of the return type to operate the stack properly.
You can workaround it by tricks, for example, by pre-declaring several function types with different sizes of return types in advance and then selecting which one you actually are going to call. But the easier solution is to require functions in your plugin to return void* or int always; the actual result being returned via pointers given as arguments.
What you must ensure is that you always call the function with the exact number and types of arguments it's supposed to accept. Pay closer attention to difference between different integer types (your best option would be to explicitly cast arguments to them).
Several commenters reported that the code above is not guaranteed to work for variadic functions (such as printf).
What dlsym() returns is normally a function pointer - disguised as a void *. (If you ask it for the name of a global variable, it will return you a pointer to that global variable, too.)
You then invoke that function just as you might using any other pointer to function:
int (*fun)(int, char *) = (int (*)(int, char *))dlsym(triangle, "function");
(*fun)(1, "abc"); # Old school - pre-C89 standard, but explicit
fun(1, "abc"); # New school - C89/C99 standard, but implicit
I'm old school; I prefer the explicit notation so that the reader knows that 'fun' is a pointer to a function without needing to see its declaration. With the new school notation, you have to remember to look for a variable 'fun' before trying to find a function called 'fun()'.
Note that you cannot build the function call dynamically as you are doing - or, not in general. To do that requires a lot more work. You have to know ahead of time what the function pointer expects in the way of arguments and what it returns and how to interpret it all.
Systems that manage more dynamic function calls, such as Perl, have special rules about how functions are called and arguments are passed and do not call (arguably cannot call) functions with arbitrary signatures. They can only call functions with signatures that are known about in advance. One mechanism (not used by Perl) is to push the arguments onto a stack, and then call a function that knows how to collect values off the stack. But even if that called function manipulates those values and then calls an arbitrary other function, that called function provides the correct calling sequence for the arbitrary other function.
Reflection in C is hard - very hard. It is not undoable - but it requires infrastructure to support it and discipline to use it, and it can only call functions that support the infrastructure's rules.​​​​
The Proper Solution
Assuming you're writing the shared libraries; the best solution I've found to this problem is strictly defining and controlling what functions are dynamically linked by:
Setting all symbols hidden
for example clang -dynamiclib Person.c -fvisibility=hidden -o libPerson.dylib when compiling with clang
Then using __attribute__((visibility("default"))) and extern "C" to selectively unhide and include functions
Profit! You know what the function's signature is. You wrote it!
I found this in Apple's Dynamic Library Design Guidelines. These docs also include other solutions to the problem above was just my favorite.
The Answer to your Question
As stated in previous answers, C and C++ functions with extern "C" in their definition aren't mangled so the function's symbols simply don't include the full function signature. If you're compiling with C++ without extern "C" however functions are mangled so you could demangle them to get the full function's signature (with a tool like demangler.com or a c++ library). See here for more details on what mangling is.
Generally speaking it's best to use the first option if you're trying to import functions with dlopen.

What does ... mean in an argument list in C?

I came across the following function signature and I wondered if this (the ellipsis, or "...") is some kind of polymorphism?
#include <fcntl.h>
int fcntl(int fd, int cmd, ... );
Thanks in advance.
It's a variable argument list.
That is a variadic function. See stdarg.h for more details.
The ... means that you can pass any number of arguments to this function, as other commenters have already mentioned. Since the optional arguments are not typed, the compiler cannot check the types and you can technically pass in any argument of any type.
So does this mean you can use this to implement some kind of polymorphic function? (I.e., a function that performs some operation based on the type of its arguments.)
No.
The reason you cannot do this, is because you cannot at runtime inspect the types of the arguments passed in. The function reading in the variable argument list is expected to already know the types of the optional arguments it is going to receive.
In case of a function that really is supposed to be able to take any number of arguments of any type (i.e., printf), the types of the arguments are passed in via the format string. This means that the caller has to specify the types it is going to pass in at every invocation, removing the benefit of polymorphic functions (that the caller doesn't have to know the types either).
Compare:
// Ideal invocation
x = multiply(number_a, number_b)
y = multiply(matrix_a, matrix_b)
// Standard C invocation
x = multiply_number(number_a, number_b)
y = multiply_matrix(matrix_a, matrix_b)
// Simulated "polymorphism" with varargs
x = multiply(T_NUMBER, number_a, number_b)
y = multiply(T_MATRIX, matrix_a, matrix_b)
You have to specify the type before the varargs function can do the right thing, so this gains you nothing.
No, that's the "ellipsis" you're seeing there, assuming you're referring to the ... part of the declaration.
Basically it says that this function takes an unknown number of arguments after the first two that are specified there.
The function has to be written in such a way that it knows what to expect, otherwise strange results will ensue.
For other functions that support this, look at the printf function and its variants.
Does C support polymorphism?
No, it doesn't.
However there are several libraries, such as Python C API, that implements a rough variant of polymorphism using structs and pointers. Beware that compiler cannot perform appropriate type checking in most cases.
The tecnhique is simple:
typedef struct {
char * (*to_string)();
} Type;
#define OBJ_HEADER Type *ob_type
typedef struct {
OBJ_HEADER;
} Object;
typedef struct {
OBJ_HEADER;
long ival;
} Integer;
typedef struct {
OBJ_HEADER;
char *name;
char *surname;
} Person;
Integer and Person get a Type object with appropriate function pointers (e.g. to functions like integer_to_string and person_to_string).
Now just declare a function accepting an Object *:
void print(Object *obj) {
printf("%s", obj->type->to_string());
}
now you can call this function with both an Integer and a Person:
Integer *i = make_int(10);
print((Object *) i);
Person *p = make_person("dfa");
print((Object *) p);
EDIT
alternatively you can declare i and p as Object *; of course make_int and make_person will allocate space for Integer and Person and do the appropriate cast:
Object *
make_integer(long i) {
Integer *ob = malloc(sizeof(Integer));
ob->ob_type = &integer_type;
ob->ival = i;
return (Object *) ob;
}
NB: I cannot compile these examples rigth now, please doublecheck them.
I came across the following function signature and I wondered if this (the ellipsis, or "...") is some kind of polymorphism?
yes, it is a primitive form of polymorphism. With only one function signature you are able to pass various structures. However the compiler cannot help you with detecting type errors.
Adding to what's been said: C supports polymorphism through other means. For example, take the standard library qsort function which sorts data of arbitrary type.
It is able to do so by means of untyped (void) pointers to the data. It also needs to know the size of the data to sort (provided via sizeof) and the logic that compares the objects' order. This is accomplished by passing a function pointer to the qsort function.
This is a prime example of runtime polymorphism.
There are other ways to implement object-oriented behaviour (in particular, virtual function calls) by managing the virtual function tables manually. This can be done by storing function pointers in structures and passing them around. Many APIs do so, e.g. the WinAPI, which even uses advanced aspects of object orientation, e.g. base class call dispatch (DefWindowProc, to simulate calling the virtual method of the base class).
I assume you are referring to the ellipsis (...)? If so this indicates that 0 or more parameters will follow. It is called varargs, defined in stdarg.h
http://msdn.microsoft.com/en-us/library/kb57fad8.aspx
printf uses this functionality. Without it you wouldn't be able to keep adding parameters to the end of the function.
C supports a crude form of Polymorphism. I.e. a type being able to appear and behave as another type. It works in a similar was as in C++ under the hood (relying on memory being aligned) but you have to help the compiler out by casting. E.g. you can define a struct:
typedef struct {
char forename[20];
char surname[20];
} Person;
And then another struct:
typedef struct {
char forename[20];
char surname[20];
float salary;
char managername[20];
} Employee;
Then
int main (int argc, int *argv)
{
Employee Ben;
setpersonname((Person *) &Ben);
}
void setpersonname(Person *person)
{
strcpy(person->forename,"Ben");
}
The above example shows Employee being used as a Person.
No, it is a function that is taking variable number of arguments.
That is not technically polymorphism. fcntl takes variable number of arguments & that is the reason for the ... similar to printf function.
C neither supports function overloading - which is a type of ad-hoc polymorphism based on compile-time types - nor multiple dispatch (ie overloading based on runtime types).
To simulate function overloading in C, you have to create multiple differently named functions. The functions' names often contain the type information, eg fputc() for characters and fputs() for strings.
Multiple dispatch can be implemented by using variadic macros. Again, it's the programmer's job to provide the type information, but this time via an extra argument, which will be evaluated at runtime - in contrast to the compile-time function name in case of the approach given above. The printf() family of functions might not be the best example for multiple dispatch, but I can't think of a better one right now.
Other approaches to multiple dispatch using pointers instead of variadic functions or wrapping values in structures to provide type annotations exist.
The printf declaration in the standard library is
int printf(const char*, ...);
Think about that.
You can write code that supports Polymorphic behavior in C, but the ... (ellipsis) is not going to be much help. That is for variable arguments to a function.
If you want polymorphic behavior you can use, unions and structures to construct a data structure that has a "type" section and variable fields depending on type. You can also include tables of function pointers in the structures. Poof! You've invented C++.
Yes C Do support the polymorphism
the Code which we write in the C++ using virtual to implement the polymorphism
if first converted to a C code by Compiler (one can find details here).
It's well known that virtual functionality in C++ is implemented using function pointers.

Resources