Calling C functions with too many arguments - c

I am currently changing the function signatures of a class of functions in an application. These functions are being stored in a function table, so I was expecting to change this function table as well. I have just realised that in certain instances, we already use the new function signature. But because everything is casted to the correct function type as it is put into the function table, no warnings are being raised.
When the function is called, it will be passed extra parameters that are not really part of the function declaration, but they are on the end of the parameter list.
I can't determine if this is guaranteed by the way function parameters are passed in C. I guess to do variadic functions like sprintf, it has to be the case that earlier arguments can be resolved correctly whatever is on the end of the parameter list?
It evidently works just fine across multiple platforms but out of curiosity I'd like to know how and why it works.

But because everything is casted to the correct function type as it is put into the function table, no warnings are being raised.
So the compiler gets to be no help to speak of. C programmers cast too much. >_<
I can't determine if this is guaranteed by the way function parameters are passed in C. I guess to do variadic functions like sprintf, it has to be the case that earlier arguments can be resolved correctly whatever is on the end of the parameter list?
Technically, you've got undefined behavior. But it's defined for your platform to use the standard C calling conventions (see Scott's answer), or something that maps directly to them (usually by mapping the first N parameters to a certain set of processor registers).
This comes up a lot with variable argument lists, too. For example, printf is declared something like:
int printf(const char* format, ...);
And its definition usually uses the stdarg system to handle the extra arguments, which looks like:
#include <stdarg.h>
int printf(const char* format, ...)
{
va_list ap;
int result;
va_start(ap, format);
result = vprintf(format, ap);
va_end(ap);
return result;
}
If you're on a platform with standard C calling conventions, that va_end(ap) macro usually turns into a do-nothing. In this case, you can get away with passing extra arguments to a function. But on some platforms, the va_end() call is required to restore the stack to a predictable state (i.e. where it was before the call to va_start); in those cases, your functions will not leave the stack the way it found it (it won't pop enough arguments back off the stack) so your calling function could, for example, crash on exit when it fetches a bogus value for a return address.

Your functions must certainly be using the cdecl calling convention (http://en.wikipedia.org/wiki/X86_calling_conventions#cdecl). This pushes arguments on the stack in reverse order, from right to left, ensuring that the last argument can be easily located (top of the stack) and used to interpret the remainder, such as a printf format string. It is also the responsibility of the caller to clean up the stack, which is a bit less compact than the function itself doing so (as in pascal/stdcall convention), but ensures that variable argument lists can be used, and implies that trailing arguments can be ignored.

Related

C callbacks - optional argument?

Hey I have implemented some callbacks in my C program.
typedef void (*server_end_callback_t)(void *callbackArg);
then I have variable inside structure to store this callback
server->server_end_callback = on_sever_end;
What I have noticed it that I can pass in on_server_end callback function implementation that skips void *callbackArg and the code works correctly (no errors).
Is it correct to skip some arguments like void * implementing callback functions which prototypes takes such arguments?
void on_server_end(void) {
// some code goes here
}
I believe it is an undefined behavior from the C point of view, but it works because of the calling convention you are using.
For example, AMD64 ABI states that the first six arguments get passed to the calling function using CPU registers, not stack. So neither caller nor callee need no clean-up for the first six arguments and it works fine.
For more info please refer the Wikipedia.
The code works correctly because of the convention of passing arguments. Caller knows that callee expects some arguments - exactly one. So, it prepares the argument(s) (either in register or on stack - depending on ABI on your platform). Then callee uses those parameters or not. After return from callee, caller cleans up the stack if necessary. That's the mistery.
However, you shall not abuse this specific behaviour by passing incompatible function. It is a good practice to always compile your code with options -W -Wall -Werror (clang/gcc and compatible). Enabling such option would provide you a compilation error.
C allows a certain amount of playing fast and loose with function arguments. So
void (*fptr) ();
means "a pointer to a function which takes zero or more arguments". However this is for backwards compatibility, it's not wise to use it in new C code. The other way round
void (*fptr)(void *ptr)
{
/* don't use the void */
}
/* in another scope */
(*fptr)(); /* call with no arguments */
also works, as long as you don't use the void *, and I believe it is guaranteed to work though I'm not completely sure about that (on a modern machine the calling convention is to pass the first arguments in registers, so you just get a garbage register, and it will work). Again, it is a very bad idea to rely on it.
You can pass a void *, which you then cast to a structure of appropriate type containing as many arguments as you wish. That is a good idea and a sensible use of C's flexibility.
Is it correct to skip some arguments like void * implementing callback functions which prototypes takes such arguments?
No it is not. Any function with a given function declaration is not compatile with a function of a different function declaration. This rule applies for pointers to functions too.
So if you have a function such as pthread_create(..., my_callback, ...); and it expects you to pass a function pointer of type void* (*) (void*), then you cannot pass a function pointer of a different format. This invokes undefined behavior and compilers may generate incorrect code.
That being said, function pointer compatibility is a common non-standard extension on many systems. If the calling convention of the system is specified in a way that the function format doesn't matter, and the specific compiler port supports it, then such code might work just fine.
Such code is however not portable and not standard. It is best to avoid it whenever possible.

How does Stack Activation Frame for Variable arguments functions works?

I am reading http://www.cs.utexas.edu/users/lavender/courses/cs345/lectures/CS345-Lecture-07.pdf to try to understand how does Stack Activation Frame for Variable arguments functions works?
Specifically how can the called function knows how many arguments are being passed?
The slide said:
The va_start procedure computes the fp+offset value following the argument
past the last known argument (e.g., const char format). The rest of the arguments are then computed by calling
va_arg, where the ‘ap’ argument to va_arg is some fp+offset value.*
My question is what is fp (frame point)? how does va_start computes the 'fp+offset' values?
and how does va_arg get 'some fp+offset values? and what does va_end supposed to do with stack?
The function doesn't know how many arguments are passed. At least not in any way that matters, i.e. in C you cannot query for the number of arguments.
That's why all varargs functions must either:
Use a non-varying argument that contains information about the number and types of all variable arguments, like printf() does; or
Use a sentinel value on the end of the variable argument list. I'm not aware of a function in the standard library that does this, but see for instance GTK+'s gtk_list_store_set() function.
Both mechanisms are risky; if your printf() format string doesn't match the arguments, you're going to get undefined behavior. If there was a way for printf() to know the number of passed arguments, it would of course protect against this, but there isn't. So it can't.
The va_start() macro takes as an argument the last non-varying argument, so it can somehow (this is compiler internals, there's no single correct or standard answer, all we can do from this side of the interface is reason from the available data) use that to know where the first varying argument is located on the stack.
The va_arg() macro gets the type as an "argument", which makes it possible to use that to compute the offset, and probably increment some state in the va_list object to point at the next varying argument.

An example of use of varargs in C

Here I found an example of how varargs can be used in C.
#include <stdarg.h>
double average(int count, ...)
{
va_list ap;
int j;
double tot = 0;
va_start(ap, count); //Requires the last fixed parameter (to get the address)
for(j=0; j<count; j++)
tot+=va_arg(ap, double); //Requires the type to cast to. Increments ap to the next argument.
va_end(ap);
return tot/count;
}
I can understand this example only to some extent.
It is not clear to me why we use va_start(ap, count);. As far as I understand, in this way we set the iterator to its first element. But why it is not set to the beginning by default?
It is not clear to me why we need to give count as an argument. Can't C automatically determine the number of the arguments?
It is not clear to me why we use va_end(ap). What does it change? Does it set the iterator to the end of the list? But is it not set to the end of the list by the loop? Moreover, why do we need it? We do not use ap anymore; why do we want to change it?
Remember that arguments are passed on the stack. The va_start function contains the "magic" code to initialize the va_list with the correct stack pointer. It must be passed the last named argument in the function declaration or it will not work.
What va_arg does is use this saved stack pointer, and extract the correct amount of bytes for the type provided, and then modify ap so it points to the next argument on the stack.
In reality these functions (va_start, va_arg and va_end) are not actually functions, but implemented as preprocessor macros. The actual implementation also depends on the compiler, as different compilers can have different layout of the stack and how it pushes arguments on the stack.
But why it is not set to the beginning by default?
Maybe because of historical reasons from when compilers weren't smart enough. Maybe because you might have a varargs function prototype which doesn't actually care about the varargs and setting up varargs happens to be expensive on that particular system. Maybe because of more complex operations where you do va_copy or maybe you want to restart working with the arguments multiple times and call va_start multiple times.
The short version is: because the language standard says so.
Second, it is not clear to me why we need to give count as an argument. Can't C++ automatically determine the number of the arguments?
That's not what all that count is. It is the last named argument to the function. va_start needs it to figure out where the varargs are. Most likely this is for historical reasons on old compilers. I can't see why it couldn't be implemented differently today.
As the second part of your question: no, the compiler doesn't know how many arguments were sent to the function. It might not even be in the same compilation unit or even the same program and the compiler doesn't know how the function will be called. Imagine a library with a varargs function like printf. When you compile your libc the compiler doesn't know when and how programs will call printf. On most ABIs (ABI is the conventions for how functions are called, how arguments are passed, etc) there is no way to find out how many arguments a function call got. It's wasteful to include that information in a function call and it's almost never needed. So you need to have a way to tell the varargs function how many arguments it got. Accessing va_arg beyond the number of arguments that were actually passed is undefined behavior.
Then it is not clear to me why do we use va_end(ap). What does it change?
On most architectures va_end doesn't do anything relevant. But there are some architectures with complex argument passing semantics and va_start could even potentially malloc memory then you'd need va_end to free that memory.
The short version here is also: because the language standard says so.
va_start initalises the list of variable arguments. You always pass the last named function argument as the second parameter. It is because you need to provide information about location in the stack, where variable arguments begin, since arguments are pushed on the stack and compiler cannot know where there's a beginning of variable argument list (there's no differentiation).
As to va_end, it is used to free resources allocated for the variable argument list during the va_start call.
It's C macroses. va_start sets internal pointer to address of first element. va_end cleanup va_list. If there is va_start in code and there is no va_end - it's UB.
The restrictions that ISO C places on the second parameter to the va_start() macro in header
are different in this International Standard. The parameter parmN is the identifier of the rightmost parameter
in the variable parameter list of the function definition (the one just before the ...). If the parameter
parmN is declared with a function, array, or reference type, or with a type that is not compatible with the
type that results when passing an argument for which there is no parameter, the behavior is undefined.

Is returning va_list safe in C?

I'd like to write a function that has return type of va_list.
example: va_list MyFunc(va_list args);
is this safe and portable?
va_list might (but is not guaranteed to) be an array type, so you can't pass or return it by value. Code that looks as if it does, might just be passing/returning a pointer to the first element, so you can use the parameter in the callee but you might be acting on the original.
Formally you can probably say that va_list is an entity type, not a value type. You copy it with va_copy, not with assignment or via function parameters/return.
While you can definitely return such a value, I am not sure if the return value can be used in a useful way.
As the handling of va_lists requires special treatment (va_end() required after va_start() and va_copy()), and va_start/copy and va_end macros are even allowed to contain { } to enforce this pairing, you cannot call one without the other.
Whatever the language standard says, this is unlikely to work in practice. A va_list is likely to be a pointer to a call record placed on the stack by a caller for the benefit of the callee. Once the callee returns, that memory on the stack is fair game for reuse.
Returning the type va_list is unlikely to actually copy the content of the list back to the caller. Although that would be a valid implementation of C, if the standard requires it be done so, that would be a defect in the specification.
Passing a pointer to another function is quite different from
returning that pointer though. Many/most implementations store the
actual variable arguments in a stack frame that is destroyed when the
vararg function returns. (i.e. returning a va_list or a pointer to
one, will leave you with pointers to local variables that's
destroyed). – nos
well in my case i'd be returning va_list back but thanks for warning –
Hayri Uğur Koltuk
If you pass a pointer to a va_list to the function MyFunc(va_list *args), you don't need to pass the modified (by va_arg(*args, type)) argument list back, since MyFunc modifies the original list.

Why is void f(...) not allowed in C?

Why doesn't C allow a function with variable length argument list such as:
void f(...)
{
// do something...
}
I think the motivation for the requirement that varargs functions must have a named parameter is for uniformity of va_start. For ease of implementation, va_start takes the name of the last named parameter. With a typical varargs calling convention, and depending on the direction arguments are stored, va_arg will find the first vararg at address (&parameter_name) + 1 or (first_vararg_type*)(&parameter_name) - 1, plus or minus some padding to ensure alignment.
I don't think there's any particular reason why the language couldn't support varargs functions with no named parameters. There would have to be an alternative form of va_start for use in such functions, that would have to get the first vararg directly from the stack pointer (or to be pedantic the frame pointer, which is in effect the value that the stack pointer had on function entry, since the code in the function might well have moved the sp since function entry). That's possible in principle -- any implementation should have access to the stack[*] somehow, at some level -- but it might be annoying for some implementers. Once you know the varargs calling convention you can generally implement the va_ macros without any other implementation-specific knowledge, and this would require also knowing how to get at the call arguments directly. I have implemented those varargs macros before, in an emulation layer, and it would have annoyed me.
Also, there's not a lot of practical use for a varargs function with no named parameters. There's no language feature for a varargs function to determine the type and number of variable arguments, so the callee has to know the type of the first vararg anyway in order to read it. So you might as well make it a named parameter with a type. In printf and friends the value of the first parameter tells the function what the types are of the varargs, and how many of them there are.
I suppose that in theory the callee could look at some global to figure out how to read the first argument (and whether there even is one), but that's pretty nasty. I would certainly not go out of my way to support that, and adding a new version of va_start with extra implementation burden is going out of my way.
[*] or if the implementation doesn't use a stack, to whatever it uses instead to pass function arguments.
With variable-length argument list you must declare the type of the first argument - that's the syntax of the language.
void f(int k, ...)
{
/* do something */
}
will work just fine. You then have to use va_list, va_start, va_end, etc. inside the function to access individual arguments.
C does allow for variable length arguments, but you need to use va_list, va_start, va_end, etc. for it. How do you think printf and friends are implemented? That said, I would recommend against it. You can usually accomplish a similar thing more cleanly using an array or struct for the parameters.
Playing around with it, made this nice implementation that I think some people might want to consider.
template<typename T>
void print(T first, ...)
{
va_list vl;
va_start(vl, first);
T temp = first;
do
{
cout << temp << endl;
}
while (temp = va_arg(vl, T));
va_end(vl);
}
It ensures you have one variable minimum, but allows you to put them all in a loop in a clean way.
There's no an intrisic reason why C can't accept void f(...). It could, but "designers" of this C feature decided not to do so.
My speculation about their motivations is that allowing void f(...) would require more "hidden" code (that can be accounted as a runtime) than not allowing it: in order to make distinguishable the case f() from f(arg) (and the others), C should provide a way to count how many args are given, and this needs more generated code (and likely a new keyword or a special variable like say "nargs" to retrieve the count), and C usually tries to be as minimalist as possible.
The ... allows for no arguments, ie: for int printf(const char *format, ...); the statement
printf("foobar\n");
is valid.
If you don't mandate at least 1 parameter (which should be used to check for more parameters), there is no way for the function to "know" how it was called.
All these statements would be valid
f();
f(1, 2, 3, 4, 5);
f("foobar\n");
f(qsort);

Resources