Below is code which includes a variadic function and calls to the variadic function. I would expect that it would output each sequence of numbers appropriately. It does when compiled as a 32-bit executable, but not when compiled as a 64-bit executable.
#include <stdarg.h>
#include <stdio.h>
#ifdef _WIN32
#define SIZE_T_FMT "%Iu"
#else
#define SIZE_T_FMT "%zu"
#endif
static void dumpargs(size_t count, ...) {
size_t i;
va_list args;
printf("dumpargs: argument count: " SIZE_T_FMT "\n", count);
va_start(args, count);
for (i = 0; i < count; i++) {
size_t val = va_arg(args, size_t);
printf("Value=" SIZE_T_FMT "\n", val);
}
va_end(args);
}
int main(int argc, char** argv) {
(void)argc;
(void)argv;
dumpargs(1, 10);
dumpargs(2, 10, 20);
dumpargs(3, 10, 20, 30);
dumpargs(4, 10, 20, 30, 40);
dumpargs(5, 10, 20, 30, 40, 50);
return 0;
}
Here is the output when compiled for 64-bit:
dumpargs: argument count: 1
Value=10
dumpargs: argument count: 2
Value=10
Value=20
dumpargs: argument count: 3
Value=10
Value=20
Value=30
dumpargs: argument count: 4
Value=10
Value=20
Value=30
Value=14757395255531667496
dumpargs: argument count: 5
Value=10
Value=20
Value=30
Value=14757395255531667496
Value=14757395255531667506
Edit:
Please note that the reason the variadic function pulls size_t out is because the real-world use of this is for a variadic function that accepts a list of pointers and lengths. Naturally the length argument should be a size_t. And in some cases a caller might pass in a well-known length for something:
void myfunc(size_t pairs, ...) {
va_list args;
va_start(args, count);
for (i = 0; i < pairs; i++) {
const void* ptr = va_arg(args, const void*);
size_t len = va_arg(args, size_t);
process(ptr, len);
}
va_end(args);
}
void user(void) {
myfunc(2, ptr1, ptr1_len, ptr2, 4);
}
Note that the 4 passed into myfunc might encounter the problem described above. And yes, really the caller should be using sizeof or the result of strlen or just plain put the number 4 into a size_t somewhere. But the point is that the compiler is not catching this (a common danger with variadic functions).
The right thing to do here is to eliminate the variadic function and replace it with a better mechanism that provides type safety. However, I would like to document this problem, and collect more detailed information as to exactly why this problem exists on this platform and manifests as it does.
So basically, if a function is variadic, it must conform to a certain calling convention (most importantly, the caller must clean up args, not the callie, since the callie has no idea how many args there will be).
The reason why it starts happening on the 4th is because of the calling convention used on x86-64. To my knowledge, both visual c++ and gcc use registers for the first few parameters, and then after that use the stack.
I am guessing that this is the case even for variadic functions (which does strike me as odd since it would make the va_* macros more complicated).
On x86, the standard C calling convention is the use the stack always.
The problem is that you're using size_t to represent the type of the values. This is incorrect, the values are actually normal 32 bit values on Win64.
Size_t should only be used for values which change size based on the 32 or 64 bit-ness of the platform (such as pointers). Change the code to use int or __int32 and this should fix your problem.
The reason this works fine on Win32 is that size_t is a different sized type depending on the platfrom. For 32 bit windows it will be 32 bits and on 64 bit windows it will be 64 bit. So on 32 bit windows it just happens to match the size of the data type you are using.
A variadic function is only weakly type checked. In particular, the function signature does not provide enough information for the compiler to know the type of each argument assumed by the function.
In this case, size_t is 32-bits on Win32 and 64-bits on Win64. It has to vary in size like that in order to perform its defined role. So for a variadic function to pull arguments out correctly which are of type size_t, the caller had to make certain that the compiler could tell that the argument was of that type at compile-time in the calling module.
Unfortunately 10 is a constant of type int. There is no defined suffix letter that marks a constant to be of type size_t. You could hide that fact inside a platform-specific macro, but that would be no clearer than writing (size_z)10 at the call site.
It appears to work partially because of the actual calling convention used in Win64. From the examples given, we can tell that the first four integral arguments to a function are passed in registers, and the rest on the stack. That allowed count and the first three variadic parameters to be read correctly.
However it only appears to work. You are actually standing squarely in Undefined Behavior territory, and "undefined" really does mean "undefined": anything can happen.
On other platforms, anything can happen too.
Because variadic functions are implicitly unsafe, a special burden is placed on the coder to make certain that the type of each argument known at compile time matches the type that argument will be assumed to have at run time.
In some cases where the interfaces are well known, it is possible to warn about type mismatch. For example, gcc can often recognize that the type of an argument to printf() doesn't match the format string, and issue a warning. But doing that in the general case for all variadic functions is hard.
The reason for this is because size_t is defined as a 32-bit value on 32-bit Windows, and a 64-bit value on 64-bit Windows. When the 4th argument is passed into the variadic function, the upper bits appear to be uninitialized. The 4th and 5th values that are pulled out are actually:
Value=0xcccccccc00000028
Value=0xcccccccc00000032
I can solve this problem with a simple cast on all the arguments, such as:
dumpargs(5, (size_t)10, (size_t)20, (size_t)30, (size_t)40, (size_t)50);
This does not answer all my questions, however; such as:
Why is it the 4th argument? Likely because the first 3 are in registers?
How does one avoid this situation in a type-safe portable manner?
Does this happen on other 64-bit platforms, using 64-bit values (ignoring that size_t might be 32-bit on some 64-bit platforms)?
Should I pull out the values as 32-bit values regardless of the target platform, and will that cause problems if a 64-bit value is pushed into the variadic function?
What do the standards say about this behavior?
Edit:
I really wanted to get a quote from The Standard, but it's something that's not hyperlink-able, and costs money to purchase and download. Therefore I believe quoting it would be a copyright violation.
Referencing the comp.lang.c FAQ, it's made clear that when writing a function that takes a variable number of arguments, there's nothing you can do for type safety. It's up to the caller to make sure that each argument either perfectly matches or is explicitly cast. There are no implicit conversions.
That much should be obvious to those who understand C and printf (note that gcc has a feature to check printf-style format strings), but what's not so obvious is that not only are the types not implicitly cast, but if the size of the types don't match what's extracted, you can have uninitialized data, or undefined behavior in general. The "slot" where an argument is placed might not be initialized to 0, and there might not be a "slot"--on some platforms you could pass a 64-bit value, and extract two 32-bit values inside the variadic function. It's undefined behavior.
If you are the one writing this function, it is your job to write the variadic function correctly and/or correctly document your function's calling conventions.
You already found that C plays fast-and-loose with types (see also signedness and promotion), so explicit casting is the most obvious solution. This is frequently seen with integer constants being explicitly defined with things like UL or ULL.
Most sanity checks on passed values will be application-specific or non-portable (e.g. pointer validity). You can use hacks like mandating that pre-defined sentinel value(s) be sent as well, but that's not infallible in all cases.
Best practice would be to document heavily, perform code reviews, and/or write unit tests with this bug in mind.
Related
Is it safe and defined behaviour to read va_list like an array instead of using the va_arg function?
EX:
void func(int string_count, ...)
{
va_start(valist, string_count);
printf("First argument: %d\n", *((int*)valist));
printf("Second argument: %d\n", *(((int*)valist)+1));
va_end(valist);
}
Same question for assigningment
EX:
void func(int string_count, ...)
{
va_start(valist, string_count);
printf("Third argument: %d\n", *(((int*)valist)+2));
*((int*)valist+2)=33;
printf("New third argument: %d\n", *(((int*)valist)+2));
va_end(valist);
}
PS: This seems to work on GCC
No, it is not, you cannot assume anything because the implementation varies across libraries.
The only portable way to access the values is by using the macros defined in stdarg.h for accessing the
ellipsis. The size of the type is important, otherwise you end up reading garage
and if your read more bytes than has been passed, you have undefined behaviour.
So, to get a value, you have to use va_arg.
See: STDARG documentation
You cannot relay on a guess as to how va_list works, or on a particular
implementation. How va_list works depends on the ABI, the architecture, the
compiler, etc. If you want a more in-depth view of va_list, see
this answer.
edit
A couple of hours ago I wrote this answer explaining how to use the
va_*-macros. Take a look at that.
No, this is not safe and well-defined. The va_list structure could be anything (you assume it is a pointer to the first argument), and the arguments may or may not be stored contiguously in the "right order" in some memory area being pointed to.
Example of va_list implementation that doesn't work for your code - in this setup some arguments are passed in registers instead of the stack, but the va_arg still has to find them.
If an implementation's documentation specifies that va_list may be used in ways beyond those given in the Standard, you may use them in such fashion on that implementation. Attempting to use arguments in other ways may have unpredictable consequences even on platforms where the layout of parameters is specified. For example, on a platform where variadic arguments are pushed on the stack in reverse order, if one were to do something like:
int test(int x, ...)
{
if (!x)
return *(int*)(4+(uintptr_t)&x); // Address of first argument after x
... some other code using va_list.
}
int test2(void)
{
return test(0, someComplicatedComputation);
}
a compiler which is processing test2 might look at the definition of test,
notice that it (apparently) ignores its variadic arguments when the first
argument is zero, and thus conclude that it doesn't need to compute and
pass the result of someComplicatedComputation. Even if the documentation
for the platform documents the layout of variadic arguments, the fact that
the compiler can't see that they are accessed may cause it to conclude that
they are not.
I'm on programming project 4 from chapter 19 of C programming, A Modern Approach. My code works but I get this warning trying to pass a function returning a void * parameter to printf with conversion specifier %s.
format %s expects argument of type char *, but argument 2 has type void * [-Wformat=]
I can easily get rid of the warning by casting the return type of the function to char *, like
printf("%s\n", (char *) function(param));
but I just want to know why this necessary since type void * is casted to another pointer type automatically.
Compiler is very right to complain in this case.
As per your logic itself, the function returning void * could return a structure pointer casted to void *, but then, %s won't be able to print that, isn't it?
So, if you know what you're doing, you can cast the result, for this case.
Also, as others pointed out, maybe it's worthy to mention that, this warning has nothing to do with the standard specification, as in the standards, there is no restriction of the type of the arguments. (Borrowing Mr. #WhozCraig's words) This warning is basically due to an additional layer of type-checking entirely performed by compiler on it's own, enabled by -Wformat flag in gcc .
As far as the pure language is concerned (not the standard library and its expectations, the actual formal language) you can push anything you want on that argument list (including something utterly incoherent in relating to the requirements of a %s format specifier of some library routine). Of course, unless whatever you pushed ultimately is, in fact, the address of a nullchar terminated sequence of char, printf itself will trapes into undefined behavior at your behest.
The warning you're receiving is based on an additional layer of api-checking within the compiler, not some violation of the language itself. That api checking is matching format specs with types of presented arguments for frequently-used standard library apis such as printf, scanf, etc. Could the author of that warning-check been a little more forgiving and ignore void* arguments for specs expecting pointer-types? Certainly, but the point of the check-feature would dwindle pretty rapidly were that the case. Consider this:
int a = 0;
void *b = &a;
printf("%s\n", b);
If that api-check feature is going to be worth any salt at all it had better bark about that mismatched type, because as far as the language itself is concerned, there is nothing wrong with this code. And that has nothing to do with what evil I just requested it do. As far as the language is concerned, printf is simply this:
int printf(char *format ...);
And the call I setup certainly fulfills that (bad for me, and thankfully, the api-checks of my modern compiler will let me know soon enough there may be a problem).
A pointer is a variable which points to a single memory location.
The number of bytes pointed by the pointer depends on the type of the pointer. So if it is int* then it is interpreted as 4 bytes,if it is a char* it is interpreted as 1 byte.
A void* has no type. So the compiler cant dereference this pointer. So in order for the compiler to understand the memory to be dereferenced we need typecasting here.
The printf function is declared as something like this:
int printf(char *format ...);
Here ... denotes any additional arguments the caller supplied (that is, your string you wanted to print). When printf examines these additional parameters, it uses some low-level code, which has no type safety.
This code cannot determine that the parameter has type void*, and cast it to char* automatically. Instead, if the binary representation of void* and char* is the same, the parameter can be extracted from the ... part without regard to its actual type. If the representation is different, the low-level code will try to print an incorrect value (and probably crash).
Representation of void* and char* is the same for all platforms that I know of, so it's probably safe (if you trust me, that is - please don't!). However, if you compile with gcc -Wall, as some people recommend, all warnings are upgraded to errors, so you should do the casting, as the compiler indicates.
As you can see from the code snippet below, I have declared one char variable and one int variable. When the code gets compiled, it must identify the data types of variables str and i.
Why do I need to tell again during scanning my variable that it's a string or integer variable by specifying %s or %d to scanf? Isn't the compiler mature enough to identify that when I declared my variables?
#include <stdio.h>
int main ()
{
char str [80];
int i;
printf ("Enter your family name: ");
scanf ("%s",str);
printf ("Enter your age: ");
scanf ("%d",&i);
return 0;
}
Because there's no portable way for a variable argument functions like scanf and printf to know the types of the variable arguments, not even how many arguments are passed.
See C FAQ: How can I discover how many arguments a function was actually called with?
This is the reason there must be at least one fixed argument to determine the number, and maybe the types, of the variable arguments. And this argument (the standard calls it parmN, see C11(ISO/IEC 9899:201x) §7.16 Variable arguments ) plays this special role, and will be passed to the macro va_start. In another word, you can't have a function with a prototype like this in standard C:
void foo(...);
The reason why the compiler can not provide the necessary information is simply, because the compiler is not involved here. The prototype of the functions doesn't specify the types, because these functions have variable types. So the actual data types are not determined at compile time, but at runtime.
The function then takes one argument from the stack, after the other. These values don't have any type information associated with it, so the only way, the function knows how to interpret the data is, by using the caller provided information, which is the format string.
The functions themselves don't know which data types are passed in, nor do they know the number of arguments passed, so there is no way that printf can decide this on it's own.
In C++ you can use operator overloading, but this is an entire different mechanism. Because here the compiler chooses the appropriate function based on the datatypes and available overloaded function.
To illustrate this, printf, when compiled looks like this:
push value1
...
push valueN
push format_string
call _printf
And the prototype of printf is this:
int printf ( const char * format, ... );
So there is no type information carried over, except what is provided in the format string.
printf is not an intrinsic function. It's not part of the C language per se. All the compiler does is generate code to call printf, passing whatever parameters. Now, because C does not provide reflection as a mechanism to figure out type information at run time, the programmer has to explicitly provide the needed info.
Compiler may be smart, but functions printf or scanf are stupid - they do not know what is the type of the parameter do you pass for every call. This is why you need to pass %s or %d every time.
The first parameter is a format string. If you're printing a decimal number, it may look like:
"%d" (decimal number)
"%5d" (decimal number padded to width 5 with spaces)
"%05d" (decimal number padded to width 5 with zeros)
"%+d" (decimal number, always with a sign)
"Value: %d\n" (some content before/after the number)
etc, see for example Format placeholders on Wikipedia to have an idea what format strings can contain.
Also there can be more than one parameter here:
"%s - %d" (a string, then some content, then a number)
Isn't the compiler matured enough to identify that when I declared my
variable?
No.
You're using a language specified decades ago. Don't expect modern design aesthetics from C, because it's not a modern language. Modern languages will tend to trade a small amount of efficiency in compilation, interpretation or execution for an improvement in usability or clarity. C hails from a time when computer processing time was expensive and in highly limited supply, and its design reflects this.
It's also why C and C++ remain the languages of choice when you really, really care about being fast, efficient or close to the metal.
scanf as prototype int scanf ( const char * format, ... ); says stores given data according to the parameter format into the locations pointed by the additional arguments.
It is not related with compiler, it is all about syntax defined for scanf.Parameter format is required to let scanf know about the size to reserve for data to be entered.
GCC (and possibly other C compilers) keep track of argument types, at least in some situations. But the language is not designed that way.
The printf function is an ordinary function which accepts variable arguments. Variable arguments require some kind of run-time-type identification scheme, but in the C language, values do not carry any run time type information. (Of course, C programmers can create run-time-typing schemes using structures or bit manipulation tricks, but these are not integrated into the language.)
When we develop a function like this:
void foo(int a, int b, ...);
we can pass "any" number of additional arguments after the second one, and it is up to us to determine how many there are and what are their types using some sort of protocol which is outside of the function passing mechanism.
For instance if we call this function like this:
foo(1, 2, 3.0);
foo(1, 2, "abc");
there is no way that the callee can distinguish the cases. There are just some bits in a parameter passing area, and we have no idea whether they represent a pointer to character data or a floating point number.
The possibilities for communicating this type of information are numerous. For example in POSIX, the exec family of functions use variable arguments which have all the same type, char *, and a null pointer is used to indicate the end of the list:
#include <stdarg.h>
void my_exec(char *progname, ...)
{
va_list variable_args;
va_start (variable_args, progname);
for (;;) {
char *arg = va_arg(variable_args, char *);
if (arg == 0)
break;
/* process arg */
}
va_end(variable_args);
/*...*/
}
If the caller forgets to pass a null pointer terminator, the behavior will be undefined because the function will keep invoking va_arg after it has consumed all of the arguments. Our my_exec function has to be called like this:
my_exec("foo", "bar", "xyzzy", (char *) 0);
The cast on the 0 is required because there is no context for it to be interpreted as a null pointer constant: the compiler has no idea that the intended type for that argument is a pointer type. Furthermore (void *) 0 isn't correct because it will simply be passed as the void * type and not char *, though the two are almost certainly compatible at the binary level so it will work in practice. A common mistake with that type of exec function is this:
my_exec("foo", "bar", "xyzzy", NULL);
where the compiler's NULL happens to be defined as 0 without any (void *) cast.
Another possible scheme is to require the caller to pass down a number which indicates how many arguments there are. Of course, that number could be incorrect.
In the case of printf, the format string describes the argument list. The function parses it and extracts the arguments accordingly.
As mentioned at the outset, some compilers, notably the GNU C Compiler, can parse format strings at compile time and perform static type checking against the number and types of arguments.
However, note that a format string can be other than a literal, and may be computed at run
time, which is impervious to such type checking schemes. Fictitious example:
char *fmt_string = message_lookup(current_language, message_code);
/* no type checking from gcc in this case: fmt_string could have
four conversion specifiers, or ones not matching the types of
arg1, arg2, arg3, without generating any diagnostic. */
snprintf(buffer, sizeof buffer, fmt_string, arg1, arg2, arg3);
It is because this is the only way to tell the functions (like printf scanf) that which type of value you are passing. for example-
int main()
{
int i=22;
printf("%c",i);
return 0;
}
this code will print character not integer 22. because you have told the printf function to treat the variable as char.
printf and scanf are I/O functions that are designed and defined in a way to receive a control string and a list of arguments.
The functions does not know the type of parameter passed to it , and Compiler also cant pass this information to it.
Because in the printf you're not specifying data type, you're specifying data format. This is an important distinction in any language, and it's doubly important in C.
When you scan in a string with with %s, you're not saying "parse a string input for my string variable." You can't say that in C because C doesn't have a string type. The closest thing C has to a string variable is a fixed-size character array that happens to contain a characters representing a string, with the end of string indicated by a null character. So what you're really saying is "here's an array to hold the string, I promise it's big enough for the string input I want you to parse."
Primitive? Of course. C was invented over 40 years ago, when a typical machine had at most 64K of RAM. In such an environment, conserving RAM had a higher priority than sophisticated string manipulation.
Still, the %s scanner persists in more advanced programming environments, where there are string data types. Because it's about scanning, not typing.
Is declaring an header file essential? This code:
main()
{
int i=100;
printf("%d\n",i);
}
seems to work, the output that I get is 100. Even without using stdio.h header file. How is this possible?
You don't have to include the header file. Its purpose is to let the compiler know all the information about stdio, but it's by no means necessary if your compiler is smart (or lazy).
You should include it because it's a good habit to get into - if you don't, then the compiler has no real way to know if you're breaking the rules, such as with:
int main (void) {
puts (7); // should be a string.
return 0;
}
which compiles without issue but rightly dumps core when running. Changing it to:
#include <stdio.h>
int main (void) {
puts (7);
return 0;
}
will result in the compiler warning you with something like:
qq.c:3: warning: passing argument 1 of ‘puts’ makes pointer
from integer without a cast
A decent compiler may warn you about this, such as gcc knowing about what printf is supposed to look like, even without the header:
qq.c:7: warning: incompatible implicit declaration of
built-in function ‘printf’
How is this possible? In short: three pieces of luck.
This is possible because some compilers will make assumptions about undeclared functions. Specifically, parameters are assumed to be int, and the return type also int. Since an int is often the same size as a char* (depending on the architecture), you can get away with passing ints and strings, as the correct size parameter will get pushed onto the stack.
In your example, since printf was not declared, it was assumed to take two int parameters, and you passed a char* and an int which is "compatible" in terms of the invocation. So the compiler shrugged and generated some code that should have been about right. (It really should have warned you about an undeclared function.)
So the first piece of luck was that the compiler's assumption was compatible with the real function.
Then at the linker stage, because printf is part of the C Standard Library, the compiler/linker will automatically include this in the link stage. Since the printf symbol was indeed in the C stdlib, the linker resolved the symbol and all was well. The linking was the second piece of luck, as a function anywhere other than the standard library will need its library linked in also.
Finally, at runtime we see your third piece of luck. The compiler made a blind assumption, the symbol happened to be linked in by default. But - at runtime you could have easily passed data in such a way as to crash your app. Fortunately the parameters matched up, and the right thing ended up occurring. This will certainly not always be the case, and I daresay the above would have probably failed on a 64-bit system.
So - to answer the original question, it really is essential to include header files, because if it works, it is only through blind luck!
As paxidiablo said its not necessary but this is only true for functions and variables but if your header file provides some types or macros (#define) that you use then you must include the header file to use them because they are needed before linking happens i.e during pre-processing or compiling
This is possible because when C compiler sees an undeclared function call (printf() in your case) it assumes that it has
int printf(...)
signature and tries to call it casting all the arguments to int type. Since "int" and "void *" types often have same size it works most of the time. But it is not wise to rely on such behavior.
C supprots three types of function argument forms:
Known fixed arguments: this is when you declare function with arguments: foo(int x, double y).
Unknown fixed arguments: this is when you declare it with empty parentheses: foo() (not be confused with foo(void): it is the first form without arguments), or not declare it at all.
Variable arguments: this is when you declare it with ellipsis: foo(int x, ...).
When you see standard function working then function definition (which is in form 1 or 3) is compatible with form 2 (using same calling convention). Many old std. library functions are so (as desugned to be), because they are there form early versions of C, where was no function declarations and they all was in form 2. Other function may be unintentionally be compatible with form 2, if they have arguments as declared in argument promotion rules for this form. But some may not be so.
But form 2 need programmer to pass arguments of same types everywhere, because compiler not able to check arguments with prototype and have to determine calling convention osing actual passed arguments.
For example, on MC68000 machine first two integer arguments for fixed arg functions (for both forms 1 and 2) will be passed in registers D0 and D1, first two pointers in A0 and A1, all others passed through stack. So, for example function fwrite(const void * ptr, size_t size, size_t count, FILE * stream); will get arguments as: ptr in A0, size in D0, count in D1 and stream in A1 (and return a result in D0). When you included stdio.h it will be so whatever you pass to it.
When you do not include stdio.h another thing happens. As you call fwrite with fwrite(data, sizeof(*data), 5, myfile) compiler looks on argruments and see that function is called as fwrite(*, int, int, *). So what it do? It pass first pointer in A0, first int in D0, second int in D1 and second pointer in A1, so it what we need.
But when you try to call it as fwrite(data, sizeof(*data), 5.0, myfile), with count is of double type, compiler will try to pass count through stack, as it is not integer. But function require is in D1. Shit happens: D1 contain some garbage and not count, so further behaviour is unpredictable. But than you use prototype defined in stdio.h all will be ok: compiler automatically convert this argument to int and pass it as needed. It is not abstract example as double in arument may be just result of computation involving floating point numbers and you may just miss this assuming result is int.
Another example is variable argument function (form 3) like printf(char *fmt, ...). For it calling convention require last named argument (fmt here) to be passed through stack regardess of its type. So, then you call printf("%d", 10) it will put pointer to "%d" and number 10 on stack and call function as need.
But when you do not include stdio.h comiler will not know that printf is vararg function and will suppose that printf("%d", 10) is calling to function with fixed arguments of type pointer and int. So MC68000 will place pointer to A0 and int to D0 instead of stack and result is again unpredictable.
There may be luck that arguments was previously on stack and occasionally read there and you get correct result... this time... but another time is will fail. Another luck is that compiler takes care if not declared function may be vararg (and somehow makes call compatible with both forms). Or all arguments in all forms are just passed through stack on your machine, so fixed, unknown and vararg forms are just called identically.
So: do not do this even you feel lucky and it works. Unknown fixed argument form is there just for compatibility with old code and is strictly discouraged to use.
Also note: C++ will not allow this at all, as it require function to be declared with known arguments.
I'm having a C programming question: I want to write a function with variable argument lists, where the specific types of each argument is not know - only its size in bytes. That means, if I want to get an int-Parameter, I (somewhere before) define: There will be a parameter with sizeof( int ), that is handled with a callback-function xyz.
My variable argument function should now collect all information from its call, the real data-type specific operations (which also can be user-defined data types) are processed only via callback-functions.
At the standard va_arg-functions, it is not possible to say "get me a value of X bytes from my parameter-list", so I thought to do it this way. My data-type is double in this case, but it can be any other number of bytes (and even variable ones).
#include <stdlib.h>
#include <stdio.h>
#include <stdarg.h>
int fn( int anz, ... )
{
char* p;
int i;
int j;
va_list args;
char* val;
char* valp;
int size = sizeof( double );
va_start( args, anz );
val = malloc( size );
for( i = 0; i < anz; i++ )
{
memcpy( val, args, size );
args += size;
printf( "%lf\n", *( (double*)val ) );
}
va_end( args );
}
int main()
{
fn( 1, (double)234.2 );
fn( 3, (double)1234.567, (double)8910.111213, (double)1415.161718 );
return 0;
}
It works for me, under Linux (gcc). But my question is now: Is this really portable? Or will it fail under other systems and compilers?
My alternative approach was to replace
memcpy( val, args, size );
args += size;
with
for( j = 0; j < size; j++ )
val[j] = va_arg( args, char );
but then, my values went wrong.
Any ideas or help on this?
Performing arithmetic on a va_list is on the extreme end of nonportable. You should use va_arg normally with a type of the same size as the argument and it will probably work anywhere. For the sake of being "closer to portable" you should use unsigned integer types for this purpose (uint32_t etc.).
A non scientific test.
AIX 5.3 with GCC 4.2 - works
HP-UX 11.23 with aCC 5.56 - doesn't
Linux (SUSE 10.2) with GCC 4.1 - doesn't
Solaris 10 with CC 5.9 - doesn't
All Linux, Solaris and HP-UX complained about the args += size; line.
Otherwise, it is quite obvious that va_arg() was included for a reason. E.g. on SPARCs stack is used completely differently.
May I suggest replacing variable arguments with an array of void pointers?
It is not portable, sorry. The format of va_list is compiler/platform dependent.
You have to use va_arg() to access va_list, and you must pass the correct type of the argument to va_list.
However, I believe it's possible that if you pass a type of the correct size to va_arg, that would work. ie. the type is not usually relevant, only it's size. However, even this is not guaranteed to work across all systems.
I think I'd suggest relooking at your design and seeing if you can find an alternative design - are there more details on why you are trying to do this that you can share? Can you pass the va_list to the callbacks instead?
Update
The reason the byte-by-byte approach doesn't work is probably quite involved. As far as the C standard goes, the reason it doesn't work is because it's not allowed - you can only use va_arg to access the identical types that were passed to the function.
But I suspect you'd like to know what's going on behind the scenes :)
The first reason is that when you read pass a "char" to a function, it's actually automatically promoted to an int, so is stored into the va_arg as an int. So when you read a char, you're reading an "int"s worth of memory, not a "char"s - so you're not actually reading a byte at a time.
A further reason has to do with alignment - on some architectures (one example would be very recent ARM processors), a "double" must be aligned to a 64 bit (or sometimes even 128 bit) boundary. That is, for the pointer value p, p % 16 (p modulus 16, in bytes - ie. 128 bit) must equal 0. So when these are packed on the va_arg, the compiler will probably be ensuring that any double values have space (padding) added so they only occur with the correct alignment - but you're not taking acount of it when you read the entries a byte at a time.
(There may be other reasons too - I'm not intimately familiar with the inner works of va_arg.)
In this case avoiding va_args would likely be cleaner because you still end up bound to a specific number of arguments in code at the calling point. I'd go with passing arrays of arguments.
struct arg
{
void* vptr;
size_t len;
};
void fn( struct arg* args, int nargs );
For that matter, I'd also carry the data definition too, either an int as mentioned earlier in the comments or a pointer to a struct if it's a more complex data def.
For C99, if I may assume that all your arguments are integer types, but may just be of different width or signedness you can get away with a macro. By that you may even transform your argument list into an array without pain for the user that calls this.
#define myFunc(...) myRealFunc(NARGS(__VA_ARGS__), (uintmax_t const[]){ __VA_ARGS__})
where
void myRealFunc(size_t len, uintmax_t const param*);
and where NARGS is a macro that gives you the length of the __VA_ARGS__ parameter. Such a thing then can be called just like a function that would receive va_list.
To explain a bit what the macro does:
it places the number of arguments to the first parameter of myRealFunc
it creates a temporary array (compound literal) of the correct size and initializes it with the arguments (all cast to uintmax_t)
if NARGS is done correctly, this
evaluates your argument list only at
preprocessing time, as tokens, and
not at run time. So your parameters
are not run-time evaluated more than
once
Now your callback functions would be called by myRealFunc by using whatever magic you would like to place there. Since when called they'd need a parameter of a different integer type, the uintmax_t parameter param[i] would be cast back into that type.