I am confused about the usage of the arguments before ... in c. Some people say that the parameter before the ... is supposed to hold the amount of variadic arguments. However how does that make sense with variadic functions like printf()? Does the amount of characters given equal the the upper limit on how many variadic arguments can be given? For example:
printf("Hey"); // Printf() is passing 4 characters (including the null one).
// So potentially 4 variadic argument?
Or is the final parameter before ... just used to initialize va_list using va_start?
I thought that the parameter before the ... was used to contain the amount of variadic arguments but how does that work with with printf() then?
The va_arg etc macros have no way to know how many arguments you provide. In case of printf it will parse the format string until encountering a null terminator, count the number of conversion specifiers and then pray that the programmer actually gave it as many arguments as the format string said. If the programmer didn't, then printf will run amok.
This is one of many reasons why variadic functions should be avoided when possible. It is one of the most brittle and dangerous features of the C language.
Related
What is the point of format specifier in C if we have allready set the type of variable before printf?
For example:
#include<stdio.h>
int main(void)
{
int a=7
printf("%d", a);
}
Like, it's allready stated what a is, it's integer(int). So what is the point of adding %d to specify that it's an integer?
The answer to this question really only makes sense in the context of C's history.
C is, by now, a pretty old language. Though undoubtedly a "high level language", it is famously low-level as high-level languages go. And its earliest compiler was deliberately and self-consciously small and simple.
In its first incarnation, C did not enforce type safety during function calls. For example, if you called sqrt(144), you got the wrong answer, because sqrt expects an argument of type double, but 144 is an int. It was the programmer's responsibility to call a function with arguments of the correct types: the compiler did not know (did not even attempt to keep track of) the arguments expected by each function, so it did not and could not perform automatic conversions. (A separate program, lint, could check that functions were called with the correct arguments.)
C++ corrected this deficiency, by introducing the function prototype. These were inherited by C in the first ANSI C standard in 1989. However, a function prototype only works for a function that expects a single, fixed argument list, meaning that it can't help for functions that accept a variable number of arguments, the premier example being: printf.
The other thing to remember is that, in C, printf is a more or less ordinary function. ("Ordinary" other than accepting a variable number of arguments, that is.) So the compiler has no direct mechanism to notice the types of the arguments and make that list of types available to printf. printf has no way of knowing, at run time, what types were passed during any given call; it can only rely (it must rely) on the clues provided in the format string. (This is by contrast to languages, many of them, where the print statement is an explicit part of the language parsed by the compiler, meaning that the compiler can do whatever it needs to do in order to treat each argument properly according to its known type.)
So, by the rules of the language (which are constrained by backwards compatibility and the history of the language), the compiler can't do anything special with the arguments in a printf call, other than performing what is called the default argument promotions. So the compiler can't fix things (can't perform the "correct" implicit conversion) if you write something like
int a = 7;
printf("%f", a);
This is, admittedly, an uncomfortable situation. These days, programmers are used to the protections and the implicit promotions provided for by function prototypes. If, these days, you can call
int x = sqrt(144);
and have the right thing happen, why can't you similarly call
printf("%f\n", 144);
Well, you can't, although a good, modern compiler will try to help you out anyway. Although the compiler doesn't have to inspect the format string (because that's printf's job to do, at run time), and the compiler isn't allowed to insert any implicit conversions (other than the default promotions, which don't help here), a compiler can duplicate printf's logic, inspect the format string, and issue strong warnings if the programmer makes a mistake. For example, given
printf("%f\n", 144);
gcc prints "warning: format ‘%f’ expects argument of type ‘double’, but argument 2 has type ‘int", and clang prints "warning: format specifies type 'double' but the argument has type 'int'".
In my opinion, this is a fine compromise, balancing C's legacy behavior with modern expectations.
what is the point of adding %d to specify that it's an integer?
printf() is a function which receives a variable number of arguments of various type after the format argument. It does not directly know the number nor the type of arguments passed nor received.
The callers knows the argument count and types it gives to printf().
To pass the arguments count and type information, the format argument is used by the caller to encodes the argument count and types. printf() uses that format and decodes it to know the argument count and type. It is very important that the format and following arguments passed are consistent.
printf() accepts a variable number of arguments. To process those variable arguments it (va_start()) needs to know the last fixed argument is. It (va_arg()) also needs to know the type of each argument so it figure how much data to read.
The format specifier is also a compact template (or DSL) to express how text and variables should be formatted including field width, alignment, precision, encoding.
A. printf("Values: X=%s Y=%s\n", x,y,z);
B. printf("Values: x=%s, Y=%s\n", x);
Both of the above printf() statements are incorrect: one has extra parameters, other has fewer parameters. I would like to choose between the lesser evil with an explanation. Can a modern C compiler help catch such problems? If yes, how does printf() implementor need to assist the compiler?
Both of the above printf() statements are incorrect: one has extra parameters, other has fewer parameters.
The first one is not incorrect according to the C standard. The rules for function calls in general, in C 2018 6.5.2.2, do not make it an error to pass unused arguments for a ... in the function prototype. For printf specifically, C 2018 7.21.6.1 2 (about fprintf, which the specification for printf refers to) says extra arguments are harmless:
… If the format is exhausted while arguments remain, the excess arguments are evaluated (as always) but are otherwise ignored…
Certainly if a programmer writes printf("Values: X=%s. Y=%s.\n", x, y, z);, they might have made a mistake, and a compiler would be reasonable in pointing out this possibility. However, consider code such as:
printf(ComputedFormat, x, y, z);
Here it is reasonable that we wish to print different numbers of values in different circumstances, and the ComputedFormat reflects this. It would be tedious to write code for each case and dispatch to them with a switch statement. It is simpler to write one call and let the computed format determine how many values are printed. So it is not always an error to have more arguments than the conversion specifications use.
I would like to choose between the lesser evil with an explanation.
The behavior of the latter code is not defined by the C standard. C 2018 7.21.6.1 2 also says:
… If there are insufficient arguments for the format, the behavior is undefined…
Thus, no behavior may be relied on from the latter code, unless there is some guarantee from the C implementation.
Can a modern C compiler help catch such problems?
Good modern C compilers have information about the specification of printf and, when the format argument is a string literal, they compare the number and types of the arguments to the conversion specifications in the string.
If yes, how does printf() implementor need to assist the compiler?
The implementor of printf does not need to do anything except conform to the specification of printf in the C standard. The aid described above is performed by the C compiler with reference to the C standard; it does not rely on features of the particular printf implementation.
In some platforms, information about the number of arguments passed is provided to the called routine. In such platforms, a printf implementor could check whether too few arguments are provided and signal an error in some method.
Eric Postpischil has already made a great answer that uses the most reliable source (the C standard), but I just want to post my own answer about why printf may behave as it does in both cases.
printf is a variadic function which can take a variable number of arguments. The way it knows how many you have passed is solely through the format string; every time it finds a format specifier, it takes the next argument out of the list (and assumes its type from which specifier has been used). Nothing would really happen to any extra arguments because since there is no specifier for them, the function will not even try to take them and they will not be printed. So you may be warned about the extra arguments by the compiler, but the behavior in the first example is well-defined.
The second, on the other hand, is definitely undefined behavior. Since there are not enough arguments to match the number of format specifiers in the string, eventually when it finds the second %s, it will try to take the next variadic argument, but the issue is that you haven't passed any. When this happens for me, it prints some garbage value in place of the format specifier that doesn't look too nice. Anything could happen in undefined behavior though. In this case, the function seems to try to take the next variadic argument from a CPU register / the stack (memory) and fetches some garbage value that happened to be there (though again, anything could happen with undefined behavior).
So in short:
printf("%s\n", "Hello", "World");
| | ^^^^^^^ Ignored
-------
and
printf("%s\n"); ?
| |
----------
I am reading http://www.cs.utexas.edu/users/lavender/courses/cs345/lectures/CS345-Lecture-07.pdf to try to understand how does Stack Activation Frame for Variable arguments functions works?
Specifically how can the called function knows how many arguments are being passed?
The slide said:
The va_start procedure computes the fp+offset value following the argument
past the last known argument (e.g., const char format). The rest of the arguments are then computed by calling
va_arg, where the ‘ap’ argument to va_arg is some fp+offset value.*
My question is what is fp (frame point)? how does va_start computes the 'fp+offset' values?
and how does va_arg get 'some fp+offset values? and what does va_end supposed to do with stack?
The function doesn't know how many arguments are passed. At least not in any way that matters, i.e. in C you cannot query for the number of arguments.
That's why all varargs functions must either:
Use a non-varying argument that contains information about the number and types of all variable arguments, like printf() does; or
Use a sentinel value on the end of the variable argument list. I'm not aware of a function in the standard library that does this, but see for instance GTK+'s gtk_list_store_set() function.
Both mechanisms are risky; if your printf() format string doesn't match the arguments, you're going to get undefined behavior. If there was a way for printf() to know the number of passed arguments, it would of course protect against this, but there isn't. So it can't.
The va_start() macro takes as an argument the last non-varying argument, so it can somehow (this is compiler internals, there's no single correct or standard answer, all we can do from this side of the interface is reason from the available data) use that to know where the first varying argument is located on the stack.
The va_arg() macro gets the type as an "argument", which makes it possible to use that to compute the offset, and probably increment some state in the va_list object to point at the next varying argument.
I'm trying to plug a hole in my knowledge. Why variadic functions require at least two arguments? Mostly from C's main function having argc as argument count and then argv as array of arrays of chars? Also Objective-C's Cocoa has NSString methods that require format as first argument and afterwards an array of arguments ([NSString stringWithFormat:#"%#", foo]). Why is it impossible to create a variadic function accepting only a list of arguments?
argc/argv stuff is not really variadic.
Variadic functions (such as printf()) use arguments put on the stack, and don't require at least 2 arguments, but 1.
You have void foo(char const * fmt, ...) and usually fmt gives a clue about the number of arguments.
That's minimum 1 argument (fmt).
C has very limited reflection abilities so you must have some way to indicate what it is that the variable arguments contain - either specifying the number of arguments or the type of them (or both), and that is the logic behind having one more parameter. It is required by the ISO C standard so you can't omit it. If feel you don't need any extra parameters because the number and type of the arguments is always constant then there is no need for variable arguments in the first place.
You could of course design other ways to encode the number / type information inside the variable arguments such as a sentinel value. If you want to do this, you can just supply a dummy value for the first argument and not use it in the method body.
And just to be pedantic about your title, variadic functions only require one argument (not two). It's perfectly valid to make a call to a variadic function without providing any optional arguments:
printf("Hello world");
I think, that the reason is the following:
in the macro va_start(list, param); you specify the last fixed argument - it is needed to determine the address of the beginning of the variable arguments list on the stack.
How would you then know if the user provided any arguments?
There has to be some information to indicate this, and C in general wasn't designed to do behind-your-back data manipulation. So anything you need, it makes you pass explicitly.
I'm sure if you really wanted to you could try to enforce some scheme whereby the variadic function takes only a certain type of parameter (a list of ints for example) - and then you fill some global variable indicating how many ints you had passed.
Your two examples are not variadic functions. They are functions with two arguments, but they also highlight a similar issue. How can you know the size of a C array without additional information? You can either pass the size of the array, or you describe a scheme with some sentinel value demarcating the end of the array (i.e. '\0' for a C string).
In both the variadic case and the array case you have the same problem, how can you know how much data you have legitimate access to? If you don't know this with the array case, you will go out of bounds. If you don't know this with the variadic case you will call va_arg too many times, or with the wrong type.
To turn the question around, how would you be able to implement a function taking a variable number of arguments without passing the extra information?
I was wondering how I could do this. I'm mostly puzzled by the N arguments part:
printf("Hello, I'm %i years old and my mom is %i .",me.age(),mom.age());
I want to make a function that will take a formatted string like this and return a std string.
How is the N arguments part done?
printf is a variadic function; you can implement your own variadic functions using the facilities provided by <stdarg.h>.
In C++, you should avoid variadic functions wherever possible. They are quite limited in what types they can accept as arguments and they are not type safe. C++0x adds variadic templates to C++; once support for this feature is widespread, you'll be able to write type safe variadic functions.
In the meantime, it's best to use some other type safe method. Boost.Format, for example, overloads the % operator to perform formatting.