order of evaluation of varadic arguments? - c

I understand that for an ordinary C functions:
g1(f1(), f2() f3())
that the order of evaluating the arguments is unspecified: f3() might be called before f1() or vice versa.
But does the same hold true for varadic functions? The definition of va_arg() says:
Each invocation of the va_arg macro modifies ap to point to the next variable argument.
Though it doesn't specify what it means by 'next' (left to right or right to left?), common use cases make it seem likely that it means left to right.
Furthermore, can one assume that the required first argument is evaluated before (or after) the variable arguments? Or is that also unspecified?
void g2(int a, ...);
I suppose a strict reading of this says that one cannot assume any particular order of evaluation. But it certainly would make writing functions like printf() much more difficult, if not intractable.

The order of evaluation of function arguments does not (strictly) depend on whether the function in question is variadic or not. In both cases, the order is unspecified.
The description of va_arg is just telling you what order it reads the arguments in. By the time you're inside the variadic function, the arguments were already evaluated and passed to the function.

Each invocation of the va_arg macro modifies ap to point to the next variable argument.
This refers to "left to right", with the arguments passed to ....
Furthermore, can one assume that the required first argument is evaluated before (or after) the variable arguments?
This is also unspecified. There is no exception for variadic arguments when it comes to order of evaluation, AFAIK.
The arguments are all evaluated when you call the function, although the order is not specificed. By the time you use the va_arg macro, you have already evaluated all the arguments.
It's the same way if I call f(int a, int b), both a and b have already been evaluated by the time I'm in the body of f.

I understand that for an ordinary C functions [...] the order of evaluating the arguments is unspecified [...] But does the same hold true for varadic functions? The definition of va_arg() says [...]
C specifies that
There is a sequence point after the evaluations of the function designator and the actual arguments but before the actual call
(C17 6.5.2.2/7)
That applies to all functions, including variadic ones. Among other things, it means that the specifications for the va_arg macro are irrelevant to the order of evaluation of the actual arguments to the function in which that macro appears. All actual arguments to the function are evaluated before execution of the function body begins.
The only distinction C draws on the calling side between variadic functions and non-variadic functions is in the argument type conversions that apply. The variable arguments to a variadic function (or to a function without an in-scope prototype) are subject to the default argument promotions, whereas that is not the case for non-variadic arguments to functions with in-scope prototypes.
Furthermore, can one assume that the required first argument is evaluated before (or after) the variable arguments? Or is that also unspecified?
It is also unspecified. Again, the only distinction C makes between the semantics of calling a variadic function and of calling a non-variadic one is to do with rules for argument type promotions. And that's not really so much a distinction as it is covering cases that don't otherwise arise, and even then it is in a manner that is consistent with other C semantics for arguments whose types are not specified via a function prototype.
I suppose a strict reading of this says that one cannot assume any particular order of evaluation. But it certainly would make writing functions like printf() much more difficult, if not intractable.
No "strict" reading is required. C does not specify any rules for the relative order of evaluation of arguments to the same function call. Period. But that causes no particular issue with the implementation of variadic functions, because all the arguments are evaluated before execution of the function body starts. Variadic functions are subject to constraints on the order in which the values of the variable arguments are read within the function, but that has nothing to do with the order in which the argument expressions are evaluated on the caller's side.

Related

atomic arguments order of execution in C

I am trying to work with the stdatomic.h functions, specifically atomic_flag_test_and_set. I am not seeing any errors, but want to know if what I am doing is always safe. I have a struct like the following:
typedef struct Mystruct {
int somedata;
atomic_flag flag;
} Mystruct;
Later, when I create a mystruct and use its instance of the flag, I do so like this:
if(atomic_flag_test_and_set(&mystructInstance->flag)) {
// do something
}
Is the evaluation of &mystructInstance->flag always completed before the check for the atomic operation? I would assume so since it should be one processor instruction (or something that emulates one processor instruction), but I want to make sure.
Is the evaluation of &mystructInstance->flag always completed before the check for the atomic operation?
The answer to this question can be found in the section on "Function calls" in the C standard.
6.5.2.2 Function calls
...
4. An argument may be an expression of any complete object type. In preparing for the call to a function, the arguments are evaluated, and each parameter is assigned the value of the corresponding argument.
Also note that if a function takes more than one parameter, the order of evaluation of the arguments passed to it is unspecified. This also is mentioned in the same section in the standard.
10.There is a sequence point after the evaluations of the function designator and the actual arguments but before the actual call. Every evaluation in the calling function (including other function calls) that is not otherwise specifically sequenced before or after the execution of the body of the called function is indeterminately sequenced with respect to the execution of the called function.

Is C compiler allowed to optimize away unused function arguments?

I know that how arguments are passed to functions is not part of the C standard, and is dependent on the hardware architecture and calling convention.
I also know that an optimizing compiler may automatically inline functions to save on call overhead, and omit code that has no "side effects".
But, I have a question about a specific case:
Lets say there is a non trivial function that can not be inlined or removed, and must be called, that is declared to take no arguments:
int veryImportantFunc() {
/* do some important stuff */
return result;
}
But this function is called with arguments:
int result = veryImportantFunc(1, 2, 3);
Is the compiler allowed to call the function without passing these arguments?
Or is there some standard or technical limitation that would prevent this kind of optimization?
Also, what if argument evaluation has side effects:
int counter = 1;
int result = veryImportantFunc(1, ++counter, 3);
Is the compiler obligated to evaluate even without passing the result, or would it be legal to drop the evaluation leaving counter == 1?
And finally, what about extra arguments:
char* anotherFunc(int answer) {
/* Do stuff */
return question;
}
if this function is called like this:
char* question = anotherFunc(42, 1);
Can the 1 be dropped by the compiler based on the function declaration?
EDIT: To clarify: I have no intention of writing the kind of code that is in my examples, and I did not find this in any code I am working on.
This question is to learn about how compilers work and what the relevant standards say, so to all of you who advised me to stay away from this kind of code: thank you, but I already know that.
To begin with, "declared to take no arguments" is wrong. int veryImportantFunc() is a function accepting any arguments. This is obsolete C style and shouldn't be used. For a function taking no arguments, use (void).
Is the compiler allowed to call the function without passing these arguments?
If the actual function definition does not match the number of arguments, the behavior is undefined.
Also, what if argument evaluation has side effects
Doesn't matter, since arguments are evaluated (in unspecified order) before the function is called.
Is the compiler obligated to evaluate even without passing the result, or would it be legal to drop the evaluation leaving counter == 1?
It will evaluate the arguments and then invoke undefined behavior. Anything can happen.
And finally, what about extra arguments:
Your example won't compile, as it isn't valid C.
The following quotes from the C standard are relevant to your different questions:
6.5.2.2 Function calls
...
2. If the expression that denotes the called function has a type that includes a prototype, the number of arguments shall agree with the number of parameters.
...
4. An argument may be an expression of any complete object type. In preparing for the call to a function, the arguments are evaluated, and each parameter is assigned the value of the corresponding argument.
...
6. If the expression that denotes the called function has a type that does not include a prototype, the integer promotions are performed on each argument, and arguments that have type float are promoted to double. These are called the default argument promotions. If the number of arguments does not equal the number of parameters, the behavior is undefined. If the function is defined with a type that includes a prototype, and either the prototype ends with an ellipsis (, ...) or the types of the arguments after promotion are not compatible with the types of the parameters, the behavior is undefined. If the function is defined with a type that does not include a prototype, and the types of the arguments after promotion are not compatible with those of the parameters after promotion, the behavior is undefined.
...
10. There is a sequence point after the evaluations of the function designator and the actual arguments but before the actual call. Every evaluation in the calling function (including other function calls) that is not otherwise specifically sequenced before or after the execution of the body of the called function is indeterminately sequenced with respect to the execution of the called function.
Lets say there is a non trivial function that can not be inlined or removed, and must be called, that is declared to take no arguments:
int veryImportantFunc() {
/* do some important stuff */
return result;
}
But this function is called with arguments:
There are two possibilities:
the function is declared with a "full" prototype such as
int veryImportantFunc(void);
in this case the call with extra arguments won't compile, as the number of parameters and arguments must match;
the function is declared as taking an unspecified number of arguments, i.e. the declaration visibile to the call site is
int veryImportantFunc();
in this case, the call is undefined behavior, as the usage doesn't match the actual function definition.
All the other considerations about optimization aren't particularly interesting, as what you are trying to do is illegal however you look at it.
We can stretch this and imagine a situation where passing extra useless arguments is legal, for example a variadic function never using the extra arguments.
In this case, as always, the compiler is free to perform any such optimization as long as the observable behavior isn't impacted, and proceeds "as if" performed according to the C abstract machine.
Given that the details of arguments passing aren't observables1, the compiler could in line of principle optimize away the argument passing, while the arguments evaluation may still need to be done if it has some observable impact on the program state.
That being said, I have a hard time imagining how such optimization may be implemented in the "classical linking model", but with LTCG it shouldn't be impossible.
The only observable effects according to the C standard are IO and reads/writes on volatile variables.
Following Pascals theory, it is better to be wrong in believing the compiler can make this optimisation than be right in believing it doesn’t. It serves no purpose to wrongly define a function; if you really must, you can always put a stub in front of it:
int RealSlimShady(void) {
return Dufus;
}
int MaybeSlimShady(int Mathew, int Mathers) {
Used(Mathew);
Used(Mathers);
return RealSlimShady();
}
Everyone is happy, and if your compiler is worth its salt, there will be 0 code overhead.

Guarantee of non-equality of pointers to standard functions?

Does the C language guarantee that pointers to differently-named standard functions must compare not-equal?
Per 6.5.9 Equality Operators, ¶6,
Two pointers compare equal if and only if both are null pointers, both are pointers to the same object (including a pointer to an object and a subobject at its beginning) or function, ...
I seem to recall seeing an interpretation claiming that aliases (multiple identifiers for the "same function") are permissible for standard functions, the canonical candidates for such treatment being getc==fgetc and putc==fputc; however, I don't know where I might have seen it, and I'm skeptical of the concept.
Is there any official interpretation or well-accepted argument for or against this possibility?
No, I don't believe there is any such guarantee.
I believe this whole discussion originates from the part of the standard which allows a function to also be defined as a macro with the same name.
From C17 7.1.4:
Any
function declared in a header may be additionally implemented as a function-like macro defined
in the header, so if a library function is declared explicitly when its header is included, one of the
techniques shown below can be used to ensure the declaration is not affected by such a macro. Any
macro definition of a function can be suppressed locally by enclosing the name of the function in
parentheses, because the name is then not followed by the left parenthesis that indicates expansion of
a macro function name. For the same syntactic reason, it is permitted to take the address of a library
function even if it is also defined as a macro189).
189) This means that an implementation shall provide an actual function for each library function, even if it also provides a
macro for that function.
The text goes on describing how users may #undef the macro name if they want to be guaranteed that they get an actual function.
So it is allowed for the implementation to have a standard function and a macro with the same name. But what the macro then expands to is implementation-defined. It may very well be an internal function with the same address as what another library macro expands to.
Based on that, I don't believe there are any guarantees that different functions have different addresses.
In the specific case of getc, the standard says (C17 7.21.7.5):
The getc function is equivalent to fgetc, except that if it is implemented as a macro, it may evaluate stream more than once, so the argument should never be an expression
with side effects.
I would say it is somewhat likely that the implementation calls the same actual function for fgetc and getc when these are implemented as macros. (Or that atoi versus strtol call the same function, etc etc). The standard library implementations I have peeked at don't seem to do it this way, but I don't think there is anything in the standard stopping them.
(As a side note, taking the address of library functions may not be a good idea for other reasons, namely that it may block inlining of that function within the same translation unit.)
Well you are falling in an implementation detail. The standard only specifies the behaviour of the functions of the standard library.
For getc the spec says (emphasize mine):
The getc function is equivalent to fgetc, except that if it is implemented as a macro, it
may evaluate stream more than once, so the argument should never be an expression
with side effects.
So the implementation may implement getc as a macro, but it also may implement it as an alias (a mere pointer to func) to fgetc or as a different function with same behaviour. Long story short you cannot rely on &getc == &fgetc to be true or false.
The only thing that the standard requires is that '&getc` must be defined per 7.1.4 § 1:
... it is permitted to take the address of a library function even if it is also defined as
a macro...
That just means that the implementation must have a function of that name, but it could:
be the fgets function - ok &fgetc == &getc is true
use the macro - fgetc == &getc is false
call the fgets function - &fgetc == &getc is false

How does Stack Activation Frame for Variable arguments functions works?

I am reading http://www.cs.utexas.edu/users/lavender/courses/cs345/lectures/CS345-Lecture-07.pdf to try to understand how does Stack Activation Frame for Variable arguments functions works?
Specifically how can the called function knows how many arguments are being passed?
The slide said:
The va_start procedure computes the fp+offset value following the argument
past the last known argument (e.g., const char format). The rest of the arguments are then computed by calling
va_arg, where the ‘ap’ argument to va_arg is some fp+offset value.*
My question is what is fp (frame point)? how does va_start computes the 'fp+offset' values?
and how does va_arg get 'some fp+offset values? and what does va_end supposed to do with stack?
The function doesn't know how many arguments are passed. At least not in any way that matters, i.e. in C you cannot query for the number of arguments.
That's why all varargs functions must either:
Use a non-varying argument that contains information about the number and types of all variable arguments, like printf() does; or
Use a sentinel value on the end of the variable argument list. I'm not aware of a function in the standard library that does this, but see for instance GTK+'s gtk_list_store_set() function.
Both mechanisms are risky; if your printf() format string doesn't match the arguments, you're going to get undefined behavior. If there was a way for printf() to know the number of passed arguments, it would of course protect against this, but there isn't. So it can't.
The va_start() macro takes as an argument the last non-varying argument, so it can somehow (this is compiler internals, there's no single correct or standard answer, all we can do from this side of the interface is reason from the available data) use that to know where the first varying argument is located on the stack.
The va_arg() macro gets the type as an "argument", which makes it possible to use that to compute the offset, and probably increment some state in the va_list object to point at the next varying argument.

Order of evaluation of arguments in function calling?

I am studying about undefined behavior in C and I came to a statement that states that
there is no particular order of evaluation of function arguments
but then what about the standard calling conventions like _cdecl and _stdcall, whose definition said (in a book) that arguments are evaluated from right to left.
Now I am confused with these two definitions one, in accordance of UB, states different than the other which is in accordance of the definition of calling convention. Please justify the two.
As Graznarak's answer correctly points out, the order in which arguments are evaluated is distinct from the order in which arguments are passed.
An ABI typically applies only to the order in which arguments are passed, for example which registers are used and/or the order in which argument values are pushed onto the stack.
What the C standard says is that the order of evaluation is unspecified. For example (remembering that printf returns an int result):
some_func(printf("first\n"), printf("second\n"));
the C standard says that the two messages will be printed in some order (evaluation is not interleaved), but explicitly does not say which order is chosen. It can even vary from one call to the next, without violating the C standard. It could even evaluate the first argument, then evaluate the second argument, then push the second argument's result onto the stack, then push the first argument's result onto the stack.
An ABI might specify which registers are used to pass the two arguments, or exactly where on the stack the values are pushed, which is entirely consistent with the requirements of the C standard.
But even if an ABI actually requires the evaluation to occur in a specified order (so that, for example, printing "second\n" followed by "first\n" would violate the ABI) that would still be consistent with the C standard.
What the C standard says is that the C standard itself does not define the order of evaluation. Some secondary standard is still free to do so.
Incidentally, this does not by itself involve undefined behavior. There are cases where the unspecified order of evaluation can lead to undefined behavior, for example:
printf("%d %d\n", i++, i++); /* undefined behavior! */
Argument evaluation and argument passing are related but different problems.
Arguments tend to be passed left to right, often with some arguments passed in registers rather than on the stack. This is what is specified by the ABI and _cdecl and _stdcall.
The order of evaluation of arguments before placing them in the locations that the function call requires is unspecified. It can evaluate them left to right, right to left, or some other order. This is compiler dependent and may even vary depending on optimization level.
_cdecl and _stdcall merely specify that the arguments are pushed onto the stack in right-to-left order, not that they are evaluated in that order. Think about what would happen if calling conventions like _cdecl, _stdcall, and pascal changed the order that the arguments were evaluated.
If evaluation order were modified by calling convention, you would have to know the calling convention of the function you're calling in order to understand how your own code would behave. That's a leaky abstraction if I've ever seen one. Somewhere, buried in a header file someone else wrote, would be a cryptic key to understanding just that one line of code; but you've got a few hundred thousand lines, and the behavior changes for each one? That would be insanity.
I feel like much of the undefined behavior in C89 arose from the fact that the standard was written after multiple conflicting implementations existed. They were maybe more concerned with agreeing on a sane baseline that most implementers could accept than they were with defining all behavior. I like to think that all undefined behavior in C is just a place where a group of smart and passionate people agreed to disagree, but I wasn't there.
I'm tempted now to fork a C compiler and make it evaluate function arguments as if they're a binary tree that I'm running a breadth-first traversal of. You can never have too much fun with undefined behavior!
Check the book you mentioned for any references to "Sequence points", because I think that's what you're trying to get at.
Basically, a sequence point is a point that, once you've arrived there, you are certain that all preceding expressions have been fully evaluated, and its side-effects are sure to be no more.
For example, the end of an initializer is a sequence point. This means that after:
bool foo = !(i++ > j);
You are sure that i will be equal to i's initial value +1, and that foo has been assigned true or false. Another example:
int bar = i++ > j ? i : j;
Is perfectly predictable. It reads as follows: if the current value of i is greater than j, and add one to i after this comparison (the question mark is a sequence point, so after the comparison, i is incremented), then assign i (NEW VALUE) to bar, else assign j. This is down to the fact that the question mark in the ternary operator is also a valid sequence point.
All sequence points listed in the C99 standard (Annex C) are:
The following are the sequence points described in 5.1.2.3:
— The call to a function, after the arguments have been evaluated (6.5.2.2).
— The end of the first operand of the following operators: logical AND && (6.5.13);
logical OR || (6.5.14); conditional ? (6.5.15); comma , (6.5.17).
— The end of a full declarator: declarators (6.7.5);
— The end of a full expression: an initializer (6.7.8); the expression in an expression
statement (6.8.3); the controlling expression of a selection statement (if or switch)
(6.8.4); the controlling expression of a while or do statement (6.8.5); each of the
expressions of a for statement (6.8.5.3); the expression in a return statement
(6.8.6.4).
— Immediately before a library function returns (7.1.4).
— After the actions associated with each formatted input/output function conversion
specifier (7.19.6, 7.24.2).
— Immediately before and immediately after each call to a comparison function, and
also between any call to a comparison function and any movement of the objects
passed as arguments to that call (7.20.5).
What this means, in essence is that any expression that is not a followed by a sequence point can invoke undefined behaviour, like, for example:
printf("%d, %d and %d\n", i++, i++, i--);
In this statement, the sequence point that applies is "The call to a function, after the arguments have been evaluated". After the arguments are evaluated. If we then look at the semantics, in the same standard under 6.5.2.2, point ten, we see:
10 The order of evaluation of the function designator, the actual arguments, and
subexpressions within the actual arguments is unspecified, but there is a sequence point
before the actual call.
That means for i = 1, the values that are passed to printf could be:
1, 2, 3//left to right
But equally valid would be:
1, 0, 1//evaluated i-- first
//or
1, 2, 1//evaluated i-- second
What you can be sure of is that the new value of i after this call will be 2.
But all of the values listed above are, theoretically, equally valid, and 100% standard compliant.
But the appendix on undefined behaviour explicitly lists this as being code that invokes undefined behaviour, too:
Between two sequence points, an object is modified more than once, or is modified
and the prior value is read other than to determine the value to be stored (6.5).
In theory, your program could crash, instead of printinf 1, 2, and 3, the output "666, 666 and 666" would be possible, too
so finally i found it...yeah.
it is because the arguments are passed after they are evaluated.So passing arguments is a completely different story from the evaluation.Compiler of c as it is traditionally build to maximize the speed and optimization can evaluate the expression in any way.
so the both argument passing and evaluation are different stories altogether.
since the C standard does not specify any order for evaluating parameters, every compiler implementation is free to adopt one. That's one reason why coding something like foo(i++) is complete insanity- you may get different results when switching compilers.
One other important thing which has not been highlighted here - if your favorite ARM compiler evaluates parameters left to right, it will do so for all cases and for all subsequent versions. Reading order of parameters for a compiler is merely a convention...

Resources