Order of evaluation of arguments in function calling? - c

I am studying about undefined behavior in C and I came to a statement that states that
there is no particular order of evaluation of function arguments
but then what about the standard calling conventions like _cdecl and _stdcall, whose definition said (in a book) that arguments are evaluated from right to left.
Now I am confused with these two definitions one, in accordance of UB, states different than the other which is in accordance of the definition of calling convention. Please justify the two.

As Graznarak's answer correctly points out, the order in which arguments are evaluated is distinct from the order in which arguments are passed.
An ABI typically applies only to the order in which arguments are passed, for example which registers are used and/or the order in which argument values are pushed onto the stack.
What the C standard says is that the order of evaluation is unspecified. For example (remembering that printf returns an int result):
some_func(printf("first\n"), printf("second\n"));
the C standard says that the two messages will be printed in some order (evaluation is not interleaved), but explicitly does not say which order is chosen. It can even vary from one call to the next, without violating the C standard. It could even evaluate the first argument, then evaluate the second argument, then push the second argument's result onto the stack, then push the first argument's result onto the stack.
An ABI might specify which registers are used to pass the two arguments, or exactly where on the stack the values are pushed, which is entirely consistent with the requirements of the C standard.
But even if an ABI actually requires the evaluation to occur in a specified order (so that, for example, printing "second\n" followed by "first\n" would violate the ABI) that would still be consistent with the C standard.
What the C standard says is that the C standard itself does not define the order of evaluation. Some secondary standard is still free to do so.
Incidentally, this does not by itself involve undefined behavior. There are cases where the unspecified order of evaluation can lead to undefined behavior, for example:
printf("%d %d\n", i++, i++); /* undefined behavior! */

Argument evaluation and argument passing are related but different problems.
Arguments tend to be passed left to right, often with some arguments passed in registers rather than on the stack. This is what is specified by the ABI and _cdecl and _stdcall.
The order of evaluation of arguments before placing them in the locations that the function call requires is unspecified. It can evaluate them left to right, right to left, or some other order. This is compiler dependent and may even vary depending on optimization level.

_cdecl and _stdcall merely specify that the arguments are pushed onto the stack in right-to-left order, not that they are evaluated in that order. Think about what would happen if calling conventions like _cdecl, _stdcall, and pascal changed the order that the arguments were evaluated.
If evaluation order were modified by calling convention, you would have to know the calling convention of the function you're calling in order to understand how your own code would behave. That's a leaky abstraction if I've ever seen one. Somewhere, buried in a header file someone else wrote, would be a cryptic key to understanding just that one line of code; but you've got a few hundred thousand lines, and the behavior changes for each one? That would be insanity.
I feel like much of the undefined behavior in C89 arose from the fact that the standard was written after multiple conflicting implementations existed. They were maybe more concerned with agreeing on a sane baseline that most implementers could accept than they were with defining all behavior. I like to think that all undefined behavior in C is just a place where a group of smart and passionate people agreed to disagree, but I wasn't there.
I'm tempted now to fork a C compiler and make it evaluate function arguments as if they're a binary tree that I'm running a breadth-first traversal of. You can never have too much fun with undefined behavior!

Check the book you mentioned for any references to "Sequence points", because I think that's what you're trying to get at.
Basically, a sequence point is a point that, once you've arrived there, you are certain that all preceding expressions have been fully evaluated, and its side-effects are sure to be no more.
For example, the end of an initializer is a sequence point. This means that after:
bool foo = !(i++ > j);
You are sure that i will be equal to i's initial value +1, and that foo has been assigned true or false. Another example:
int bar = i++ > j ? i : j;
Is perfectly predictable. It reads as follows: if the current value of i is greater than j, and add one to i after this comparison (the question mark is a sequence point, so after the comparison, i is incremented), then assign i (NEW VALUE) to bar, else assign j. This is down to the fact that the question mark in the ternary operator is also a valid sequence point.
All sequence points listed in the C99 standard (Annex C) are:
The following are the sequence points described in 5.1.2.3:
— The call to a function, after the arguments have been evaluated (6.5.2.2).
— The end of the first operand of the following operators: logical AND && (6.5.13);
logical OR || (6.5.14); conditional ? (6.5.15); comma , (6.5.17).
— The end of a full declarator: declarators (6.7.5);
— The end of a full expression: an initializer (6.7.8); the expression in an expression
statement (6.8.3); the controlling expression of a selection statement (if or switch)
(6.8.4); the controlling expression of a while or do statement (6.8.5); each of the
expressions of a for statement (6.8.5.3); the expression in a return statement
(6.8.6.4).
— Immediately before a library function returns (7.1.4).
— After the actions associated with each formatted input/output function conversion
specifier (7.19.6, 7.24.2).
— Immediately before and immediately after each call to a comparison function, and
also between any call to a comparison function and any movement of the objects
passed as arguments to that call (7.20.5).
What this means, in essence is that any expression that is not a followed by a sequence point can invoke undefined behaviour, like, for example:
printf("%d, %d and %d\n", i++, i++, i--);
In this statement, the sequence point that applies is "The call to a function, after the arguments have been evaluated". After the arguments are evaluated. If we then look at the semantics, in the same standard under 6.5.2.2, point ten, we see:
10 The order of evaluation of the function designator, the actual arguments, and
subexpressions within the actual arguments is unspecified, but there is a sequence point
before the actual call.
That means for i = 1, the values that are passed to printf could be:
1, 2, 3//left to right
But equally valid would be:
1, 0, 1//evaluated i-- first
//or
1, 2, 1//evaluated i-- second
What you can be sure of is that the new value of i after this call will be 2.
But all of the values listed above are, theoretically, equally valid, and 100% standard compliant.
But the appendix on undefined behaviour explicitly lists this as being code that invokes undefined behaviour, too:
Between two sequence points, an object is modified more than once, or is modified
and the prior value is read other than to determine the value to be stored (6.5).
In theory, your program could crash, instead of printinf 1, 2, and 3, the output "666, 666 and 666" would be possible, too

so finally i found it...yeah.
it is because the arguments are passed after they are evaluated.So passing arguments is a completely different story from the evaluation.Compiler of c as it is traditionally build to maximize the speed and optimization can evaluate the expression in any way.
so the both argument passing and evaluation are different stories altogether.

since the C standard does not specify any order for evaluating parameters, every compiler implementation is free to adopt one. That's one reason why coding something like foo(i++) is complete insanity- you may get different results when switching compilers.
One other important thing which has not been highlighted here - if your favorite ARM compiler evaluates parameters left to right, it will do so for all cases and for all subsequent versions. Reading order of parameters for a compiler is merely a convention...

Related

When does a function call copy its pass-by-value arguments relative to the argument sequences?

I would like to understand this code snippet as much as the undefined behavior permits it:
int i = 0;
printf("%d %d %d", i, ++i, i++);
output:
2 2 0
From what I can tell:
a comma , defines a sequence
the actual printing happens when all of the sequences are evaluated inside the function argument call
since arguments are pass-by-value, a copy happens sometime(?!) while calling the function
the order in which the function argument sequences evaluated is undefined ( is this true? )
So as far as I can tell most of the behavior in that single line of code is undefined, still I would like to understand the parts that are NOT undefined behavior.
I know the output depends on the compiler, but what parts are there that are defined in the C Standard?
I am interested in ANSI C, C99 as well, but I believe latest C++ standards improved on this at least in some aspects, is that true?
a comma , defines a sequence
No. Don't confuse the comma operator (What does the comma operator , do?) for a function argument list. The actual , symbol is used in a lot of different places in the C syntax, but the only place where it has a well-defined order of evaluation is when it's used for the actual comma operator. Neither function call argument lists nor initializer lists have well-defined orders.
The C standard says this about function calls and arguments (C17 6.5.2.2):
There is a sequence point after the evaluations of the function designator and the actual arguments but before the actual call. Every evaluation in the calling function (including other function calls) that is not otherwise specifically sequenced before or after the execution of the body of the called function is indeterminately sequenced with respect to the execution of the called function.
I your case the 4 arguments "%d %d %d", i, ++i, i++ are indeterminately sequenced in relation to each other and the sequence point, as mentioned above, is placed after the evaluation of the arguments.
To address your other statements:
the actual printing happens when all of the sequences are evaluated inside the function argument call
Correct, but then undefined behavior has already struck.
since arguments are pass-by-value, a copy happens sometime(?!) while calling the function
You can assume that these are pass by value indeed. printf is an icky variadic function so they end up in a va_list, which I believe has an implementation-defined representation internally.
the order in which the function argument sequences evaluated is undefined ( is this true? )
The order of evaluation of function arguments is unspecified behavior, meaning we can't know the order - it can be undocumented - and therefore we shouldn't assume a particular order. The actual undefined behavior happens because of 6.5/2:
If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined. If there are multiple allowable orderings of the subexpressions of an expression, the behavior is undefined if such an unsequenced side effect occurs in any of the orderings.
I would like to understand the parts that are NOT undefined behavior.
I'm not sure what you mean with that, since code can't be "just a little bit undefined". If it has undefined behavior and then all bets are off regarding anything in that program. You can't reason about it or it wouldn't be undefined, but something else. Therefore it doesn't make sense to reason like "undefined behavior turns this value to 55 and then it is passed by value". For example maybe the UB causes the whole function call to be optimized away or get inlined in strange ways.

What's the consequence of a sequence-point "immediately before a library function returns"?

In this recent question, some code was shown to have undefined behavior:
a[++i] = foo(a[i-1], a[i]);
because even though the actual call of foo() is a sequence point, the assignment is unsequenced, so you don't know whether the function is called after the side-effect of ++i took place or before that.
Thinking further about this, the sequence point at a function call only guarantees that side effects from evaluating the function arguments are carried out once the function is entered, e.g.
int y = 1;
int func1(int x) { return x + y; }
int main(void)
{
int result = func1( y++ ); // guaranteed to be 3
}
But looking at the standard, there's also §7.1.4 p3 (in the chapter about the standard library):
There is a sequence point immediately before a library function returns.
My question here is: What's the consequence of this paragraph? Why does it only concern library functions and what kind of code would actually rely on that?
Simple ideas like (nonsensical code to follow)
errno = 0;
long result = ftell(file) * errno;
would still be undefined as this time, the multiplication is unsequenced. I'm looking for an example that makes use of this special guarantee §7.1.4 p3 makes for library functions.
Regarding the suggested duplicate, Sequence point after a return statement?, this is indeed closely related and I found it before asking this question. It's not a duplicate, because
it asks about normative text stating there is a sequence point immediately after a return, without asking about the consequences when there is one.
it only mentions the special rule for library functions this question is about, without further elaborating on it.
Consequently, my questions here are not answered over there. The accepted answer uses a return value in an unsequenced expression (in this case an addition) and explains how the result depends on the sequencing of this addition, only finding that if you knew the sequencing of the addition, the whole result would be defined with a sequence point immediately after return. It doesn't show an example of code that is actually defined because of this rule, and it doesn't say anything about how/why library functions are special.
Library functions don't have the code that implements them covered by the standard (they might not even be implemented in C). The standard only specifies their behaviour. So the provision about return statements does not apply to implementation of library functions.
The purpose of this clause (in combination with there being a sequence point on entry of a library function) is to say that any side-effects of the library functions are sequenced either before or after any other evaluations that might be in the code which calls the library function.
So the example in your question is not undefined behaviour (unless the multiplication overflows!): the read of errno is either sequenced before or after the modification by ftell, it's unspecified which.

Behavior of increment operator as parameter to function

Why does the following code produces this output:
Code Snippet:
int a=10;
printf("%d%d%d",++a,a++,++a);
Output:
131113
How are are parameters evaluated? Does this depend upon compiler? I use gcc compiler. Can anybody tell me how my compiler evaluated it? If the compiler evaluates parameters of a function from Right to Left then will the output of this code be the following:
121111
That is Undefined Behavior. Modifying a variable more than once in a sequence point is Undefined Behavior.
From Wiki:
A sequence point defines any point in a computer program's execution
at which it is guaranteed that all side effects of previous
evaluations will have been performed, and no side effects from
subsequent evaluations have yet been performed.
Please read: Operator Precedence vs Order of Evaluation
For a thorough understanding please read: Undefined Behavior and Sequence Points
You can do this though:
int a=10;
int b = (++a,a++,++a);
Because the Comma (,) is an operator here and not just a mere separator between the arguments as in case of printf.
This is actually a combination of both unspecified behavior and undefined behavior.
It is unspecified because the order of evaluation of function arguments is not specified so in this line:
printf("%d%d%d",++a,a++,++a);
^ ^ ^ ^
1 2 3 4
we do not know which order functions arguments 1 to 4 will be evaluated and it is even possible that if this was in a loop that on subsequent executions the order could be different. This is covered in the C99 draft standard section 6.5.2.2 Function calls paragraph 10 which says(emphasis mine):
The order of evaluation of the function designator, the actual arguments, and
subexpressions within the actual arguments is unspecified, but there is a sequence point
before the actual call.
By itself this means the output of the program is unreliable but we also have undefined behavior due to the fact that the code above modifies a multiple times within one sequence point. This is covered by section 6.5 Expressions paragraph 2 which says(emphasis mine):
Between the previous and next sequence point an object shall have its stored value
modified at most once by the evaluation of an expression.72) Furthermore, the prior value shall be read only to determine the value to be stored.73)
Undefined behavior means that anything could result from running this program including that it appears to work correctly, the term is defined in section 3.4.3 and includes the following note which explains the possible result of undefined behavior:
Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).
Beyond that even if you could obtain reliable results code like that is hard to read and would be a nightmare to maintain in a complex project. If there are alternative simpler ways to write the code that meets other requirements such as performance etc... then that is the approach you should take.

I am confused between True and False .. does a True value stands for Non-Zero and False value stands for Zero? [duplicate]

This question already has answers here:
Is C/C++ bool type always guaranteed to be 0 or 1 when typecast'ed to int?
(6 answers)
Closed 9 years ago.
Why does the below program gives me the opposite answer after comparison operations are done?
main()
{
int k=35;
printf("%d\n%d\n%d",k==35,k=50,k<40);
}
output
0
50
1
This program is not a valid C program as per the C standard.
There are 2 problems associated with this program.
Problem 1: Unspecified Behavior
The order of evaluation of arguments to a function is Unspecified[Ref 1].
It could be left to right or
It could be right to left or
Any other magical order
Problem 2: Undefined Behavior
This has undefined behavior[Ref 2] because a variable should not be modified more than once without a intervening sequence point. Note that , in the function arguments does not introduce a sequence point. Thus k gets modified without a intervening sequence point and causes Undefined Behavior.
So you cannot rely on the behavior to be anything specific in this case. The program is not a valid C program.
[Ref 1]
C99 Standard 6.5.2.2.10:
The order of evaluation of the function designator, the actual arguments, and
subexpressions within the actual arguments is unspecified, but there is a sequence point
before the actual call.
[Ref 2]
C99 Standard 6.5.2:
Between the previous and next sequence point an object shall have its stored value
modified at most once by the evaluation of an expression. Furthermore, the prior value
shall be accessed only to determine the value to be stored.
Note that Unspecified and Undefined Behavior are terms defined by the standard as:
C99 Standard 3.19 Unspecified Behavior:
behavior where this International Standard provides two or more possibilities and
imposes no requirements on which is chosen in any instance
C99 Standard 3.18 Undefined Behavior:
behavior, upon use of a nonportable or erroneous program construct, of erroneous data, or
of indeterminately valued objects, for which this International Standard imposes no
requirements
Did you notice that the second argument to printf is k=50? This is an undefined behavior because the order of evaluation of the parameters is unspecified
The order of evaluation of function arguments is not defined by the C standard. See C99 §6.5.2.2p10:
The order of evaluation of the function designator, the actual
arguments, and subexpressions within the actual arguments is
unspecified, but there is a sequence point before the actual call.
This means that each of the comparison k==35, the assignment k=50, and the test k<40 can happen in any order. When I tried your program using MSVC, the assignment happened first. Other compilers, or even other invocations of the same compiler, may choose different orders.
I wish you'd shown your output. However, my suspicion is that the problem is that you've included an assignment as one of the arguments to printf(), and heavens knows what order the three arguments were evaluated, i.e. k might have been 50 when the k==35 was evaluated ;-)

Explain the order of evaluation in printf [duplicate]

This question already has answers here:
Why are these constructs using pre and post-increment undefined behavior?
(14 answers)
Closed 4 years ago.
main()
{
int i=5;
printf("%d%d%d%d%d%d",i++,i--,++i,--i,i);
}
Output is 45545, but I don't how it is working. Some say that the arguments in a function call are pushed into the stack from left to right.
The evaluation order of function parameters is unspecified.
From c99 standard:
6.5.2.2 Function calls
10/ The order of evaluation of the function designator, the actual
arguments, and subexpressions within the actual arguments is
unspecified, but there is a sequence point before the actual call.
This is, however, only a part of the problem. Another thing (which is actually worse, since it involves undefined behavior) is:
6.5 Expressions
2/ Between the previous and next sequence point an object shall have
its stored value modified at most once by the evaluation of an
expression. Furthermore, the prior value shall be read only to
determine the value to be stored.
In our case all the arguments evaluation is between only 2 sequence points: the previous ; and the point before the function is entered but after all the arguments have been evaluated. You'd better not write a code like this.
C standard is pretty relaxed in some places to leave room for optimizations that compilers might do.
The order in which the parameters to a function are passed is not defined in the standard, and is determined by the calling convention used by the compiler.
I think in your case, cdecl calling convention (which many C compilers use for x86 architecture) is used in which arguments in a function get evaluated from right to left.
This function call is undefined behavior:
printf("%d%d%d%d%d%d",i++,i--,++i,--i,i);
Modifying an object more than once between two sequence points is undefined behavior in C.
It is also undefined behavior because you have 6 conversion specifications but only 5 arguments for the format.
Two points:
Function arguments are evaluated in an unspecified order. This allows the compiler to optimize however it likes.
Your particular arguments invoke undefined behavior. You're not allowed to modify i multiple times before a sequence point.
The evaluation order of printf arguments is unspecified. It depends, among other, on the calling convention of the system you are using. Moreover, this is also an undefined behavior, because you are modifying i several times without any sequence point. BTW, there is a missing argument.

Resources